XBMC Community Forum
Regular Expressions - Printable Version

+- XBMC Community Forum (http://forum.xbmc.org)
+-- Forum: Help and Support (/forumdisplay.php?fid=33)
+--- Forum: XBMC General Help and Support (/forumdisplay.php?fid=111)
+--- Thread: Regular Expressions (/showthread.php?tid=25349)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25


Regular Expressions - blakholephysics - 2007-03-20 19:51

It seems like I'm missing something but I'm having trouble with the regular expressions. I don't understand how to modify them.

Could someone explain some of this to me?

Code:
<tvshowmatching>
    <regexp>\[[Ss]([0-9]*)\]_\[[Ee]([0-9]*)[^\\/]*</regexp>
    <regexp>[\._ \-]([0-9]*)x([0-9]*)[^\\/]*</regexp>
    <regexp>[\._ \-][Ss]([0-9]*)[\.\-]?[Ee]([0-9]*)[^\\/]*</regexp>
    <regexp>[\._ \-]([0-9]*)([0-9][0-9])[\._ \-][^\\/]*</regexp>
    <twopart>
        <regexp>\[[Ss]([0-9]*)\]_\[[Ee][0-9][0-9]\-([0-9]*)\][^\\/]*</regexp>
        <regexp>[\._ \-][Ss]([0-9]*)[^0-9]*[Ee][0-9][0-9]\-([0-9]*)[^\\/]*</regexp>
        <regexp>[\._ \-][0-9]*x[0-9]*[\._ \-]*([0-9]*)x([0-9]*)[^\\/]*</regexp>
    </twopart>
  </tvshowmatching>

Specifically, I format my tv shows with /tvshowname/season#/episode# - episodetitle.extension

And I'm not sure how to modify this part of the advanced settings to do that.


- spiff - 2007-03-20 20:10

first, there was a bug in svn that i fixed @ rev 8255.

second;

a regexp for your scheme would be:

season([0-9]+)[\\/]episode([0-9]+)[^\\/]*

now, the explanation (man, i'm patient today):

the season part should be explanatory; we need it to say 'season' at the start of of pattern. then we need 1 or more numbers. we state that we only allow numbers by the [0-9] part. this defines a list of allowed characters - 0-9 is expanded into 0 1 2 3 4 5 6 7 8 9. we say that we need one of more this using the + sign. now, since this is the season number and that's what we are after, we select this part. this is done by embracing it in parantheses - ([0-9]+). now we need either a backslash or a slash. since a backslash has a special meaning in regular expressions we tell the parser that we mean a litteral \ by doing \\. then it needs episode literally, and we do the same thing to select the episode number. finally we want to make sure we match at the end of the path and not in the middle of it. we do that by specifying a list of not allowed characters, namely \ and /. this we do by putting a ^ inside the []'s, inverting the meaning of the list. finally we need 0 or more of these non-slash characters, which we indicate by a *.


- jmarshall - 2007-03-21 00:26

Send that man a beer (hell - a whole crate for that effort Smile


- J_K_M_A_N - 2007-03-21 13:55

Wow spiff. That was VERY helpful. Thank you for being patient today. I may actually be able to work mine out now.

If I figure that out, maybe I can figure out how to change the wiki. I will do my best to make you proud.

J_K_M_A_N


- blakholephysics - 2007-03-21 21:45

That explanation is great. (we really do need something like that in the wiki)

I'm trying to read these expressions and looking at the first expression listed in the wiki:
Code:
\[[Ss]([0-9]*)\]_\[[Ee]([0-9]*)[^\\/]*

Why a backslash in the beginning? (does it indicate a literal "[" and then later a literal "]"?)
Why the underscore?
Why Ss and Ee? I thought that they were case insensitive.

Code:
[\._ \-]([0-9]*)x([0-9]*)[^\\/]*

Why two backslashes in the first segment? [\._ \-]

Now with what I am trying to do: season #\# - Episodetitle.extension

would This be the correct expression?
Code:
season ([0-9]+)[\\/]([0-9]+) - [^\\/]



- DonJ - 2007-03-21 23:07

\[ means a literal [
[Ss] means a capital S or a small s.
\] means a literal ]
_ means underscore
[Ee] means E or e

regular expressions are not case insensitive

[\._ \-] means literal . or underscore or literal -

Your regular expression looks ok to me.


- J_K_M_A_N - 2007-03-22 00:07

Man I feel like a complete MORON! I have my files like so:

Series Name\Season #\Series Name - Season # - Episode ## - Episode Name.avi

I thought the expression I would want would be:

[\\/][\-][Season]([0-9]+)[\-][Episode]([0-9][0-9])[^\\/]*

But that doesn't work. I am taking the season number from the file name. Should I take it from the directory instead? Can anyone give me a clue?

J_K_M_A_N

(sorry spiff Sad )


- szsori - 2007-03-22 00:19

J_K_M_A_N Wrote:Series Name\Season #\Series Name - Season # - Episode ## - Episode Name.avi

I thought the expression I would want would be:

[\\/][\-][Season]([0-9]+)[\-][Episode]([0-9][0-9])[^\\/]*

But that doesn't work. I am taking the season number from the file name. Should I take it from the directory instead? Can anyone give me a clue?
Can you give an example filename and path? It's hard to tell which of the following you have:

Lost\1\Lost - 1 - 01 - Pilot.avi
Lost\Season 1\Lost - Season 1 - Episode 01 - Pilot.avi

With an example we can be sure we give you exactly what you need.


- spiff - 2007-03-22 00:22

you're missing the spaces...
plus you're saying that your pattern should start with a \ or a /, inconsistent with the seriesname part of your example.


- jmarshall - 2007-03-22 00:34

J_K_M_A_N (man that's hard to type Tongue)

You are wanting to match:

\Series Name - Season # - Episode ## - Episode Name.avi

Assuming your spacing includes those spaces, you need the "Season " then you need the number, then " - Episode " then the number:

Season ([0-9]+) - Episode ([0-9]+)[^\\/]*

You don't care what's after the Episode number, except that you don't want a slash as it has to be in the filename.

I'm sure you can mod it to include/not include spaces as necessary.

And please: Add to the wiki with your example regexp, the names it matches, and your explanation on how it works.

Cheers,
Jonathan