Kodi Community Forum
Extra REGEX for TV Show Episode matching - Printable Version

+- Kodi Community Forum (https://forum.kodi.tv)
+-- Forum: Support (https://forum.kodi.tv/forumdisplay.php?fid=33)
+--- Forum: Tips, tricks, and step by step guides (https://forum.kodi.tv/forumdisplay.php?fid=110)
+--- Thread: Extra REGEX for TV Show Episode matching (/showthread.php?tid=51614)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26


RE: Extra REGEX for TV Show Episode matching - Neroes - 2012-12-30

(2012-12-27, 14:21)shms Wrote: iam totally oblivious to this, could someone pelase generate a regex for: seventwenty-weedss01e12.avi, avchd-tb-s03e10-720p-x264.avi and flhd-bes02e04-1080p.avi ?

it would be easier for you just to make a folder called weeds, put in the avi file and then rename the file s01e12.avi
if you want more structure in your file system add a folder named the season
like this:
\weeds\season 1\S01E12.avi
*caps doesn't matter


RE: Extra REGEX for TV Show Episode matching - Ned Scott - 2012-12-30

(2012-12-24, 12:50)Cav99 Wrote: Hello,

I know that this is an annoying request, but I've been naming my TV shows in a non standard format for about 10 years now, and I want to use XBMC over Mediaportal but XBMC won't read the naming convention I have. The annoying thing is Mediaportal does! Anyway, I have over 200 TV shows, so I hope to avoid renaming. Any help you can provide would be appreciated. My structure is:

\[TV Show Name]\Season 01\1.01 Pilot.avi

Is there a regex that allows this to be read? I've tried unsuccessfully to make this work, and any info you could provide would be much appreciated.

Thanks

This should work by default.


RE: Extra REGEX for TV Show Episode matching - Cav99 - 2013-01-06

(2012-12-30, 03:55)Ned Scott Wrote:
(2012-12-24, 12:50)Cav99 Wrote: Hello,

I know that this is an annoying request, but I've been naming my TV shows in a non standard format for about 10 years now, and I want to use XBMC over Mediaportal but XBMC won't read the naming convention I have. The annoying thing is Mediaportal does! Anyway, I have over 200 TV shows, so I hope to avoid renaming. Any help you can provide would be appreciated. My structure is:

\[TV Show Name]\Season 01\1.01 Pilot.avi

Is there a regex that allows this to be read? I've tried unsuccessfully to make this work, and any info you could provide would be much appreciated.

Thanks

This should work by default.

Thanks for replying. XBMC reads the TV shows and creates the database for them, but cannot read the episodes themselves. I think it's because I used a dot (".") instead of say an "x" between the season and the ep number. Do you have any suggestions for regex's? Thanks.


RE: Extra REGEX for TV Show Episode matching - Evin - 2013-01-29

(2012-09-08, 23:39)mightybalthazar Wrote:
(2012-09-07, 04:50)mightybalthazar Wrote: Another Anime tip and a question of my own....

I have found that by adding the simple regexp "<regexp>[/\._ \-]()([0-9]+)(-[0-9]+)?</regexp>" from the wiki for absolute numbering of single season TV shows that my anime scrapes correctly 99% of the time. However, where this fix ultimately fails is with episodes that count into the triple digits.

For example, the following file will be found as expected (I use AniDB.net to scrape):

Hanasaku Iroha
|-----[Coalgirls]_Hanasaku_Iroha_01_(1920x1080_Blu-Ray_FLAC)_[DF0A6D51].mkv

Even version files are found with not hiccups:

Durarara
|-----[EC]Durarara - 01v3(1280x720 h264)[9FA1A46E].mkv

However, when episodes surpass 99, the scraper assumes the first digit is the season; thus, "[Taka]_Naruto_Shippuuden_135_[480p][9073B8C2].ogm" returns as "1x35. Naruto Shippuuden"

Is there anyway to get to the XBMC defaults so that I can disable foo.101.* as naming convention? I never name any of my episodes this way (really, does anyone?), and I think it would easily solve my issue, and work for other anime watchers as well.

Thanks in advance for any help with this matter! Big Grin

(2012-09-08, 21:58)mightybalthazar Wrote:
(2012-09-07, 09:09)Haudrauf Wrote: Hi,
have you tried to rename the triple-digit-episodes like
"[Taka]_Naruto_Shippuuden_S01E135_[480p][9073B8C2].ogm"?
Will the episode be scraped correctly?

Yes, that does work, but it is not desirable. I like to keep my episode naming conventions consistent, and since absolute numbering is the standard for anime, that's what I'd like to stick with.

I'd rather not have:
[SubGroup]Anime.Title.97(1080p.FLAC)[1234ABCD]
[SubGroup]Anime.Title.98(1080p.FLAC)[1234ABCD]
[SubGroup]Anime.Title.99(1080p.FLAC)[1234ABCD]
[SubGroup]Anime.Title.S1E100(1080p.FLAC)[1234ABCD]

Nor do I want to rename 99 episode to mach the "S1E01" formatting. Ideally, there would be a way to remove "foo.101.*" formatting from the defaults. That way all my TV shows can follow S1E01 and my Anime can follow absolute numbering and I will not have to go through the laborious task of renaming all my files.

Huzzah! I think I have this working.

What I did was copy the default XBMC settings, removed the f00.101 line and added the single season matching line. Omitting any append/prepend actions, this overwrites the defaults and looks to be working fairly well on both my TV (S01E01) and anime (1, 2, 3, etc).

I found one hiccup with anime named using "EP" as an episode prefix, but this isn't common among sub groups, so I am just going to rename.
Example:
[Elysium]Show.Title.EP01(BD.1080p.FLAC)[1234ABCD]

This does not pick up multiple-part episodes (S01E12-13, S06E06E07, or 135-136). It will only grab the first episode in the set (using TVDB, AniDB), but I am not sure this is a settings problem as much as a database problem. It's not that big a deal to me, so I probably won't look into it any further.


Code:
<tvshowmatching>
<regexp>\[[Ss]([0-9]+)\]_\[[Ee]([0-9]+)([^\\/]*)</regexp> <!-- foo_[s01]_[e01] -->
<regexp>[\._ \-]([0-9]+)x([0-9]+)([^\\/]*)</regexp> <!-- foo.1x09 -->
<regexp>[\._ \-][Ss]([0-9]+)[\.\-]?[Ee]([0-9]+)([^\\/]*)</regexp> <!-- foo s01e01, foo.s01.e01, foo.s01-e01 -->
<regexp>[\._ \-]p(?:ar)?t[._ -]()([ivxlcdm]+)([\._ \-][^\\/]*)</regexp> <!-- Pt.I, Part XIV -->

<regexp>[/\._ \-]()([0-9]+)(-[0-9]+)?</regexp> <!-- Single Season Matching -->
</tvshowmatching>

Hope this is of some help to others. I will run it through some more tests, but it looks to be working like I hoped.

Hi, can anyone please tell my how to achieve this? I can't find the userdata-folder with the default ones... :/

Or is there another way to scrape for example Anime/Shows/Fairy Tail/Fairy Tail - 123.mkv in absolute ordering? Doesn't work for triple digits...

Thx in advance!



RE: Extra REGEX for TV Show Episode matching - gahenna - 2013-02-02

Sorry if this has been covered already. I have been studying the code you posted and it seems that it picks up for anomalies but I have issue with a few specific shows. Is there a way to write a case to direct a file matching a pattern towards a specific show? A few of the files I have trouble with are The Big Bang Theory and Castle (2009). I can post some debug info later to help determine WHY they are failing to recognize properly but I was hoping to determine if REGEX was going to help me.


RE: Extra REGEX for TV Show Episode matching - gbandit - 2013-02-03

Hi,

There's an advanced setting used in the linked file that I can't find listed on the Wiki. Does anyone know if this is even a valid setting in Frodo or just an omission on the wiki?
Code:
<myvideos>
   <extractthumb>false</extractthumb> <!-- Dont create random thumbnails. Either scrape them from the internet or dont have them -->
</myvideos>

Thanks.


RE: Extra REGEX for TV Show Episode matching - scudlee - 2013-02-03

It's actually a GUI setting...

Just not one you can actually set using the GUI...


RE: Extra REGEX for TV Show Episode matching - gbandit - 2013-02-03

Interesting. Thanks for that scudlee.


RE: Extra REGEX for TV Show Episode matching - Griever92 - 2013-02-07

Currently, i'm using the following regex to match episodes among my anime content, but it does not appear to be working at all.
Code:
<tvshowmatching>
        <regexp>(?i)()(?:[\. _-]|ep)(\d{1,3})[\. _-v].*[[({][\da-f]{8}[])}]</regexp>
    </tvshowmatching>

*I have also used prepend and append with no change.

It begins the video info scan at line 16, and fails each file consecutively starting at line 19.

http://pastebin.com/SjVwH5Py

(Obviously this is occuring across multiple series, this one is just an example)


RE: Extra REGEX for TV Show Episode matching - bondaki - 2013-02-07

edit: Nvm... i misread something...

For anime, what iI do is use this, just after the 3number tv show regex's in prepend from this forum, otherwise it finds many false positives

<regexp>[/\._\-]()(\d{2,3})([^\\/]+)$</regexp>


this doesn't look for crc, so you might wanna add that bit in if all your anime have crcs in the file name


RE: Extra REGEX for TV Show Episode matching - Griever92 - 2013-02-07

(2013-02-07, 09:26)bondaki Wrote: For anime, what iI do is use this, just after the 3number tv show regex's in prepend from this forum, otherwise it finds many false positives

<regexp>[/\._\-]()(\d{2,3})([^\\/]+)$</regexp>

This doesn't seem to do anything different in my case. (Inserted into the prepend area of v2.3 where you specified)
Either that, or XBMC has decided it's no longer listening to changes made to advancedsettings.xml

What kind of response should I be looking for out of the regex in order to XBMC to 'understand' what the episode number?
Running the original regex I had (provided in v2.3 from Post #2) through pythonregex.com, using Code Geass Hangyaku no Lelouch - 01 - The Day a New Demon Was Born - [OZC](bfe4e9eb).mkv as an example, returns the following.

Code:
>>> regex = re.compile("(?i)()(?:[\. _-]|ep)(\d{1,3})[\. _-v].*[[({][\da-f]{8}[])}]")
>>> r = regex.search(string)
>>> r
<_sre.SRE_Match object at 0xe539b65a74cdded0>
>>> regex.match(string)
None

# List the groups found
>>> r.groups()
(u'', u'01')

# List the named dictionary objects found
>>> r.groupdict()
{}

# Run findall
>>> regex.findall(string)
[(u'', u'01')]

# Run timeit test
>>> setup = ur"import re; regex =re.compile("(?i)()(?:[\. _-]|ep)(\d{1,3})[\. _-v].*[[({][ ...
>>> t = timeit.Timer('regex.search(string)',setup)
>>> t.timeit(10000)
0.179860115051

Now I can see within here that it's grabbing the episode number and seemingly ignoring every other piece of data, but what is XBMC looking for exactly, in order to determine that this 'file' is a valid episode, and proceed with scraping the information?


RE: Extra REGEX for TV Show Episode matching - bondaki - 2013-02-07

xbmc uses the regexes to get season and episode numbers - in case of anime, the () sends off nothing for the season.

post your advanced settings. xml and it's location, you may have made a mistake in there


RE: Extra REGEX for TV Show Episode matching - Griever92 - 2013-02-07

http://pastebin.com/dUDgDGqB

C:\Program Files (x86)\XBMC\userdata\advancedsettings.xml
(Have tried using this file in AppData\Roaming\XBMC\userdata\ as well)

[EDIT]
Looks like I got it working now using the v2.3 set in Post#2.
I had to go in and clear out the scraper cache for anidb and restart XBMC, but it's populating now.
(AppData\Roaming\XBMC\cache\scrapers\metadata.anidb.net)


RE: Extra REGEX for TV Show Episode matching - kwanbis - 2013-02-11

This is very weird. For my Once Upon A Time episodes, it appears to recognize all of them, but the 10th episode of the 1st season is not showing.

Also, NONE of the 2nd season episodes are showing. I have tried with 1x?? and s01e??.

Here is the http://pastebin.com/6sEurV8P and a http://i.imgur.com/5xkUEhj.jpg

I haven't checked other episodes/series.

Any ideas? By the way, I'm using a standard FRODO install.


RE: Extra REGEX for TV Show Episode matching - scudlee - 2013-02-11

1x10 isn't found because of the .7 immediately after the episode number:

once.upon.a.time.s01e10.7.15.am.webdl.mkv.

The episode is interpreted as being the 7th sub-part of episode 10 (which doesn't exist). You need to rename the file to put either a space or underscore between the actual episode number and the 7.15 of the episode title.


Speaking of episode titles... The reason season 2 isn't included should become apparent if you compare the file names to the titles being scraped.