Best approach to scraping scene releases?
#1
Hi,

I've been trying to scrape folders of scene releases in their original format, and so far I've only managed to successfully scrape movies.

What's the best approach to scraping scene releases of TV-shows, where most of the show information is in the folder name?

An example might be Cops.S26E13.720p.HDTV.x264-SYS/ which contains a bunch of rar files and a .nfo file.

This is my advancedsettings.xml that allows me to scrape movies:
Code:
<advancedsettings>
  <video>
    <cleanstrings>
     <regexp>[ _\,\.\(\)\[\]\-](ac3|dts|custom|dc|divx|divx5|dsr|dsrip|dutch|dvd|dvdrip|dvdscr|dvdscreener|screener|dvdivx|cam|fragment|fs|hdtv|hdrip|hdtvrip|internal|limited|multisubs|ntsc|ogg|ogm|pal|pdtv|proper|repack|rerip|retail|r3|r5|bd5|se|svcd|swedish|german|read.nfo|nfofix|unrated|ws|telesync|ts|telecine|tc|brrip|bdrip|480p|480i|576p|576i|720p|720i|1080p|1080i|hrhd|hrhdtv|hddvd|bluray|x264|h264|xvid|xvidvd|xxx|www.www|cd[1-9]|\[.*\])([ _\,\.\(\)\[\]\-]|$)</regexp>
     <regexp>(\[.*\])</regexp>
    </cleanstrings>
  </video>

  <tvshowmatching>
    <regexp>[Ss]([0-9]+)[][ ._-]*[Ee]([0-9]+)([^\\/]*)$</regexp>  <!-- foo.s01.e01, foo.s01_e01, S01E02 foo, S01 - E02 -->
    <regexp>[\._ -]()[Ee][Pp]_?([0-9]+)([^\\/]*)$</regexp>  <!-- foo.ep01, foo.EP_01 -->
    <regexp>([0-9]{4})[\.-]([0-9]{2})[\.-]([0-9]{2})</regexp>  <!-- foo.yyyy.mm.dd.* (byDate=true) -->
    <regexp>([0-9]{2})[\.-]([0-9]{2})[\.-]([0-9]{4})</regexp>  <!-- foo.mm.dd.yyyy.* (byDate=true) -->
    <regexp>[\\/\._ \[\(-]([0-9]+)x([0-9]+)([^\\/]*)$</regexp>  <!-- foo.1x09* or just /1x09* -->
    <regexp>[\\/\._ -]([0-9]+)([0-9][0-9])([\._ -][^\\/]*)$</regexp>  <!-- foo.103*, 103 foo -->
    <regexp>[\/._ -]p(?:ar)?t[_. -]()([ivx]+)([._ -][^\/]*)$</regexp>  <!-- Part I, Pt.VI -->
  </tvshowmatching>
</advancedsettings>

Thanks in advance.
Reply
#2
I've also tried using the latest version of the extra regex for tv shows.

Here's a snippet of my log:

Code:
02:23:49 T:4576 WARNING: No information found for item 'ftp://…/…/Cops.S26E02.HDTV.x264-2HD/', it won't be added to the library.
02:23:50 T:4576 WARNING: No information found for item 'ftp://…/…/Cops.S26E03.720p.HDTV.x264-2HD/', it won't be added to the library.
02:23:50 T:4576 WARNING: No information found for item 'ftp://…/…/Cops.S26E03.HDTV.x264-2HD/', it won't be added to the library.
Reply

Logout Mark Read Team Forum Stats Members Help
Best approach to scraping scene releases?0