Is it possible for the themoviedb scraper to ignore a prefix? - Printable Version +- Kodi Community Forum (https://forum.kodi.tv) +-- Forum: Support (https://forum.kodi.tv/forumdisplay.php?fid=33) +--- Forum: Add-on Support (https://forum.kodi.tv/forumdisplay.php?fid=27) +---- Forum: Information Providers (scrapers) (https://forum.kodi.tv/forumdisplay.php?fid=147) +----- Forum: Movie Scrapers (https://forum.kodi.tv/forumdisplay.php?fid=302) +----- Thread: Is it possible for the themoviedb scraper to ignore a prefix? (/showthread.php?tid=187569) Pages:
1
2
|
Is it possible for the themoviedb scraper to ignore a prefix? - Mastakilla - 2014-02-26 As not all movie series come in the correct order when sorted alphabetically (extreme example is the 26 movie collection of Zatoichi), I would like to have prefix in the filename in Windows Explorer. However, I'm having trouble finding a way to prefix so that themoviedb scraper ignores the prefix and finds my movies. For some series, my prefix works fine (e.g.: James Bond), but for others it does not find most movies anymore (Zatoichi, Mad Max and many more). I have tried many different prefixes, but none work well: 1-moviename 2-moviename 3-moviename ... or [1] moviename [2] moviename [3] moviename ... or 1979 moviename 1981 moviename 1985 moviename So my questions: * does anyone know a prefix that might work? * does anyone know a hack / patch so that themoviedb scraper can ignore the prefix? (e.g.: like some regex in advancedsettings.xml) Thanks! RE: Is it possible for the themoviedb scraper to ignore a prefix? - Prof Yaffle - 2014-02-26 I've done variations of (1) and (2) with no problems... I've simply numbered the films, made sure the date is there, and off it went, e.g. 1. moviename [year]. You can also specify the imdb reference on a manual seatch, which solves a multitude of lookup problems. RE: Is it possible for the themoviedb scraper to ignore a prefix? - Mastakilla - 2014-02-26 1. Mad.Max.1979.1080p.DTS.HDMA --> not found 2. Mad.Max.2.1981.1080p.AC3.5.1.HQ --> not found 3. Mad.Max.Beyond.Thunderdome.1985.1080p.BluRay.x264-CiNEFiLE --> found while Mad.Max.1979.1080p.DTS.HDMA --> found Mad.Max.2.1981.1080p.AC3.5.1.HQ --> found Mad.Max.Beyond.Thunderdome.1985.1080p.BluRay.x264-CiNEFiLE --> found RE: Is it possible for the themoviedb scraper to ignore a prefix? - Prof Yaffle - 2014-02-26 What about "1. Mad Max [1979] - DTS HDMA" or variations? I wonder if the dots are confusing things as delimiters. Or "1 - Mad.Max.....". Or "1 - Mad.Max [1979] ....". RE: Is it possible for the themoviedb scraper to ignore a prefix? - Mastakilla - 2014-02-26 Thanks for the suggestion! But that kinda would mess up my entire naming convention I prefer keeping the movie names as they are... only the prefix is changeable... I don't really feel like renaming 1500 movies today The dots work fine in all other situations (without prefix) though RE: Is it possible for the themoviedb scraper to ignore a prefix? - scudlee - 2014-02-26 This can't be done without editing the scraper. Have a look at this thread for the basic idea. RE: Is it possible for the themoviedb scraper to ignore a prefix? - Mastakilla - 2014-02-28 Thanks for the tip!! After many hours of messing around, I'm finally getting somewhere, but I'm still having issues getting it right... I have modified my <CreateSearchUrl>, but I'm having trouble getting the regex right. Here is a working one: Code: <CreateSearchUrl dest="3"> However, I would like the following to be possible: "[1] Mad.Max.1979.1080p.DTS.HDMA" and "[10] Mad.Max.1979.1080p.DTS.HDMA" following regex do not work for allowing the space: <expression noclean="1">\[[0-9]\] (.*)</expression> <expression noclean="1">\[[0-9]\]\s(.*)</expression> <expression noclean="1">\[[0-9]\]%20(.*)</expression> <expression noclean="1">\[[0-9]\]+(.*)</expression> following regex do not work for allowing 2 numbers: <expression noclean="1">\[[0-9]+\]_(.*)</expression> <expression noclean="1">\[[0-9]{1,2}\]_(.*)</expression> <expression noclean="1">\[[0-9][0-9]*\]_(.*)</expression> I also can't view what is being fed to buffer 1 ($$1), so it is very hard to debug... The link to scrap on this page does not work anymore: http://wiki.xbmc.org/index.php?title=HOW-TO:Write_media_scrapers Can anyone help me out? RE: Is it possible for the themoviedb scraper to ignore a prefix? - Mastakilla - 2014-02-28 Seems like I'm not that far yet The incomplete regex that I thought was working, actually isn't working very well yet <expression noclean="1">\[[0-9]\]_(.*)</expression> recognizes [1]_Mad.Max.1979.1080p.DTS.HDMA but does NOT recognize [5]_Mad.Max.1979.1080p.DTS.HDMA I don't understand it.... Does anyone know how to display or log the input and the output of the <createsearchurl>? RE: Is it possible for the themoviedb scraper to ignore a prefix? - scudlee - 2014-02-28 If you have debug logging turned on then you should be able to see what is in $$1 buffer, as it gets passed directly as the query parameter of the URL (assuming the added clean-up regex doesn't match). The third space regex is the one that makes sense (spaces get percent-encoded). All of the 2-number regexes look valid. RE: Is it possible for the themoviedb scraper to ignore a prefix? - Mastakilla - 2014-02-28 ah yes, debug does log these kinds of things... thanks! eg: with Regex <expression noclean="1">\[[0-9]\]_(.*)</expression> and movie [5]_Mad.Max.1979.1080p.DTS.HDMA Code: 17:10:08 T:8700 DEBUG: VideoInfoScanner: Scanning dir 'D:\Videos\test\Mad Max Series (NL Subbed)\[5]_Mad.Max.1979.1080p.DTS.HDMA\' as not in the database [1]_Mad.Max.1979.1080p.DTS.HDMA Code: 17:18:05 T:5784 DEBUG: VideoInfoScanner: Scanning dir 'D:\Videos\test\Mad Max Series (NL Subbed)\[1]_Mad.Max.1979.1080p.DTS.HDMA\' as not in the database anyone have an idea what I'm doing wrong? RE: Is it possible for the themoviedb scraper to ignore a prefix? - Mastakilla - 2014-02-28 I also just tried with the regexp within the main regexp, but still doesn't work Code: <CreateSearchUrl dest="3"> RE: Is it possible for the themoviedb scraper to ignore a prefix? - scudlee - 2014-02-28 Looking at the output, it looks like the square brackets are also being percent-encoded, so you'd want a regex like: Code: <expression noclean="1">%5b[0-9]+%5d%20(.*)</expression> RE: Is it possible for the themoviedb scraper to ignore a prefix? - Mastakilla - 2014-03-01 good point! thanks! but unfortunately still not working Code: <CreateSearchUrl dest="3"> RE: Is it possible for the themoviedb scraper to ignore a prefix? - scudlee - 2014-03-01 Aww crap. I just tested it... I forgot about an inescapable bit of core code - underscores are always converted to spaces, but the periods are only converted to spaces if there are no actual spaces in the name, otherwise they are left as-is. So, "[1]_Mad.Max.1979.1080p.DTS.HDMA" will get cleaned up to "[1] Mad Max" and then get percent-encoded to "%5b1%5d%20Mad%20Max" for the scraper. Whereas "[1] Mad.Max.1979.1080p.DTS.HDMA" will get cleaned up to "[1] Mad.Max" and then get percent-encoded to "%5b1%5d%20Mad.Max". Using the underscore, you can clean to "Mad%20Max" and get a match, but with the space you'd be left with "Mad.Max", which doesn't. No easy way around that. The code you posted worked for me using underscores. Relevant lines from the debug log: Code: 16:43:20 T:7140 DEBUG: VideoInfoScanner: No NFO file found. Using title search for 'E:\Videos\Test\[1]_Mad.Max.1979.1080p.DTS.HDMA\movie.disc' RE: Is it possible for the themoviedb scraper to ignore a prefix? - Mastakilla - 2014-03-03 Thanks for that extremely crucial bit of information. That explains a lot... I'm now using the following (and it works!) : Code: <CreateSearchUrl dest="3"> I'm using the following prefixes now [1].Mad.Max.1979.1080p.DTS.HDMA [2].Mad.Max.2.1981.1080p.AC3.5.1.HQ etc also works for multiple numbers like [11]. Thanks again for the support! |