So something like:
Code:
<expression noclean="1">.+?%20%2d%20(.+?)(?:%20part[1-9]%20)?$</expression>
olympia
Team-XBMC Member Joined: May 2008 Reputation: 30 |
2011-04-10 13:16
Post: #11
Be aware that the title in $$1 in CreateSearchURL is URL encoded, so you have to create the regexp for paul%20haggis%20%2d%20crash in this example.
So something like: Code: <expression noclean="1">.+?%20%2d%20(.+?)(?:%20part[1-9]%20)?$</expression>
ASUS P5N7A-VM - Intel C2D E7300 - 2GB RAM - Silverstone GD02 Black - MS MCE Remote - Patriot Warp 32GB SSD Drive |
| find quote |
nucleo
Junior Member Posts: 8 Joined: Apr 2011 Reputation: 0 |
2011-04-10 16:02
Post: #12
olympia Wrote:Be aware that the title in $$1 in CreateSearchURL is URL encoded, so you have to create the regexp for paul%20haggis%20%2d%20crash in this example. Wow, that's really helpful! Thanks for the tip, I'll try. |
| find quote |
nucleo
Junior Member Posts: 8 Joined: Apr 2011 Reputation: 0 |
Finally I got everything working the way I want. Great thanks to olympia and bambi73. Without you I would never sort this out.
I post here my results for the case if somebody finds it useful to organize their own movie collection. File names should contain movie title and year. I tried 2 formats, both work perfectly: 1. Sidney Lumet - Dog Day Afternoon (1975).part1.avi 2. Sidney Lumet - Dog Day Afternoon part1 (1975).avi And the third is obvious when you don't have movie broken into parts: 3. Sidney Lumet - Dog Day Afternoon (1975).avi There are 3 very important points I've learned about XBMC scraping: 1. It cuts automatically the year and the file extension from the file name before the scraper even starts working, so Sidney Lumet - Dog Day Afternoon (1975).avi becomes Sidney Lumet - Dog Day Afternoon - this what comes to the scraper in buffer $$1 (well, not exactly this, see item 3) The buffer $$2 in this case will contain "1975" - the year stripped from braces, even before the scraper starts. 2. It automatically recognizes words like "part[1-9]", "cd[1-9]" cuts them off and displays several parts as one item in the movie library. No further action is required from the scraper. Thus Sidney Lumet - Dog Day Afternoon (1975).part1.avi and Sidney Lumet - Dog Day Afternoon (1975).part2.avi are scraped as one item, which is Sidney Lumet - Dog Day Afternoon (well, not exactly this, see item 3) at buffer $$1, before applying regular expressions by scraper. 3. Items in $$1 come to scraper URL-encoded and lower-cased. Thus in our example $$1 will actually contain sidney%20lumet%20%2d%20dog%20day%20afternoon All spaces are replaced with %20 and dash is replaced with %2d I had to modify a little default scrapers, so that they can work with my file naming. Here is what I have for now: 1. TMDB scraper (on Ubuntu: ~/.xbmc/addons/metadata.themoviedb.org/tmdb.xml) Code: <CreateSearchUrl dest="3">Unfortunately, TMDB appears to not contain information about some of my movies (Woody Allen - Manhatten - what the heck, is it so rare?). That's why I used also another scraper - IMDB EDIT: It's my mistake in typing. Manhatten should be ManhattAn. And of course TMDB could find a misspelled word too, but anyway I'm happy it was found at all ![]() 2. IMDB scraper (on Ubuntu: ~/.xbmc/addons/metadata.imdb.org/imdb.xml) Code: <CreateSearchUrl dest="3" SearchStringEncoding="iso-8859-1">Again, the same regexp, but now there is also inner one for year. It was there and I didn't touch it, though I find it strange to use inner regexp, which just adds %20 before the year in the buffer $$4, while you can just add "%20($$2)" to the url directly. Anyway the scraper works 95% of time for me, so please consider doing this yourself, if you need to. If this information is anyhow useful and somebody can point me to the corresponding Wiki page, I can add it there. Or somebody can do it for me, if I cannot access that wiki.
(This post was last modified: 2011-04-10 22:58 by nucleo.)
|
| find quote |
olympia
Team-XBMC Member Joined: May 2008 Reputation: 30 |
2011-04-10 20:20
Post: #14
Actually both (tmdb year addition and imdb inner regex) are good catch. Thank you for sharing. I will tune the official scrapers according to this.
Not sure why the year was not added to tmdb search URL before. One possibility is that at the time it has been written the API did not supported this yet.
ASUS P5N7A-VM - Intel C2D E7300 - 2GB RAM - Silverstone GD02 Black - MS MCE Remote - Patriot Warp 32GB SSD Drive |
| find quote |