[RELEASE] FilmAffinity (Spanish) scraper - Printable Version +- Kodi Community Forum (https://forum.kodi.tv) +-- Forum: Support (https://forum.kodi.tv/forumdisplay.php?fid=33) +--- Forum: Add-on Support (https://forum.kodi.tv/forumdisplay.php?fid=27) +---- Forum: Information Providers (scrapers) (https://forum.kodi.tv/forumdisplay.php?fid=147) +----- Forum: Movie Scrapers (https://forum.kodi.tv/forumdisplay.php?fid=302) +----- Thread: [RELEASE] FilmAffinity (Spanish) scraper (/showthread.php?tid=25389) |
- pancheto - 2011-11-19 I'm good at regular expressions due to my job, so I'm considering going into this scrapers world to understand how variables are really working, which is what is stopping me from modifying this one. the year option was the first thing I thought when I came to open the scraper code, but since I thought that \1 contained all the file name and that the year could not be splitted from it I stopped from continuing. if you give me some hint how to split year from name I would try learning scraper coding and maybe try contributing to this great scraper. suggested modifications to filmaffinity scraper v1.4.1 - pancheto - 2011-11-21 after revising the code, I've came with a few updates that improve the scraper's work. since it's still in alpha version (as I'm not yet into the proper scraper coding yet), I will only post here the line changes for anyone to test them. the first one is the main change, which works fantastically (is able get all filmaffinity's info from ~700 file names without any error), and the other one is just a suggestion to bypass google search. line 11: Code: <RegExp input="$$1" output="<url>http://www.filmaffinity.com/es/advsearch.php?stype[]=title&fromyear=$$2&toyear=$$2&stext=\1</url>" dest="3"> Code: <RegExp input="$$1" output="<url>http://www.filmaffinity.com/es/search.php?stext=\1&amp;stype=none</url>" dest="3"> lines 124 to 138: Code: <RegExp input="$$9" output="<url function="GoogleToIMDB">http://www.imdb.com/search/title?year=$$6&title=$$9</url>" dest="5+"> Code: <RegExp input="$$9" output="<url function="GoogleToIMDB">http://www.google.com/search?q=site:imdb.com\1</url>" dest="5+"> note: the search suggestion is only trying to bypass google search not because it isn't performing well (on the contrary, it does perform much better than my suggestion), but because when google detects a few hundreds of searches with the same structure and coming from the same IP it blocks the results page (after ~200 automatic searches it requests a captcha solving) and for that reason updating an entire library of hundreds of files is very complicated. I have tried to bypass http://www.google.com/ queries using http://www.google.es/ or even https://www.google.com/, but without luck. suggested modifications (tested and working) to filmaffinity scraper v1.4.1 - pancheto - 2011-11-21 forget about my previous post, as I finally worked out how to bypass google's search limitations, which were stopping me from batch updating my entire library. in summary, the modifications I suggest to current filmaffinity's v1.4.1 scraper are only 2, very simple yet very useful ones:
some future improvement? sure there is plenty, but what it came to me as obvious was the fact that some miniatures were not being downloaded appropriately from filmaffinity. I can download them manually through xbmc as it will suggest imdb's ones, but I was wondering why movies like Nixon (http://www.filmaffinity.com/es/film737736.html) don't get such miniature. in fact there's no lightbox overthem so it looks like the img code will surely look different. I guess I'll leave this for the proper scraper developers, in order to debug it and release a new scraper version including my 2 previous suggestions. - MaDDoGo - 2011-11-22 Hi, I looked at your modifications and (after adjacome tests) I merged it into the repo so the scraper is enhanced with your modifications. Thanks for your time and the modifications. poster searching improvements - pancheto - 2011-11-24 I have implemented a few improvements on filmaffinity poster searching at github, hoping that you'll find them useful. - itombs - 2011-11-25 Hi, the number of votes is not working since a few days. I updated to the MaDDoGo github version but not work with number of votes. Please, fix the number of votes. Thanks a lot. - pancheto - 2011-11-26 the scraper code looks for the number of votes between brackets, although now (filmaffinity is currently working on its look) it appears without them. I'll report this to MaDDoGo hoping to have it solved in the next version. - itombs - 2011-12-01 When could be fixed the problem with number of votes? There is news about this? - pancheto - 2011-12-01 the fix has been already submitted to XBMC's main repository, and you should see it updated on your system as version 1.4.3. if the addon doesn't get updated automatically try doing it manually, or downloading the code from MaDDoGo's github repository. - itombs - 2011-12-02 pancheto Wrote:the fix has been already submitted to XBMC's main repository, and you should see it updated on your system as version 1.4.3. if the addon doesn't get updated automatically try doing it manually, or downloading the code from MaDDoGo's github repository. Thanks a lot. Works well. See you. Incorrect title when english title between () - serieofilo - 2012-01-01 Hello, I've been using FA scraper for sometime and I've found the following inconsistency when downloading information of movies with english/spanish names between () but only when the movie is not a video file in the disk but a pointer to a DVD disc. The problem is that the title has extra blank spaces between the movie name and the english/spanish name between () but only if the movie is a .disc file (a pointer for an external DVD disc). For example, this pointer to a DVD file is getting incorrect information, 3 spaces between Jersey and the (: Una chica de Jersey (Jersey Girl) (2004).dvd.disc Code: <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> This one is getting good information, only 1 space between the name and the (: The Reader (El lector) (2008).avi Code: <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> Any idea about what's the problem? Thank you. - pancheto - 2012-01-02 I've tried to replicate your issue, and although I've found some interesting things I haven't found any problem on the scrapper since the search results don't depend at all on the filename or its extension, but on the film entry's filmaffinity code itself. the problem with Jersey Girl in particular, regardless its file and filename nature, is that its filmaffinity film entry has indeed 3 spaces on its title: Code: <title>Una chica de Jersey (Jersey Girl) (2004) - FilmAffinity</title - pancheto - 2012-01-11 I've just addressed this problem (the extra spaces coming from FilmAffinity) on github. I'm sure that it'll soon be commited on the master branch, and then commited on the official XBMC repository. RE: FilmAffinity (Spanish) scraper - tonybeccar - 2012-03-28 Hello, I've been using the FA scraper since I have XBMC, and one feature that I now see available in the imdb scraper is the only thing that IMO this script is missing!! The imdb script has the option to scrape the movie title based on a predefined country. This is really useful for me and I assume for many others, because if a person doesn't live in Spain, some titles may become confusing for the user, resulting in renaming lots of movies by hand.. So, I'm asking, would it be possible to include this feature in the FA scraper? Maybe a copy paste of the IMDB scraper? Thanks in advance!! RE: FilmAffinity (Spanish) scraper - pancheto - 2012-03-28 the main idea behind the FA scraper is indeed to search in FA. although the scraper tries to enrich those search results with information from other sources (such as IMDB), the way to locate an entry on FA can only be done by its spanish title (logical, since this is a spanish community) or by its original title. I know agjacome is working on a way to leave the original title stored on XBMC db instead of the spanish one, but I don't think looking for other title language should be the aim for FA scraper. |