XBMC Community Forum
[RELEASE] FilmAffinity (Spanish) scraper - Printable Version

+- XBMC Community Forum (http://forum.xbmc.org)
+-- Forum: Help and Support (/forumdisplay.php?fid=33)
+--- Forum: Add-ons Help and Support (/forumdisplay.php?fid=27)
+---- Forum: Metadata scrapers (/forumdisplay.php?fid=147)
+---- Thread: [RELEASE] FilmAffinity (Spanish) scraper (/showthread.php?tid=25389)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26


- spiff - 2008-10-21 22:50

updated scraper is now in svn, r15969


- spiff - 2008-10-21 23:07

oh, and the search string encoding worked fine for me. i made a directory named cariño, set content, did the lookup. got the list your url pointed to.


- fidoboy - 2008-10-22 00:24

And where is the SVN? can you provide a link to download the scraper or attach it here?

regards,

Fido


- w00dst0ck - 2008-10-22 10:02

SVN: https://xbmc.svn.sourceforge.net/svnroot/xbmc/branches/linuxport/XBMC/system/scrapers/video/


@HectorziN:
It is possible to get the IMDB Link with a google search.
site:imdb.com +original title +year

I'm using a google wrapper to get the IMDB ID for fanart at my moviemaze scraper.

Code:
<!--URL to Google and Fanart-->
<RegExp conditional="fanart" input="$$8" output="&lt;url function=&quot;GoogleToIMDB&quot;&gt;http://www.google.com/search?q=site:imdb.com+moviemaze\1&lt;/url&gt;" dest="5+">
<RegExp input="$$1" output="\1" dest="7">
    <expression>&lt;h2&gt;\((.*)\)&lt;</expression>
</RegExp>
<RegExp input="$$7" output="+\1" dest="8+">
    <expression repeat="yes">([^ ,]+)</expression>
</RegExp>
<expression></expression>
</RegExp>

<!--GoogleToIMDB-->
<GoogleToIMDB dest="5">
<RegExp input="$$2" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;&gt;&lt;details&gt;\1&lt;/details&gt;" dest="5">
<RegExp input="$$1" output="&lt;url function=&quot;GetFanart&quot;&gt;http://api.themoviedb.org/backdrop.php?imdb=\1&lt;/url&gt;" dest="2+">
<expression>/title/([t0-9]*)</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GoogleToIMDB>

<!-- Fanart -->
<GetFanart dest="5">
<RegExp input="$$2" output="&lt;details&gt;&lt;fanart url=&quot;http://themoviedb.org/image/backdrops&quot;&gt;\1&lt;/fanart&gt;&lt;/details&gt;" dest="5">
<RegExp input="$$1" output="&lt;thumb preview=&quot;/\1/\2_poster.jpg&quot;&gt;/\1/\2.jpg&lt;/thumb&gt;" dest="2">
<expression repeat="yes">/([0-9]*)/([t0-9-]*).jpg&lt;/URL</expression>
</RegExp>
<expression noclean="1">(.+)</expression>
</RegExp>
</GetFanart>



- HectorziN - 2008-10-22 12:02

w00dst0ck Wrote:SVN: https://xbmc.svn.sourceforge.net/svnroot/xbmc/branches/linuxport/XBMC/system/scrapers/video/


@HectorziN:
It is possible to get the IMDB Link with a google search.
site:imdb.com +original title +year

I'm using a google wrapper to get the IMDB ID for fanart at my moviemaze scraper.

Code:
<!--URL to Google and Fanart-->
<RegExp conditional="fanart" input="$$8" output="&lt;url function=&quot;GoogleToIMDB&quot;&gt;http://www.google.com/search?q=site:imdb.com+moviemaze\1&lt;/url&gt;" dest="5+">
<RegExp input="$$1" output="\1" dest="7">
    <expression>&lt;h2&gt;\((.*)\)&lt;</expression>
</RegExp>
<RegExp input="$$7" output="+\1" dest="8+">
    <expression repeat="yes">([^ ,]+)</expression>
</RegExp>
<expression></expression>
</RegExp>

<!--GoogleToIMDB-->
<GoogleToIMDB dest="5">
<RegExp input="$$2" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;&gt;&lt;details&gt;\1&lt;/details&gt;" dest="5">
<RegExp input="$$1" output="&lt;url function=&quot;GetFanart&quot;&gt;http://api.themoviedb.org/backdrop.php?imdb=\1&lt;/url&gt;" dest="2+">
<expression>/title/([t0-9]*)</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GoogleToIMDB>

<!-- Fanart -->
<GetFanart dest="5">
<RegExp input="$$2" output="&lt;details&gt;&lt;fanart url=&quot;http://themoviedb.org/image/backdrops&quot;&gt;\1&lt;/fanart&gt;&lt;/details&gt;" dest="5">
<RegExp input="$$1" output="&lt;thumb preview=&quot;/\1/\2_poster.jpg&quot;&gt;/\1/\2.jpg&lt;/thumb&gt;" dest="2">
<expression repeat="yes">/([0-9]*)/([t0-9-]*).jpg&lt;/URL</expression>
</RegExp>
<expression noclean="1">(.+)</expression>
</RegExp>
</GetFanart>

Thanks! it is a great idea but.... always returns the same movie? it could return a wrong one, right?


- HectorziN - 2008-10-22 12:12

spiff Wrote:oh, and the search string encoding worked fine for me. i made a directory named cariño, set content, did the lookup. got the list your url pointed to.

Not a directory, the movie must be called cariño or another movie with a tittle containing ñ

If you search for a movie with the ñ character the scraper cannot find it because the encoding. Using the web browser in filmaffinity.com, it works.

Couls you test it, and... do yoy know the value for searchstringencoding that I need to use?

many thanks!


- spiff - 2008-10-22 12:48

Confused

i repeat;
i made a directory named cariño, set content (including scan by dir name obviously), did the lookup. got the list your url pointed to.


- fidoboy - 2008-10-22 12:56

Hi,

The encoding for ñ char is: %F1 but, anyway here you have the complete list (accents, etc):

http://www.jairoblanco.com/guia-rapida/html/html-url-encode-codificacion/

greets,


- w00dst0ck - 2008-10-22 13:17

HectorziN Wrote:Thanks! it is a great idea but.... always returns the same movie? it could return a wrong one, right?

I've included moviemaze in my search string. If it's listed in the external review list of imdb.com [example] I'll be sure that's the same movie.


- HectorziN - 2008-10-22 17:58

spiff Wrote:Confused

i repeat;
i made a directory named cariño, set content (including scan by dir name obviously), did the lookup. got the list your url pointed to.

OK, but the problem I have is this one:
A folder called Movies
In this folder a lot of movies
one of them called "Cariño estoy hecho un perro"
I search information for this movie using the filmaffinity scrapper
and no results found, I change Cariño with Carino and it works.

The problem is that the search is not done with iso encoding, and I don't know the value to set in searchstringencoding


- fidoboy - 2008-10-24 02:51

Hectorzin, have you readed my answer? You must encode your string, you should replace "cariño" with "cari%F1o" in your URL...

regards,


- spiff - 2008-10-24 09:08

that will be done by the URL encoding applied prior to passing the argument to the scraper function...


- HectorziN - 2008-10-24 20:17

My scraper is a lot complex. Is there any application to help debugger it?
I want to include impawards posters and I can't get it.

Thanks


- w00dst0ck - 2008-10-25 12:56

I use xbmc for windows and watch the xbmc.log

There are also some online RegEx testers.


- HectorziN - 2008-10-27 11:29

w00dst0ck Wrote:I use xbmc for windows and watch the xbmc.log

There are also some online RegEx testers.

Where the log file is stored in windows atlantis version?

thanks