MyMovies.it - Italian Scraper

  Thread Rating:
  • 2 Votes - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
sipontino Offline
Junior Member
Posts: 35
Joined: Dec 2008
Reputation: 0
Post: #16
The version i made seems to work with the new and old link,
but i dont know why it doesn't download actor thumbs....that are working on scraper xml tester.

I'm trying to add fanart suppor but i don't understand how to get back in details section imdb id and call api imdb loockup of tmdb
find quote
muttley:bd Offline
Senior Member
Posts: 147
Joined: Oct 2008
Reputation: 0
Post: #17
see my scraper, thumbs and funart work well
find quote
sipontino Offline
Junior Member
Posts: 35
Joined: Dec 2008
Reputation: 0
Post: #18
I tried many times, but with no succes Frown
For thumbs my alternative method is working ....
but for fanart...help me Mutt Sad

Thanks in advance!!!
find quote
muttley:bd Offline
Senior Member
Posts: 147
Joined: Oct 2008
Reputation: 0
Post: #19
my code have much comment, you must see only this part of my code:
Code:
<!-- Tmdb Backdrops through imdbId -->
      <RegExp conditional="backdrops" input="$$7" output="&lt;url function=&quot;GetTMDBFanartByIMDBId&quot;&gt;http://api.themoviedb.org/2.0/Movie.imdbLookup?imdb_id=\1&amp;amp;api_key=57983e31fb435df4df77afb854740ea9&lt;/url&gt;" dest="5+">
              <!-- Get imdbId -->        
        <RegExp input="$$6" output="&lt;url function=&quot;GetImdbId&quot;&gt;http://akas.imdb.com/find?s=tt;q=\1&lt;/url&gt;" dest="7">
          <!-- Italian film title -->
          <RegExp input="$$1" output="\1" dest="4">
            <expression noclean="1">&lt;title&gt;([^\(]+) \(</expression>
          </RegExp>
          <!-- Original film title: not always present -->
          <RegExp input="$$1" output="\1" dest="3">
            <expression cs="true" noclean="1" clear="yes">Titolo originale[^&gt;]+&gt;([^&lt;]+)&lt;</expression>
          </RegExp>
          <!-- Test Original film title -->
          <RegExp input="$$3" output="\1" dest="4">
            <expression>(.+)</expression>
          </RegExp>
          <!-- For better serach -->
          <!-- Get Film Date -->
          <RegExp input="$$1" output="$$4 (\1)" dest="4">
            <expression noclean="1">&lt;title&gt;[^\(]+ \(([0-9]{4})</expression>
          </RegExp>
          <!-- Substitute "space" with "%20"...a sort of urlencoding -->
          <!-- when supported use: encode="1" -->
          <RegExp input="$$4" output="\1%20\2" dest="6">
            <expression repeat="yes" noclean="1,2">(.*?) ([^ ]*)</expression>
          </RegExp>
          <expression noclean="1"></expression>
        </RegExp>
              <expression noclean="1"></expression>
      </RegExp>

the rest of code at the end of scraper, is copied from common/imdb.xml.
Only beacuse the tag <include>common/imdb.xml</include> don't work for me...

if you make same spcific question, i'll be gladly help you Big Grin
find quote
sipontino Offline
Junior Member
Posts: 35
Joined: Dec 2008
Reputation: 0
Post: #20
Code:
<!-- Tmdb Backdrops through imdbId -->
      <RegExp conditional="backdrops" input="$$7" output="&lt;url function=&quot;GetTMDBFanartByIMDBId&quot;&gt;http://api.themoviedb.org/2.0/Movie.imdbLookup?imdb_id=\1&amp;amp;api_key=57983e31fb435df4df77afb854740ea9&lt;/url&gt;" dest="5+">
              <!-- Get imdbId -->        
        <RegExp input="$$6" output="&lt;url function=&quot;GetImdbId&quot;&gt;http://akas.imdb.com/find?s=tt;q=\1&lt;/url&gt;" dest="7">

Considering up code, my doubt is what is passed by the second function with otuput on dest 7 to first function with input $$7.

By my test, seems that i got nothing in imput $$7
find quote
KoTiX Offline
Fan
Posts: 518
Joined: Jun 2004
Reputation: 6
Post: #21
After 2 days hitting my head to get these fanarts working, I just copied the code from the moviemaze scraper and it seems to work ok:

Code:
<!--URL to Google and Fanart-->
            <RegExp input="$$8" output="&lt;url function=&quot;GetTMDBFanartByIMDBId&quot;&gt;http://www.google.com/search?q=site:imdb.com\1&lt;/url&gt;" dest="5+">
                <RegExp input="$$1" output="&quot;\1&quot;" dest="7">
                    <expression>&lt;title&gt;([^\(]+) \(([0-9]{4})</expression>
                </RegExp>
                <RegExp input="$$7" output="+\1" dest="8+">
                    <expression repeat="yes">([^ ,]+)</expression>
                </RegExp>
                        <expression></expression>
                </RegExp>

Wink
find quote
sipontino Offline
Junior Member
Posts: 35
Joined: Dec 2008
Reputation: 0
Post: #22
I resolved putting second url in get search result:

Code:
<?xml version="1.0" encoding="utf-8"?>
<scraper name="MyMovies" date="2009-09-09" content="movies" framework="1.0" thumb="MyMovies.png" language="it">
  <NfoUrl dest="3">
    <RegExp input="$$1" output="\1" dest="3">
      <expression noclean="1">(http://www\.mymovies\.it/dizionario/recensione\.asp\?id=[0-9]+)</expression>
    </RegExp>
  </NfoUrl>
  <CreateSearchUrl dest="3">
    <RegExp input="$$1" output="http://www.mymovies.it/database/ricerca/default.asp?q=\1" dest="3">
      <expression noclean="1"></expression>
    </RegExp>
  </CreateSearchUrl>
  <GetSearchResults dest="8">
    <RegExp input="$$5" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;?&gt;&lt;results&gt;\1&lt;/results&gt;" dest="8">
      <RegExp input="$$1" output="&lt;entity&gt;&lt;title&gt;\2 (\5, \4)&lt;/title&gt;&lt;url&gt;http://www.mymovies.it/dizionario/recensione.asp?id=\1&lt;/url&gt;&lt;url&gt;http://akas.imdb.com/find?s=tt;q=\2%20(\5)&lt;/url&gt;&lt;id&gt;\1&lt;/id&gt;&lt;/entity&gt;" dest="5">
        <expression repeat="yes" noclean="1,3">&lt;h3 style="margin:0px;"&gt;[^&lt;]*&lt;a href="http://www\.mymovies\.it/dizionario/recensione\.asp\?id=([0-9]+)" title="[^"]+"&gt;([^&lt;]+)&lt;/a&gt;[^7]+&lt;div class="linkblu2" style="padding-right:7px; text-align:justify;"&gt;[^&lt;]+Un film di &lt;b&gt;[^&lt;]*&lt;a href="http://www\.mymovies\.it/biografia/\?r=([0-9]+)"&gt;([^&lt;]+)&lt;/[ab]&gt;[^;]+anno=([^"]+)</expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetSearchResults>
  <GetDetails clearbuffers="no" dest="3">
    <RegExp input="$$5" output="&lt;details&gt;\1&lt;/details&gt;" dest="3">
      <RegExp input="$$1" output="&lt;title&gt;\1&lt;/title&gt;&lt;year&gt;\2&lt;/year&gt;" dest="5">
        <expression noclean="1,2">&lt;title&gt;([^\(]+) \(([0-9]{4})</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;genre&gt;\2&lt;/genre&gt;" dest="5+">
        <expression noclean="1">&lt;a title="Film ([^"]+)" href="http://www.mymovies.it/film/\1/"&gt;([^&lt;]+)</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;runtime&gt;\1&lt;/runtime&gt;" dest="5+">
        <expression noclean="1">durata ([0-9]+) min\.</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;director&gt;\1&lt;/director&gt;" dest="5+">
        <expression>Un film di (.+?) con</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;tagline&gt;\1&lt;/tagline&gt;" dest="5+">
        <expression noclean="1" trim="1"> &lt;strong class="rec_lancio" &gt;([^&lt;]+)&lt;/strong&gt;</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;plot&gt;\1&lt;/plot&gt;" dest="5+">
        <expression repeat="yes" trim="1">&lt;td rowspan="2" valign="top"&gt;[\s]+&lt;p&gt;[\s]+[^&gt;]+&gt;[\s]+[^&gt;]+/&gt;[\s]+&lt;/a&gt;[\s]+(.+) &lt;/p&gt;</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;url function=&quot;GetPosters&quot;&gt;\1&lt;/url&gt;" dest="5+">
        <expression noclean="1">&lt;td class="rec_link_disattivo"&gt;&lt;a title="[^"]+" href="([^"]+)"&gt;Poster&lt;/a&gt;&lt;/td&gt;</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;url function=&quot;GetMovieTrailer&quot;&gt;\1&lt;/url&gt;" dest="5+">
        <expression>&lt;td class="rec_link_disattivo"&gt;&lt;a title="[^"]+" href="([^"]+)"&gt;Trailer&lt;/a&gt;&lt;/td&gt;</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;rating&gt;\1\2&lt;/rating&gt;" dest="5+">
        <expression>&lt;div style="text-align:center; font-size:23px; font-weight:bold; letter-spacing:1px; margin:0px 11px 7px 11px"&gt;([0-9]+)\,([0-9]+)&lt;span style="font-size:11px"&gt;/([^&lt;]+)&lt;/span</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;url function=&quot;GetMovieCast&quot;&gt;\1&lt;/url&gt;" dest="5+">
        <expression>&lt;td class="rec_link_disattivo"&gt;&lt;a title="[^"]+" href="([^"]+)"&gt;Cast&lt;/a&gt;&lt;/td&gt;</expression>
      </RegExp>
      <RegExp input="$$2" output="&lt;url function=&quot;GetFan&quot;&gt;http://api.themoviedb.org/2.1/Movie.imdbLookup/en/xml/57983e31fb435df4df77afb854740ea9/tt\1&lt;/url&gt;" dest="5+">
        <expression noclean="1">/title/tt([0-9]+)/</expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetDetails>
  <GetMovieTrailer dest="5">
    <RegExp input="$$1" output="&lt;details&gt;&lt;trailer&gt;\1&lt;/trailer&gt;&lt;/details&gt;" dest="5+">
      <expression noclean="1">"file=([^&amp;]+)</expression>
    </RegExp>
  </GetMovieTrailer>
  <GetMovieCast dest="5">
    <RegExp input="$$2" output="&lt;details&gt;\1&lt;/details&gt;" dest="5+">
      <RegExp input="$$1" output="&lt;actor&gt;&lt;name&gt;\2&lt;/name&gt;&lt;role&gt;\3&lt;/role&gt;&lt;thumb&gt;\1&lt;/thumb&gt;&lt;/actor&gt;" dest="2+">
        <expression repeat="yes" noclean="1">src="([^"]+)"[\s]+alt="([^"]+)" /&gt;[\s]+&lt;/a&gt;[\s]+&lt;div style=[^&gt;]+&gt;[\s]+&lt;a href="[^&gt;]+&gt;[^&lt;]+&lt;/a&gt;[\s]+&lt;div style="[^&gt;]+&gt;([^&lt;]+)&lt;/div&gt;</expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetMovieCast>
  <GetPosters dest="5">
    <RegExp input="$$6" output="&lt;details&gt;\1&lt;/details&gt;" dest="5+">
      <RegExp input="$$1" output="&lt;thumb&gt;http://www.mymovies.it/filmclub/\2/\3/\4/locandina\5&lt;/thumb&gt;" dest="6+">
        <expression repeat="yes" noclean="1">&lt;td align="center" valign="middle" style="background-color:#eeeeee; border:solid 1px #AEAEAE;"&gt;[\s]+&lt;a href="([^"]+)"&gt;&lt;[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[\s]+src="http://www.mymovies.it/filmclub/([0-9]+)/([0-9]+)/([0-9]+)/imm([^"]+)</expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetPosters>
  <GetFan dest="5">
    <RegExp input="$$4" output="&lt;details&gt;&lt;fanart&gt;\1&lt;/fanart&gt;&lt;/details&gt;" dest="5+">
      <RegExp input="$$1" output="&lt;thumb preview=&quot;\2&quot;&gt;\1&lt;/thumb&gt;" dest="4">
        <expression repeat="yes" noclean="1">&lt;image type="backdrop" size="original" url="([^"]+)" id="[0-9]+"/&gt;[^&gt;]+&gt;[\s]+&lt;image type="backdrop" size="thumb" url="([^"]+)" id="[0-9]+"/&gt;</expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetFan>
</scraper>

Try it, will get fanarts directly on api tmdb imdb lookup
(This post was last modified: 2009-10-03 10:13 by sipontino.)
find quote
ZIOLele Offline
Senior Member
Posts: 130
Joined: Oct 2008
Reputation: 0
Post: #23
It almost work... but if you try to scrape info for movies which have only one poster the scraper couldn't load them (the posters)

For Example: Try scraping Fantozzi or Cleaner (2007 with S.L. Jackson)

Hope this help ironing it out....

ZIOLele
find quote
sipontino Offline
Junior Member
Posts: 35
Joined: Dec 2008
Reputation: 0
Post: #24
Simply modify get posters function like under:

Code:
<GetPosters dest="5">
    <RegExp input="$$6" output="&lt;details&gt;\1&lt;/details&gt;" dest="5+">
      <RegExp input="$$1" output="&lt;thumb&gt;http://www.mymovies.it/filmclub/\2/\3/\4/locandina\5&lt;/thumb&gt;" dest="6+">
        <RegExp input="$$1" output="&lt;thumb&gt;\1&lt;/thumb&gt;" dest="6">
          <expression repeat="yes" noclean="1">&lt;td align="center" valign="middle" style="background-color:#eeeeee; border:solid 1px #AEAEAE;"&gt;[\s]+&lt;img width="[0-9]+" style="margin-top:[0-9]+px; margin-bottom:[0-9]+px;" title="[^"]+" alt="[^"]+" src="([^"]+)" /&gt;</expression>
        </RegExp>
        <expression repeat="yes" noclean="1">&lt;td align="center" valign="middle" style="background-color:#eeeeee; border:solid 1px #AEAEAE;"&gt;[\s]+&lt;a href="([^"]+)"&gt;&lt;[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[\s]+src="http://www.mymovies.it/filmclub/([0-9]+)/([0-9]+)/([0-9]+)/imm([^"]+)"</expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetPosters>

Thx you for bring this error to us attention!!Nod
Now ...about posters...the only problem is that sometime with some movie fails to download first poster (italian poster)...but it fails cause mymovies doesn't host the file even if is linked.
(This post was last modified: 2009-10-03 20:30 by sipontino.)
find quote
sipontino Offline
Junior Member
Posts: 35
Joined: Dec 2008
Reputation: 0
Smile   
Post: #25
Hi all, finally MyMovies.it scraper seems to be finished.
I used include common tmdb.xml for fanart part
Tested by me and KoTiX with very good results.
In settings u can disable trailer and fanart (slow due to tmdb)

It can get thumbs, fanart (tmdb) and trailers.....all.
This can't be more complete, considering also that mymovies.it is not
too "stable" due to site still change.

Code:
<?xml version="1.0" encoding="utf-8"?>
<scraper name="MyMovies.it" date="2009-09-09" content="movies" framework="1.0" thumb="MyMovies.png" language="it">
  <include>common/tmdb.xml</include>
  <GetSettings dest="3">
    <RegExp input="$$5" output="&lt;settings&gt;\1&lt;/settings&gt;" dest="3">
      <RegExp input="$$1" output="&lt;setting label=&quot;Get TMDB Backdrops (Very slow)&quot; type=&quot;bool&quot; id=&quot;backdrops&quot; default=&quot;true&quot;&gt;&lt;/setting&gt;" dest="5+">
        <expression></expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;setting label=&quot;Get Trailer&quot; type=&quot;bool&quot; id=&quot;trailer&quot; default=&quot;true&quot;&gt;&lt;/setting&gt;" dest="5+">
        <expression></expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetSettings>
  <NfoUrl dest="3">
    <RegExp input="$$1" output="\1" dest="3">
      <expression noclean="1">(http://www\.mymovies\.it/dizionario/recensione\.asp\?id=[0-9]+)</expression>
    </RegExp>
  </NfoUrl>
  <CreateSearchUrl dest="3">
    <RegExp input="$$1" output="http://www.mymovies.it/database/ricerca/default.asp?q=\1" dest="3">
      <expression noclean="1"></expression>
    </RegExp>
  </CreateSearchUrl>
  <GetSearchResults dest="8">
    <RegExp input="$$5" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;?&gt;&lt;results&gt;\1&lt;/results&gt;" dest="8">
      <RegExp input="$$1" output="&lt;entity&gt;&lt;title&gt;\2 (\5, \4)&lt;/title&gt;&lt;url&gt;http://www.mymovies.it/dizionario/recensione.asp?id=\1&lt;/url&gt;&lt;id&gt;\1&lt;/id&gt;&lt;/entity&gt;" dest="5">
        <expression repeat="yes" noclean="1,3">&lt;h3 style="margin:0px;"&gt;[^&lt;]*&lt;a href="http://www\.mymovies\.it/dizionario/recensione\.asp\?id=([0-9]+)" title="[^"]+"&gt;([^&lt;]+)&lt;/a&gt;.+?&lt;div class="linkblu2" style="padding-right:7px; text-align:justify;"&gt;[\s]+Un film di &lt;b&gt;[^&lt;]*&lt;a href="http://www\.mymovies\.it/biografia/\?r=([0-9]+)"&gt;([^&lt;]+)&lt;/[ab]&gt;[^;]+anno=([^"]+)</expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetSearchResults>
  <GetDetails clearbuffers="no" dest="3">
    <RegExp input="$$5" output="&lt;details&gt;\1&lt;/details&gt;" dest="3">
      <RegExp input="$$1" output="&lt;title&gt;\1&lt;/title&gt;&lt;year&gt;\2&lt;/year&gt;" dest="5">
        <expression noclean="1,2">&lt;title&gt;([^\(]+) \(([0-9]{4})</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;genre&gt;\2&lt;/genre&gt;" dest="5+">
        <expression noclean="1">&lt;a title="Film ([^"]+)" href="http://www.mymovies.it/film/\1/"&gt;([^&lt;]+)</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;runtime&gt;\1&lt;/runtime&gt;" dest="5+">
        <expression noclean="1">durata ([0-9]+) min\.</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;director&gt;\1&lt;/director&gt;" dest="5+">
        <expression noclean="1" trim="1">Un film di (.+?) con</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;tagline&gt;\1&lt;/tagline&gt;" dest="5+">
        <expression noclean="1" trim="1"> &lt;strong class="rec_lancio" &gt;([^&lt;]+)&lt;/strong&gt;</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;plot&gt;\1&lt;/plot&gt;" dest="5+">
        <expression repeat="yes" trim="1">&lt;td rowspan="2" valign="top"&gt;[\s]+&lt;p&gt;[\s]+[^&gt;]+&gt;[\s]+[^&gt;]+/&gt;[\s]+&lt;/a&gt;[\s]+(.+) &lt;/p&gt;</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;thumb&gt;\1&lt;/thumb&gt;" dest="5+">
        <expression noclean="1">&lt;img style="float:left; border:solid 1px gray; padding:3px; margin:5px;" src="([^"]+)" width="[0-9]+px" height="[0-9]+px" alt="[^"]+" /&gt;</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;url function=&quot;GetPosters&quot;&gt;\1&lt;/url&gt;" dest="5+">
        <expression noclean="1">&lt;td class="rec_link_disattivo"&gt;&lt;a title="[^"]+" href="([^"]+)"&gt;Poster&lt;/a&gt;&lt;/td&gt;</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;url function=&quot;GetMovieTrailer&quot;&gt;\1&lt;/url&gt;" dest="5+">
        <expression>&lt;td class="rec_link_disattivo"&gt;&lt;a title="[^"]+" href="([^"]+)"&gt;Trailer&lt;/a&gt;&lt;/td&gt;</expression>
      </RegExp>
      <RegExp conditional="backdrops" input="$$7" output="&lt;url function=&quot;GetTMDBFanartByIMDBId&quot;&gt;http://www.imdb.it/find?s=all&amp;q=\1&lt;/url&gt;" dest="5+">
        <RegExp input="$$1" output="\1" dest="4">
          <expression noclean="1" trim="1">&lt;title&gt;([^\(]+) \(</expression>
        </RegExp>
        <RegExp input="$$1" output="$$4 (\1)" dest="4">
          <expression noclean="1">&lt;title&gt;[^\(]+ \(([0-9]{4})</expression>
        </RegExp>
        <RegExp input="$$4" output="\1+\2" dest="6">
          <expression repeat="yes" noclean="1,2" trim="1">(.*?) ([^ ]*)</expression>
        </RegExp>
        <RegExp input="$$6" output="\1\2" dest="7">
          <expression repeat="yes" noclean="1,2" trim="1">(.*?)([^&amp;]*)</expression>
        </RegExp>
        <expression noclean="1"></expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;rating&gt;\1\2&lt;/rating&gt;" dest="5+">
        <expression>&lt;div style="text-align:center; font-size:23px; font-weight:bold; letter-spacing:1px; margin:0px 11px 7px 11px"&gt;([0-9]+)\,([0-9]+)&lt;span style="font-size:11px"&gt;/([^&lt;]+)&lt;/span</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;url function=&quot;GetMovieCast&quot;&gt;\1&lt;/url&gt;" dest="5+">
        <expression>&lt;td class="rec_link_disattivo"&gt;&lt;a title="[^"]+" href="([^"]+)"&gt;Cast&lt;/a&gt;&lt;/td&gt;</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;studio&gt;\1&lt;/studio&gt;" dest="5+">
        <expression noclean="1" trim="1">&gt;[0-9]+&lt;/a&gt;&lt;/strong&gt;.[^-]+-([^&lt;]+)&lt;strong&gt;</expression>
      </RegExp>
      <RegExp input="$$1" output="&lt;mpaa&gt;\1&lt;/mpaa&gt;" dest="5+">
        <expression noclean="1">ratings: [^=]+=[^=]+=[^&gt;]+&gt;([^&lt;]+)&lt;/a&gt;&lt;/strong&gt;</expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetDetails>
  <GetMovieTrailer dest="5">
    <RegExp conditional="trailer" input="$$1" output="&lt;details&gt;&lt;trailer&gt;\1&lt;/trailer&gt;&lt;/details&gt;" dest="5+">
      <expression noclean="1">"file=([^&amp;]+)</expression>
    </RegExp>
  </GetMovieTrailer>
  <GetPosters dest="5">
    <RegExp input="$$6" output="&lt;details&gt;\1&lt;/details&gt;" dest="5+">
      <RegExp input="$$1" output="&lt;thumb&gt;http://www.mymovies.it/filmclub/\2/\3/\4/locandina\5&lt;/thumb&gt;" dest="6+">
        <RegExp input="$$1" output="&lt;thumb&gt;\1&lt;/thumb&gt;" dest="6">
          <expression repeat="yes" noclean="1">&lt;td align="center" valign="middle" style="background-color:#eeeeee; border:solid 1px #AEAEAE;"&gt;[\s]+&lt;img width="[0-9]+" style="margin-top:[0-9]+px; margin-bottom:[0-9]+px;" title="[^"]+" alt="[^"]+" src="([^"]+)" /&gt;</expression>
        </RegExp>
        <expression repeat="yes" noclean="1">&lt;td align="center" valign="middle" style="background-color:#eeeeee; border:solid 1px #AEAEAE;"&gt;[\s]+&lt;a href="([^"]+)"&gt;&lt;[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[^"]+"[\s]+src="http://www.mymovies.it/filmclub/([0-9]+)/([0-9]+)/([0-9]+)/imm([^"]+)"</expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetPosters>
  <GetMovieCast dest="5">
    <RegExp input="$$2" output="&lt;details&gt;\1&lt;/details&gt;" dest="5+">
      <RegExp input="$$1" output="&lt;actor&gt;&lt;name&gt;\2&lt;/name&gt;&lt;role&gt;\3&lt;/role&gt;&lt;thumb&gt;\1&lt;/thumb&gt;&lt;/actor&gt;" dest="2+">
        <expression repeat="yes" noclean="1">src="([^"]+)"[\s]+alt="([^"]+)" /&gt;[\s]+&lt;/a&gt;[\s]+&lt;div style=[^&gt;]+&gt;[\s]+&lt;a href="[^&gt;]+&gt;[^&lt;]+&lt;/a&gt;[\s]+&lt;div style="[^&gt;]+&gt;([^&lt;]+)&lt;/div&gt;</expression>
      </RegExp>
      <expression noclean="1"></expression>
    </RegExp>
  </GetMovieCast>
</scraper>

vdrfan Wrote:Once there's a updated and working scraper please create a new trac ticket and attach the scraper so we can push it to SVN. Thanks.

Please, if u wanna push it to svn ...i'm not so friend to Trac.

Bye alll
(This post was last modified: 2009-10-07 22:03 by sipontino.)
find quote
vdrfan Offline
Team-XBMC Developer
Posts: 2,891
Joined: Jan 2008
Reputation: 8
Location: Germany
Post: #26
If you want this to be included in XBMC create a new trac ticket and attach it.

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules
For troubleshooting and bug reporting please make sure you read this first.
find quote
sipontino Offline
Junior Member
Posts: 35
Joined: Dec 2008
Reputation: 0
Smile   
Post: #27
vdrfan Wrote:If you want this to be included in XBMC create a new trac ticket and attach it.

Ok , np i will do it soon!

Is there a scraper with working actors thumb? It seems that even if
the output is good, no actor thumbs displayed.
Any help would be appreciated, considering that scraper XML tester
is displaing actor thumbs correctly.

Here is an output example:

Code:
<details><actor><name>Steven Spielberg</name><role>Regista</role><thumb>http://www.mymovies.it/filmclub/registi/771.jpg</thumb></actor><actor><name>Harrison Ford</name><role>Indiana Jones</role><thumb>http://www.mymovies.it/filmclub/attori/262.jpg</thumb></actor><actor><name>Karen Allen</name><role>Marion Ravenwood</role><thumb>http://www.mymovies.it/filmclub/attori/4890.jpg</thumb></actor><actor><name>Cate Blanchett</name><role>Irina Spalko</role><thumb>http://www.mymovies.it/filmclub/attori/26216.jpg</thumb></actor><actor><name>Shia LaBeouf</name><role>Mutt Williams</role><thumb>http://www.mymovies.it/filmclub/attori/58848.jpg</thumb></actor><actor><name>John Hurt</name><role>Abner Ravenwood</role><thumb>http://www.mymovies.it/filmclub/attori/2195.jpg</thumb></actor><actor><name>Ray Winstone</name><role>Mac</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor><actor><name>Jim Broadbent</name><role>Il professore di Yale</role><thumb>http://www.mymovies.it/filmclub/attori/17593.jpg</thumb></actor><actor><name> Igor Jijikine</name><role>Dovchenko</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor><actor><name>Pavel Lychnikoff</name><role>Un soldato russo (per Pasha D. Lychnikoff)</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor><actor><name> Andrew Divoff</name><role>Un soldato russo</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor><actor><name>Alan Dale</name><role>General Ross</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor><actor><name>Frank Marshall</name><role>Produzione</role><thumb>http://www.mymovies.it/filmclub/registi/622.jpg</thumb></actor><actor><name> Denis L. Stewart</name><role>Produzione - altri</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor><actor><name>Kathleen Kennedy</name><role>Produttore esecutivo</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor><actor><name> George Lucas</name><role>Soggetto</role><thumb>http://www.mymovies.it/filmclub/registi/902.jpg</thumb></actor><actor><name>Jeff Nathanson</name><role>Soggetto</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor><actor><name> Philip Kaufman</name><role>Soggetto</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor><actor><name>David Koepp</name><role>Sceneggiatura</role><thumb>http://www.mymovies.it/filmclub/registi/9107.jpg</thumb></actor><actor><name>Janusz Kaminsky</name><role>Fotografia</role><thumb>http://www.mymovies.it/v7/img/icon_cast/fotografia_big.jpg</thumb></actor><actor><name>Michael Kahn</name><role>Montaggio</role><thumb>http://www.mymovies.it/v7/img/icon_cast/montaggio_big.jpg</thumb></actor><actor><name>John Williams</name><role>Musica</role><thumb>http://www.mymovies.it/v7/img/icon_cast/musica_big.jpg</thumb></actor><actor><name>Guy Dyas</name><role>Scenografia</role><thumb>http://www.mymovies.it/v7/img/icon_cast/scenografia_big.jpg</thumb></actor><actor><name>Bernie Pollack</name><role>Costumi</role><thumb>http://www.mymovies.it/v7/img/icon_cast/costumi_big.jpg</thumb></actor><actor><name>Mary Zophres</name><role>Costumi</role><thumb>http://www.mymovies.it/v7/img/icon_cast/costumi_big.jpg</thumb></actor><actor><name>Pablo Helman</name><role>Effetti</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor><actor><name>Ben Burtt</name><role>Effetti - altri</role><thumb>http://www.mymovies.it/v7/img/icon_cast/star_big.jpg</thumb></actor></details>

Thxs in advance
find quote
vdrfan Offline
Team-XBMC Developer
Posts: 2,891
Joined: Jan 2008
Reputation: 8
Location: Germany
Post: #28
Output looks sane. Note, in order to fetch actor thumbs you will have to enable it in the video settings.

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules
For troubleshooting and bug reporting please make sure you read this first.
find quote
muttley:bd Offline
Senior Member
Posts: 147
Joined: Oct 2008
Reputation: 0
Post: #29
After (another) last graphic restyling, here you can find last version:
http://muttley.eb2a.com/2009/xbmc-scrape...movies-it/ (italian page)

see first post
find quote
KoTiX Offline
Fan
Posts: 518
Joined: Jun 2004
Reputation: 6
Post: #30
Sorry Muttley but this version of the scraper doesn't work for me, it doesn't even find any movie, the expression in GetSearchResult don't point to any result of the google page:

Code:
22:43:03 T:3360 M:1715896320   DEBUG: FileCurl::Open(0012D8C8) http://www.google.it/cse?q=franklyn&cx=partner-pub-1699801751737986%3Ax7j961-1g3m&ie=ISO-8859-1&sa=Cerca&num=20
22:43:03 T:3360 M:1716051968   DEBUG: FileCurl::Close(0012D8C8) http://www.google.it/cse?q=franklyn&cx=partner-pub-1699801751737986%3Ax7j961-1g3m&ie=ISO-8859-1&sa=Cerca&num=20
22:43:03 T:3360 M:1716047872   DEBUG: scraper: GetSearchResults returned <?xml version="1.0" encoding="iso-8859-1" standalone="yes"?><results></results>
22:43:03 T:3360 M:1716047872   ERROR: CIMDB::Process: Error looking up movie Franklyn
22:43:03 T:3360 M:1716047872   DEBUG: Thread 3360 terminating

Did you upload the right file to your site?
(This post was last modified: 2009-10-28 23:56 by KoTiX.)
find quote
Post Reply