I've never write a scraper, a quick help ?

  Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #31
uhrr, you make only one selection. why do you think you can make the title be \2 ?
and see the debug log - it shows you exactly what is returned and whats going on.

the outer regexp is fine.
find quote
small_frenchy Offline
Junior Member
Posts: 32
Joined: Jun 2009
Reputation: 0
Post: #32
Okay, one more step Smile thanks for your help.

Now, I have a problem with GetDetails it seems.

I have in the log :

17:06:44 T:6876 M:944844800 DEBUG: FileCurl::Open(0012B9F4) http://localhost/CMMServer/GetDetails.aspx?s=Apt pupil (1998)
17:06:44 T:6876 M:944803840 DEBUG: XFILE::CFileCurl::CReadState::FillBuffer: curl failed with code 22
17:06:44 T:6876 M:944803840 ERROR: CFileCurl::CReadState::Open, didn't get any data from stream.
17:06:44 T:6876 M:944803840 DEBUG: FileCurl::Close(0012B9F4) http://localhost/CMMServer/GetDetails.aspx?s=Apt pupil (1998)

but when I try these url (http://localhost/CMMServer/GetDetails.aspx?s=Apt pupil (1998)) with Firefox I have :

Code:
<?xml version="1.0" encoding="utf-8"?>
<details>
  <title>Un Élève doué</title>
  <rating>6,500000</rating>
  <year>1998</year>
  <top250>0</top250>
  <votes>14,976</votes>
  <outline>
    Fasciné par le cours de son professeur de sociologie à propos de l'Holocauste, Todd Bowden 16 ans et élève particulièrement brillant, se conscacre à des recherches sur le sujet. Un jour il croise un vieil homme en qui il croit reconnaitre l'ancien directeur du camp de Patin, recherché pour crimes contre l'humanite, Kurt Dussander. Entre l'élève curieux et l'ancien nazi, d'étranges relations de pouvoir vont se nouer.
  </outline>
  <plot>
    Fasciné par le cours de son professeur de sociologie à propos de l'Holocauste, Todd Bowden 16 ans et élève particulièrement brillant, se conscacre à des recherches sur le sujet. Un jour il croise un vieil homme en qui il croit reconnaitre l'ancien directeur du camp de Patin, recherché pour crimes contre l'humanite, Kurt Dussander. Entre l'élève curieux et l'ancien nazi, d'étranges relations de pouvoir vont se nouer.
  </plot>
  <tagline>S'il y a une fin au scénario, on ne peut en dire autant des interrogations qui vous assaillent à la sortie de la salle. Mais c'est un des rares films qui changent à la fois le cinéma et le spectateur.</tagline>
  <runtime>1h 51min</runtime>
  <thumb>http://localhost/CMMServer/temp/thumb_6d1b2267-818a-4505-a187-684208148db0.jpg</thumb>
  <fanart>
    <thumb>http://localhost/CMMServer/temp/fanart_6d1b2267-818a-4505-a187-684208148db0.jpg</thumb>
  </fanart>
  <mpaa />
  <playcount>0</playcount>
  <lastplayed />
  <id>tt0118636</id>
  <genre>Drame / Thriller</genre>
  <credits />
  <director>Bryan Singer</director>
  <premiered />
  <status />
  <code />
  <aired />
  <studio />
  <trailer />
  <actor>
    <name>Brad Renfro</name>
    <role>Todd Bowden</role>
    <thumb>http://localhost/CMMServer/temp/actor_9af31911-7099-488a-8f53-849e80dbcae6.jpg</thumb>
  </actor>
  <actor>
    <name>Ian McKellen</name>
    <role>Kurt Dussander</role>
    <thumb>http://localhost/CMMServer/temp/actor_4cd2042a-e453-414c-b5ab-9c363568b4e4.jpg</thumb>
  </actor>
  <actor>
    <name>Joshua Jackson</name>
    <role>Joey</role>
    <thumb>http://localhost/CMMServer/temp/actor_edf561fd-f573-4684-a38b-d4c3afe27181.jpg</thumb>
  </actor>
  <actor>
    <name>Mickey Cottrell</name>
    <role>Sociology Teacher</role>
    <thumb />
  </actor>
  <actor>
    <name>Michael Reid MacKay</name>
    <role>Nightmare Victim</role>
    <thumb>http://localhost/CMMServer/temp/actor_454e991d-a837-407b-bee9-2448ddfafa8e.jpg</thumb>
  </actor>
  <actor>
    <name>Ann Dowd</name>
    <role>Monica Bowden</role>
    <thumb />
  </actor>
  <actor>
    <name>Bruce Davison</name>
    <role>Richard Bowden</role>
    <thumb>http://localhost/CMMServer/temp/actor_3609fd9f-7302-412d-85be-4e48a688d250.jpg</thumb>
  </actor>
  <actor>
    <name>James Karen</name>
    <role>Victor Bowden</role>
    <thumb>http://localhost/CMMServer/temp/actor_ea4d21c2-a9c2-4601-802b-20a15a93e6e6.jpg</thumb>
  </actor>
  <actor>
    <name>Marjorie Lovett</name>
    <role>Agnes Bowden</role>
    <thumb />
  </actor>
  <actor>
    <name>David Cooley</name>
    <role>Gym Teacher</role>
    <thumb />
  </actor>
  <actor>
    <name>Blake Anthony Tibbetts</name>
    <role>Teammate</role>
    <thumb />
  </actor>
  <actor>
    <name>Heather McComb</name>
    <role>Becky Trask</role>
    <thumb>http://localhost/CMMServer/temp/actor_1bbb9d96-454a-463e-a08f-6326621d16e7.jpg</thumb>
  </actor>
  <actor>
    <name>Katherine Malone</name>
    <role>Student</role>
    <thumb />
  </actor>
  <actor>
    <name>Grace Sinden</name>
    <role>Secretary</role>
    <thumb />
  </actor>
  <actor>
    <name>David Schwimmer</name>
    <role>Edward French</role>
    <thumb>http://localhost/CMMServer/temp/actor_721e8f8a-5431-4409-ba17-bd2eca668c22.jpg</thumb>
  </actor>
  <artist />
</details>
find quote
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #33
firefox accepts broken url's, we do not by the looks of it. spaces is not valid in urls (needs to be %20). either replace in the scraper, or have your server output a url safe version in addition
find quote
small_frenchy Offline
Junior Member
Posts: 32
Joined: Jun 2009
Reputation: 0
Post: #34
Finally it works, very great thanks to you spiff... sorry to waste your time... Now I can finish my project !

When it will be usable I will let it find his way thru users of XBMC if they wants a tool to unify scraping on a computer of their LAN... Thanks a lot again
find quote
Post Reply