Quick Scraper Question (Hope so:))

  Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #1
Hi everyone,

i try to make a scraper but can't get ahead with one step.

I use scrap.exe to test my scraper:

CreateSearchUrl returned is okay!

GetSearchResults returned is okay !

Details URL is okay !

but then the GetDetails returned: is nothing with the Error: Unable to parse details.xml

Here's my code:

PHP Code:
<scraper name="TEST" content="movies" thumb="cinefacts.gif" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" language="de">
    <
CreateSearchUrl dest="3">
        <
RegExp input="$$1" output="http://www.cinefacts.de/suche/suche.php?name=\1" dest="3">
            <
expression noclean="1"/>
        </
RegExp>
    </
CreateSearchUrl>
    <
GetSearchResults dest="8">
        <
RegExp input="$$5" output="<?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;?><results>\1</results>" dest="8">
            <
RegExp input="$$1" output="<entity><title>\3 \4</title><url>http://www.cinefacts.de/kino/\1/\2/filmdetails.html</url></entity>" dest="5">
                <
expression repeat="yes">><a href=&quot;/kino/([0-9]*)/(.[^\/]*)/filmdetails.html&quot;>[^>]*(.[^<]*)</b></a><br>[^>]*[^\t]+\t+[^&nbsp;]+[^0-9]+([^<]+)</expression>
            </
RegExp>
            <
expression noclean="1"/>
        </
RegExp>
    </
GetSearchResults>
    <
GetDetails dest="3">
        <
RegExp input="$$5" output="<details>\1</details>" dest="3">
            <!--
Title -->
            <
RegExp input="$$1" output="<title>\1</title>" dest="5+">
                <
expression trim="1" noclean="1"><h1>([^<]*)</expression>
            </
RegExp>
                </
RegExp>
        </
GetDetails>
</
scraper

Maybe someone could have a quick look at this and tell me the direction to get it right.

Thanks so much in advance

Schenk
find quote
spiff Offline
Grumpy Bastard Developer
Posts: 12,179
Joined: Nov 2003
Reputation: 82
Post: #2
unfortunately scrap.exe is outdated and we lost the source.

and the reason it does not work is that you are missing the expression for the outermost RegExp in GetDetails, i.e.
Code:
....
</RegExp>
<expression noclean="1"/>
</RegExp>
</GetDetails>

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
(This post was last modified: 2009-04-22 22:19 by spiff.)
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #3
Hi Spiff,

thanks for your answer, that solved the problem with scrap.exe Smile

But now i tried it in XBMC and it doesn't work. i know that scrap.exe is outdated but is there any chance to see at which point XBMC stuck with my scrapper or better why it not works. any scrapper logsConfused At this point i have absolutely no clue where to start and find the error because with scrap.exe it's just fine. Thanks again for any hints or infos.

Greetz

Schenk
find quote
spiff Offline
Grumpy Bastard Developer
Posts: 12,179
Joined: Nov 2003
Reputation: 82
Post: #4
my answer depends on two things;
1) you speak c++ and can compile
or
2) you can compile
or
3) neither

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #5
Big Grin

maybe 2) better 3)

Could you explain why?

Thanks

Schenk
find quote
spiff Offline
Grumpy Bastard Developer
Posts: 12,179
Joined: Nov 2003
Reputation: 82
Post: #6
if 1 i could gotten away with instructions
2 means i'll have to do a patch for you which i will do shortly - here it is; http://dureks.dyndns.org:8080/scraperlog.diff
3 means i don't have to do anything

Smile

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
(This post was last modified: 2009-04-22 22:41 by spiff.)
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #7
spiff Wrote:if 1 i could gotten away with instructions
2 means i'll have to do a patch for you which i will do shortly
3 means i don't have to do anything

Smile


2 sound like i could try
3 makes me crying because i want that Cinefacts Scraper working Smile
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #8
little side note:

i made a cinefacts.de scraper for MediaPortal but now switched to XBMC and would like to use it here. It was even hard for me to do this in MP, in XBMC i'm getting depressed because it's totally different Smile
find quote
spiff Offline
Grumpy Bastard Developer
Posts: 12,179
Joined: Nov 2003
Reputation: 82
Post: #9
heh, different does not mean bad. don't give up, you'll get the hang of it =P

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #10
Spiff, i know i'm kind of lazy yet but is there a compiled version with your patch to download or do i really have to compile by my own, what makes me really afraid Shocked
find quote
Post Reply