Quick Scraper Question (Hope so:))

  Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
spiff Online
Grumpy Bastard Developer
Posts: 12,176
Joined: Nov 2003
Reputation: 82
Post: #11
that was the prerequisite for 2)

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #12
Hey Spiff,

don't wanna waste your time but i got a question left. I'm getting on with my scraper and the first things are very good. But now i parse the genres and that work but the output is like Action, Thriller, Horror. How to get rid of the , Confused

Thanks in advance

Schenk
find quote
spiff Online
Grumpy Bastard Developer
Posts: 12,176
Joined: Nov 2003
Reputation: 82
Post: #13
Code:
<RegExp input="$$2" output="\1\2" dest="3">
    <expression noclean="1,2" repeat="yes">(.*?),(.*)</expression>
</RegExp>

also you should use multiple <genre> tags so maybe something like this?
Code:
<RegExp input="$$2" output="&lt;genre&gt;\1&lt;/genre&gt;" dest="3">
    <expression noclean="1" repeat="yes">(.*?),</expression>
</RegExp>

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #14
And again a question, sorry for that in advance:

if my movie title has a german umlaut in it (ä) the scraper can't find the movie but when i'm writing ae instead of the umlaut it will be found. i tried all the encoding stuff but can't find the answer:

Here's my code:

PHP Code:
<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<scraper name="Cinefacts.de" content="movies" thumb="cinefacts.jpg" language="de">
    
    <CreateSearchUrl dest="3">
        <RegExp input="$$1" output="http://www.cinefacts.de/suche/suche.php?name=\1" dest="3">
            <expression noclean="1"/>
        </RegExp>
    </CreateSearchUrl>

    <GetSearchResults dest="8">
        <RegExp input="$$5" output="<?xml version=&quot;1.0&quotencoding=&quot;iso-8859-1&quotstandalone=&quot;yes&quot;?><results>\1</results>" dest="8">
            <RegExp input="$$1" output="<entity><title>\3 (\4)</title><url>http://www.cinefacts.de/kino/\1/\2/filmdetails.html</url></entity>" dest="5">
                <expression repeat="yes">><a href=&quot;/kino/([0-9]*)/(.[^\/]*)/filmdetails.html&quot;>[^<]*<b title=&quot;([^&quot;]*)&quot; class=&quot;headline&quot;>[^<]+</b></a><br>[^<]+<br>+[^0-9]+([^<]*)</td></expression>
        </RegExp>
                        <expression noclean="1"/>
        </RegExp>
    </GetSearchResults>

</scraper> 


thanks again for any hints!!!

Schenk
find quote
Nicezia Offline
Fan
Posts: 369
Joined: Nov 2006
Reputation: 0
Location: Montgomery, Alabama
Post: #15
Schenk2302 Wrote:And again a question, sorry for that in advance:

if my movie title has a german umlaut in it (ä) the scraper can't find the movie but when i'm writing ae instead of the umlaut it will be found. i tried all the encoding stuff but can't find the answer:

Here's my code:



thanks again for any hints!!!

Schenk


try escaping the charachter code with '\xE4'
not sure if that's included in the regular expression engine though
find quote
spiff Online
Grumpy Bastard Developer
Posts: 12,176
Joined: Nov 2003
Reputation: 82
Post: #16
you need to set the SearchStringEncoding on the CreateSearchUrl function

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #17
spiff Wrote:you need to set the SearchStringEncoding on the CreateSearchUrl function

As always thanks for this one, Spiff !!!
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #18
Is there something equal for the GetDetails Section, because the plot is displayed with the html tags for the umlautsConfused
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #19
My <GetThumbnailLink dest="5"> outputs as many urls as covers but my <GetThumbnail dest="5"> only outputs the thumb from the first url. how to make all url' s outputted, means getting all thumbs??

thanks in advance and sorry for that poor english Smile

Schenk
find quote
spiff Online
Grumpy Bastard Developer
Posts: 12,176
Joined: Nov 2003
Reputation: 82
Post: #20
<details><thumbs><thumb>..</thumb><thumb>..</thumb></thumbs></details>

also see http://forum.xbmc.org/showthread.php?tid=48643

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
(This post was last modified: 2009-04-28 22:07 by spiff.)
find quote
Post Reply