Quick Scraper Question (Hope so:))

  Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #51
spiff Wrote:there is no <poster> tag, where did you get that idea from?

thought it's free. just replaced fanart.No
find quote
spiff Online
Grumpy Bastard Developer
Posts: 12,172
Joined: Nov 2003
Reputation: 81
Post: #52
well, all the time spent the last days struggling with <thumbs> and <thumb> should make it clear how you add thumbs to the result

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #53
Hey spiff,

after struggling the whole night and day, i just made it Smile this now works and at the end it was easier then i thought the whole day!!! But i'm not finished disturbing you: please let me know if you could help me here.

Code:
<RegExp input="$$1" output="&lt;url function=&quot;GetPosterLinkURL&quot;&gt;http://www.moviemaze.de/suche/result.phtml?searchword=\1&lt;/url&gt;" dest="5+">

Prob here is that it will onyl search for the first word.
for example: Das Hundehotel, it only search for Die, same for The Last..., it only search for The. Any way to change thisConfused

Thanks again

Schenk
find quote
spiff Online
Grumpy Bastard Developer
Posts: 12,172
Joined: Nov 2003
Reputation: 81
Post: #54
you need to run a replacement regexp, replacing ' ' with %20. something along this;

(grab the relevant title in, e.g. $$5)
Code:
<RegExp input="$$5" output="\1%20\2" dest="7">
  <expression repeat="yes">(.*?) (.*)</expression>
</RegExp>

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #55
spiff Wrote:you need to run a replacement regexp, replacing ' ' with %20. something along this;

(grab the relevant title in, e.g. $$5)
Code:
<RegExp input="$$5" output="\1%20\2" dest="7">
  <expression repeat="yes">(.*?) (.*)</expression>
</RegExp>

As usual, i don't understand where to make this in here. is it an inner or outer or seperate regexp?

Code:
            <!--Moviemaze Poster URL-->
                        <RegExp input="$$1" output="&lt;url function=&quot;GetPosterLinkURL&quot;&gt;http://www.moviemaze.de/suche/result.phtml?searchword=\1&lt;/url&gt;" dest="5+">
                <expression noclean="1">&lt;h1&gt;([^&lt;]*)</expression>
            </RegExp>
find quote
spiff Online
Grumpy Bastard Developer
Posts: 12,172
Joined: Nov 2003
Reputation: 81
Post: #56
1) grab whatever you want to search for into a buffer (as i already stated).
Code:
<RegExp input="$$1" output="\1" dest="6">
  <expression noclean="1">&lt;h1&gt;([^&lt;]*)</expression>
</RegExp>

2. run the replacement regexp
Code:
<RegExp input="$$6" output="\1%20\2" dest="7">
  <expression repeat="yes">(.*?) (.*)</expression>
</RegExp>

3. finally construct the url based on your new and shiny space-replaced title
Code:
<RegExp input="$$7" output="&lt;url function=&quot;GetPosterLinkURL&quot;gt;http://www.moviemaze.de/suche/result.phtml?searchword=\1&lt;/url&gt;" dest="5+">
  <expression noclean="1"/>
</RegExp>

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #57
spiff Wrote:
Code:
<RegExp input="$$1" output="\1" dest="6">
  <expression noclean="1">&lt;h1&gt;([^&lt;]*)</expression>
</RegExp>

Code:
<RegExp input="$$6" output="\1%20\2" dest="7">
  <expression repeat="yes">(.*?) (.*)</expression>
</RegExp>

Code:
<RegExp input="$$7" output="&lt;url function=&quot;GetPosterLinkURL&quot;gt;http://www.moviemaze.de/suche/result.phtml?searchword=\1&lt;/url&gt;" dest="5+">
  <expression noclean="1"/>
</RegExp>

Thanks again spiff, i now understand and got it working but now i think it only search for e.g. Der letzte and not Der letzte Zug. Maybe i have to change the expression, but don't know what the old expression is doing for now (.*?) (.*)

-Schenk
find quote
spiff Online
Grumpy Bastard Developer
Posts: 12,172
Joined: Nov 2003
Reputation: 81
Post: #58
well, you should know what that does - it is just a regular expression.

that being said; my bad. you want
Code:
<RegExp input="$$6" output="\1%20" dest="7">
  <expression repeat="yes">([^ ]+)</expression>
</RegExp>

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
w00dst0ck Offline
Junior Member
Posts: 37
Joined: Aug 2008
Reputation: 0
Location: Germany
Post: #59
@Schenk2302:
Als ich den moviemaze.de scraper geschrieben habe und mich dadurch das erste mal mit RegEx auseinandersetzen musste, hat mir diese Seite weitergeholfen.
http://www.regex-tester.de/regex.html
find quote
Schenk2302 Offline
Senior Member
Posts: 103
Joined: Feb 2009
Reputation: 4
Post: #60
w00dst0ck Wrote:@Schenk2302:
Als ich den moviemaze.de scraper geschrieben habe und mich dadurch das erste mal mit RegEx auseinandersetzen musste, hat mir diese Seite weitergeholfen.
http://www.regex-tester.de/regex.html

Hi woodstock,

ja, Danke Dir, habe dort auch schon geschaut, nur manchmal fällt der Groschen einfach nicht.

Grüße

Schenk
find quote
Post Reply