Need help with scraper, removing space
#1
Code:
            <RegExp input="$$1" output="&lt;url function=&quot;GetTrailer1&quot;&gt;http://www.site.com/?q=\1&lt;/url&gt;" dest="5+">
                <expression>&lt;title&gt;([^&lt;]*)&lt;/title&gt;</expression>
            </RegExp>

This returns the url with the found expression. eg "http://www.site.com/?q=The Movie". But somehow this url returns an error in curl. If the url was "http://www.site.com/?q=The%20Movie", everything would work. How can I replace all the space found from the expression with %20?

btw, would trim help me here?
Reply
#2
Code:
<RegExp input="$$1" output="\1%20\2" dest="5">
  <expression repeat="yes">(.*?) (.*)</expression>
</RegExp>
Reply
#3
I tried that, but it only works on titles with two words. If there are more than two words, the spaces aren't replaced by %20.

btw, is it possible to scan a movie with imdb, then scan it again with another scraper to just add a trailer without replacing any other details?
Reply
#4
no, but you can add the trailer lookup to the imdb scraper.

you need to massage the expression a bit to accept more than one space, it was just to give you the idea.
Code:
<RegExp input="$$1" output="\1%20\2" dest="5">
  <expression repeat="yes">(.*?) ([^ ]*)</expression>
</RegExp>
or thereabout should do it
Reply
#5
thanks. i changed it up a little and got it working.
Reply
#6
Hey thanks! It helped me a lot !
Reply
#7
Isn't there an encode attribute to the RegExp tag, that should do the trick?
Reply
#8
these days yes. but not back when this topic was alive.
Reply
#9
spiff Wrote:these days yes. but not back when this topic was alive.
BTW, nor the \s construct, neither the trim switch match TAB characters... No
Reply
#10
i added \t to trim a few weeks back.
Reply

Logout Mark Read Team Forum Stats Members Help
Need help with scraper, removing space0