REGEX class handles multiline?

  Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
AaronD Offline
Senior Member
Posts: 252
Joined: Jan 2007
Reputation: 0
Location: Dubai, United Arab Emirates
Post: #1
This is for anyone familiar with the scraper implementation, specifically the capability of PCRE (Not looking at anyone in particular :rolleyesSmile

Stepping through the code, I have the following expression text:
Code:
<thumb([^>]*)>.*?url="([^"]*)".*?size="original".*?</thumb>

And the following input string, copied literally from the Text visualizer in Visual Studio:
Code:
<thumb>
          <image url="http://i2.themoviedb.org/posters/42f/4bc95005017a3c57fe02342f/antichrist-original.jpg" size="original" width="1418" height="1944"/>
          <image url="http://i3.themoviedb.org/posters/42f/4bc95005017a3c57fe02342f/antichrist-mid.jpg" size="mid" width="500" height="685"/>
          <image url="http://i1.themoviedb.org/posters/42f/4bc95005017a3c57fe02342f/antichrist-cover.jpg" size="cover" width="185" height="253"/>
          <image url="http://i1.themoviedb.org/posters/42f/4bc95005017a3c57fe02342f/antichrist-thumb.jpg" size="thumb" width="92" height="126"/>
        </thumb>

I've tried several 3rd party regex testers or evaluators, and they all fail to find a match, but only because of the carriage returns. If I remove those so the text is all on a single line, the regex matches. However what puzzles me is if I use the original expression text from the scraper file, it also fails on other regex testers.

So my question: Does PCRE work across multiple lines? And if so, why doesn't it work with my regex? JUst an oddity of PCRE?
find quote
jmarshall Offline
Team-XBMC Developer
Posts: 24,523
Joined: Oct 2003
Reputation: 138
Post: #2
PCRE should work across multiple lines, yes - there's a flag we specify. Not sure why it's not matching your regexp?

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


[Image: badge.gif]
find quote
AaronD Offline
Senior Member
Posts: 252
Joined: Jan 2007
Reputation: 0
Location: Dubai, United Arab Emirates
Post: #3
Actually I realised afterwards that mine was a stupid post. Its scraping websites with multiple lines isn't it, of course it must work with carriage return.

Oh well, I guess I will just put it down to an oddity of PCRE and move on.

Thanks
find quote