REGEX class handles multiline?
#1
This is for anyone familiar with the scraper implementation, specifically the capability of PCRE (Not looking at anyone in particular Rolleyes)

Stepping through the code, I have the following expression text:
Code:
<thumb([^>]*)>.*?url="([^"]*)".*?size="original".*?</thumb>

And the following input string, copied literally from the Text visualizer in Visual Studio:
Code:
<thumb>
          <image url="http://i2.themoviedb.org/posters/42f/4bc95005017a3c57fe02342f/antichrist-original.jpg" size="original" width="1418" height="1944"/>
          <image url="http://i3.themoviedb.org/posters/42f/4bc95005017a3c57fe02342f/antichrist-mid.jpg" size="mid" width="500" height="685"/>
          <image url="http://i1.themoviedb.org/posters/42f/4bc95005017a3c57fe02342f/antichrist-cover.jpg" size="cover" width="185" height="253"/>
          <image url="http://i1.themoviedb.org/posters/42f/4bc95005017a3c57fe02342f/antichrist-thumb.jpg" size="thumb" width="92" height="126"/>
        </thumb>

I've tried several 3rd party regex testers or evaluators, and they all fail to find a match, but only because of the carriage returns. If I remove those so the text is all on a single line, the regex matches. However what puzzles me is if I use the original expression text from the scraper file, it also fails on other regex testers.

So my question: Does PCRE work across multiple lines? And if so, why doesn't it work with my regex? JUst an oddity of PCRE?
Reply
#2
PCRE should work across multiple lines, yes - there's a flag we specify. Not sure why it's not matching your regexp?
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#3
Actually I realised afterwards that mine was a stupid post. Its scraping websites with multiple lines isn't it, of course it must work with carriage return.

Oh well, I guess I will just put it down to an oddity of PCRE and move on.

Thanks
Reply

Logout Mark Read Team Forum Stats Members Help
REGEX class handles multiline?0