Solved Regex are greedy? / extracting content from multiple lines
#1
Hi everyone.

I've started to build a movie scraper and extraction of content like that works:
html:
<h1 class="text-serif">LORD OF THE RINGS</h1>

However, extraction of content from multiple lines does not work as expected.
html:
        <a href="/person/filme/2358">
            Ralph Bakshi
        </a>
Corresponding regex, not escaped:
Code:
<a href="/person/filme/.*">(.*)</a>

That regex does not stop at the closing </a> tag but include much more text until some other closing </a> tag.
So I guess the regex is greedy. Is there a way I can change that?
Alternatively, is there a way I can match whitespace and line breaks in the regex? I wasn't successful then trying.

Thank you,
Ben
Reply
#2
Use .*? to stop after first match.
Reply
#3
Great tip! Thanks a lot!
(I'll open another thread for my next question)
Reply
#4
Thread marked solved.
My Signature
Links to : Official:Forum rules (wiki) | Official:Forum rules/Banned add-ons (wiki) | Debug Log (wiki)
Links to : HOW-TO:Create Music Library (wiki) | HOW-TO:Create_Video_Library (wiki)  ||  Artwork (wiki) | Basic controls (wiki) | Import-export library (wiki) | Movie sets (wiki) | Movie universe (wiki) | NFO files (wiki) | Quick start guide (wiki)
Reply

Logout Mark Read Team Forum Stats Members Help
Regex are greedy? / extracting content from multiple lines0