Regex to truncate after a marker problem
#1
Hello,

Please help, I've stared at this thing for hours now. Not even sure if it's my regex dumbness or a ScraperEdit bug.

Trying to CreateSearchUrl-s for movies with differing name formats, such as:
Eat Drink Man Woman.mp4
The Palm Beach Story (HQ) arte 2015-02-16 20h15.mov
Ben Hur (1959).mov
Chicago (2002) [Musical].mov

Ideally, the regex would truncate after (and including) the markers "(HQ)" and "[". (A "(" for marker would not include the year.)
What I have tried so far in ScaperEdit (here for "(HQ)":

Negative lookbehind
Code:
(.+)(?! \(HQ\).+)(\..{3,})
but \1 is greedy and lazy won't work; (.+)*?(?! \(HQ\)… also returns the full filename "The Palm Beach Story (HQ) arte 2015-02-16 20h15.mov"

If statement
Code:
(?(Marker)(TitleBeforeMarker)|(FullTitle))
but apparently the if-clause is not supported; even regex101.com says "(?" is an invalid group structure.

The apparently easiest
Code:
(.+)( \(HQ\).+)?(\..{3,})
has also the greedy problem.

Either|Or, searching for the truncated version first
Code:
(.+)( \(HQ\).+)(\..{3,})|(.+)(\..{3,})
this returns the title only, either in \1 or in \4, but...
simply combining both returns in the output
Code:
<url>http://........?q=\1\4</url>
won't work, I guess because always \1 or \4 is empty.

So I split into two regexps.
the first to execute is without truncation and with "clear destination" checked
Code:
(.+)(\..{3,})
followed by the regex with truncation and without "clear destination"
Code:
(.+)( \(HQ\).*)(\..{3,})
Both give the desired title only in \1. If the second regexp finds something to truncate the second result would overwrite the first. But in ScraperEdit I get a discrepancy between the returns of "find matches" on the truncating expression and the url created in the debugger. Running on "The Palm Beach Story (HQ) arte 2015-02-16 20h15.mov", "find matches" returns the desired truncated "The Palm Beach Story" as \1, but the debugger does not truncate and produces the URL "…?q=The+Palm+Beach+Story+%28HQ%29+arte+2015-02-16+20h15".

Where am I wrong?
y
Reply
#2
ScraperEdit is broken, at least in my configuration.

Some regexps that didn't work in it did work in Kodi.
Reply

Logout Mark Read Team Forum Stats Members Help
Regex to truncate after a marker problem0