2009-04-18, 11:06
Posted here as i could not decide which forum was appropriate. Feel free to move.
I have some questions around XBMC specific implementation of REGEX.
1. I was certain XBMC didnt support advanced metacharacters e.g. \d
However this wiki page is full of them:
http://wiki.xbmc.org/?title=Regular_Expr...)_Tutorial
Is this just a work in progress.? If so it really should be given a big sign as almost every example is not appropriate for XBMC?
Edit: If XBMC really is fully PCRE compatible these examples are perfect.
Edit: Confirmed XBMC does support REGEX meta chars it never used to such as \d
2. I thought that TV show matching only needed two () matches; one for season and one for episode.
However the default set listed on the wiki here:
http://wiki.xbmc.org/?title=Advancedsett...atching.3E
e.g. <regexp>\[[Ss]([0-9]+)\]_\[[Ee]([0-9]+)([^\\/]*)</regexp> <!-- foo_[s01]_[e01] -->
has 3 matches. Why 3 group matches?
Edit: Confirmed via here and IRC (thanks cpt) that the 3rd macth is for two part ep matching.
3. This REGEX is listed all over the place at the end of most TV matching regex [^\\/]*
I understand what it means but why is such a greedy match like this needed?
Edit: Explained by answer 2
4. Several forum examples by XBMC devs use case sensitive matching e.g [a-zA-Z].
I was under the impression that strings were converted to lower case by the code prior to being REGEX matched?
Edit: Over time XBMC has been improved. Now you dont need to care about case at all since everything is forced to lower case prior to a match
5. Can someone point me at an explanation of two part matching? I was sure there was a guide somewhere but i just cant find it now.
Edit: The only real docs on this are in the source comments
I have some questions around XBMC specific implementation of REGEX.
1. I was certain XBMC didnt support advanced metacharacters e.g. \d
However this wiki page is full of them:
http://wiki.xbmc.org/?title=Regular_Expr...)_Tutorial
Is this just a work in progress.? If so it really should be given a big sign as almost every example is not appropriate for XBMC?
Edit: If XBMC really is fully PCRE compatible these examples are perfect.
Edit: Confirmed XBMC does support REGEX meta chars it never used to such as \d
2. I thought that TV show matching only needed two () matches; one for season and one for episode.
However the default set listed on the wiki here:
http://wiki.xbmc.org/?title=Advancedsett...atching.3E
e.g. <regexp>\[[Ss]([0-9]+)\]_\[[Ee]([0-9]+)([^\\/]*)</regexp> <!-- foo_[s01]_[e01] -->
has 3 matches. Why 3 group matches?
Edit: Confirmed via here and IRC (thanks cpt) that the 3rd macth is for two part ep matching.
3. This REGEX is listed all over the place at the end of most TV matching regex [^\\/]*
I understand what it means but why is such a greedy match like this needed?
Edit: Explained by answer 2
4. Several forum examples by XBMC devs use case sensitive matching e.g [a-zA-Z].
I was under the impression that strings were converted to lower case by the code prior to being REGEX matched?
Edit: Over time XBMC has been improved. Now you dont need to care about case at all since everything is forced to lower case prior to a match
5. Can someone point me at an explanation of two part matching? I was sure there was a guide somewhere but i just cant find it now.
Edit: The only real docs on this are in the source comments