2011-04-09, 15:12
Hi
I was trying to use existing TMDB scraper for my movie collection. Unfortunately, it seems that it sends the whole file name to the search api, but I have all file names like "<Director Name> - <Title> [part1-2] (<Year>)". So I need to change regexps a little to match my file names. I found a guide here:
http://wiki.xbmc.org/index.php?title=HOW...s_guide%29
Unfortunately it's very difficult to understand from that guide, how to work with existing scrapers.
Here is an excerpt from TMDB scraper:
The regexp I need for my files is simple and obvious:
I just don't get how the inner regexp from scraper is connected to the outer regexp - the inner one works with buffers 2 and 4 and the outer one works with 1 and 3. So for me they shouldn't correlate at all...
Furthermore, the inner regexp is (.+), which should wipe out everything and the output is +\1 - does it mean it just adds "+" in front of the string?
Please explain me how can I modify the scraper above to add my regexp?
I was trying to use existing TMDB scraper for my movie collection. Unfortunately, it seems that it sends the whole file name to the search api, but I have all file names like "<Director Name> - <Title> [part1-2] (<Year>)". So I need to change regexps a little to match my file names. I found a guide here:
http://wiki.xbmc.org/index.php?title=HOW...s_guide%29
Unfortunately it's very difficult to understand from that guide, how to work with existing scrapers.
Here is an excerpt from TMDB scraper:
Code:
<CreateSearchUrl dest="3">
<RegExp input="$$1" output="<url>http://api.themoviedb.org/2.1/Movie.search/$INFO[language]/xml/57983e31fb435df4df77afb854740ea9/\1</url>" dest="3">
<RegExp input="$$2" output="+\1" dest="4">
<expression clear="yes">(.+)</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</CreateSearchUrl>
Code:
.+\s-\s(.+)\s(part[1-9]\s)?\(.+\)
I just don't get how the inner regexp from scraper is connected to the outer regexp - the inner one works with buffers 2 and 4 and the outer one works with 1 and 3. So for me they shouldn't correlate at all...
Furthermore, the inner regexp is (.+), which should wipe out everything and the output is +\1 - does it mean it just adds "+" in front of the string?
Please explain me how can I modify the scraper above to add my regexp?