2012-04-29, 08:48
Hi,
I'm currently working on a music video scraper retrieving data from Last.fm and MusicBrainz and fanarts from HTBackdrops. In fact, it's almost complete. I only have a small problem : I can't get access the "fanarts" tab, despite the fact that in the debugger, fanarts are clearly found and correctly parsed,
Here is the XML (I hid the real url, beacause it's only a test server and can't handle a lot of requests):
Here is the what comes out from the debugger for the video "The Veronicas - Untouched" :
So yeah, clearly the scraper finds some fanarts, but I can't access them, the "fanart" tab is unclickable...
Do you know where my error is, or what the problem is?
Thanks!
Sam
I'm currently working on a music video scraper retrieving data from Last.fm and MusicBrainz and fanarts from HTBackdrops. In fact, it's almost complete. I only have a small problem : I can't get access the "fanarts" tab, despite the fact that in the debugger, fanarts are clearly found and correctly parsed,
Here is the XML (I hid the real url, beacause it's only a test server and can't handle a lot of requests):
Code:
<?xml version="1.0" encoding="utf-8"?><scraper framework="1" date="2012-01-18" name="Last.fm Music Video Scraper" content="musicvideos" thumb="icon.png" language="en">
<CreateSearchUrl dest="3">
<RegExp input="$$1" output="<url>http://ws.audioscrobbler.com/2.0/?method=track.search&track=\1&api_key=b25b959554ed76058ac220b7b2e0a026</url>" dest="3">
<expression noclean="1" />
</RegExp>
</CreateSearchUrl>
<GetSearchResults dest="3">
<RegExp input="$$5" output="<results>\1</results>" dest="3">
<RegExp input="$$8" output="<entity>\1</entity>" dest="5">
<RegExp input="$$1" output="<title>\2 - \1</title>" dest="8">
<expression><track>\s*<name>([^<]*)</name>\s+<artist>([^<]*)</artist></expression>
</RegExp>
<RegExp input="$$1" output="<url>http://xxxxx.com/scraper/search.php?artist=\2&track=\1</url>" dest="8+">
<expression encode="1,2"><track>\s*<name>([^<]*)</name>\s+<artist>([^<]*)</artist></expression>
</RegExp>
<expression repeat="yes" noclean="1" />
</RegExp>
<expression noclean="1" />
</RegExp>
</GetSearchResults>
<GetDetails dest="3">
<RegExp input="$$5" output="<details>\1</details>" dest="3">
<RegExp input="$$1" output="\1" dest="7">
<expression noclean="1"><artist>(.*)</artist></expression>
</RegExp>
<RegExp input="$$1" output="<title>\1</title>" dest="5">
<expression><track>(.*)</track></expression>
</RegExp>
<RegExp input="$$1" output="<artist>\1</artist>" dest="5+">
<expression><artist>(.*)</artist></expression>
</RegExp>
<RegExp input="$$1" output="<year>\1</year>" dest="5+">
<expression><year>(.*)</year></expression>
</RegExp>
<RegExp input="$$1" output="<album>\1</album>" dest="5+">
<expression><album>(.*)</album></expression>
</RegExp>
<RegExp input="$$1" output="<thumb>\1</thumb>" dest="5+">
<expression><thumb>(.*)</thumb></expression>
</RegExp>
<RegExp input="$$1" output="<genre>\1</genre>" dest="5+">
<expression repeat="yes"><genre>(.*)</genre></expression>
</RegExp>
<RegExp input="$$1" output="<plot>\1</plot>" dest="5+">
<expression><plot>(.*)</plot></expression>
</RegExp>
<RegExp input="$$7" output="<chain function="GetHTBFanart">\1</chain>" dest="5+">
<expression/>
</RegExp>
<expression noclean="1" />
</RegExp>
</GetDetails>
<GetHTBFanart dest="5">
<RegExp input="$$1" output="<details><url function="ParseHTBFanart" post="yes" cache="htb-images-\1.xml">http://htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/searchXML?keywords=\1&default_operator=and&aid=1,5</url></details>" dest="5">
<expression noclean="1" />
</RegExp>
</GetHTBFanart>
<ParseHTBFanart dest="5">
<RegExp input="$$13" output="<details><fanart>\1</fanart></details>" dest="5">
<RegExp input="$$1" output="<thumb preview="http://www.htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/download/\1/thumbnail">http://www.htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/download/\1/fullsize</thumb>" dest="13">
<expression repeat="yes" noclean="1"><id>([^<]+)</id>\n[^<]+<aid>1</aid></expression>
</RegExp>
<expression noclean="1">(.+)</expression>
</RegExp>
</ParseHTBFanart>
</scraper>
Here is the what comes out from the debugger for the video "The Veronicas - Untouched" :
Code:
02:31:13 T:8304 DEBUG: scraper: GetDetails returned <details><artist>The Veronicas</artist><album>Untouched</album><thumb>http://userserve-ak.last.fm/serve/300x300/58306283.png</thumb><genre>Pop / Female Vocalists / Dance / The Veronicas / Australian</genre><plot>"Untouched" is the second single by The Veronicas from their sophomore album, Hook Me Up. It was released in December 2007 to Australia. It is also the first single from the same album in North America and Europe. The song is written by Jess and Lisa and also has writing credits from Toby Gad. According to the Untouched Songfacts, the song is "about a long distance relationship and having to interact over the technology of today ."
It peaked at #2 on the Australian Top 50. After being officially released to US radio in April 2008, it took six months before the single took off. By Christmas, the single had been added to Z100, the biggest station in the US. In Canada, it took the single a little longer to take off, but eventually outpeaked the US' peak. The song peaked at #17 in the US, and #5 in Canada. Due to its success, the single is planned to be released in Europe. It's already charted in Finland at #20, and the Czech Republic at #12. The single had also peaked at #9 in New Zealand and #71 in Chile. It is said to be their biggest single world wide and their breakthrough into the US market. The single also became the 12th most played song on American pop radio for the week of February 15, 2009.
User-contributed text is available under the Creative Commons By-SA License and may also be available under the GNU FDL.</plot><chain function="GetHTBFanart">The Veronicas</chain></details>
02:31:13 T:8304 DEBUG: scraper: GetHTBFanart returned <details><url function="ParseHTBFanart" post="yes" cache="htb-images-The Veronicas.xml">http://htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/searchXML?keywords=The Veronicas&default_operator=and&aid=1,5</url></details>
02:31:13 T:8304 DEBUG: FileCurl::Open(0025A8D8) http://htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/searchXML
02:31:13 T:8304 DEBUG: scraper: ParseHTBFanart returned <details><fanart><thumb preview="http://www.htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/download/1682/thumbnail">http://www.htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/download/1682/fullsize</thumb><thumb preview="http://www.htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/download/5525/thumbnail">http://www.htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/download/5525/fullsize</thumb><thumb preview="http://www.htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/download/5526/thumbnail">http://www.htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/download/5526/fullsize</thumb><thumb preview="http://www.htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/download/5527/thumbnail">http://www.htbackdrops.com/api/7681a907c805e0670330c694e788e8e8/download/5527/fullsize</thumb></fanart></details>
02:31:13 T:8304 DEBUG: Thread CVideoInfoDownloader 8304 terminating
02:31:13 T:8440 DEBUG: VideoInfoScanner: Adding new item to musicvideos:L:\Music Videos\Old - new\The Veronicas - Untouched.avi
So yeah, clearly the scraper finds some fanarts, but I can't access them, the "fanart" tab is unclickable...
Do you know where my error is, or what the problem is?
Thanks!
Sam