2009-04-22, 22:01
Hi everyone,
i try to make a scraper but can't get ahead with one step.
I use scrap.exe to test my scraper:
CreateSearchUrl returned is okay!
GetSearchResults returned is okay !
Details URL is okay !
but then the GetDetails returned: is nothing with the Error: Unable to parse details.xml
Here's my code:
Maybe someone could have a quick look at this and tell me the direction to get it right.
Thanks so much in advance
Schenk
i try to make a scraper but can't get ahead with one step.
I use scrap.exe to test my scraper:
CreateSearchUrl returned is okay!
GetSearchResults returned is okay !
Details URL is okay !
but then the GetDetails returned: is nothing with the Error: Unable to parse details.xml
Here's my code:
PHP Code:
<scraper name="TEST" content="movies" thumb="cinefacts.gif" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" language="de">
<CreateSearchUrl dest="3">
<RegExp input="$$1" output="http://www.cinefacts.de/suche/suche.php?name=\1" dest="3">
<expression noclean="1"/>
</RegExp>
</CreateSearchUrl>
<GetSearchResults dest="8">
<RegExp input="$$5" output="<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?><results>\1</results>" dest="8">
<RegExp input="$$1" output="<entity><title>\3 \4</title><url>http://www.cinefacts.de/kino/\1/\2/filmdetails.html</url></entity>" dest="5">
<expression repeat="yes">><a href="/kino/([0-9]*)/(.[^\/]*)/filmdetails.html">[^>]*(.[^<]*)</b></a><br>[^>]*[^\t]+\t+[^ ]+[^0-9]+([^<]+)</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GetSearchResults>
<GetDetails dest="3">
<RegExp input="$$5" output="<details>\1</details>" dest="3">
<!--Title -->
<RegExp input="$$1" output="<title>\1</title>" dest="5+">
<expression trim="1" noclean="1"><h1>([^<]*)</expression>
</RegExp>
</RegExp>
</GetDetails>
</scraper>
Maybe someone could have a quick look at this and tell me the direction to get it right.
Thanks so much in advance
Schenk