Scraping inconsistency scrap.exe/xbmc?
#1
I'm making a scraper for AsianDB.com. It seems to work flawlessly under scrap.exe, but XBMC misses a lot of info it retrieves. Here's an example details XML output:

Code:
<details>
    <title>Violent Cop</title>
    <year>1989</year>
    <director>Takeshi Kitano</director>
    <runtime>103mins</runtime>
    <thumb>http://www.asiandb.com/data/title/mini/4141.jpg</thumb>
    <rating>7</rating>
    <votes>3</votes>
    <genre>Action</genre>
    <genre>Crime</genre>
    <credits>Takeshi Kitano</credits>
    <credits>Hisashi Nozawa</credits>
    <actor>
        <name>Takeshi Kitano</name>
    </actor>
</details>

XBMC doesn't extract the director, genre, credits (correct way to enter writers?) and actors, but does get all other items.

Is there a bug in my XML output? (Note: pretty-printed for readability, no extra whitespace in actual XML)

Also, pressing X+Y during boot did get me in debug mode, but didn't tell much about the scraping process. Is there a method (like in the old days Big Grin ) to set the debuglevel to 'insane' or similar?

Thanks for any help you can give,

ezd
Reply
#2
For reference, I've upped the current asiandb.xml to pastebin.
Reply
#3
i'm on a conference this week, so this post is only to say that i cannot see anything wrong at first glimse. i hardly have inet accessiblity so i have to wait until i get back home to investigate.
Reply
#4
Thanks for the heads up, no hurry here, mostly did this for the Greater Xbmc Good Smile

Enjoy your conference!
Reply
#5
before each of those you have regexp's that grabs the relevant pieces of the html. on those you don't specify 'noclean="1"' and hence all html tags are stripped off. i guess the scrap.exe doesnt honor this.
Reply
#6
any progress on this scraper ?
i really need this one. Laugh
Reply
#7
then i suggest you finish it
Reply
#8
wise-ass... if i could don't you think i would ?
some people have learned them selfs programming skills, other artistic skills.
Reply
#9
it doesnt take programming skills. that's the whole reason i created the scraper system. it only takes some logic and reading a 10 min regexp guide.
Reply
#10
if you think it's that easy for everyone, then why is 'esd' having problems with it ?
I'm pretty much code-blind, but if you (or anyone else) could give a little help i might give it (another) try.
Reply
#11
ezd had done a simple screwup which i explained.

i'll answer specifics.
Reply
#12
sorry for the triple post, editing post doest seem to work for me for some reason.
a mod can combine/delete the posts if they feel the need.

the changes i made were pretty much only adding noclean="1" on the right places.
i also tried that with the stuff thats still not working (tagline, plot, cast, MPAA rating) but that didnt change anything.
so i edited those wrong or something else is wrong that i'm missing.

-blaize
Reply
#13
how the heck did you manage ? the 2 posts are 10 (TEN) minutes apart!
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#14
i know, i went back a page (history) but because i'm walking bcak and forth my PC and box i got confused and though i pressed edit (still cant find that button >_>)
thats how i reposted it.

Actors are working now, a stupid typo Confused
Reply
#15
Sorry for the slow reply, been away for a while, thanks Spiff for your reply. Still had a strange problem with XBMC hanging when I enabled plot extraction, but no problem in the new build, so I've upped the scraper on Sourceforge for inclusion.

For those in a hurry to enable Asiandb:

https://sourceforge.net/tracker/?func=de...tid=581840

Cheers,

ezd
Reply

Logout Mark Read Team Forum Stats Members Help
Scraping inconsistency scrap.exe/xbmc?0