Kodi Community Forum

Full Version: Scraping inconsistency scrap.exe/xbmc?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
I'm making a scraper for AsianDB.com. It seems to work flawlessly under scrap.exe, but XBMC misses a lot of info it retrieves. Here's an example details XML output:

Code:
<details>
    <title>Violent Cop</title>
    <year>1989</year>
    <director>Takeshi Kitano</director>
    <runtime>103mins</runtime>
    <thumb>http://www.asiandb.com/data/title/mini/4141.jpg</thumb>
    <rating>7</rating>
    <votes>3</votes>
    <genre>Action</genre>
    <genre>Crime</genre>
    <credits>Takeshi Kitano</credits>
    <credits>Hisashi Nozawa</credits>
    <actor>
        <name>Takeshi Kitano</name>
    </actor>
</details>

XBMC doesn't extract the director, genre, credits (correct way to enter writers?) and actors, but does get all other items.

Is there a bug in my XML output? (Note: pretty-printed for readability, no extra whitespace in actual XML)

Also, pressing X+Y during boot did get me in debug mode, but didn't tell much about the scraping process. Is there a method (like in the old days Big Grin ) to set the debuglevel to 'insane' or similar?

Thanks for any help you can give,

ezd
For reference, I've upped the current asiandb.xml to pastebin.
i'm on a conference this week, so this post is only to say that i cannot see anything wrong at first glimse. i hardly have inet accessiblity so i have to wait until i get back home to investigate.
Thanks for the heads up, no hurry here, mostly did this for the Greater Xbmc Good Smile

Enjoy your conference!
before each of those you have regexp's that grabs the relevant pieces of the html. on those you don't specify 'noclean="1"' and hence all html tags are stripped off. i guess the scrap.exe doesnt honor this.
any progress on this scraper ?
i really need this one. Laugh
then i suggest you finish it
wise-ass... if i could don't you think i would ?
some people have learned them selfs programming skills, other artistic skills.
it doesnt take programming skills. that's the whole reason i created the scraper system. it only takes some logic and reading a 10 min regexp guide.
if you think it's that easy for everyone, then why is 'esd' having problems with it ?
I'm pretty much code-blind, but if you (or anyone else) could give a little help i might give it (another) try.
ezd had done a simple screwup which i explained.

i'll answer specifics.
sorry for the triple post, editing post doest seem to work for me for some reason.
a mod can combine/delete the posts if they feel the need.

the changes i made were pretty much only adding noclean="1" on the right places.
i also tried that with the stuff thats still not working (tagline, plot, cast, MPAA rating) but that didnt change anything.
so i edited those wrong or something else is wrong that i'm missing.

-blaize
how the heck did you manage ? the 2 posts are 10 (TEN) minutes apart!
i know, i went back a page (history) but because i'm walking bcak and forth my PC and box i got confused and though i pressed edit (still cant find that button >_>)
thats how i reposted it.

Actors are working now, a stupid typo Confused
Sorry for the slow reply, been away for a while, thanks Spiff for your reply. Still had a strange problem with XBMC hanging when I enabled plot extraction, but no problem in the new build, so I've upped the scraper on Sourceforge for inclusion.

For those in a hurry to enable Asiandb:

https://sourceforge.net/tracker/?func=de...tid=581840

Cheers,

ezd
Pages: 1 2