Kodi Community Forum

Full Version: IMDB scrapper issue
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
My folder format for example is [Ice Age - The Meltdown (2006)]. When imdb tries to scrap this, it will always pick up http://www.imdb.com/title/tt0795398/ which is actually the video game. Why isnt it picking up the correct title that is almost 100% match? Also why does imdb pick up video game titles, should it be ignoring those entries?
Don't you have an nfo with the above link in it?
nfo's are only created after exporting the library, i'm talking about its first initial scan. or a new scan with newly added content.
What happens if you refresh the movie? Do you see the correct one in the list?
Yh i do see the correct 1 after the 1st option. If it didn't take VG into consideration, then no doubt it would scrap the correct 1.
And i dont know how else i could name it any better, considering "Ice Age 2: the meltdown" is the game and "Ice Age: The Meltdown" is the movie, and taking my folder/file name into consideration, it doesn't have the 2 which should be a better match to the movie name imo. While manual work is possible, this does not happen on just one movie, and some movie's may get overlooked, and I do frequently start a scan from scratch from time too time.
So use export then...
If i export, then its just going to export what its actually already scrapped as, I dont think you understand the issue i'm stating, the fact if this is a bug or not.
I understand. I just don't see why you are complaining if very few movies fails to scrape. Also don't see why does it make sense to always re-scrape everything from scratch (with this approach you are also hitting the content providers pretty much by the way).

...and finally, you can always place an nfo file only with the correct imdb link in it and then the scraper will use this to scrape.

Let me know if you found a good and safe way to filter out VG from the IMDb search and I will have a look at it.
I never ment to seem that i'm complaining (if i was complaining, I would off actually searched and fixed it my self instead off raising the issue here), I'm raising an issue (2 completely different scenarios). I asked a question, which may had been resulting in a bug as such. Also if this issue was the cause off a overseen Regular expression and easily fixable. Then maybe this can be attended too by a dev off imdb scraper for next release. After all, xbmc is fluid as it is, it only seems right for add-ons and extras to follow suite.

I will also have a look into the nfo file routine, The reason i don't stick with the nfo's is because I've had situations where ive decided too change scrappers which requires the nfos too be removed. Hence start again.

I'll have a look into the code for imdb later today and get back too you, I was hoping someone else might had noticed this and had there own fix too share etc.
It's not a regular expression and/or a scraper issue. XBMC core sorts the search results returned by the scraper on its own trying to list the best match first.

I am not asking you to look at the imdb scraper code. The scraper obviously use imdb search engine. I meant if you can find a _good_ way on the imdb site to filter out VG, then I will make sure the scraper gets updated with your finding.

Well, ive made my own scrapper for imdb about a month ago for a script that was executed after a copy process on my system. My script used curl, and regular expressions on the returned html to extract the information required. Considering i haven't read the code etc, I can only assume that it was the same kinda process. Considering IMDB doesn't have an api available for the public, i can only assume the information gathered is done via curl etc. (specially considering scrappers are broken due to html layout changes). for eg. http://www.imdb.com/title/tt0303016/ you can see in the title that it says video game, and the same goes for http://www.imdb.com/title/tt0795398/. It is also shown as (VG) in a search result.
Oh and on top off that there is an advanced search feature which can exclude videogames.
Also, if i have my script create a nfo file within the movie folder. Do i only need to have
Code:
<movie>
   <id>(MOVIE_ID)</id>
</movie>

then when it scans, it will get the movie and the rest off the details automatically?
Unfortunately not - it's a bit messy. You need the imdb url. IIRC we mainly match on tt######, but there may be some more than that, just drop in the whole URL to be sure.
(2012-11-18, 01:57)edhen Wrote: [ -> ]Oh and on top off that there is an advanced search feature which can exclude videogames.

Try to search for 'Ice Age Meltdown' with advanced search...
Pages: 1 2