Login at Kodi Home

k_zeon · (This post was last modified: 2011-10-17, 22:30 by k_zeon.)

Hey Eldorado
Just done a test and my addon works a lot faster now.

One thing i did find was a movie that that i passed ie A Christmas Carol 1910 was not found, but A Christmas Carol 2009 was and it used that.
If a movie is not found will it default to the last movie of that name or can it be made to not return anything if no movie found.

It could be that this info is in the db and for some reason it is using this. Not too sure.
will try to open db and find entry and delete it and then try again and see if it get the wrong data again...

what do you think....

Also just tried
http://api.themoviedb.org/2.1/Movie.sear...11:14+2003
and info is receieved correctly in webbrowser but no pic on the XBMC.. Huh

Do i have to URL Encode the Colon and if so all other characters that need encoding...

Eldorado · 2011-10-17, 22:30

k_zeon Wrote:Hey Eldorado
Just done a test and my addon works a lot faster now.

One thing i did find was a movie that that i passed ie A Christmas Carol 1910 was not found, but A Christmas Carol 2009 was and it used that.
If a movie is not found will it default to the last movie of that name or can it be made to not return anything if no movie found.

It could be that this info is in the db and for some reason it is using this. Not too sure.
will try to open db and find entry and delete it and then try again and see if it get the wrong data again...

what do you think....

I was looking at that one and is causing me some grief.. doesn't seem to be much I can do as IMDB is reporting this as a match, where TMDB correctly does not since the year does not equal, generally if no matches are found it updates the DB with blank information

The only other thing I could do is compare the name of the movie 1 for 1 with what is returned, but that could mean a LOT of skipped movies as most are never perfect.. a nice enhancement would be the ability to refresh meta data for individual movies and show a selection of possible matches, same way the default scrapers do with your local library.. not sure how possible this is though

Crappy part about this one is each time you load the list it reports it can't find it in the DB - which is correct because 'A Christmas Carol' from 1910 doesn't exist in the DB, so it retries grabbing from TMDB/IMDB.. when it goes to save in the DB it finds an entry exists with the same IMDB ID, deletes it, saves the new entry with the date it found on IMDB (2009).. and the cycle continues the next time

I've corrected a few small issues surrounding the premiered date, if TMDB didn't have one I wasn't updating the meta data with a date from IMDB.. correcting this fixed a good handful of movies

Also there was a bug hit on one movie, can't remember which one anymore, where it tried to get updates from IMDB but the IMDB page came back with no results, but tried updating anyways.. fixed that..

There are still a few movies in the list that I see getting re-scraped on reloads of the list, so debugging these guys to find out why.. some are valid like the christmas carol one.. others are odd where IMDB is returning results for completely different movies

A good thing to do is check your logs after an intial scrape has been done, you should see each movie blocked out and what was done.. even better is after the list has been scraped, go back and reload the list then read the log again... see how many movies it picked up from the DB and how many it had to re-scrape

At my count, I've done the list in approx. 25-30mins, so that works out to approx. 2secs per movie, pretty decent!

k_zeon · 2011-10-17, 22:34

yeah , i was thinking the same thing, if a movie is wrong be able to select the correct one and use that for future.
As you say, not to sure how to go about that..

Anyway , off early to bed.

Look forward to your updates....

Eldorado · 2011-10-17, 22:54

slyi Wrote:I started playing with it as well and noticed the Unicode issue but I can work around these for the moment. I'm using ice films as my target, integrating asynchronous meta calls rather your sample serial calls later I hope to add paging. One thing I notice an issue on tmdb function update_imdb_meta dictionary keys are incorrect named eg:imdb_poster should be thumb_url
Btw your latest code is much cleaner but for imdb searchs I feel you should initialize with imdbapi first then update with tvdb
Btw2: davilla made a fix for slow python on atv2 so everything is much faster now

Just fixed this, it was actually working properly.. if you noticed in the _format_tmdb() method they then checked for 'imdb_poster', I removed all that and have it assigned to 'cover_url' instead of fooling around the way they had it

Can you give more details on the asynchronous calls and how they are done?

At what point does the data get scraped? When does it stop scraping?

For addons using this library, the user side should generally have very little scraping to do.. only when new content is added to the site, I strongly recommend all dev's who will use this to pre-scrape the site on their end and distribute a meta data pack that must be installed if they choose to enable meta data

slyi · 2011-10-18, 01:39

Thanks eldorado I had just rewrote that function yesterday but it was not in a state to submit for review. Your lib works really well bar some import optimizations. I fully agree that you should prepopulate with a package first and only then hit the server but when we do it should be responsive. Below is a sample I was doing that downloads 100 requests in about 2 sec on atv2.

http://dl.dropbox.com/u/6589941/asyncmeta/addon.py.txt
I was optimizing that for atv2 until Rogerthis found the atv2 python perf issue fix.

I'm currently rewriting the async/batching sound your most excellent library an I hope to have a sample to show you later this week.

Eldorado · 2011-10-18, 16:48

slyi Wrote:Thanks eldorado I had just rewrote that function yesterday but it was not in a state to submit for review. Your lib works really well bar some import optimizations. I fully agree that you should prepopulate with a package first and only then hit the server but when we do it should be responsive. Below is a sample I was doing that downloads 100 requests in about 2 sec on atv2.

http://dl.dropbox.com/u/6589941/asyncmeta/addon.py.txt
I was optimizing that for atv2 until Rogerthis found the atv2 python perf issue fix.

I'm currently rewriting the async/batching sound your most excellent library an I hope to have a sample to show you later this week.

Ok, so what it looks like is you are reading pre-scraped data from a json file

You will find the same response from the DB, on my windows PC the data from k_zeon's list of 680 returns in under a second

I'm still not sure why you want to go with json vs sqlite? Saving, retrieving, reading, comparing of the data (to myself at least) seems much easier using sql.. I am starting to really like json over xml for web based responses though!

The real test is when you need to scrape the data from the sites.. this is mainly where I was wondering how the asynchronous updating would work - if scraping takes a couple minutes, and users have exited the addon before scraping is finished.. what happens to the thread?

Eldorado · 2011-10-18, 16:55

I just pushed in some more fixes, getting closer and closer

k_zeon these fixes are aimed mainly at what i found when scraping your 'A' list and should correct a few movies, you will see the most difference after the data has been scraped and you reload your list - aka the data is pulled from the db instead of scraped online

Delete your metahandlers folder in user_data and give it another go

There does still seem to be some movies in the DB that do not have a 'premiered' date, this is very important to have for correct matching when recalling from the DB by name.. I'm looking to track these down next

k_zeon · (This post was last modified: 2011-10-18, 21:19 by k_zeon.)

Hey Eldorado , thanks for the update...

I just noticed on the Icefilms Master that it has been updated to ver 1.1.0 and the Stable one was updated by you also recently.

Which one should we be using.

Also see Bold text below.

v1.1.0 (Tuesday, 15 March 2011):
Special thanks to westcoast13 for joining the project and making this release possible! This is the LAST RELEASE OF ICEFILMS ADDON. VIDEOFALCON is its superceder, and will provide support for icefilms.
- neatened up add-on file structure
- added TV show metadata support
- IMDB fallback scraper for failed scrapes
- individual item metadata refresh option
- metadata lookups even if item does not have an imdb number
- updated metacontainers
- download in background
- fixed problems with special characters on windows filesystems
- fixed favourites
- added setting to display number of episodes on icefilms for a show

Eldorado · 2011-10-18, 21:19

And even more updates pushed to the repo..

I have realized an issue with searching based on movie/year

Quick rundown: Currently the scraper grabs all data from TMDB/IMDB including the date the movie was released, I then store this in the 'premiered' column in the DB.. if I can't find this data, I use the year that was passed in

When searching the local DB based on the name/year it compares the year you send in (what the site you are scraping says) against this 'premiered' column

Problem is - these won't always match, either the year is wrong on the site you are scraping or the date is wrong on TMDB/IMDB.. this results in re-scraping for the same movie

eg. From k_zeon's addon, TVSource:

A Kiss at Midnight 2008

IMDB returns: 2007-04-27

I'm thinking of this approach, looking for opinions:

Likely the most logical/easiest - create a new column in the DB saving the passed in year. Though there will likely be cases where you don't always send in a year.. either you don't have it or you have the IMDB ID so are using that to search

I was considering just ignoring what I get from TMDB/IMDB and using the passed in year for 'premiered', but don't think that's the correct way to go

Eldorado · (This post was last modified: 2011-10-18, 21:32 by Eldorado.)

k_zeon Wrote:Hey Eldorado , I just noticed on the Icefilms Master that it has been updated to ver 1.1.0 and the Stable one was updated by you also recently.

Which one should we be using.

tks

Also see Bold text below.

v1.1.0 (Tuesday, 15 March 2011):
Special thanks to westcoast13 for joining the project and making this release possible! This is the LAST RELEASE OF ICEFILMS ADDON. VIDEOFALCON is its superceder, and will provide support for icefilms.
- neatened up add-on file structure
- added TV show metadata support
- IMDB fallback scraper for failed scrapes
- individual item metadata refresh option
- metadata lookups even if item does not have an imdb number
- updated metacontainers
- download in background
- fixed problems with special characters on windows filesystems
- fixed favourites
- added setting to display number of episodes on icefilms for a show

The master branch is basically what is to be v1.1.0 that Anarchintosh and crew were working on before he stopped.. it's very close to release ready, including all those items

I've done a small bit of work on it to help get it finished, need help though

The stable branch is what is currently released and where all the fixes are being done on, v1.0.14.. soon the two will need to be merged

Edit - ooh read closer, I need to go dig for that individual item metadata refresh code Smile

k_zeon · (This post was last modified: 2011-10-18, 21:39 by k_zeon.)

Quote:Likely the most logical/easiest - create a new column in the DB saving the passed in year. Though there will likely be cases where you don't always send in a year.. either you don't
have it or you have the IMDB ID so are using that to search

Yes i agree. even if another addon uses the same db and passes the same movie name but a slightly different year, the info can still be added to the DB.

ie.
so you would have 2 movies in the DB with the same name but different years
however , when the info is retreived you display the scraped info year.

Code:
NAME  , PASSED YEAR , SCRAPED YEAR

Movie , 2010        , 2010

Movie , 2009        , 2010

Eldorado · 2011-10-18, 21:52

k_zeon Wrote:Yes i agree. even if another addon uses the same db and passes the same movie name but a slightly different year, the info can still be added to the DB.

ie.
so you would have 2 movies in the DB with the same name but different years
however , when the info is retreived you display the scraped info year.

Code:
NAME , PASSED YEAR , SCRAPED YEAR Movie , 2010 , 2010 Movie , 2009 , 2010

Hmm.. ok, so to be able to do this and not get duplicate key errors I would need to add this new year column to unique list.. currently it is:

Code:
UNIQUE(imdb_id, tmdb_id, title)

I think this is ok and should satisfy most needs

k_zeon · (This post was last modified: 2011-10-18, 23:03 by k_zeon.)

Eldorado, did you find the individual item metadata refresh code.
had a quick look myself but quite a bit of code.

Looks like they make use of Contextmenus

Code:
contextMenuItems.append(('Refresh Info', 'XBMC.RunPlugin(%s?mode=999&name=%s&url=%s&imdbnum=%s&dirmode=%s)' % (sys.argv[0], sysname, sysurl, urllib.quote_plus(str(imdb)), dirmode)))

p.s how is the '%s' used seen it a lot ie sql_select = "SELECT * FROM " + table + " WHERE title = '%s'" % name

Eldorado · 2011-10-19, 01:53

k_zeon Wrote:Eldorado, did you find the individual item metadata refresh code.
had a quick look myself but quite a bit of code.

Looks like they make use of Contextmenus

Code:
contextMenuItems.append(('Refresh Info', 'XBMC.RunPlugin(%s?mode=999&name=%s&url=%s&imdbnum=%s&dirmode=%s)' % (sys.argv[0], sysname, sysurl, urllib.quote_plus(str(imdb)), dirmode)))

p.s how is the '%s' used seen it a lot ie sql_select = "SELECT * FROM " + table + " WHERE title = '%s'" % name

I know basically how to do it in the addon, it's pretty much just what you posted, shouldn't be an issue.. what I need to do for this module is build a method that returns the full list of possible matches, I don't think it would be much extra work

I noticed they also put in a 'search for trailer' function.. cool idea

The %s is pretty handy as it is basically a string replacement, instead of doing stuff like:

"blah blha " + some_string + "blah blah"

You can do "blah blah %s blah blah" % some_string

If you have multiple variables to replace: "blah %s blah %s blah %s" % (string1, string2, string3)

I find it really comes in handy when you have to define a string where a portion of it can change quite often

eg.
url = 'http://www.google.com/%s/whatever'
..
location = 'somewhere'
new_url = url % location

slyi · 2011-10-20, 17:19

Eldorado Wrote:Ok, so what it looks like is you are reading pre-scraped data from a json file

You will find the same response from the DB, on my windows PC the data from k_zeon's list of 680 returns in under a second

I'm still not sure why you want to go with json vs sqlite? Saving, retrieving, reading, comparing of the data (to myself at least) seems much easier using sql.. I am starting to really like json over xml for web based responses though!

The real test is when you need to scrape the data from the sites.. this is mainly where I was wondering how the asynchronous updating would work - if scraping takes a couple minutes, and users have exited the addon before scraping is finished.. what happens to the thread?

I was just using json for simplity for my own testing, sql is definatly the best option. I was playing with t0mmo test sample to enable metahandler async. I needed to add a couple of helper queries to metahandlers but otherwise it works well and loads unscraped content faster.

http://dl.dropbox.com/u/6589941/asyncmet...ult.py.txt
http://dl.dropbox.com/u/6589941/asyncmet...ers.py.txt

There is a still some bugs in it that i need to figure out and add back in updating the UI as each thead completes.

Threads run in the background even if the users exit the app and the will still save the requested data to the DB for next load.