Kodi Community Forum

Full Version: themoviedb scraper now partly broken
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6
WOW! Big Grin

First chance to check back in expecting 1 or 2 comments not a solution!

Installed and all working correctly now with the titles that had been a problem last night.

Many thanks all, have a cold one on me!

Cheers

Edz
BugBoy Wrote:The latest XML doesn't fix the problem on my system. The file is also a lot bigger than the original file. Is that correct?

The file only changed by a handful of characters. Are you sure you only changed the version in the common TMDB folder, not the main TMDB folder? Also, what version of XBMC are you running - something recent? Maybe an older XBMC version has an older and smaller file?
My TMDB scraper is also broken. I'm having trouble locating where the tmdb.xml file is on XBMC Live. Anybody know the location? Thanks.
*********IMPORTANT - - - PLEASE READ***********



I just talked to Travis who runs tmdb, he wasn't aware of the problem we were having. The change in their api was not supposed to break the scrapers. He will update his api so the original (unmodified) tmdb scrapers should work after the update.

Keep in mind that whatever is currently cached in their system will take about 24 hours to update.

If your scraper is currently not working, do not do anything, just wait a day, it will start working as before.

If you already modified your scraper, you are probably going to have to revert back to the original one by tomorrow.
Hi aptalca,

Maybe he shouldn't be too rash. I found that the regexp that I thought was matching against the image ID (which is now a GUID) was actually matching against the show ID which is still a number. So I was wrong. IN other words, the GUID with alphas in the image ID is not a problem!

Your original post, with the addition of the width parameter, is the one that fixes the images. The only thing I found was a typo in your tmdb.xml - you had an extra '>' character at least in the one in pastebin.

But you're right - it is the changes to the api that have broken things here. I suspect that if he moved the width parameter to the end of the image element, or at least after the size parameter which is the last one that the scraper matches on, that this would not break the XBMC scraper either.

Maybe you could direct him to this post?

Cheers
Aaron
Hi Aaron

That is exactly what he's gonna do, he's gonna change the order so that size comes right after the url, so the original scraper's regex should still pick that up. Like you said, whatever comes after the size element gets ignored anyway.

Travis didn't think xbmc was still using regex for xml scraping. That's why he didn't think it wold break our scraper. And he is using ruby which orders the elements itself, so it wasn't Travis who put the width element between the url and size.

I am by no means an expert at scrapers, I just went through the xml and scraper code and saw that regex wasn't matching. :-)

FYI, my conversation with Travis is here.


PS. About the extra ">", it was probably left over from when I was trying to modify them so I could use the tmdb scraper for retrieving imdb ratings, I later gave up on that as I wasn't skilled enough and switched to imdb scraper altogether :-)
As I wrote in the other thread, XBMC shouldn't be assuming the order of elements if it can avoid it. Unfortunately, we have to in some ways. We can, however, get away with just assuming that url comes before size and ignore anything in between. This seems like a reasonable solution in the meantime. A patch for this would be great.

Cheers,
Jonathan
Like Travis said, can we use a XML parser instead of RegExp to parse the output?
jmarshall Wrote:We can, however, get away with just assuming that url comes before size and ignore anything in between. This seems like a reasonable solution in the meantime. A patch for this would be great.

Doh!!Blush Yeah, that would have been a good idea wouldn't it? (Sheepish grin) I'll provide a patch if no one beats me to it.
aptalca Wrote:here's a working version of common tmdb.xml

http://pastebin.com/f1Um9i2Y


Replace your system\scrapers\video\common\tmdb.xml

I repeat, do not replace your main tmdb.xml, but the one in the common folder with this one


This will fix both tmdb and imdb scrapers as long as you have imdb set up to download fanart from tmdb.

The reason it broke is tmdb changed their api, search results for fanart have an added "width="xxx"" element in there that prevents the scraper from parsing the link correctly

hi there, this works for me thank you ! ^^

lad

ps os:macosx10.5.8 xbmc: SVN-28256 skin:Transparency! v2.14
Im also using xbmc live and can't find the tmdb.xml file.

The scraper is still not working for me and I need the pc finished for a project tommorow, can someone point me to where the file is located?
Benjie, I believe the it should be working again in the near future. They said they put in an update yesterday that should go into effect in 24 hours and we're around that point so I'd just give it another hour or two and see if the problem resolves itself.
i can confirm it is indeed working once again ...
So I'm still having issues. Not sure why. I'll give it some time. AaronD's file worked for a while but stopped pulling cast/crew information. I'll check to see if things are up and running again tomorrow.
Sorry for the double post, but I'm learning what these scrapers do.

But the Cast/Crew scraping issue is still a problem but I realize now that these portions get scraped with the "non-common" tmdb.xml. So...I'm still going to have those issues. Other than that I'll hope that things get propagated through soon because I'm having issues still.

I'll try to dig through the other xml to see what's happening with the Cast/Crew query, but I'm not great at Regex.
Pages: 1 2 3 4 5 6