Posts: 1,970
Joined: Nov 2008
Reputation:
1
azido
Posting Freak
Posts: 1,970
this one sounds very promising, cheers for that.
one (personal) question:
as you modified an existing scraper with good results, any chances you can do me a favor and use some of your time to add the functionality to look up for artist thumbs on my site?
i began to start a resource for especially prepared artist thumbs for the use in aeon (showmix) and so far 777 thumbs are present, with a bunch of users willing to add more in the future. also we started collecting fanart, but that's maybe another topic.
the gallery is organised pretty basic, thumbs are categorised in folders by artist names, so it should be pretty easy for ppl that have the skills to write a lookup to get them downloaded. every artist thumb has a thumbnail and a full picture. unfortunately there is no api that can be used, so it would be simple html scraping; but as there is an easy structure and we don't hold additional info in general, once again it should be easy to get them scraped. we also use a search feature that returns matches by given keywords in the whole gallery (thumbs AND fanart).
i would be glad if you consider trying to do that.
Posts: 47
Joined: Oct 2009
Reputation:
0
Hello guys,
I have problem with scraping latelly. I am running latest SVN build (24059) on Windows 7 x64 an i almost cant scrap any info for music. Dicogs, Freebase and Last.fm is not working, and Allmusic, even with updated script from here downloads almost no artist thumbs at all, not to say any fanart. Since I have huge library (1200+ artists) with mostly underground music, it is a huge problem for me.
And another problem with latest build is, that when i try to scrap every artist, XBMC crashes.
Where is the problem? In the build? In the scrapers, or did the discogs changed layout?
What would be the best automated way to collect artist thumbs and fanart outside of xbmc? I tried media info, but it is also byggy and gives me very strange results.
Posts: 47
Joined: Oct 2009
Reputation:
0
Sorry, but this question was aimed to talisto and others in this thread, who are trying to tweak scrapers and since the scrapers stopped working this week, i am asking if they know what happened.
Posts: 2,288
Joined: Nov 2005
Reputation:
5
I already posted some bug reports on Trac about the scraper issues.
Posts: 47
Joined: Oct 2009
Reputation:
0
Also it seems, that freebase.org is down or gone. Webpage is empty without any explanation. It just seems strange, that (in my opinion) so important part of the xbmc stops working and there is no mention about t on the forums.
Posts: 40
Joined: Jun 2005
Reputation:
0
2009-10-28, 07:00
(This post was last modified: 2009-10-28, 07:14 by talisto.)
Ok, so it seems that Discogs is checking the user agent of the request, and is rejecting XBMC's user agent. It seems that they're specifically targeting XBMC, since if I enter a random user agent, it works fine, yet any user agent starting with "XBMC" is rejected. This is fairly troubling since Discogs is the default scraper for XBMC, so probably 99% of the userbase now has a failing music scraper.
Clearly they don't want XBMC scraping their pages anymore. So I suppose the question now is, do we start spoofing a browser's user agent to get around the problem, or do we merely skip Discogs and move to other sources allowing more legitimate scraping (e.g. the last.fm API)?
(edit: the real unfortunate thing is that they're even blocking XBMC's usage of the API, so we can't even use that an alternative to the HTML scraping.)