• 1
  • 35
  • 36
  • 37(current)
  • 38
  • 39
  • 42
Release Universal Scraper for Music Artists
My experience of MusicBrainz is that relying on MBID stuff is not really a workable strategy at this point since the MB database is woefully incomplete for many, many artists. I've tried using Picard to tag albums in my collection, but I get a hit rate of maybe 20%, i.e. it completely fails to recognize about 80% and checking on the MB site confirms that there is no entry for the album, this compares very poorly with tools that rely on freedb/cddb entries which almost always correctly identifies the album.
Reply
An update to my earlier post. The duplicate MBID issue seems to have gone away after a reinstall but I'm still having problems. I will post a link to my debug log when I get chance but it appears that the MBID is initially picked up correctly when doing the first check on the MusicBrainz site but subsequent URLs for both MusicBrainz and TheAudioDB have the IDs missing so cannot be parsed. Allmusic searches seem to be ok though and if I change the settings to only use allmusic where possible and everything else to none I get the allmusic data returned. Setting any one option to either MusicBrainz or TheAudioDB seems to break everything.

Not sure if it's relevant but I have Kodi running on a Raspberry Pi 2 and have tested with the same results on both Openelec and OSMC
Reply
(2016-06-01, 01:18)Hallucyn8 Wrote: An update to my earlier post. The duplicate MBID issue seems to have gone away after a reinstall but I'm still having problems. I will post a link to my debug log when I get chance but it appears that the MBID is initially picked up correctly when doing the first check on the MusicBrainz site but subsequent URLs for both MusicBrainz and TheAudioDB have the IDs missing so cannot be parsed. Allmusic searches seem to be ok though and if I change the settings to only use allmusic where possible and everything else to none I get the allmusic data returned. Setting any one option to either MusicBrainz or TheAudioDB seems to break everything.

Not sure if it's relevant but I have Kodi running on a Raspberry Pi 2 and have tested with the same results on both Openelec and OSMC

I face similar problem on RPi2 and B+ with both libreelec and osmc. The scraper will fail on any provider but allmusic and fanart/thumbs are not working either.It seems to be a hit and miss. some times for a single artist it does work but 95% of the time it does not. It used to work fine though when I initially filled my library, all artists got picked up fine. I refreshed my setup now and currently, the universal artist scraper and audiodb scraper fail. The suggestion of using a mirror is not possible yet in version 3.6.2, is that correct? At least not in OSMC or LibreELEC
Reply
Ok, so I've put an extract of my log file onto Pastebin: http://pastebin.com/UQyQ6cui

This shows an attempt to use the scraper with the default settings to get data for Massive Attack.

Initially all seems well and the ResolveIDToUrl line returns a working URL, in this case: http://musicbrainz.org/ws/2/artist/10adb...c=url-rels

Things seem to start going wrong during GetArtistDetails. The ID for allmusic is returned but MusicBrainz and TheAudioDB are blank:

GetArtistDetails returned <details><chain function="GetAMGData">mn0000378288</chain><chain function="GetMBDiscographyByMBID"></chain><chain function="GetTADBBiographyByMBID"></chain><chain function="GetTADBArtistGenresByMBID"></chain><chain function="GetTADBArtistStylesByMBID"></chain><chain function="GetTADBArtistMoodsByMBID"></chain><chain function="GetTADBLifeSpanByMBID"></chain></details>

The years active data is returned from allmusic, presumably because this is first in the list but then the URLs created for MusicBrainz and TheAudioDB contain no IDs, e.g.

<url function="ParseMBDiscography" cache="mb--discog.xml">/ws/2/release-group?artist=&amp;limit=100&amp;type=album</url>
<url function="ParseTADBBiography" cache="tadb--artist.json">http://www.theaudiodb.com/api/v1/json/58424d43204d6564696120/artist-mb.php?i=</url>


Each of which generates an "Unable to parse web site" error.

Any ideas what is going wrong? Could it be something Raspberry Pi specific?

For information, TheAudioDb Artist Scraper works fine for me but would like to get to the bottom of why Universal Artist Scraper is having these issues.

Hopefully this information helps I can try to provide more details if it will help.
Reply
This is a backend issue; Musicbrainz is having problem as discussed here a couple of time (hence the mirrors are configurable now) to serve all the requests and also their rate limiter blocks you after a couple of requests. This situation we can't change, Musicbrainz is working on this. Read their blog.

Best thing to do for now (although I appreciate not everyone can do this) is to run your own MB mirror:
https://wiki.musicbrainz.org/MusicBrainz_Server/Setup

I know it's not an answer anyone would like, but we could we do on our side?
The other option is the use the theaudiodb.com scraper.
Reply
Thanks for the reply and I now appreciate that the issue is with Musicbrainz and not the scraper.

I will look into the mirror options but have already started adding some missing artists to TheAudioDb so that may be the solution for me.
Reply
I know this is a thread for the Universal Artist Scraper but can anyone confirm if the TheAudioDb scraper uses MBIDs or does it just rely on a text search?
Reply
(2016-06-02, 16:51)Hallucyn8 Wrote: I know this is a thread for the Universal Artist Scraper but can anyone confirm if the TheAudioDb scraper uses MBIDs or does it just rely on a text search?

It relies on text name search I believe (Is that right Olympia?). So an easy way to get accurate results is to use something like picard to rename the artist tag using MBID.
Reply
This seems to be an old issue, but is there a way to get "Query Info for all artists" working to get the thumbnails and art so one doesn't have to refresh every artist separately? The scraper was set in the settings before scanning the library but that workaround doesn't seem to work.
Reply
Elra there is a bug in Jarvis that means "Query Info for all artists" quietly does nothing unless you have done a library update since last power up.

But the current MB server overload also means that scraping may not get anything, and so you need to retry multiple times.

Maybe try with debug enabled and check the log to see what happened.
Reply
Thanks a lot for the info DaveBlake! Indeed it does more when I update the library first, but then still doesn't update any art and I get this from the log for the several artists it tries before it stops:

21:29:59 T:6076 ERROR: ADDON::CScraper::Run: Unable to parse web site
21:29:59 T:2884 ERROR: CCurlFile::FillBuffer - Failed: HTTP returned error 503
21:29:59 T:2884 ERROR: CCurlFile::Open failed with code 503 for
Reply
503 error means musicbrainz server is not responding. Try the mirrors as outlined here

http://forum.kodi.tv/showthread.php?tid=...pid2340856
Reply
(2016-06-02, 11:33)olympia Wrote: This is a backend issue; Musicbrainz is having problem as discussed here a couple of time (hence the mirrors are configurable now) to serve all the requests and also their rate limiter blocks you after a couple of requests. This situation we can't change, Musicbrainz is working on this. Read their blog.

Best thing to do for now (although I appreciate not everyone can do this) is to run your own MB mirror:
https://wiki.musicbrainz.org/MusicBrainz_Server/Setup

I know it's not an answer anyone would like, but we could we do on our side?
The other option is the use the theaudiodb.com scraper.

Ok, so after a few false starts I now have a MusicBrainz mirror running on a VM that I was planning to use for my scraping. The mirror itself seems to be working ok, I can access the website and it is replicating hourly with the main MusicBrainz site/DB. So far so good.

I've then configured the scraper to look at my local mirror but I'm seeing exactly the same thing as I described in post #544. The URL is created correctly with the IP address of my local mirror and the MBID of the artist but then it is not carrying through to the subsequent MusicBrainz and TheAudioDb URLs so they looking for pages with no MBID which obviously fails.

Could this be something that needs changing on my mirror server? I'm assuming any throttling/limiting wouldn't be done at this level.

I've also seen a reoccurrence of the issue where the MBID is repeated twice within the generated URLs. This seems to be artist specific and replicable so I'm trying to work out what makes certain artists/MBIDs behave differently. The pattern so far seems to be artists that have few details added. At first I thought it was caused by an absence of an AllMusic link but have added these to the artists I have been tested with and still seem the same issue.
Reply
Can you try without artist.nfo?
Reply
(2016-06-08, 12:53)olympia Wrote: Can you try without artist.nfo?

I'm not using artist.nfo just the tags that have been written using Picard.
Reply
  • 1
  • 35
  • 36
  • 37(current)
  • 38
  • 39
  • 42

Logout Mark Read Team Forum Stats Members Help
Universal Scraper for Music Artists9