The logic and future of Music scrapers?
#5
Ah just spotted this thread, thanks @ronie for starting the conversation. I would like more insight into music scraping too, because what I see seems to need changing, but I don't like to do that without some understanding of why it is like it is. But I do have a good understanding now of the muisic db and tag processing, so that should help!

(2017-02-07, 17:45)ronie Wrote: 1) if the 'prefer online info' setting is disabled, we pass the artistname to the artist scraper.
if the setting is enabled, we pass the artist mbid to the scraper.

why don't we always pass the mbid (if available) regardless of this setting?
I thought that we did use mbids when we had them. Not doing so is a mistake for sure. The other effect of this setting is to overwrite the data derrived from music file tags, or not. That is the only thing I think it should do.

Quote:2) if the album scraper returns no results, we completely skip the artist scraper. why?
That must be in the automated scraping, because albums and artists can be scraped separately via "query info for all". I would guess that the thinking is that if the album isn't known then the artists are unlikely to be known too. There is some sense to that. Is it that successfull album scraping could return the artist mbids?

(2017-02-08, 00:14)ronie Wrote: when reading http://forum.kodi.tv/showthread.php?tid=...pid1612266
i get the impression 'prefer online info' was meant to be a temporary setting for testing musicbrainz support when it was introduced
(2014-01-27, 17:57)night199uk Wrote: Essentially the option is pretty redundant but it should allow us to bring the feature in more smoothly.
Interesting idea. As best I can tell whatever ambitions @night199uk had for Musicbrainz use replacing the need for music file tagging they never came to fruition. It needed albums and artists to be uniquely identifiable by name alone, and far too often they just aren't. So possibly the temp nature of this setting was to eventually always have the online info overwrite.

Quote:now i'm kind of curious what magic code we have in kodi to accomplish this:
(2014-01-27, 17:57)night199uk Wrote: once you've MusicBrainz tagged your files once, your library and metadata is built dynamically from up-to-date MusicBrainz data, and later KEPT up-to-date and refreshed automatically so you always have good tags.
For Musicbrainz tagged music the idea would have been to have both "Prefer online info", "Fetch info on update", and "update lib on start up" enabled. Then, by magic, any changes in the cloud to artist biogs, dates etc, or album reviews etc. would be propagated into your music lib every time you turn it on.

But there are reasons this does not work in practice, let alone that many users are control freaks don't actually want their precious music lib details shifting underneath them! They are thrilled first time it appears, but don't like it to change afterwards.

The first big hurdle was that Kodi made a mess with the default tagging that came out of Picard for multiple artists on songs or albums, and this lead to many wanting to turn off mbid tags. Correct mbids is essential for the above magic. Krypton is greatly improved in this respect, but I still would not advise fetching online data by default, the user needs to look at the lib their tagging has created frist. There is an overhang too of upgraded databases that have not rescanned the tags (we encourage rescan but the user can cancel it). The mbid tag mess will haunt Kodi a while yet, and we need to manage that as best we can.

Another issue is server traffic, we hammer Musicbrainz and TADB, and end up with server time outs. We really need to be more efficient with scraping and try to avoid scraping the same info over and over again. If you have a big music collection then just scanning the hash table to identify changed music files takea a long time, let alone fetching online info etc. Even as an asynchronous process it can be a problem for smaller processors and "respectful" users that see the progress bar and don't like to switch off midway. What the hell is Kodi doing they ask?

I believe that kind of sync of artist and album info with the "wisdom of crowds" data needs to be an elective rather than a default automatic process e.g. user can do so when and if they want, but it does not happen automatically.

One thing I would like to start doing is storing the scraped mbid for albums and artists that did have them in tags. We need to flag then as scraped rather than from tags because they could be wrong, but it would be more efficient to use the value once we have it. We also could do with offering the user the disambiguation data e.g. 23 artists called "Eclipse" which do they mean. At least when doing a manual scrape of a single artist the user has some chance of picking the correct one from the list they are shown.
Reply


Messages In This Thread
RE: The logic of Music scrapers? - by ronie - 2017-02-08, 00:14
RE: The logic of Music scrapers? - by jjd-uk - 2017-02-08, 11:23
RE: The logic of Music scrapers? - by DaveBlake - 2017-02-09, 11:07
RE: The logic of Music scrapers? - by ronie - 2017-02-13, 03:04
RE: The logic of Music scrapers? - by ronie - 2017-02-13, 03:12
RE: The logic of Music scrapers? - by ronie - 2017-02-13, 03:28
Logout Mark Read Team Forum Stats Members Help
The logic and future of Music scrapers?0