The logic and future of Music scrapers?
#30
Lots of points to keep up with!

Info from Musicbrainz
Obviously if we are getting more than just mbid from there we can't drop the Musicbrainz lookup for those albums and artists where we alreday have the mbid from music file tags. But if the scraper settings are such that all we want from Musicbrainz is an ID then it would be sensible to optimise that out and go straight to the other sources using that ID as you suggested @ronie

Fetching artist gender and sortname would be useful for some v18 improvements I have in progress, along with disambiguation comment.

As you comment they don't return genre, but to be honest genre is so subjective and also needs to be granular in different levels depending on your collection. For example if you have only a handful of heavy metal in your lib "heavy metal" is enough, but if 1000 artists or albums then you probably what to break that into "Alt metal", "Funk metal", "Doom metal" etc. Also artist genre is not fully integrated into the db, just added as a text string, not linked to the genre table, and not used in filtering for playlist rules.

Nor sure what to do with the "tags" they return for albumname searches, they aren't always genres more a hashtag comment. I am looking at a facility for users to give artists and albums custom properties, these would fit there.

Quote:while testing things i found out there is a very popular skin helper addon in our repo that is also making a lot of calls to musicbrainz
when you are scraping your music collection.
if you have this addon installed (it's a dependency of many skins) you're very likely to end up with many items that failed to scrape (due to throttling), as the addon is basically doubling up the number of calls to musicbrainz.
Oh dear! What is it fetching? Does it do it even when we have mbid from tags? Could we look at optimising that too?

Bugs/Questions form 1st Post
Quote:#1.
If the 'prefer online info' setting is disabled, we pass the artistname to the artist scraper.
If the setting is enabled, we pass the artist mbid to the scraper.
Why don't we always pass the mbid (if available) regardless of this setting?
Not sure why you pointed at the code that you did, and think you may have misread? The code is about what we do with the data we have scraped, (correctly) managing what data derrived from tags gets overwritten.
When calling the scraper here https://github.com/xbmc/xbmc/blob/99c25f....cpp#L1097
and https://github.com/xbmc/xbmc/blob/99c25f....cpp#L1319
we use the mbid from tags when we have it.

Can #1 go?

#2. Will check out your PR.

#3. There are other variations on the point you make here. Scraping can make a mess, even delete artists but leave thie names in the song artist desc. I have been looking at reworking this in relation to also storing the mbids that we scrape. They need to be flagged as from online not embeded tags, because we also need a mechanism to replace them if inaccurate (assumption is always that embeded mbids tags are correct). I will pick this up.
Reply


Messages In This Thread
RE: The logic of Music scrapers? - by ronie - 2017-02-08, 00:14
RE: The logic of Music scrapers? - by jjd-uk - 2017-02-08, 11:23
RE: The logic of Music scrapers? - by ronie - 2017-02-13, 03:04
RE: The logic and future of Music scrapers? - by DaveBlake - 2017-02-15, 16:12
RE: The logic of Music scrapers? - by ronie - 2017-02-13, 03:12
RE: The logic of Music scrapers? - by ronie - 2017-02-13, 03:28
Logout Mark Read Team Forum Stats Members Help
The logic and future of Music scrapers?0