2016-02-17, 16:57
I'm looking at changes to the music library to ensure that all songs, and albums, have at least one artist. The idea came up in the discussion of PR9081
At the moment if you scan a music file without an ARTIST/TPE1 tag into the library then a record is created in the song table but not a song_artist entry. The result is songs without any artist. Many users will not have this at all, but older and larger libraries do tend to have at least a few like it. This brings at least 2 issues:
1) All queries wanting song or album and their artist(s) have to be left joins. Unfortunately there is a weakness in the SQLite optimiser that means that left joins onto views are not optimised and run horribly slowly. One work around is to replace all the views in left joins with the explicit fields and tables, but it is not as eay to maintain or read.
2) Songs without artists are hidden in the UI. They do not get listed in a Genre -> Artist -> Album -> song approach. Only the songs node, listing all the songs, would show them.
From a data view point in makes a lot of sense for Kodi to collect all songs without an artist tag under a single "unknown" artist. This would mean that we can use more efficient inner joins in queries, but also that the "unknown" artist would appear in the artist node and reveal those previously lost songs to the user.
On music database update from older versions existing songs without artists will need to be updated (adding song_arist entries). In testing I have learned that this needs temp indices on the tables or it is unbelievably slow.
It also happens that Musicbrainz has an "[unknown]" musicbrainzartistid entry, so some user libraries could already have an "unknown" artist if their music files were tagged with that. It makes sense to me that Kodi treats both those untagged songs and those tagged as "unknown" in the same way internally.
That a song did not have an artist tag is recorded in the database by the song.strArtists field being empty. This also means that when the song is listed or played you do not see an unwanted "unknown" in the place of artist name but blank as currently. This is particularly important if the artist name has been included in the song title (a historic practice).
Unlike "various artists" where, because the entry is only based on name, you can end up with multiple artist entries in different languages, depending on your language settings when you scanned, I propose that we ensure there is only ever one "unknown" artist entry. This uniqueness can be provided internally by using musicbrainzid.
The internationalisation of "unknown" artist then becomes that same as that for all artist names. If you tag a music file with the "[unknown]" musicbrainzartistid and an artist/TPE1 tag of "onbekend" or "?" and that is the last scanned with that mbid then "onbekend" etc. is what appears in the artist node and becomes your unknown artist label.
The only question is what name is used for "unknown" initially?
So what have I overlooked?
At the moment if you scan a music file without an ARTIST/TPE1 tag into the library then a record is created in the song table but not a song_artist entry. The result is songs without any artist. Many users will not have this at all, but older and larger libraries do tend to have at least a few like it. This brings at least 2 issues:
1) All queries wanting song or album and their artist(s) have to be left joins. Unfortunately there is a weakness in the SQLite optimiser that means that left joins onto views are not optimised and run horribly slowly. One work around is to replace all the views in left joins with the explicit fields and tables, but it is not as eay to maintain or read.
2) Songs without artists are hidden in the UI. They do not get listed in a Genre -> Artist -> Album -> song approach. Only the songs node, listing all the songs, would show them.
From a data view point in makes a lot of sense for Kodi to collect all songs without an artist tag under a single "unknown" artist. This would mean that we can use more efficient inner joins in queries, but also that the "unknown" artist would appear in the artist node and reveal those previously lost songs to the user.
On music database update from older versions existing songs without artists will need to be updated (adding song_arist entries). In testing I have learned that this needs temp indices on the tables or it is unbelievably slow.
It also happens that Musicbrainz has an "[unknown]" musicbrainzartistid entry, so some user libraries could already have an "unknown" artist if their music files were tagged with that. It makes sense to me that Kodi treats both those untagged songs and those tagged as "unknown" in the same way internally.
That a song did not have an artist tag is recorded in the database by the song.strArtists field being empty. This also means that when the song is listed or played you do not see an unwanted "unknown" in the place of artist name but blank as currently. This is particularly important if the artist name has been included in the song title (a historic practice).
Unlike "various artists" where, because the entry is only based on name, you can end up with multiple artist entries in different languages, depending on your language settings when you scanned, I propose that we ensure there is only ever one "unknown" artist entry. This uniqueness can be provided internally by using musicbrainzid.
The internationalisation of "unknown" artist then becomes that same as that for all artist names. If you tag a music file with the "[unknown]" musicbrainzartistid and an artist/TPE1 tag of "onbekend" or "?" and that is the last scanned with that mbid then "onbekend" etc. is what appears in the artist node and becomes your unknown artist label.
The only question is what name is used for "unknown" initially?
So what have I overlooked?