Login at Kodi Home

HeresJohnny · (This post was last modified: 2018-10-12, 20:59 by HeresJohnny.)

I have a bit of a mystery here which I hope you can help me solve. Basically, I use two different KODI installations to scrape my files. On one is installed CoreELEC (codebase 2018-10-04 at the moment), that is the box connected to my home theatre. The other is the current Leia nightly Windows x64, installed on my notebook for couch surfing.

Both installtaions use a MySQL connection, use the same sources through NFS (I even copied the sources.xml over), use the same add-ons, information providers/scrapers and settings. And yet, when scraping is done on one machine and I would start scraping on the other, all albums are read again like they've not been scraped before and are added freshly to "recently added". On the other hand, if I scrape twice on the same machine, then on the second run the scraper flies through all the files because it has already processed them.

I am scratching my head how this could be possible. Of course, this makes using MySQL kinda moot since there is no true syncing. The data is replicated if I don't scrape again, yes, but somehow the files are not being recognized as being the same.

Is there some kind of difference how NFS shares are being accessed by Windows and Linux or how hashes are being calculated? I am at a loss, frankly, any ideas are welcome.

Edit: My NFS server software is Hanewin NFS Server 1.2.27

BatterPudding · 2018-10-13, 13:27

Thinking sideways - are both computers in the same Time Zone with synchronised clocks?

HeresJohnny · 2018-10-13, 19:50

I think I am getting closer to the solution. Although I've configured the Hanewin NFS server software to use UTF8 codepage, there still seem to be differences how the files are seen by Linux and by Windows. It's really astonishing how many non-ascii characters creep into a large music selection. There are your German umlauts, your Spanish, Portoguese and Swedish quirks and of course the much hated "literary" quotes used by musicbrainz.

I've taken it upon myself to "normalize" diacritics across all filenames, not a small undertaking. Thereby I hope to diminish this problem. I'll report back with my findings.

BatterPudding · 2018-10-14, 13:53

(2018-10-13, 19:50)HeresJohnny Wrote: and of course the much hated "literary" quotes used by musicbrainz.

ARGH!! And those pesky hyphens of MusicBrainz! First time the appeared in my folder names I coule not work out why I had two folders side by side both called "Nu-clear Sounds". Those pesky dashes look identical.

I rip with EAC and that has the ability to do some character swaps. Annoyingly Picard is useless for this as it wasn't to take all the Umlauted characters out too.

docwra · 2018-10-18, 13:06

Yep this is a well known problem in scraping, that specifically relates to the policies of MusicBrainz. Kodi is doing what it should.

Just a tip but if you use theaudiodb scraper we normalise all those silly quotation marks and hyphens Wink

HeresJohnny · (This post was last modified: 2018-10-18, 20:26 by HeresJohnny.)

Anyway, coming back to the original problem. It is definitely related to diacritics and other non-ascii characters, even though UTF8 is set as character set. It might be a problem of the specific combination Hanewin NFS server/Kodi NFS but I lack the info how this would play out with another kind of NFS server.

Removing all non-ascii characters definitely sped things up and helped some but the problem with files being re-scraped on ?Elec and Windows Kodi respectively persists. So it's also the non-ascii characters which are saved as tags. Now, I wonder... is the denotation of UTF8-characters different in Windows Kodi and Linux-based Kodi? There are certainly many ways to programmatically write an UTF-character. Maybe the difference is such that objects/path are not recognized as already existing? Someone smarter than me will have to look into that, though.