Kodi Community Forum

Full Version: XBMC UTF-8 problems
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi all!

I don't know how to solve this, so thats why I'm starting this thread.

XBMC imports my music with no visible problems, but when I use Yatse, and synchronize my music, there is a problem:

Code:
2013-02-13 14:29:02.865 Error/Thumbnail Parse error : null: [email protected]: Invalid UTF-8 middle byte 0x25

This happens on several files. I don't know how many since I removed the first album that had a song failing, rescanned, and Yatse reported UTF-8 error on another song etc etc.
I have no desire to do this on 500 albums.

I have cleaned the music db, rescanned, and the problem is still there. I have deleted the music.db, rescanned, and the problem is still there.

All media is located on an external USB-disk (drobo, formated with hfs+), connected to the Mac Mini which XBMC runs on.

I've looked at the tags in the songs complained about, and they looked ok. At least no visible strange characters, and I don't know how to find out if they are Latin1 or UTF-8.
99% of my songs are in m4a format (alac) converted from flac.

Any help please?
First step is finding out exactly what the error is. Given it's a "Thumbnail Parse error" that might suggest an issue with image URLs or some such.

Figure out exactly which string is failing first.

Note that it may depend somewhat on when you scanned your library - there was at some point some utf8 importing related issues with tags (basically they weren't converted to utf8 correctly). So one option may be to completely remove your music database and rescan from scratch.
As I wrote:
Quote:I have cleaned the music db, rescanned, and the problem is still there. I have deleted the music.db, rescanned, and the problem is still there.

I think it was a combination of release .nfo's and , ' # ? ! / etc in filenames.
After removing all .nfos, and renaming all folder/filenames containing those (took like 3 hours), no more errors.

It's a shame that UTF8 bugs won't get fixed in XBMC. At least it seems that way :/
We don't know of any and can't fix things we can't reproduce.
But It's XBMC sending "shit" that makes the parser in Yatse to crash.

To quote the Yatse developer:
Quote:Json is a strict standard that says utf 8. Xbmc does not respect the standard so high speed parser can t recover from that.
If the error is on last char of a string the parser can t catch the end of string and then can t continue the parsing Smile
The problem is clearly on xbmc side but they choose to not handle it on the right side.
Only the sender should skip sending bad data that client can t recover.
XBMC is sending information from it's utf-8 database. Thus, the shit is what is fed into said database. While we could check for utf8 every single time we sent the same string, that wouldn't really be efficient.

Thus, we check on insertion into the database. However, in some cases we miss it. Unless we know about it and can reproduce it, we cannot fix it.
Precisely, "..in some cases we miss it". I would prefer 100% UTF8 on import, so these problems wouldn't occur.. even if it would take a longer time to import.
After said import you wouldn't have to check "every single time we sent the same string", because every string would be 100% UTF8.

How about this ticket:
http://trac.xbmc.org/ticket/13878

I think thats the bug that bit me.
Sadly, the last post was 6 weeks ago. Is anything happening there? Or that's one of the cases that you can't reproduce?

It's perhaps a bug on OSX, as thats my OS too.
Correct, I can't reproduce. If I could it would be fixed already.
Common Willie,

post a few problematic songs with problematic NFO somwhere, so devs can test and reproduce.
I was testing my Android app and ran into a Invalid UTF-8 middle byte 0x25 when querying one of my test songs with JSON.

What would you like me to provide you to help you reproduce on your side?

I tried exporting the music database to a single file, but when I looked at the contents it looked empty.

This is the json output I get that shows the issue in the thumbnail:

{
"id": "1",
"jsonrpc": "2.0",
"result": {
"albums": [
{
"albumid": 1,
"artist": [
"Alicia Keys"
],
"artistid": [
12
],
"genre": [
"R&B"
],
"label": "As I Am",
"thumbnail": "image://[email protected]%2fVolumes%2fWAGNER%20HD%2fMusic%20Test%2f058%20-%20Alicia%20Keys%20-%20Like%20You%27ll%20Never%20See%20Me%20Again%20%20%20%5bTorrent%20Tatty%5d%20%20%20(�%84%a2%20MBK-J%20RMG).MP3/",
"title": "As I Am"
}
],
"limits": {
"end": 1,
"start": 0,
"total": 1
}
}
}
The filename looks like it has the issue in it, right? Thus, moving that file to a different folder and scanning that in should allow it to be reproduced. Does it? If so, zipping it up into a zip that's named with ascii chars should allow it (once extracted to a different folder) to reproduce. Does it? If so, provide the zip for us to reproduce.
(2013-03-15, 03:00)jmarshall Wrote: [ -> ]The filename looks like it has the issue in it, right? Thus, moving that file to a different folder and scanning that in should allow it to be reproduced. Does it? If so, zipping it up into a zip that's named with ascii chars should allow it (once extracted to a different folder) to reproduce. Does it? If so, provide the zip for us to reproduce.

Yes. The issue is because there is the trademark symbol (™) in the filename.

I'll zip it up and send it to you.