Kodi Community Forum

Full Version: What is the encoding I must use for characters with accents
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

With french text, we have some characters with accents (è, é à, ö, î, etc ...), and I need your help to display these characters in Kodi.

I parse a xml file to get a resume of my video. This is my xml file :
Quote:       <movie>
        <title>Railroad Tigers (2018)</title>
        <genre>Action</genre>
        <link>Link</link>
        <thumb>2tCeUBBZbhD7FhoGv4GuNawHgkE.jpg</thumb>
        <description>En 1941, le Japon fait avancer la guerre jusqu’à l’Asie du Sud. La ligne de chemin de fer entre Tianjin et Nanjing devient stratégique. Le cheminot Ma Yuan dirige une équipe de résistants, mettant à profit leur connaissance du réseau ferroviaire pour faire dérailler les machines de guerre japonaises. Les Chinois nomment ces héros hors du commun les « Railroad Tigers ». Quand les forces japonaises envoient des renforts à Shandong, Ma Yuan se lance dans sa plus périlleuse mission : faire sauter un pont ultra-sécurisé, ce qui ralentirait considérablement la progression japonaise…</description>
    </movie>
As you can see in description we have characters with accents.

Here my parse to get title, link, description... xml balise :

python:

with open(xmlfile, "r") as xmlfile :
tree = ET.ElementTree()
tree.parse (xmlfile)
movies = tree.getroot()

for node in movies.findall('movie'):
    title = node.find('title').text
    genre = node.find('genre').text
    link = node.find('link').text
    thumb = node.find('thumb').text
    description = node.find('description').text

description type is unicode 

In order to print description in Kodi when I see information of my video, I use setInfo like this :

python:
li = xbmcgui.ListItem(stream['name'], iconImage=iconPath, thumbnailImage=iconPath)
li.setInfo('video', {'title': stream['name'],
'genre': stream['genre'],
'mediatype': "video",
'plot' : stream['description']})

But when I launch Kodi I have this result :

Image

Please can you help me to correctly display accented characters on Kodi ?
Unicode is just a standard that maps symbols of various writing systems to 4-digit hexadecimal numbers (codepoints). It does not tell you anything about concrete binary representation of those symbols which is encoding. ListItem.setInfo() accepts either Python Unicode strings (unicode type) or binary strings (str type) encoded in UTF-8. The biggest question here: what is the encoding of your initial XML files that you parse? Looks like it is not UTF-8. In that case you need to decode them to unicode first using appropriate encoding (win-1252 maybe?).
(2018-05-06, 14:32)Roman_V_M Wrote: [ -> ]The biggest question here: what is the encoding of your initial XML files that you parse? Looks like it is not UTF-8. In that case you need to decode them to unicode first using appropriate encoding (win-1252 maybe?).
In my xml file the encoding is :

xml:
<?xml version="1.0" encoding="cp1252"?>

I have change by :

xml:
​​​​​​​<?xml version="1.0" encoding="utf-8"?>

Now all it is ok !

Thank you so much for your answer !!!

Kodi Capture