2023-11-14, 05:16
Hello.
There is a bug in the way Metadata Editor writes the .nfo files.
In the XML declaration line, the encoding shows
This is not quite correct. The proper encoding should be 'UTF-8'. If the encoding is UTF8 without the dash, when using python ElementTree.Parse(filename) and the file actually contains non-ascii unicode characters, the Parse() method will error with
"not well-formed (invalid token): line x, column y"
This encoding error is documented in footnote 1 at the bottom of the page: https://docs.python.org/3/library/xml.etree.elementtree.html
The fix is a one liner, but may also be problematic. It uses syntax only available in Python 3.8+. I know it will work for Kodi Nexus 20.x but not sure for previous versions.
In the file "script.metadata.editor\resources\lib\nfo_updater.py" in method "write_file()" change the line (currently line 111):
to:
So why is xml_declaration required? Because if you just change the encoding to UTF-8, the xml declaration line is not written to the output .nfo. This is due to a "feature" of the default Python XML processor that will not write an XML declaration line if the encoding is ascii or unicode. And if the XML declaration line is not written, ET.Parse(file) assumes ascii (or maybe system default codepage)
This has another possible fix, but xbmvfs.File() would have to be enhanced to accept an encoding= parameter (e.g. xbmcvfs.File(filename,'w',encoding="UTF-8")). The current file encoding I think is system (I am en-US and it shows OEM-US as the file encoding for the .nfo output files)
Sorry for the long winded explanation, but hope this can be fixed.
Thank you!
There is a bug in the way Metadata Editor writes the .nfo files.
In the XML declaration line, the encoding shows
Code:
<?xml version='1.0' encoding='UTF8'?>
This is not quite correct. The proper encoding should be 'UTF-8'. If the encoding is UTF8 without the dash, when using python ElementTree.Parse(filename) and the file actually contains non-ascii unicode characters, the Parse() method will error with
"not well-formed (invalid token): line x, column y"
This encoding error is documented in footnote 1 at the bottom of the page: https://docs.python.org/3/library/xml.etree.elementtree.html
The fix is a one liner, but may also be problematic. It uses syntax only available in Python 3.8+. I know it will work for Kodi Nexus 20.x but not sure for previous versions.
In the file "script.metadata.editor\resources\lib\nfo_updater.py" in method "write_file()" change the line (currently line 111):
Code:
content = ET.tostring(self.root, encoding='UTF8', method='xml').decode()
to:
Code:
content = ET.tostring(self.root, encoding='UTF-8', method='xml', xml_declaration = True).decode("UTF-8")
So why is xml_declaration required? Because if you just change the encoding to UTF-8, the xml declaration line is not written to the output .nfo. This is due to a "feature" of the default Python XML processor that will not write an XML declaration line if the encoding is ascii or unicode. And if the XML declaration line is not written, ET.Parse(file) assumes ascii (or maybe system default codepage)
This has another possible fix, but xbmvfs.File() would have to be enhanced to accept an encoding= parameter (e.g. xbmcvfs.File(filename,'w',encoding="UTF-8")). The current file encoding I think is system (I am en-US and it shows OEM-US as the file encoding for the .nfo output files)
Sorry for the long winded explanation, but hope this can be fixed.
Thank you!