Bug in xbmc.getLanguage() / Adding ISO 3166-1 Capabilities
#1
The following description refers to v21.

Problem Statement

Under certain conditions, the 'xbmc.getLanguage()' function returns incorrect values.

Details

The function 'xbmc.getLanguage()' can be used in an addon to obtain 6 formatting options related to Kodi's current language configuration:  ENGLISH_NAME, ISO_639_1 and ISO_639_2, either with or without additional regional information.

ENGLISH_NAME (Unformatted string)

The language name returned comes from the 'name' property in the active language addon's addon.xml file.

If the region is also required, it is supplied from the 'name' property of the selected 'region' node in the language addon's langinfo.xml file.  {Issue #1}

ISO_639_1 (Two lower-case character ISO language code.  eg: en, de, fr, ja, etc)

The language code is derived by performing a lookup on the 'English Name' of the language that returns the required 2 character code.  {Issue #2}

If the region code is also required, it is obtained by using the 'locale' property for that region and performing a lookup against an ISO-639-1 table.  {Issue #3}

ISO_639_2 (Three lower-case character ISO language code.  eg: eng, deu [or ger], fra [or fre], jpn, etc)

The language code is derived by performing a lookup on the 'name' of the language that returns the required 3 character code.  {Issue #4}

If the region code is also required, it is obtained by using the 'locale' property for that region and performing a lookup against an ISO 639-2B table.  {Issue #5}

Issues

Issue #1

Although mostly cosmetic, the returned language/region string will look something like "English (Australian)-Australia (24h)" instead of "English-Australia".

Proposed solution: The 'English Name' for the language should be based on a lookup of the 'locale' property [already an ISO 639-1 2 character language code] of the 'language' node in the langinfo.xml file.

Issue #2

Using the name of the language as defined in the 'name' property of the language (as provided in the addon.xml) results in a mismatch when extra details are added to a language, eg: 'English' will match whereas 'English (New Zealand)' will not match.

Proposed solution: This option should return the 2 character ISO-639-1 language code that is already present in the langinfo.xml file.

Issue #3

The 'locale' property for the region actually appears to be an 'ISO 3166-1 alpha-2' country code because it contains two UPPER-CASE characters, unlike ISO-639-1 language code, which is specifically lower-case.

Sometimes a case-blind match will almost work, for example, country code 'DE' (Germany) will match language code 'de' and return 'German' rather than 'Germany'.

Unfortunately, region code 'CA' (Canada) will match the language code 'ca' which is 'Catalan, Valencian' and return the 3 character code 'cat'.

Proposed solution:  A new lookup table needs to be introduced to return a country name from the provided 'ISO 3166-1 alpha-2' country code.

Issue #4

Similar to Issue #2.

Proposed solution:  This option should return the 3 character ISO-639-2 language code based on a lookup of the 2 character ISO-639-1 language code already present in the langinfo.xml file.

Issue #5

If a 3 character region code is required (to match the 3 character language code), then 'ISO 3166-1 alpha-3' would appear to be the most relevant code to use.

Proposed solution:  A lookup of an 'ISO 3166-1 alpha-3' country code based on the 'ISO 3166-1 alpha-2' country code would provide the desired result.

Findings

A number of functions already exist within Kodi to return some of the required values:

The existing function 'CLangInfo.GetLanguageCode()' already returns the 3 character ISO-639-2 language code ('eng').

The existing function 'CLangCodeExpander.ConvertToISO6391()' can be used to convert the 3 character ISO-639-2 language code to the 2 character ISO-639-1 language code ('eng' -> 'en').

The existing function 'CLangCodeExpander.Lookup()' can be used to obtain the language name (in English) from the 3 character ISO-639-2 language code ('eng' -> 'English').

Recommendation

Return values for the 'ENGLISH_NAME' formatting option should remain unchanged for backwards-compatibility reasons.  Addons may already exist that expect and compensate for the erroneous return values.

A new function needs to be created to perform ISO 3166-1 country code lookups.  Options are required to return the full name or the 3 character code from the 2 character code available from the region node in the langinfo.xml.

A new formatting option, ISO_NAME, should be introduced to return the language name and optionally the country name based on the ISO codes already present in the langinfo.xml file.

Conclusion

I have experimented with the proposed changes on a development fork and they appear to work as expected.

Part of these experiments involve producing a new function 'CLangCodeExpander.LookupISO31661()' that provides lookups between ISO 3166-1 2 character code, 3 character code and full name values.  ('IE' vs 'IRL' vs 'Ireland')

I would appreciate feedback on my findings/recommendations before finalising the changes and submitting them for inclusion in the next release.
Reply
#2
I have also seen problems with xbmc.getLanguage() but in testing the problem is in Kodi itself.  Not sure if these proposed fixes correct that.  I see when I set language to English US in Kodi, then switch to some other language and then switch back, Kodi regional settings always goes to AU 24h.

But I didn't really like that getLanguage at all, because I think BCP47 language code is the real way forward.  There was/is a PR WIP by IIRC fbacher to implement ICU language libraries which I think if implemented might solve some of these problems as a side effect.

scott s.
.
Reply
#3
@scott967 - The problem with xbmc.getLanguage() is two-fold:  1) It is not using the appropriate functions that are already available; and 2) some of the required information and functions to return it do not actually exist within Kodi.

If implemented correctly, xbmc.getLanguage() is capable of returning what appears to be a BCP47 language code already.
Reply
#4
After a little more consideration, I am of the opinion that the solution should be implemented when the language and region are loaded and appropriate values cached for other areas of Kodi to use just as language/regions-specific information like date formats are currently.

'CLangInfo::Load' and 'CLangInfo::SetCurrentRegion' would be appropriate places to lookup the correct ISO names and codes and then store them as properties of their respective objects.
Reply

Logout Mark Read Team Forum Stats Members Help
Bug in xbmc.getLanguage() / Adding ISO 3166-1 Capabilities0