Kodi Community Forum

Full Version: Incomplete movie titles and genres
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi! I'm using Universal Movie Scraper and recently I've encountered this problem:
Some movies have incomplete title or genre information.

Image

The title ends with </title> and genre ends with </genre> .

I've checked the log and find out that data from TMDb in English are correctly parsed while some information in Chinese are not.

Code:
15:40:03 T:4677595136   DEBUG: CurlFile::Open(0x112c2a950) http://api.tmdb.org/3/movie/tt0896872?api_key=***omitted***&language=zh
15:40:04 T:4677595136   DEBUG: Get: Using "UTF-8" charset for "http://api.tmdb.org/3/movie/tt0896872?api_key=***omitted***&language=zh"
15:40:04 T:4677595136   DEBUG: scraper: ParseTMDBTitle returned <details><title>告密?/title></details>

*** irrelevant log content omitted ***

15:40:06 T:4677595136   DEBUG: scraper: GetTMDBLangGenresByIdChain returned <details><url function="ParseTMDBGenres" cache="tmdb-zh-tt0896872.json">http://api.tmdb.org/3/movie/tt0896872?api_key=***omitted***&language=zh</url></details>
15:40:06 T:4677595136   DEBUG: scraper: ParseTMDBGenres returned <details><genre>犯罪</genre><genre>剧?/genre><genre>惊悚</genre></details>

I believe these question marks in scraper returned information caused the problem.

I manually downloaded the json file via the same URL
Code:
http://api.tmdb.org/3/movie/tt0896872?api_key=***omitted***&language=zh
The file is correctly encoded and can be successfully parsed by various JSON viewers.

So I assume there is a bug where Universal Movie Scraper handles Chinese characters.
Please look into this issue.
Have asked our charset/regexp guy to take a look.
Thanks.
BTW I'm running XBMC 13.1 on OS X 10.9.4.
whatUwant, could you provide a link to full debug log?
I installed 13.2 beta 3 yesterday. I erased the old media library and started scraping. Up to now everything seems normal.