Login at Kodi Home

michelb2 · (This post was last modified: 2022-05-20, 17:16 by michelb2.)

when i scraped
"evasion (2013) [tt1211956].mkv" the answer is "evasion fiscale : le hold up du siecle'" wrong answer
"évasion (2013) [tt1211956].mkv" the answer is "evasion" (original title : escape plan in english) right answer

so i have 2 questions
- is it normal that (e)vasion and (é)vasion has 2 differents answers ?
- is it possible to tell the scaper to search only with the imdb when the imdb number exists in the title (whatever the rest of the title)

Of course, i am french and my kodi parameters are "fr" everywhere
by the way the tvdb v4 (movies) found the right title with evasion and évasion.

this is a log with a kodi 2.0 but the result is the same with 1.94
all the mkv files are 0 byte and not true movie. (just the title)
https://pastebin.com/Sk00qafg

**Klojum** · 2022-05-20, 17:18

1) Files named "évasion" and "evasion" are considered two different files in Kodi, in Linux and most likely also in Windows file storage. The situation that the scraper also reacts different is not uncommon with diacritic characters.

2) I don't think the scraper is up to that level of artificial intelligence already, but others probably have more info on that.

**Karellen** · 2022-05-20, 20:17

(2022-05-20, 16:43)michelb2 Wrote: - is it normal that (e)vasion and (é)vasion has 2 differents answers ?

Yes it is possible. If the scraper site has two movies, each having one of those titles, then both movies will be found in the search.
You can test for yourself. Search for both of those movie titles at TMDB and append y:2013 to the end and you will see which movies are found évasion y:2013 and evasion y:2013
Same list of movies, but in a different order. So it does make a difference.

(2022-05-20, 16:43)michelb2 Wrote: - is it possible to tell the scaper to search only with the imdb when the imdb number exists in the title (whatever the rest of the title)

Yes, but you need to do it manually. See section 3... https://kodi.wiki/view/Add-on:The_Movie_...hon#Search

You can also tell the scraper to only search in French by changing the Search language. Change this on your Source.

michelb2 · (This post was last modified: 2022-05-20, 22:29 by michelb2.)

thanks for the answer
first, as i say already , all my kodi parameters are for french, also for the search in the scraper

A) i allready search in tmdb with evasion y:2013 and évasion y:2013 and yes i have multi answers with different order.

evasion y:2013 =>
1 Evasion fiscale - Le hold-up du siècle 2013
2 La Grande Évasion 1963
3 Évasion 2013

évasion y:2013 =>
1 La Grande Évasion 1963
2 Evasion fiscale - Le hold-up du siècle 2013
3 Évasion 2013
so the proposal in tmdb python is not coherent with the order of tmdb answer

it seem more naturel for the scraper to choice the exact name first then the date and at last partial name with date not the same.

B) i try the manual search and saw that tt1211956 give the right answer
but my question was :
is it possible to automate the process (in advanced_setting.xml with <cleandatetime> and/or <cleanstrings>
or directly in the code of the scraper : metadata.themoviedb.org.python\scraper.py

PS
i just look in addons\metadata.themoviedb.org.python\python\lib\tmdbscraper\tmdb.py
and it seem to me that this function (line 134 and more)
                def _parse_media_id(title):
   if title.startswith('tt') and title[2:].isdigit():
                   return {'type': 'imdb', 'id':title} # IMDB ID works alone because it is clear
                  .....
is used to search if a imdb number or tmdb number exist in the title BUT only if it is the only part in the name file. ( as you can see in the manual search)
my knowledge in python is not good enough to change the code but all you have to do is to retrun the type and imdb number as soon as "tt\d+" exists in file name
Can someone can help me ?

**Karellen** · 2022-05-20, 22:20

(2022-05-20, 22:05)michelb2 Wrote: is it possible to automate the process (in advanced_setting.xml with <cleandatetime> and/or <cleanstrings>
or directly in the code of the scraper : metadata.themoviedb.org.python\scraper.py

The scraper sends a request to the API, the API returns the list. If they are in the wrong order, then it is a problem with the API. It is best reported to TMDB.
I don't see how the advancedsettings.xml could change which movie is selected.
The request that is sent is correct, its the results that TMDB returns that seem to be wrong. That is generally speaking. You did not enable Debug mode in your log, so I can't confirm the search request.

Now that I look at the entry at TMDB, I am surprised the movie is found with the way you have named it. There is no entry for evasion or évasion so it is a mystery to me how it is found.
I imagine that if you add the French name in the list, the scraper will find the movie correctly.

michelb2 · (This post was last modified: 2022-05-20, 22:33 by michelb2.)

sorry
i edit my previous post while you answer to me

if i can export img to pastebin ,i can show you my answer in tmdb if you don't believe me

**Karellen** · 2022-05-20, 22:36

(2022-05-20, 22:31)michelb2 Wrote: if i can export img to pastebin

Button 18

michelb2 · 2022-05-20, 23:16

**Karellen** · 2022-05-20, 23:29

@michelb2

I already know what your images show. I told you about those differences in one of my above posts, and it confirms the fix needs to come from TMDB.

As for editing the scraper code, @rmrector would need to advise.

**rmrector** · 2022-05-22, 19:18

Ya, the scraper code can likely be changed to match the IMDB number from anywhere in the string - with a precise enough pattern we're unlikely to pull up false positives, like a movie with 'tt12345' legitimately in the title.

For this filename pattern specifically you'd also need to change "cleandatetime" to not match the year in the middle - that happens in Kodi before the name is passed to scrapers.

michelb2 · (This post was last modified: 2022-05-22, 21:11 by michelb2.)

thanks for the answer.
Could you tell me exactly where to made the change
i spent lot of time this week end to search in
metadata.themoviedb.org.python\python\scraper.py,
metadata.themoviedb.org.python\python\lib\tmdbscraper\tmdb.py
but dont' find where to act
everywhere i find the title in the functions but never the filename.
it seems to me that i have to act before but i don' know where

i don't understand why you want to "cleandate" because if you can replace "name_of_file_or_not (date) [tt123465]. extension" by "tt132465.extension" before applying the search, the result shoud be correct ?
and if you clean the date, you can miss the right movie if there is no tt number in the filename

michelb2 · (This post was last modified: 2022-05-28, 12:01 by Karellen.)

@rmrector any hint ?

**rmrector** · 2022-05-28, 16:16

Filename cleaning is done by Kodi before sending the title and year to the scraper. Scrapers have no access to the filename.

To make this work with your particular naming scheme, you would have to edit the line in the scraper that you have identified, and also change the cleandatestring as noted

michelb2 · (This post was last modified: 2022-05-29, 17:51 by michelb2.)

just for the record
in   addons\metadata.themoviedb.org.python\python\lib\tmdbscraper\tmdb.py
i add :

def _parse_media_id(title):
    m=re.search(r"(tt\d+)",title)
    if m: return {'type': 'imdb', 'id':m.group()}

WITHOUT ANYTHING ELSE
and if the imdb number is BEFORE the date, the imdb is enough to scrap
"nimporteqoui (2019) [tt6063090] (4k,vostfr).iso"    not OK
"nimporteqoui (2019) (4k,vostfr) [tt6063090].iso"    not OK
"nimporteqoui [tt6063090] (2019) (4k,vostfr).iso"     OK

as you say, sadly kodi remove the tt number before i can do anything else (except maybe cleandate !! but i did not try a movie without imdb in this case))

my mistake
"nimporteqoui [tt6063090] (4k,vostfr) (2019).iso" is not OK but i don't understand why

ok , i found , with
<cleanstrings> <regexp>\(.*\)?</regexp> </cleanstrings>
"nimporteqoui [tt6063090] (4k,vostfr) (2019).iso" is OK
So
title [imdb] (date) (divers).ext
title [imdb] (divers) (date).ext     are ok