Kodi Community Forum

Full Version: TMDB - Scraper don't like "period" in file name
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi all,

In my movie collection, many files have their spaces substitued with periods such as:

Once.upon.a.time.in.mexico.avi

The tmdb doesn't recognise this movie. I have to rename the file removing all periods and then tmdb recognises it.

I have also many movies with brackets and slashes and again they don't get recognised at all.

(Dvix - EN) - Once upon a time in mexico.avi

or as I've seen many in torrents:

(Dvix - EN) - Once upon a time in mexico - SILENT - TELESYNC - CREW BLA BLA BLA.avi

Is there anyway to improve the regular expression to at least solve the first problem (periods instead of spaces)? I have to say that if tried with Mediaportal all those three formats were correctly recognised.

up
I use a program called TheRenamer to rename all my movies and TV shows, XBMC picks them all up alright then. I know it might not be your ideal solution, but its a possibility
Have a look at this.

You can't expect one program to behave exactly like another. "I have to say that if tried with Mediaportal all those three formats were correctly recognised."
So?

The suggestion on using theRenamer is a good one. You can also try the Bulk renaming tool in Ember Media Manager, but I've had less than consistent results with that. It mostly works.
There is also a little program called Bulk rename Utility that is very powerful and can make short work of renaming large batches - once you learn how to use it properly. It is a bit confusing at first.

And finally, you can mess around with RegExp and bypass renaming (theoretically) altogether.
If one acquires movies via torrents (with no standard file naming convention), one shouldn't mind spending a bit of time renaming the files. Wink
Well the thing is that with other scrapers such as MyMovies or Movieplayer, all files are correctly recognized. It's just weired that the default and main scraper is not sophisticated as much as minor ones.

I feel that all scrapers should implement the best regular expressions to avoid worries and complication to the final user. At the end of the day isn't what a scraper meant to be? Recognize as much files as possible in the easiest way possible Blush
(2012-08-23, 09:16)kiwi Wrote: [ -> ]Well the thing is that with other scrapers such as MyMovies or Movieplayer, all files are correctly recognized. It's just weired that the default and main scraper is not sophisticated as much as minor ones.

I feel that all scrapers should implement the best regular expressions to avoid worries and complication to the final user. At the end of the day isn't what a scraper meant to be? Recognize as much files as possible in the easiest way possible Blush

Works is done on this part with a GSoC project by top2fs to improve the whole scraper engine.
(2012-08-23, 09:16)kiwi Wrote: [ -> ]Well the thing is that with other scrapers such as MyMovies or Movieplayer, all files are correctly recognized. It's just weired that the default and main scraper is not sophisticated as much as minor ones.

I feel that all scrapers should implement the best regular expressions to avoid worries and complication to the final user. At the end of the day isn't what a scraper meant to be? Recognize as much files as possible in the easiest way possible Blush

Issue is that you don't understand the thing you are talking about.

The scrapers most often are using the search engine of the site they are scrape from.
If a scraper gives better results, then it is most probably because the search engine of the site it scrapes from gives something back even in case if '(Dvix - EN) - Once upon a time in mexico - SILENT - TELESYNC - CREW BLA BLA BLA.avi' - which is a pure madness by the way.

It might be that tmdb doesn't give any results for this string while MyMovies does... but you blame the scraper? Ehh...

...and by the way 'Once.upon.a.time.in.mexico.avi' works perfectly for me.
(2012-08-26, 16:30)olympia Wrote: [ -> ]
(2012-08-23, 09:16)kiwi Wrote: [ -> ]Well the thing is that with other scrapers such as MyMovies or Movieplayer, all files are correctly recognized. It's just weired that the default and main scraper is not sophisticated as much as minor ones.

I feel that all scrapers should implement the best regular expressions to avoid worries and complication to the final user. At the end of the day isn't what a scraper meant to be? Recognize as much files as possible in the easiest way possible Blush

Issue is that you don't understand the thing you are talking about.

The scrapers most often are using the search engine of the site they are scrape from.
If a scraper gives better results, then it is most probably because the search engine of the site it scrapes from gives something back even in case if '(Dvix - EN) - Once upon a time in mexico - SILENT - TELESYNC - CREW BLA BLA BLA.avi' - which is a pure madness by the way.

It might be that tmdb doesn't give any results for this string while MyMovies does... but you blame the scraper? Ehh...

...and by the way 'Once.upon.a.time.in.mexico.avi' works perfectly for me.

Thanks for you clarofication but I thought that the scraper removes all special charcaters and then it searches on the specific website (or I am wrong?). If so the scraper should "clean" the movie name removing all "non title" tags. I then assume the TMDB doesn't remove DVIX, TELESYNC etc tags (or I am wonr gagain?)
(2012-08-26, 22:45)kiwi Wrote: [ -> ]Thanks for you clarofication but I thought that the scraper removes all special charcaters and then it searches on the specific website (or I am wrong?). If so the scraper should "clean" the movie name removing all "non title" tags.
Yes, you are wrong. It's not the scraper what cleans the file name, but XBMC core and then it pass this over as a search string to the scraper addons. This obviously means that all scrapers get the same search string so one cannot be better than the other with that regards.


(2012-08-26, 22:45)kiwi Wrote: [ -> ]I then assume the TMDB doesn't remove DVIX, TELESYNC etc tags (or I am wonr gagain?)
You're right here, but why would it remove anything? Maybe it should provide the closest matches though. However the closest match based on your file names are probably not more than 50-60% and TMDb search engine decides not to return those as a result (not sure how their search engine works though).

Anyway, XBMC does a pretty good job cleaning the filenames (especially if the naming conventions follows some written and non-written rules). The biggest issue is if you have something in the filename prior the movie title.
(2012-08-27, 07:53)olympia Wrote: [ -> ]
(2012-08-26, 22:45)kiwi Wrote: [ -> ]Thanks for you clarofication but I thought that the scraper removes all special charcaters and then it searches on the specific website (or I am wrong?). If so the scraper should "clean" the movie name removing all "non title" tags.
Yes, you are wrong. It's not the scraper what cleans the file name, but XBMC core and then it pass this over as a search string to the scraper addons. This obviously means that all scrapers get the same search string so one cannot be better than the other with that regards.


(2012-08-26, 22:45)kiwi Wrote: [ -> ]I then assume the TMDB doesn't remove DVIX, TELESYNC etc tags (or I am wonr gagain?)
You're right here, but why would it remove anything? Maybe it should provide the closest matches though. However the closest match based on your file names are probably not more than 50-60% and TMDb search engine decides not to return those as a result (not sure how their search engine works though).

Anyway, XBMC does a pretty good job cleaning the filenames (especially if the naming conventions follows some written and non-written rules). The biggest issue is if you have something in the filename prior the movie title.


Thanks for your reply! Finally I understand what happens under the hood Smile The only doubt now is: if everything is done by the XBMC core, what the scraper does? Why does it use regular expression if the "cleaning" process is done by XBMC?