IMDB scraper issue with some Italian titles
#1
Good morning to everybody,

I'd like to report an issue that is happening with the IMDB scraper. When I set "Italy" in the scraper's preferences, in some cases it retrieves the international titles and not the Italian ones, even if in the IMDB page the Italian title is present. A couple of examples: "Le salaire de la peur" (1953) - it even has two Italian titles "Vite vendute" and "Il salario della paura", but none of them is retrieved; "Das weiße Band" (2009), whose Italian title is "Il nastro bianco", but the scraper keeps downloading the German one.

If I set, for instance, Brazil as country preferences, the scraper correctly retrieves Portuguese titles.
Reply
#2
Delete line 192-197 from imdb.xml
Reply
#3
Which version? In my version lines 192-197 are across two subsections and if I delete them nothing works
Reply
#4
228-233 in the latest imdb scraper.
Reply
#5
I have version 2.2.2 but it only has 205 lines...

EDIT: I was looking in the wrong directory. Thanks indeed, this works!
Reply
#6
It worked but some Italian movies are then retrieved with their international title: for instance "La prima cosa bella" is retrieved as "The first beautiful thing"... ?!?
Reply
#7
Well, that's because there is no Italian title listed on imdb...

The lines you deleted were forcing the imdb display title if the movie was filmed in Italy.
Reply
#8
I think I'll revert to the official scraper. Now every Italian movie is retrieved with its international title...

What I do not understand, about the official scraper, is that some foreign movies are correctly retrieved by their Italian names, and some others with a title in another language, even if both have, on IMDB, the Italian alias...
Reply
#9
How about reading my explanation above?

Edit:
Try to replace line 229 in the official scraper with this one and check if it brings better results

Code:
<expression>&gt;&lt;a href=&quot;/country/[^&gt;]+&gt;($INFO[akatitles]&lt;/a&gt;&lt;/div&gt;)</expression>
Reply
#10
olympia Wrote:How about reading my explanation above?
I read it but it does not explain. Some examples:
- French movie "Le salaire de la peur". Film has an Italian alias "Vite vendute". It was not filmed in Italy. Retrieved as "Le salaire de la peur".
- American movie "The ten commandments" (1923). Film has an Italian alias "I dieci comandamenti". It was not filmed in Italy. Retrieved as "I dieci comandamenti".

So it does not explain why, starting from the same initial conditions, results are different.

Quote:Edit:
Try to replace line 229 in the official scraper with this one and check if it brings better results

Code:
<expression>&gt;&lt;a href=&quot;/country/[^&gt;]+&gt;($INFO[akatitles]&lt;/a&gt;&lt;/div&gt;)</expression>

I'll give it a try and let you know.
Reply
#11
It still does not work for Italian movies like "La prima cosa bella" (retrieved as "The first beautiful thing")
Reply
#12
gspinoza Wrote:I read it but it does not explain. Some examples:
- French movie "Le salaire de la peur". Film has an Italian alias "Vite vendute". It was not filmed in Italy. Retrieved as "Le salaire de la peur".

Probaby I should've write 'Filmed by Italy'. It's the Country which is taken into consideration from this page:
http://akas.imdb.com/title/tt0046268/combined
Italy is among the countries.

The modification I sent above is forcing the imdb display title only if there is only Italy mentioned as country.

...and 'La prima cosa bella' does come up as 'La prima cosa bella' for me with the above mod.
Reply
#13
You're right, it works. I cut and pasted over another line.. man, am I dumb.

Thanks for your help, olympia!
Reply
#14
I cheered too soon.

Some Italian titles like "la prima cosa bella" and "La vita è bella" are correctly retrieved in Italian, some others like "Vallanzasca - Gli angeli del male" or "Il mulino delle donne di pietra" are retrieved with their English names.

I don't know what to think anymore...
Reply
#15
If there is no title found from Italy, then the scraper will return the USA/International title.

...and this is the case here. I have to think how to work around that.
Reply
 
Thread Rating:
  • 1 Vote(s) - 5 Average



Logout Mark Read Team Forum Stats Members Help
IMDB scraper issue with some Italian titles51