Is there a way to clear duplicated Search Results entity?
#1
As the topic says, Is there a way to clear duplicated Search Results entities?
I'm having entities with same name, id and url, is it possible to just have one result instead?

Code:
GetSearchResults returned <?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<results>
<entity>
<title>Dexter (2006)</title>
<url>http://www.movieplayer.it/serietv/733/dexter/</url>
<id>733</id>
</entity>
<entity>
<title>Dexter (2006)</title>
<url>http://www.movieplayer.it/serietv/733/dexter/</url>
<id>733</id>
</entity>
</results>
where xbmc show me :

Dexter (2006)

Dexter (2006)

and they link to the same url.
I cannot just switch "repeat" off because i need it in case of same series with different year like Flash-Gordon (1954) and Flash-Gordon (2006) where they also have different "id".
Reply
#2
Ok I really need this thing sorted out!
I paste here what i already have discussed in the other movieplayer.it thread:

KoTiX Wrote:Ok let's talk about how the search work, there are some facts i'd like to discuss with you guys:

1. The direct search on the movieplayer site is not possible because it uses a kind of cripted code, this is hte link that look for "Ronin":
http://www.movieplayer.it/ricerca/cm9uaW4=/1/
even looking at the html i don't understand how it work really.

2. So I'm using a Google custom search the web for every page starting with:
http://www.movieplayer.it/film/*/*/
because every movie on mp.it have a main page like (eg for Ronin) http://www.movieplayer.it/film/664/ronin/

3. The problems start with the secondary links of each movie:

http://www.movieplayer.it/film/*/*/gallery-e-trailer/
http://www.movieplayer.it/film/*/*/homevideo/
http://www.movieplayer.it/film/*/*/in-sala/
http://www.movieplayer.it/film/*/*/rassegna-stampa/
http://www.movieplayer.it/film/*/*/suggerimenti/
http://www.movieplayer.it/film/*/*/statistiche/
http://www.movieplayer.it/film/*/*/extra/
http://www.movieplayer.it/film/*/*/articoli/
http://www.movieplayer.it/film/*/*/cast/

if i retreive these links too i will have as search result in xbmc 10 entities with the same name that point to the same link, so I decided to exclude most of them from the search, leaving just the "in-sala" one that is the most occurrent

4. the last fact is that Google doesn't find all the movies just with the main link, but it does with the secondary links.


So my conclusions are:

1. xbmc developers make a changes in the xbmc code to automatically exclude the doubled results, so I can use the secondary links as source too (best solution IMO)

2. enable some more secondary links and accepting that it will cause some multiple same results

3. My last solution, don't use movieplayer.it and switch over to mymovies.it if it gives back more accurated results, I could ask Muttley to help him developing his scraper, but before this I'd like to know if mymovies is really more reliable than mp.it

Let me know what you think or send me a PM if you wish to talk in italian.
Cheers Smile

Furthermore I wanna show you an example of what i get searching for "the millionaire" and leaving some sublinks enabled:

Code:
DEBUG: scraper: GetSearchResults returned <?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<results>
[color=blue]<entity>
<title>The Millionaire(2008)</title>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/wallpaper/1/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/promozionali/1/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/foto-di-scena/1/</url>
<id>21346</id>
</entity>
<entity>
<title>The Millionaire(2008)</title>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/wallpaper/1/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/promozionali/1/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/foto-di-scena/1/</url>
<id>21346</id>
</entity>
<entity>
<title>The Millionaire(2008)</title>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/wallpaper/1/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/promozionali/1/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/foto-di-scena/1/</url>
<id>21346</id>
</entity>
<entity>
<title>The Millionaire(2008)</title>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/wallpaper/1/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/promozionali/1/</url>
<url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/foto-di-scena/1/</url>
<id>21346</id>
</entity>
<entity><title>The Millionaire(2008)</title><url>http://www.movieplayer.it/film/21346/slumdog-millionaire/</url><url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/wallpaper/1/</url><url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/promozionali/1/</url><url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/foto-di-scena/1/</url><id>21346</id></entity><entity><title>The Millionaire(2008)</title><url>http://www.movieplayer.it/film/21346/slumdog-millionaire/</url><url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/wallpaper/1/</url><url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/promozionali/1/</url><url>http://www.movieplayer.it/film/21346/slumdog-millionaire/gallery-e-trailer/foto-di-scena/1/</url><id>21346</id></entity>
<entity>[/color]
<title>Milk(2008)</title>
<url>http://www.movieplayer.it/film/15997/milk/</url>
<url>http://www.movieplayer.it/film/15997/milk/gallery-e-trailer/wallpaper/1/</url>
<url>http://www.movieplayer.it/film/15997/milk/gallery-e-trailer/promozionali/1/</url>
<url>http://www.movieplayer.it/film/15997/milk/gallery-e-trailer/foto-di-scena/1/</url>
<id>15997</id>
</entity>
<entity>
<title>Il curioso caso di Benjamin Button(2008)</title>
<url>http://www.movieplayer.it/film/4526/il-curioso-caso-di-benjamin-button/</url>
<url>http://www.movieplayer.it/film/4526/il-curioso-caso-di-benjamin-button/gallery-e-trailer/wallpaper/1/</url>
<url>http://www.movieplayer.it/film/4526/il-curioso-caso-di-benjamin-button/gallery-e-trailer/promozionali/1/</url>
<url>http://www.movieplayer.it/film/4526/il-curioso-caso-di-benjamin-button/gallery-e-trailer/foto-di-scena/1/</url>
<id>4526</id>
</entity>
<entity>
<title>La felicità porta fortuna - Happy Go-Lucky(2008)</title>
<url>http://www.movieplayer.it/film/16330/happy-go-lucky/</url>
<url>http://www.movieplayer.it/film/16330/happy-go-lucky/gallery-e-trailer/wallpaper/1/</url>
<url>http://www.movieplayer.it/film/16330/happy-go-lucky/gallery-e-trailer/promozionali/1/</url>
<url>http://www.movieplayer.it/film/16330/happy-go-lucky/gallery-e-trailer/foto-di-scena/1/</url>
<id>16330</id>
</entity>
<entity><title>WALL·E(2008)</title><url>http://www.movieplayer.it/film/14486/wall-e/</url><url>http://www.movieplayer.it/film/14486/wall-e/gallery-e-trailer/wallpaper/1/</url><url>http://www.movieplayer.it/film/14486/wall-e/gallery-e-trailer/promozionali/1/</url><url>http://www.movieplayer.it/film/14486/wall-e/gallery-e-trailer/foto-di-scena/1/</url><id>14486</id></entity></results>

I don't understand why XBMC should consider the results that have the same ID, the same Title and the same URLS, shouldn't it clear by itself?

My suggestion would be to have a conditional to enable/disable multiple results.


I hope this is understandable enought and a developer can take a look into this and maybe get it fixed before Camelot final vers.

Tnx in advance Smile
Reply
#3
only reason we're not doing it is that it points to a crappy scraper. you should be able to separate out dupes yourself Wink
Reply
#4
Man I know how to separate them, the problem is that the google search don't find all the movies taking just the primary link, sometimes it finds only one sublink, but if I take all the sublinks as good results, xbmc take all of them as good ones too even if they are all the same.
If I was able to use the http://www.movieplayer.it own search I would already used that, but it use a post method with a base64_encode algorithm that cript the search.

Any suggestion?
Reply
#5
heh i feel your pain. my bluntness is just faster than my atom Wink

r24674
Reply
#6
Damn you!! Wink
Always the same old Spiff!! Tnx really Smile
Reply

Logout Mark Read Team Forum Stats Members Help
Is there a way to clear duplicated Search Results entity?0