•  Previous
  • 1
  • 2
  • 3
  • 4(current)
  • 5
  • 6
  • 27
  • Next 
 
Thread Rating:
  • 2 Vote(s) - 5 Average
[RELEASE] FilmAffinity (Spanish) scraper
#46
updated scraper is now in svn, r15969
Reply
#47
oh, and the search string encoding worked fine for me. i made a directory named cariño, set content, did the lookup. got the list your url pointed to.
Reply
#48
And where is the SVN? can you provide a link to download the scraper or attach it here?

regards,

Fido
Reply
#49
SVN: https://xbmc.svn.sourceforge.net/svnroot...ers/video/


@HectorziN:
It is possible to get the IMDB Link with a google search.
site:imdb.com +original title +year

I'm using a google wrapper to get the IMDB ID for fanart at my moviemaze scraper.

Code:
<!--URL to Google and Fanart-->
<RegExp conditional="fanart" input="$$8" output="&lt;url function=&quot;GoogleToIMDB&quot;&gt;http://www.google.com/search?q=site:imdb.com+moviemaze\1&lt;/url&gt;" dest="5+">
<RegExp input="$$1" output="\1" dest="7">
    <expression>&lt;h2&gt;\((.*)\)&lt;</expression>
</RegExp>
<RegExp input="$$7" output="+\1" dest="8+">
    <expression repeat="yes">([^ ,]+)</expression>
</RegExp>
<expression></expression>
</RegExp>

<!--GoogleToIMDB-->
<GoogleToIMDB dest="5">
<RegExp input="$$2" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;&gt;&lt;details&gt;\1&lt;/details&gt;" dest="5">
<RegExp input="$$1" output="&lt;url function=&quot;GetFanart&quot;&gt;http://api.themoviedb.org/backdrop.php?imdb=\1&lt;/url&gt;" dest="2+">
<expression>/title/([t0-9]*)</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GoogleToIMDB>

<!-- Fanart -->
<GetFanart dest="5">
<RegExp input="$$2" output="&lt;details&gt;&lt;fanart url=&quot;http://themoviedb.org/image/backdrops&quot;&gt;\1&lt;/fanart&gt;&lt;/details&gt;" dest="5">
<RegExp input="$$1" output="&lt;thumb preview=&quot;/\1/\2_poster.jpg&quot;&gt;/\1/\2.jpg&lt;/thumb&gt;" dest="2">
<expression repeat="yes">/([0-9]*)/([t0-9-]*).jpg&lt;/URL</expression>
</RegExp>
<expression noclean="1">(.+)</expression>
</RegExp>
</GetFanart>
Reply
#50
w00dst0ck Wrote:SVN: https://xbmc.svn.sourceforge.net/svnroot...ers/video/


@HectorziN:
It is possible to get the IMDB Link with a google search.
site:imdb.com +original title +year

I'm using a google wrapper to get the IMDB ID for fanart at my moviemaze scraper.

Code:
<!--URL to Google and Fanart-->
<RegExp conditional="fanart" input="$$8" output="&lt;url function=&quot;GoogleToIMDB&quot;&gt;http://www.google.com/search?q=site:imdb.com+moviemaze\1&lt;/url&gt;" dest="5+">
<RegExp input="$$1" output="\1" dest="7">
    <expression>&lt;h2&gt;\((.*)\)&lt;</expression>
</RegExp>
<RegExp input="$$7" output="+\1" dest="8+">
    <expression repeat="yes">([^ ,]+)</expression>
</RegExp>
<expression></expression>
</RegExp>

<!--GoogleToIMDB-->
<GoogleToIMDB dest="5">
<RegExp input="$$2" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;&gt;&lt;details&gt;\1&lt;/details&gt;" dest="5">
<RegExp input="$$1" output="&lt;url function=&quot;GetFanart&quot;&gt;http://api.themoviedb.org/backdrop.php?imdb=\1&lt;/url&gt;" dest="2+">
<expression>/title/([t0-9]*)</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GoogleToIMDB>

<!-- Fanart -->
<GetFanart dest="5">
<RegExp input="$$2" output="&lt;details&gt;&lt;fanart url=&quot;http://themoviedb.org/image/backdrops&quot;&gt;\1&lt;/fanart&gt;&lt;/details&gt;" dest="5">
<RegExp input="$$1" output="&lt;thumb preview=&quot;/\1/\2_poster.jpg&quot;&gt;/\1/\2.jpg&lt;/thumb&gt;" dest="2">
<expression repeat="yes">/([0-9]*)/([t0-9-]*).jpg&lt;/URL</expression>
</RegExp>
<expression noclean="1">(.+)</expression>
</RegExp>
</GetFanart>

Thanks! it is a great idea but.... always returns the same movie? it could return a wrong one, right?
HectorziN
Reply
#51
spiff Wrote:oh, and the search string encoding worked fine for me. i made a directory named cariño, set content, did the lookup. got the list your url pointed to.

Not a directory, the movie must be called cariño or another movie with a tittle containing ñ

If you search for a movie with the ñ character the scraper cannot find it because the encoding. Using the web browser in filmaffinity.com, it works.

Couls you test it, and... do yoy know the value for searchstringencoding that I need to use?

many thanks!
HectorziN
Reply
#52
Huh

i repeat;
i made a directory named cariño, set content (including scan by dir name obviously), did the lookup. got the list your url pointed to.
Reply
#53
Hi,

The encoding for ñ char is: %F1 but, anyway here you have the complete list (accents, etc):

http://www.jairoblanco.com/guia-rapida/h...ificacion/

greets,
Reply
#54
HectorziN Wrote:Thanks! it is a great idea but.... always returns the same movie? it could return a wrong one, right?

I've included moviemaze in my search string. If it's listed in the external review list of imdb.com [example] I'll be sure that's the same movie.
Reply
#55
spiff Wrote:Huh

i repeat;
i made a directory named cariño, set content (including scan by dir name obviously), did the lookup. got the list your url pointed to.

OK, but the problem I have is this one:
A folder called Movies
In this folder a lot of movies
one of them called "Cariño estoy hecho un perro"
I search information for this movie using the filmaffinity scrapper
and no results found, I change Cariño with Carino and it works.

The problem is that the search is not done with iso encoding, and I don't know the value to set in searchstringencoding
HectorziN
Reply
#56
Hectorzin, have you readed my answer? You must encode your string, you should replace "cariño" with "cari%F1o" in your URL...

regards,
Reply
#57
that will be done by the URL encoding applied prior to passing the argument to the scraper function...
Reply
#58
My scraper is a lot complex. Is there any application to help debugger it?
I want to include impawards posters and I can't get it.

Thanks
HectorziN
Reply
#59
I use xbmc for windows and watch the xbmc.log

There are also some online RegEx testers.
Reply
#60
w00dst0ck Wrote:I use xbmc for windows and watch the xbmc.log

There are also some online RegEx testers.

Where the log file is stored in windows atlantis version?

thanks
HectorziN
Reply
  •  Previous
  • 1
  • 2
  • 3
  • 4(current)
  • 5
  • 6
  • 27
  • Next 



[RELEASE] FilmAffinity (Spanish) scraper52