2010-03-20, 03:40
2010-03-20, 16:17
I wrote a minimal scraper for the site, but sadly I cant get the search function working at the moment.
2010-03-21, 01:55
I have added support for what I believe you are trying to do which is return search results in XML. It is not yet fully complete as ive had to head off to work so only managed to add in a few values and the thumbnail image. Search results return in the following format: (example of search return for "die hard")
Enjoy
----
EoD
Code:
<movies>
<movie>
<title>Live Free or Die Hard</title>
<url>http://www.pixelgrafters.co.uk/movie/tt0337978</url>
<released>2007</released>
<thumb>http://images.themoviedb.org/posters/6981/Live_Free_Or_Die_Hard_thumb.jpg</thumb>
<imdb-id>tt0337978</imdb-id>
<overview>Die Hard 4.0 is the fourth film in the Die Hard series starring Bruce Willis as police officer John McClane. It's 2007 and the age of virtual terrorism as McClane must save and protect a computer nerd/hacker who may know the inner workings of a nation wide hack that could take society back into the middle ages.</overview>
</movie>
<movie>
<title>Die Hard</title>
<url>http://www.pixelgrafters.co.uk/movie/tt0095016</url>
<released>1988</released>
<thumb>http://images.themoviedb.org/posters/6997/Die_Hard_thumb.jpg</thumb>
<imdb-id>tt0095016</imdb-id>
<overview>New York cop John McClane gives terrorists a dose of their own medicine as they hold hostages in an LA office building.</overview>
</movie>
<movie>
<title>Die Hard with a Vengeance</title>
<url>http://www.pixelgrafters.co.uk/movie/tt0112864</url>
<released>1995</released>
<thumb>http://images.themoviedb.org/posters/6989/Die_Hard_With_A_Vengence_2_thumb.jpg</thumb>
<imdb-id>tt0112864</imdb-id>
<overview>John McClane and a store owner must play a bomber's deadly game as they race around New York while trying to stop him.</overview>
</movie>
<movie>
<title>Die Hard 2</title>
<url>http://www.pixelgrafters.co.uk/movie/tt0099423</url>
<released>1990</released>
<thumb>http://images.themoviedb.org/posters/6985/Die_Hard_2_thumb.jpg</thumb>
<imdb-id>tt0099423</imdb-id>
<overview>John McClane is forced to battle mercenaries who seize control of an airport's communications and threaten to cause plane crashes if their demands are not met.</overview>
</movie>
</movies>
Enjoy
----
EoD
2010-03-21, 03:09
flobbes Wrote:I wrote a minimal scraper for the site, but sadly I cant get the search function working at the moment.
I have added some support for what you are trying to do.
I have added the ability to have the results return in pure xml.
I have Only had chance to add a few variables atm as I have had to head off to work this evening, here is the following added so far, more will come:
- title
- url (used to view the gui based view for the movie and the high quality fanart)
- released
- thumb (url of small thumbnail image)
- imdb-id (used to search imdb for info if needed)
- overview
Enjoy
----
EoD
2010-03-21, 12:11
Here is a small scraper I wrote, feel free to test it.
Code:
<?xml version="1.0" encoding="utf-8"?>
<scraper framework="10" date="2010-03-22" name="PixelGrafters" content="movies" thumb="eb.jpg" language="en">
<NfoUrl dest="3">
<RegExp input="$$1" output="<url>\1<url>" dest="3">
<expression noclean="1">(http://www.pixelgrafters.co.uk/.+)</expression>
</RegExp>
</NfoUrl>
<CreateSearchUrl clearbuffers="no" dest="3">
<RegExp input="$$1" output="<url spoof="http://www.pixelgrafters.co.uk" post="true">http://www.pixelgrafters.co.uk/api/search/\1</url>" dest="3">
<expression noclean="1"/>
</RegExp>
</CreateSearchUrl>
<GetSearchResults dest="6">
<RegExp input="$$5" output="<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?><results>\1</results>" dest="6">
<RegExp input="$$1" output="<entity><title>\1</title><url>\2</url></entity>" dest="5+">
<expression repeat="yes" trim="3"><title>([^<]*)</title>.+?<api>([^<]*)</api></expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GetSearchResults>
<GetDetails dest="3">
<RegExp input="$$5" output="<details>\1</details>" dest="3">
<RegExp input="$$1" output="<thumb>\1</thumb>" dest="5+">
<expression repeat="yes" noclean="1"><image>(http://images.themoviedb.org/posters/[^<]*)</image></expression>
</RegExp>
<RegExp input="$$1" output="<title>\1</title>" dest="5+">
<expression noclean="1"><title>([^<]*)</title></expression>
</RegExp>
<RegExp input="$$1" output="<year>\1</year>" dest="5+">
<expression><released>(\d{4})[^<]*</released></expression>
</RegExp>
<RegExp input="$$1" output="<plot>\1</plot>" dest="5+">
<expression><overview>([^<]*)</overview></expression>
</RegExp>
<RegExp input="$$1" output="<rating>\1</rating>" dest="5+">
<expression><rating>([^<]*)</rating></expression>
</RegExp>
<RegExp input="$$3" output="<fanart>\1</fanart>" dest="5+">
<RegExp input="$$1" output="<thumb>\1</thumb>" dest="3">
<expression repeat="yes" noclean="1"><image>(http://images.themoviedb.org/backdrops/[^<]*)</image></expression>
</RegExp>
<expression repeat="yes" noclean="1"/>
</RegExp>
<RegExp input="$$1" output="<runtime>\1</runtime>" dest="5+">
<expression><runtime>([^<]*)</runtime></expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GetDetails>
</scraper>
2010-03-22, 06:43
cool looks good, ill give it a blast in a bit, in the meantime I have written a few extra features into the site.
-- More data added to the api search function
-- to return detailed information and high res images on a specified movie, now you can using the id number or api-url which is returned in the data when searched.
-- also added new feature of saving search query's in a database to display top 5 searches on the front page
thats it for a few days i think, my coding mind has melted, hehe. i shall still be keeping it updated and ill keep gradually adding new features, i will post here when i make any changes so keep an eye out
----
EoD
-- More data added to the api search function
-- to return detailed information and high res images on a specified movie, now you can using the id number or api-url which is returned in the data when searched.
-- also added new feature of saving search query's in a database to display top 5 searches on the front page
thats it for a few days i think, my coding mind has melted, hehe. i shall still be keeping it updated and ill keep gradually adding new features, i will post here when i make any changes so keep an eye out
----
EoD
2010-03-22, 09:46
EoDTris Wrote:-- to return detailed information and high res images on a specified movie, now you can using the id number or api-url which is returned in the data when searched. for example http://www.pixelgrafters.co.uk/api/movie/tt0462538 will return high res images and detailed info of The Simpsons Movie
Yes thats nice I already used that for the scraper to get the details.
2010-03-22, 19:54
MaDDoGo Wrote:But all photos are from themoviedb or there are photos stored in their server?
yes all images are scraped from themoviedb which is fine to do that if you simply want to scrape in the background and grab info through xbmc.
The point of this is to create a quick and easy way to see art and images from movies whether that be on a mobile phone, laptop or PC.
It does a stunning amount of filtering to filter all the crap from the returned results as themoviedb is an open database and has a lot of junk. (compare a search on my site to on theirs)
It focuses around the art so you can very quickly find a nice wallpaper or show an image to your mates on your phone of a film your about to go the cinema to see.
Also today ive changed the css style of the site and it looks much cleaner and nicer now and the top 5 searches are displayed on the front page very nicley.
----
EoD
2010-03-22, 21:23
OK,
I asked that because I want to do a common scraper for this and I wanted to know if there was thumbnail or not.
I think this have what to expect:
All you have is to get them the pixelgrafters.co.uk/api/tt#########
I think it works...
I asked that because I want to do a common scraper for this and I wanted to know if there was thumbnail or not.
I think this have what to expect:
Code:
<scraperfunctions>
<PixelGraftersFanart dest="3">
<RegExp input="$$5" output="<details><fanart>\1</fanart></details>" dest="3">
<RegExp input="$$1" output="<thumb>http://images.themoviedb.org/backdrops\1.jpg</thumb>" dest="5">
<expression noclean="1" repeat="yes">backdrops([^\.]*).jpg</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</PixelGraftersFanart>
<PixelGraftersPoster dest="3">
<RegExp input="$$5" output="<details>\1</details>" dest="3">
<RegExp input="$$1" output="<thumb>http://images.themoviedb.org/posters\1.jpg</thumb>" dest="5">
<expression noclean="1" repeat="yes">posters([^\.]*).jpg</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</PixelGraftersPoster>
</scraperfunctions>
All you have is to get them the pixelgrafters.co.uk/api/tt#########
I think it works...
2010-03-23, 06:09
Update,
Now Visits to a movie page are logged to allow for stats on the front page of top 5 most viewed movies.
Also as mentioned before the search queries are also now logged to allow for stats of top 5 searches.
Im not sure if theres much more I can do with this other than just gradually refine and tweak the filtering of results over time.
Also its great to see you ppl trying out your own scrapers with it aswell, i will be trying them out myself tomorow
Anyway I hope you like it and if you have any ideas for me to add to PixelGrafters then I would love to hear them.
Cheers
----
EoD
Now Visits to a movie page are logged to allow for stats on the front page of top 5 most viewed movies.
Also as mentioned before the search queries are also now logged to allow for stats of top 5 searches.
Im not sure if theres much more I can do with this other than just gradually refine and tweak the filtering of results over time.
Also its great to see you ppl trying out your own scrapers with it aswell, i will be trying them out myself tomorow
Anyway I hope you like it and if you have any ideas for me to add to PixelGrafters then I would love to hear them.
Cheers
----
EoD