TV Show screenshots
#1
I was wondering if anyone would be interested in getting tv show episode screenshots from the web. I had looked into it some time ago and discovered that wikipedia has them for each episode already.

For instance for Stargate SG-1 the link would be:
http://en.wikipedia.org/wiki/List_of..._SG-1_episodes

For smallville it would be:
http://en.wikipedia.org/wiki/List_of...ville_episodes

the icons posted on those links are around
125x94 for 4x3
125x69 for 6x9
I download the slightly bigger icons by clicking on the link to the individual episode.
250x188 for 4x3
250x138 for 6x9
Like for smallville episode 1 of season 1
http://en.wikipedia.org/wiki/Pilot_%...lle_episode%29
will give me a slightly bigger icon.
I don't know if anyone would be interested in working on using xbmc to automatically grab these.
just thought I would throw the info out there.
Reply
#2
Interesting point, but i thought it could be dificult to get them in an automatic manner, isn't it?
Reply
#3
El Piranna Wrote:Interesting point, but i thought it could be dificult to get them in an automatic manner, isn't it?

Couldn't you use the information gathered from say a tv.com scraper to find it by a pattern like the episode name and/or number
Reply
#4
because all of the images on each of the sites have either the episode number and name or just the episode name btw the links I posted don't work

it's supposed to be this
http://en.wikipedia.org/wiki/List_of_Sta...1_episodes
and
http://en.wikipedia.org/wiki/List_of_Sma...e_episodes
Reply
#5
The problem is almost everybody upload wikipedia images naming without care...
Reply
#6
Could we base the pattern off of the surrounding code instead of the image name itself? for instance

<td rowspan="2"><a href="/wiki/Image:Thirty_eight_minutes_%28Stargate_Atlantis%29.jpg" class="image" title=""><img src="http://upload.wikimedia.org/wikipedia/en/thumb/9/98/Thirty_eight_minutes_%28Stargate_Atlantis%29.jpg/150px-Thirty_eight_minutes_%28Stargate_Atlantis%29.jpg" alt="" width="150" height="84" longdesc="/wiki/Image:Thirty_eight_minutes_%28Stargate_Atlantis%29.jpg" /></a></td>
<td><b><a href="/wiki/Thirty-Eight_Minutes_%28Stargate_Atlantis%29" title="Thirty-Eight Minutes (Stargate Atlantis)">Thirty-Eight Minutes</a></b></td>
<td><a href="/wiki/July_30" title="July 30">July 30</a>, <a href="/wiki/2004" title="2004">2004</a></td>
<td>1x04</td>

the imagelink is:
/wiki/Image:Thirty_eight_minutes_%28Stargate_Atlantis%29.jpg
The title is:
title="Thirty-Eight Minutes (Stargate Atlantis)">
The episode number is:
<td>1x04</td>
is the production code
Reply
#7
gzusrawx Wrote:Could we base the pattern off of the surrounding code instead of the image name itself? for instance

If you can do it you have make the most powerfull A.I. program ever Big Grin Maybe could be possible, but now i can't be able to think about it. If you don't know to program please make me a "recipe" how would do you do it and i'll try to do something (repeat: i'll try, i don't think i have de programming skills to do something serious...)
Reply
#8
what programming language would this need to be in?
Reply
#9
A regular expression for the following might work, depending on the page layout.
Code:
<tr>.*?src="(.*?)".*?<td>\d+x\d+</td>\s+</tr>
If there are additional images and tables on the page, you may have to add more to the regex or separate the page by <tr>'s first. My regex may be wrong as well since I only spent about 20 seconds on it, but hopefully it will help you guys get started. Smile

Your other option would be to have me add episode images to my site, which some people have already requested. Not sure when I'd have time to get to it, though. http://tvdb.zsori.com/ (interfaces at http://tvdb.zsori.com/?tab=xml)
Reply
#10
For starters, the scraper engine does not support laziness i.e (.*?) doesn't work.

A scraper is designed to work with all tv shows. This means that one regular expression has to work with every list of episodes. Afaik wikipedia is not consistent enough for this to work.
Reply
#11
If you want screen shots of episodes generated quickly you could use a program called frame shots, it's what I use to generate thumbs on my server, it runs along with an auto it script that keeps the server up to date without regenerating for all the old stuff. Took only a few minutes to set it up, auto-it is free and frame shots beta 3.0 is free and doesn't watermark the images.
Reply
#12
with regexp is there anyway to specify say the 5th image in a list of images?
if I do a regexp for all the images in a page will it give me them in the order they are coded?
Reply
#13
There are already two scripts that gather & display information from tv.com.
My own, TV.com, does have a Gallery feature for either Shows or People.

Available on xbmcscript.com, worth a try ?
Retired from Add-on dev
Reply
#14
You really need a nice organized database-driven site for a scraper to really shine. The content on Wikipedia does not conform to machine-readable standards.
Reply
#15
anyone willing to host one?
I can do some work on posting the current images and organizing them on the site
Reply

Logout Mark Read Team Forum Stats Members Help
TV Show screenshots0