2012-08-08, 14:41
can someone look at this doing my head in just f5 it and AAARGGGGGGHHHHHHH
this is how it should come out
Code:
import urllib2,urllib,re
url='http://www.imdb.com/title/tt0844441/'
req = urllib2.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
response = urllib2.urlopen(req)
link=response.read()
response.close()
match = re.compile('.+?/rg/tt-episodes/season.+?/images/.+?season.+? href="(.+?)" >(.+?)</a>').findall(link)
for url1, name in match:
url= str(url)+url1
print url
this is how it should come out
Code:
http://www.imdb.com/title/tt0844441/episodes?season=6
http://www.imdb.com/title/tt0844441/episodes?season=5
http://www.imdb.com/title/tt0844441/episodes?season=4
http://www.imdb.com/title/tt0844441/episodes?season=3
http://www.imdb.com/title/tt0844441/episodes?season=2
http://www.imdb.com/title/tt0844441/episodes?season=1