Login at Kodi Home

sansat · 2009-08-27, 19:54

In the site http://www.rajshri.com, under any section like tv shows, movies etc, when I move my mouse over page 1, 2 etc, I see a javascript - In the source also I see it as

Code:
<a id="ctlPagingTop_lnkBtn4" title="Page 208" href="javascript:__doPostBack('ctlPagingTop$lnkBtn4','')" style="cursor:pointer;cursor:hand;padding:3px;margin-right:0px;border:1px solid #CCCCCC;">208</a>

So is there any way in python to identify javascripts ?

I googled for it by many links say there is no support for javascript in python, so just wanted your opinion if its worth the effort to make a plugin for http://www.rajshri.com

Please let me know

sansat · 2009-08-28, 01:07

Stacked,

I am trying to get only the id from filmicity so that i can add +1 to display each part in the for loop

please guide me in below code as I want only the id instead of videos.php?id=14093,

Code:
import urllib2,urllib,re

#Filmicity

url = 'http://www.filmicity.in/'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

match=re.compile('<a href="videos.php?id=(.+?)"><img src="(.+?)" border="0" height="115" width="155" alt="(.+?)" title="(.+?)" class="reflect rheight20 ropacity50"/></a>').findall(link)

Thanks

sansat · 2009-08-28, 01:12

and once the above script works:
will below code afterthat give me the id ?

Code:
part1 = match[0] 

print part1

FYI -

I am actually not a coder but trying am it out,

stacked · 2009-08-28, 10:06

try this

Code:
import urllib2,urllib,re

#Filmicity

url = 'http://www.filmicity.in/videos.php?id=14169'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

parts=re.compile('1 of (.+?)').findall(link)

match=re.compile('<a href="videos.php\?id=(.+?)" ><img src="templates/Photine/images/next_video.png" border').findall(link)

print parts[0]

print match[0]

save=int(match[0])

for x in range(int(parts[0])):

    save=save+1

    print save

sansat · 2009-08-28, 20:20

Thanks.

Do you see anything wrong with below code as I am not getting any value ? it gives [] instead ?

Code:
import urllib2,urllib,re

#Filmicity

url = 'http://www.filmicity.in/'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

match=re.compile('<a href="/index.php?next=(.+?)">[(.+?)]</a>').findall(link)

print match

Thanks for all your guidance..

sansat · 2009-08-29, 00:05

Some good news, I am have been able to make the parts works, now only the page is left and the plugin will be complete.

Please let me know why I am not able to get the values from below 2 codes:

This codes lists pages for videos(1,2,3, etc)

Code:
import urllib2,urllib,re

#Filmicity

url = 'http://www.filmicity.in/'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

match=re.compile('<a href="/index.php?next=(.+?)">[(.+?)]</a>').findall(link)

print match

This code should list the movies pages (A,B,C-- etc)

Code:
import urllib2,urllib,re

#Filmicity

url = 'http://www.filmicity.in/'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

match=re.compile('<a href='list.php?alpha=(.+?)'>(.+?)</a>').findall(link)

print match

Thanks

stacked · 2009-08-29, 01:34

Quote:There are 11 characters with special meanings: the opening square bracket [, the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening round bracket ( and the closing round bracket ). These special characters are often called "metacharacters".

If you want to use any of these characters as a literal in a regex, you need to escape them with a backslash.

so

Code:
match=re.compile('<a href='list.php?alpha=(.+?)'>(.+?)</a>').findall(link)

should be

Code:
match=re.compile('<a href='list\.php\?alpha=(.+?)'>(.+?)</a>').findall(link)

sansat · 2009-08-29, 15:30

When I put below code- it still gives me no value as []

Code:
import urllib2,urllib,re

#Filmicity

url = 'http://www.filmicity.in/'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

match=re.compile('<a href="/index\.php\?next=(.+?)">[(.+?)]</a>').findall(link)

print match

and for below code it gives me syntax error

Code:
import urllib2,urllib,re

#Filmicity

url = 'http://www.filmicity.in/'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

match=re.compile('<a href='list\.php\?alpha=(.+?)'>(.+?)</a>').findall(link)

print match

please help

sansat · 2009-08-29, 16:16

Ok the first code shown below is working now after I removed [] from second value

Code:
import urllib2,urllib,re

#Filmicity

url = 'http://www.filmicity.in/'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

match=re.compile('<a href="/index\.php\?next=(.+?)">(.+?)</a>').findall(link)

print match

will work on the next one

stacked · 2009-08-29, 21:39

Code:
match=re.compile('<a href=\'list\.php\?alpha=(.+?)\'>(.+?)</a>').findall(link)

you forgot to escape the single quotes

sansat · 2009-08-30, 08:31

Thanks.

Well below is my code where it display,the moviepages, tvshows pages, parts under each show and also plays some but some it does not even if its of dailymotion.It plays on the website but in xbmc there is nothing displayed under parts -normally under part1 folder it would display - play part 1

http://pastebin.com/f449bc4b4

If you see in page 1 of TV shows most of the videos in the parts play but from page 2 onwards its very random - for eample in page 2 - 1 Aug - videos does not have videos under the parts - any idea why its behaving like this ?

Below is the log in which I am able to play a video and for the other it does not display any video to play - the parts folder is empty

http://pastebin.com/m5f6dd8cf

Also is it possible to pass multiple urls in one definition of parts ?- for example in the video page - http://www.filmicity.in/videos.php?id=9865 - parts are not displayed in source like part 1 of - but in its next video page - http://www.filmicity.in/videos.php?id=9866- its displayed as - part 2 of 7
- so I was wondering first to check if main url has the parts if not then change the url to the next page and scrape the part from it...

I am trying to write code shown below for different conditions of parts but the last section of elseif is the one I am trying to figure a way to scrape mutiple urls -

http://pastebin.com/f3d45ab0f

In above code you will get results when we use

url='http://www.filmicity.in/videos.php?id=14169'
url='http://www.filmicity.in/videos.php?id=14159'

but no results

url='http://www.filmicity.in/videos.php?id=9865' - its bcos i am not sure if its possible to do it-

Please let me know.

Thanks

stacked · 2009-08-30, 09:37

sansat Wrote:Also is it possible to pass multiple urls in one definition of parts ?- for example in the video page - http://www.filmicity.in/videos.php?id=9865 - parts are not displayed in source like part 1 of - but in its next video page - http://www.filmicity.in/videos.php?id=9866- its displayed as - part 2 of 7
- so I was wondering first to check if main url has the parts if not then change the url to the next page and scrape the part from it...

I am trying to write code shown below for different conditions of parts but the last section of elseif is the one I am trying to figure a way to scrape mutiple urls -

http://pastebin.com/f3d45ab0f

In above code you will get results when we use

url='http://www.filmicity.in/videos.php?id=14169'
url='http://www.filmicity.in/videos.php?id=14159'

but no results

url='http://www.filmicity.in/videos.php?id=9865' - its bcos i am not sure if its possible to do it-

Please let me know.

Thanks

You could pass multiple urls by:

url='http://www.filmicity.in/videos.php?id=14169'+';'+'http://www.filmicity.in/videos.php?id=14169'+';'+'http://www.filmicity.in/videos.php?id=14169'

then once it gets passed, you can split them by:

urls=url.split(';')

So the first url with be urls[0], then urls[1]...

btw, this site looks a little too unorganized to make a plugin for. There are too many random situations to deal with. You're gonna have a hard to trying to get everything to work smoothly. Good luck...

sansat · 2009-08-30, 12:19

Thanks,

can you please let me know on below query:

Well below is my code where it display,the moviepages, tvshows pages, parts under each show and also plays some but some it does not even if its of dailymotion.It plays on the website but in xbmc there is nothing displayed under parts -normally under part1 folder it would display - play part 1

http://pastebin.com/f449bc4b4

If you see in page 1 of TV shows most of the videos in the parts play but from page 2 onwards its very random - for eample in page 2 - 1 Aug - videos does not have videos under the parts - any idea why its behaving like this ?

Below is the log in which I am able to play a video and for the other it does not display any video to play - the parts folder is empty

http://pastebin.com/m5f6dd8cf

Thanks

stacked · (This post was last modified: 2009-08-30, 18:19 by stacked.)

sansat Wrote:Thanks,

can you please let me know on below query:

Well below is my code where it display,the moviepages, tvshows pages, parts under each show and also plays some but some it does not even if its of dailymotion.It plays on the website but in xbmc there is nothing displayed under parts -normally under part1 folder it would display - play part 1

http://pastebin.com/f449bc4b4

Quote:If you see in page 1 of TV shows most of the videos in the parts play but from page 2 onwards its very random - for eample in page 2 - 1 Aug - videos does not have videos under the parts - any idea why its behaving like this ?

Below is the log in which I am able to play a video and for the other it does not display any video to play - the parts folder is empty

http://pastebin.com/m5f6dd8cf

Thanks

c'mon, it's not magic that I'm finding where the problem is. You can do it too

You just need to break down the code and find what parts aren't working. Usually the problem is with regex not matching.

My though process for finding the problem...
Since I know that everything works until I click on the on of the parts links ( eg. Part 1 for 1 Aug MTV Connected ), I know the problem is somewhere under the VIDEOLINKS function. Because the video is from dailymotion, I can start testing the regex code for dailymotion starting at line 92. Once there, I test out the first regex

Code:
p=re.compile('<param name="movie" value="http://www.dailymotion.com/swf/(.+?)">')

and it works fine. When I test out the second regex

Code:
comp=re.compile('<a href="(.+?)" title="Click to Download"><font color=red>')

, I noticed nothing is matched. So I open the same url in my browser and look at the html source. Link There I can see why the function wasn't returning a video. flashvideodownloader.org shows that "Download temporarily unavailable." for some videos.

sansat · 2009-08-30, 20:17

Thanks Stacked, actually I was looking at only the first regex and it was matching in both working and non working videos, did not look into the second one, do we need the second one ? for any videos as I am not downloading the video's so maybe I will remove it and see if it helps.

Thanks for your guidance and also regarding multiple urls, my requirement was to first scrape a url, if it does not return a value, then scrape the second url, so when I add the below code for each url it does not work so was not sure if we could scrape many url in one definition(for example in def PARTS(url) Smile

using below code twice,

Code:
req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

thanks for your guidance and tips as it really directs me in the right direction..As you said this site is very inconsistent but since its the first site i am trying so if I try various scenarios it will help me in creating other plugins faster - hopefully Smile