Login at Kodi Home

~~mikey1234~~ · (This post was last modified: 2012-07-12, 14:42 by mikey1234.)

how can i return like only 12 items instead of the lot

Code:
url = 'http://deturl.com/www.youtube.com/results?search_query=adele+karaoke'         

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

match=re.compile('src="http://i1.ytimg.com/vi/(.+?)/1.jpg').findall(link)

for url in match:

  url1='http://www.flipbooth.com/yt/%s/' % url

  req = urllib2.Request(url1)

  req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

  response = urllib2.urlopen(req)

  link=response.read()

  response.close()

  match1 = re.compile('property="og:title" content="(.+?)"/>\r\n<meta').findall(link)

oneadvent · (This post was last modified: 2012-07-12, 14:16 by oneadvent.)

Do you need to limit the search or the for statement?

For instance is it ok for match1 to obtain all matches if the for loop only deals with the first 12?

If it was a limit on the for statement you could just add a counter, ie:

Code:
nums = {1,2,3,4,5,6,7,8,9,10,11,12} #mock match1 list

sum = 0

i = 0

for i in nums: #same as your for name in match

    sum = sum + 1

    if sum > 4: #hard limit of 5 here (starts with 0)

        break  #this breaks out of the for statement

    else:

        print str(i) #whatever code you need goes here

~~mikey1234~~ · (This post was last modified: 2012-07-12, 14:43 by mikey1234.)

i changed the code in first post to show all code

basically when people do a search match spits out about 25 to 30 but takes forever

so i only want it to return about 12

oneadvent · (This post was last modified: 2012-07-12, 15:10 by oneadvent.)

Looks to me like this is the slow one:

Code:
match1 = re.compile('property="og:title" content="(.+?)"/>\r\n<meta').findall(link)

which is being called however many times

Code:
match=re.compile('src="http://i1.ytimg.com/vi/(.+?)/1.jpg').findall(link)

matches. So doing what I was saying should work:

Code:
url = 'http://deturl.com/www.youtube.com/results?search_query=adele+karaoke'         

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

match=re.compile('src="http://i1.ytimg.com/vi/(.+?)/1.jpg').findall(link)

i = 0

for url in match:

  i +=1

  if i > 11:

    break

  else:

    url1='http://www.flipbooth.com/yt/%s/' % url

    req = urllib2.Request(url1)

    req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

    response = urllib2.urlopen(req)

    link=response.read()

    response.close()

    match1 = re.compile('property="og:title" content="(.+?)"/>\r\n<meta').findall(link)

Suppose it is "sloppy" but I do not see a way to limit re matches via the re command.

~~mikey1234~~ · 2012-07-12, 15:39

lol not sloppy.....

but match still returns 24

and match1 only returns 1

match is actually pretty quick

but match1 uses the url from match and scrapes a different website 24 times to get name

oneadvent · (This post was last modified: 2012-07-12, 15:43 by oneadvent.)

hmmm gotta be a typo. Yea it is ok for match to have 24, and like you are saying match1 is the slow poke because of all the websites. Thats why we want it to stop after 12...

Really has to be a typo somewhere....either with me or you?

Wanna repaste the code? Oh and "if i > 11:" realy should be "if i>=11:"

oneadvent · 2012-07-12, 15:59

OK I ran this:

Code:
# This test program is for finding the correct Regular expressions on a page to insert into the plugin template.

# After you have entered the url between the url='here' - use ctrl-v

# Copy the info from the source html and put it between the match=re.compile('here')

# press F5 to run if match is blank close and try again.

import urllib2,urllib,re

url = 'http://deturl.com/www.youtube.com/results?search_query=adele+karaoke'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

match=re.compile('src="http://i1.ytimg.com/vi/(.+?)/1.jpg').findall(link)

print match

i = 0

for url in match:

      i +=1

      if i >= 11:

         break

      else:

         url1='http://www.flipbooth.com/yt/%s/' % url

         req = urllib2.Request(url1)

         req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

         response = urllib2.urlopen(req)

         link=response.read()

         response.close()

         match1 = re.compile('property="og:title" content="(.+?)"/>\r\n<meta').findall(link)

         print match1

and I get the result:

Code:
['TGVhQ61C6IU', 'sTE9s43rdT8', 'v5pJMwgKQcw', 'XUQVk4oG8LM', 'x-nTWktLBL8', 'Cgdes6lFjzM', 'vWaXO1wnOKU', 'L7nmEjjrKGc', 'fgfJLw9Zug4', '1l0drm-lM5M', 'uqhVqsPI5v8', 'KMOmkQ_LhYA', 'AX-kWqAwH1I', 'HHL0q-z7CyM', '59CPfYsIKc4', '2QC8DfcSe_0', 'SFRJpyNQ1K0', '3lrBve6xLw8', 'MgEyI3EFPP4', '_4fNgNCxQWk', '7D26e6l_5iI', 'JFwd16raBKU', 'LrW7Umr_5wM', 'SRn4Fh47kok', 'li9W-yEjK2g', 'uQCaVs5FGpc', '3bxsKcbKVa0', 'buG0HCAFy3s', '2AHVUH_bGBY', 'LtMb_fGHUY0', '4W5TO-woLmg', '0jF6XyW3QY4']

['Turning Tables ~ Adele Karaoke/Intrumental']

['Make You Feel My Love - Adele [Karaoke/Instrumental]']

['Someone Like You - ADELE (Karaoke)']

['Adele - Set Fire To The Rain (Karaoke)']

['One And Only -- Adele (karaoke - full version)']

['Adele - Someone Like You Karaoke']

[]

[]

['Set Fire To The Rain - Adele - Karaoke']

["I'll Be Waiting - Adele Karaoke/Instrumental + Lyrics"]

So it is definitely limiting, the two blanks were removed btw, might want to check for those before adding a 1 to i.

~~mikey1234~~ · 2012-07-12, 16:29

thats brilliant thank you

you a star