Question: problem about encoding (i think utf-8)
#16
Scratch that I read the trace wrong. What is the code in your show_names function?
Reply
#17
(2013-06-03, 18:35)Bstrdsmkr Wrote: Scratch that I read the trace wrong. What is the code in your show_names function?

Code:
@plugin.route('/series/<stype>/<soption>')
def show_names(stype, soption):
    page = 1
    if soption == 'latest':
        page = int(plugin.request.args.get('page', ['1'])[0])
        names = scraper.get_latest(page, stype)
    elif soption == 'orderby':
        page = int(plugin.request.args.get('page', ['1'])[0])
        names = scraper.get_orderby(page, stype)

    items = []
    
    if page > 1:
        previous_page = str(page - 1)
        items.append({
            'label': '<< %s %s <<' % (_('page'), previous_page),
            'thumbnail': scraper.PREV_IMG,
            'path': plugin.url_for(
                endpoint='show_names',
                stype=stype,
                soption=soption,
                page=previous_page,
                update='true'
            )
        })
    next_page = str(page + 1)
    items.append({
        'label': '>> %s %s >>' % (_('page'), next_page),
        'thumbnail': scraper.NEXT_IMG,
        'path': plugin.url_for(
            endpoint='show_names',
            stype=stype,
            soption=soption,
            page=next_page,
            update='true'
        )
    })
    
    items.extend([{
        'label': name['title']+' '+name['statusname']+' Update('+name['lastupdated']+')',
        'thumbnail': name['thumb'],
        'info': {
            'count': i,
            'genre': name['genre'],
            'year': name['year'],
            'episode': name['numberofep'],
            'director': name['director'],
            'plot': name['plot'],
            'plotoutline': name['prodcom'],
            'title': name['title'],
            'studio': name['studio'],
            'writer': name['writer'],
            'tvshowtitle': name['title'],
            'premiered': name['year'],
            'status': name['statusname'],
            'trailer': name['trailer']
        },
        'path': plugin.url_for(
            endpoint='show_episodes',
            stype=stype,
            soption=soption,
            name_id=name['id']
        )
    } for i, name in enumerate(names)])
    plugin.set_content('tvshows')
    return plugin.finish(items, update_listing=True)

my scraper
Code:
def get_latest(page, stype):
    usefull = __get_usefull(stype)
    url = usefull['surl'] + '?type=serie&pid=%d&ps=20' % int(page-1)
    return _get_names(url)

def _get_names(url):
    tree, html = __get_tree(url)
    
    names = []
    for item in tree.findAll('item'):
        if item.status.string == '0':
            item.status.string = '[Soon]'
        elif item.status.string == '1':
            item.status.string = '[Airing]'
        elif item.status.string == '2':
            item.status.string = '[Completed]'
        names.append({
            'id': item.id.string,
            'title': item.entitle.string,
            'thumb': usefull['poster'] + item.picturename.string,
            'genre': item.type.string,
            'year': int(item.dateshowtime.string),
            'numberofep': int(item.numberofpart.string),
            'director': item.director.string,
            'plot': item.abstract.string,
            'prodcom': item.productioncompany.string,
            'studio': item.stationonair.string,
            'writer': item.written.string,
            'statusname': item.status.string,
            'trailer':item.trailer.string,
            'lastupdated': item.lastupdated.string
        })
    log('_get_names got %d item' % len(names))
    return names

def __get_tree(url):
    log('__get_tree opening url: %s' % url)
    req = urllib2.Request(url)
    req.add_header('User-Agent', usefull['uagent'])
    try:
        html = urllib2.urlopen(req).read()
    except urllib2.HTTPError, error:
        raise NetworkError('HTTPError: %s' % error)
    log('__get_tree got %d bytes' % len(html))
    tree = BeautifulSoup(html, convertEntities=BeautifulSoup.XML_ENTITIES)
    tree
    return tree, html
Reply
#18
Found it, it was in xbmcswift. Line 64 in urls.py, urllib2.quote_plus() isn't unicode friendly, so each value needs to be encoded to utf-8 (or most any other valid encoding), then the dict can be passed to urllib.urlencode() to get back a valid url query string. This is what I use:
Code:
def build_plugin_url(self, queries):
        '''
        Returns a ``plugin://`` URL which can be used to call the addon with
        the specified queries.
        
        Example:
        
        >>> addon.build_plugin_url({'name': 'test', 'type': 'basic'})
        'plugin://your.plugin.id/?name=test&type=basic'
        
        
        Args:
            queries (dict): A dctionary of keys/values to be added to the
            ``plugin://`` URL.
            
        Retuns:
            A string containing a fully formed ``plugin://`` URL.
        '''
        out_dict = {}
        for k, v in queries.iteritems():
            if isinstance(v, unicode):
                v = v.encode('utf8')
            elif isinstance(v, str):
                # Must be encoded in UTF-8, otherwise, force an error
                v.decode('utf8')
            out_dict[k] = v
        return self.url + '?' + urllib.urlencode(out_dict)

@iClosedz, when you pass something in plugin.url_for(), make sure it's encoded to utf-8 and you should be good
Reply
#19
(2013-06-04, 01:01)Bstrdsmkr Wrote: Found it, it was in xbmcswift. Line 64 in urls.py, urllib2.quote_plus() isn't unicode friendly, so each value needs to be encoded to utf-8 (or most any other valid encoding), then the dict can be passed to urllib.urlencode() to get back a valid url query string. This is what I use:
Code:
def build_plugin_url(self, queries):
        '''
        Returns a ``plugin://`` URL which can be used to call the addon with
        the specified queries.
        
        Example:
        
        >>> addon.build_plugin_url({'name': 'test', 'type': 'basic'})
        'plugin://your.plugin.id/?name=test&type=basic'
        
        
        Args:
            queries (dict): A dctionary of keys/values to be added to the
            ``plugin://`` URL.
            
        Retuns:
            A string containing a fully formed ``plugin://`` URL.
        '''
        out_dict = {}
        for k, v in queries.iteritems():
            if isinstance(v, unicode):
                v = v.encode('utf8')
            elif isinstance(v, str):
                # Must be encoded in UTF-8, otherwise, force an error
                v.decode('utf8')
            out_dict[k] = v
        return self.url + '?' + urllib.urlencode(out_dict)

@iClosedz, when you pass something in plugin.url_for(), make sure it's encoded to utf-8 and you should be good
I'm very thank you for you adviser, but I didn't get that to use it

When I can use this function
Code:
def build_plugin_url(self, queries):
        out_dict = {}
        for k, v in queries.iteritems():
            if isinstance(v, unicode):
                v = v.encode('utf8')
            elif isinstance(v, str):
                # Must be encoded in UTF-8, otherwise, force an error
                v.decode('utf8')
            out_dict[k] = v
        return self.url + '?' + urllib.urlencode(out_dict)

when I have the data?
Code:
def _get_names(url):
    tree, html = __get_tree(url)
    
    names = []
    for item in tree.findAll('item'):
        if item.status.string == '0':
            item.status.string = '[Soon]'
        elif item.status.string == '1':
            item.status.string = '[Airing]'
        elif item.status.string == '2':
            item.status.string = '[Completed]'
        names.append({
            'id': item.id.string,
            'title': item.entitle.string,
            'thumb': usefull['poster'] + item.picturename.string,
            'genre': item.type.string,
            'year': int(item.dateshowtime.string),
            'numberofep': int(item.numberofpart.string),
            'director': item.director.string,
            'plot': item.abstract.string,
            'prodcom': item.productioncompany.string,
            'studio': item.stationonair.string,
            'writer': item.written.string,
            'statusname': item.status.string,
            'trailer':item.trailer.string,
            'lastupdated': item.lastupdated.string
        })
    log('_get_names got %d item' % len(names))
    return names
before I return names Is I have to add this ?

name = build_plugin_url(names)
return names

I didn't get that when your "build_plugin_url" function are return the part of plugin url how I have to do it next
Reply
#20
Sorry, that function was for Sphere so I could explain what I was getting at. Just GI through your code and make sure any time you call plugin.url_for(), add .encode('utf-8') to the end of each parameter. This is a work around for a bug in Python
Reply
#21
(2013-06-04, 14:07)Bstrdsmkr Wrote: Sorry, that function was for Sphere so I could explain what I was getting at. Just GI through your code and make sure any time you call plugin.url_for(), add .encode('utf-8') to the end of each parameter. This is a work around for a bug in Python

Oh! I got that

Thank you very much for your advice. Angel
Reply
#22
I have problem again.
I try to encode utf-8 that you tell me it work but some of it doesn't work?
Code:
@plugin.route('/series/<stype>/<soption>')
def show_names(stype, soption):
    page = 1
    if soption == 'latest':
        page = int(plugin.request.args.get('page', ['1'])[0])
        names = scraper.get_latest(page, stype)
    elif soption == 'orderby':
        page = int(plugin.request.args.get('page', ['1'])[0])
        names = scraper.get_orderby(page, stype)

    items = []
    
    if page > 1:
        previous_page = str(page - 1)
        items.append({
            'label': '<< %s %s <<' % (_('page'), previous_page),
            'thumbnail': scraper.PREV_IMG,
            'path': plugin.url_for(
                endpoint='show_names',
                stype=stype,
                soption=soption,
                page=previous_page,
                update='true'
            ).encode('utf-8')
        })
    next_page = str(page + 1)
    items.append({
        'label': '>> %s %s >>' % (_('page'), next_page),
        'thumbnail': scraper.NEXT_IMG,
        'path': plugin.url_for(
            endpoint='show_names',
            stype=stype,
            soption=soption,
            page=next_page,
            update='true'
        ).encode('utf-8')
    })
    
    items.extend([{
        'label': name['title']+' '+name['statusname']+' Update('+name['lastupdated']+')',
        'thumbnail': name['thumb'],
        'info': {
            'count': i,
            'genre': name['genre'],
            'year': name['year'],
            'episode': name['numberofep'],
            'director': name['director'],
            'plot': name['plot'],
            'plotoutline': name['prodcom'],
            'title': name['title'],
            'studio': name['studio'],
            'writer': name['writer'],
            'tvshowtitle': name['title'],
            'premiered': name['year'],
            'status': name['statusname'],
            'trailer': name['trailer']
        },
        'path': plugin.url_for(
            endpoint='show_episodes',
            stype=stype,
            soption=soption,
            name_id=name['id']
        ).encode('utf-8')
    } for i, name in enumerate(names)])
    plugin.set_content('tvshows')
    return plugin.finish(items, update_listing=True)
on
Code:
'path': plugin.url_for(
            endpoint='show_episodes',
            stype=stype,
            soption=soption,
            name_id=name['id']
        ).encode('utf-8')
I try to encode the plugin.url_for

and this is my 'show_episodes'
Code:
def get_episodes(name_id, stype):
    usefull = __get_usefull(stype)
    url = usefull['surl'] + '?type=serielink&ps=100&id=%d' % int(name_id)
    tree, html = __get_tree(url)
    pattern = re.compile("\<Link\>(?P<link>[^\<]*)\<\/Link\>\s*")
    total = re.finditer(pattern,html)
    bufferr = []
    for i in total:
        bufferr.append({'source' : i.group('link')})
    episodes = []
    for item in tree.findAll('item'):        
        episodes.append({
            'title': item.seriename.string,
            'part': item.partnumber.string,
            'date': item.lastupdated.string
        })
    count = 0
    for i in episodes:
        i.update(bufferr[count])
        count+=1
    log('_get_episodes got %d item' % len(episodes))
    print episodes
    return episodes

and my scraper
Code:
def __get_tree(url):
    log('__get_tree opening url: %s' % url)
    req = urllib2.Request(url)
    req.add_header('User-Agent', usefull['uagent'])
    try:
        html = urllib2.urlopen(req).read()
    except urllib2.HTTPError, error:
        raise NetworkError('HTTPError: %s' % error)
    log('__get_tree got %d bytes' % len(html))
    tree = BeautifulSoup(html, convertEntities=BeautifulSoup.XML_ENTITIES)
    return tree, html

it doesn't have any error but it don't work



Edit
------------------------
now that code it work but I don't understand some of them doesn't work

picture 1
Image

picture 2
Image

that is the same word
Reply
#23
The individual items need to be encoded, like this:
Code:
if page > 1:
        previous_page = str(page - 1)
        items.append({
            'label': '<< %s %s <<' % (_('page'), previous_page),
            'thumbnail': scraper.PREV_IMG,
            'path': plugin.url_for(
                endpoint='show_names',
                stype=stype.encode('utf-8'),
                soption=soption.encode('utf-8'),
                page=previous_page.encode('utf-8'),
                update='true'.encode('utf-8')
            )
        })

That's assuming that all those parameters are unicode objects
Reply
#24
(2013-06-04, 20:05)Bstrdsmkr Wrote: The individual items need to be encoded, like this:
Code:
if page > 1:
        previous_page = str(page - 1)
        items.append({
            'label': '<< %s %s <<' % (_('page'), previous_page),
            'thumbnail': scraper.PREV_IMG,
            'path': plugin.url_for(
                endpoint='show_names',
                stype=stype.encode('utf-8'),
                soption=soption.encode('utf-8'),
                page=previous_page.encode('utf-8'),
                update='true'.encode('utf-8')
            )
        })

That's assuming that all those parameters are unicode objects

it didn't work Sad

I don't know why

but thank you for help!
Reply
#25
Every time you try something that doesn't work, you need to post a link to a log of it not working so we can see why
Reply
#26
(2013-06-05, 14:10)Bstrdsmkr Wrote: Every time you try something that doesn't work, you need to post a link to a log of it not working so we can see why

I don't know why it didn't work it have error noting.

can I give you my script to test it?

sorry for this I'm a new dev.
Reply
#27
Sure, if you've got a script or github etc, I'll be glad to take a look. Otherwise, grab your log file from the session where the suggested solution didn't work, upload it somewhere, and post a link to it. More info: http://wiki.xbmc.org/index.php?title=Log_file/Advanced
Reply
#28
this is my repo
https://github.com/iClosedz/

and this is my log file.
http://pastebin.com/Ng6SyZD8

the not correct language is start on line 795 when I try to show the "last episode" I think you need to install Thai language to see this.

Thank you very much for help.
Reply
#29
i look into the code and into the xml-file that you load.
i think, its not a problem of your addon, its a problem of the xml-file,
because sometimes the first item has some special characters in the field <PartNumber>จบ</PartNumber>
the second and all following items have numbers as partnumber.

and just for joking: the จบ ist not very happy Smile
(i dont know thai, but i search in google for a translation)
Reply
#30
(2013-06-05, 22:15)olivaar Wrote: i look into the code and into the xml-file that you load.
i think, its not a problem of your addon, its a problem of the xml-file,
because sometimes the first item has some special characters in the field <PartNumber>จบ</PartNumber>
the second and all following items have numbers as partnumber.

and just for joking: the จบ ist not very happy Smile
(i dont know thai, but i search in google for a translation)

haha Thank you for advice. The จบ in Thai is mean the end of the episode, "end episode"
Reply

Logout Mark Read Team Forum Stats Members Help
Question: problem about encoding (i think utf-8)1