2011-08-29, 22:15
Hi.
In this code segment for parsing http://www.mako.co.il/mako-vod-index:
The result is unicode code and not hebrew character. when reading the log file, or adding a string I have generated using join to a directory I see
[u'\u05ea\u05d0\u05de\u05d9\u05df \u05dc\u05d9']
and not the Hebrew text I want.
How to fix this?
Thanks
In this code segment for parsing http://www.mako.co.il/mako-vod-index:
Code:
soup = BeautifulStoneSoup(link, convertEntities=BeautifulStoneSoup.XML_ENTITIES)
programs = soup('ul')
for i,prog in enumerate(programs):
if i==(4+getLetterValue(name)):
j = 0
while j < len(prog('li')):
li = prog('li')[j]
link = li('a')[0]
url = link['href']
text = link.contents
print ''.join(text)
j = j+1
The result is unicode code and not hebrew character. when reading the log file, or adding a string I have generated using join to a directory I see
[u'\u05ea\u05d0\u05de\u05d9\u05df \u05dc\u05d9']
and not the Hebrew text I want.
How to fix this?
Thanks