Login at Kodi Home

**bossanova808** · (This post was last modified: 2014-01-28, 01:42 by bossanova808.)

https://github.com/HenrikDK/xbmc-common-...ons/pull/3

(pull request to stop fetchpage choking on latin-1 pages

nektahiti · (This post was last modified: 2014-03-16, 06:10 by nektahiti.)

Anyone has any hint on this error?

Code:
NOTICE: [CommonFunctions-2.5.1] parseDOM : 'Couldn't decode html binary string. Data length: 285810'

I am doing a parse like usual and this comes up. The website I am parsing has french accents, could this be related?

Any hint would be greatly appreciated, thanks Smile

I tried bossanova808 patch thinking it might help, no dice Sad

**takoi** · 2014-03-18, 18:29

(2014-01-21, 00:44)bossanova808 Wrote: parsedom has known bugs (e.g. with non ascii chars) and I haven't seen the author for quite some time....I had to re-write one of the functions locally to work around it and plan to eventually abandon it for something better supported.

Ok, I agree now. Author left and project is dead. Somebody should take over. At least split out parseDOM and maintain that as this is probably the most useful function here (all the others looks out of date and not very relevant anymore to me). Btw bossanova, why are you even using fetchpage? Doesn't the requests module already do all it can do and much much more, and better (like automatic decoding)?

**bossanova808** · 2014-03-19, 00:32

The basic answer is it was the first python and add on I wrote and I was a bit useless at it. Kinda pains me looking back at it now, but given it works alright I am not sure I can be bothered re-writing it properly.

That said, which requests module do you mean?

**takoi** · 2014-03-19, 13:00

(2014-03-19, 00:32)bossanova808 Wrote: The basic answer is it was the first python and add on I wrote and I was a bit useless at it. Kinda pains me looking back at it now, but given it works alright I am not sure I can be bothered re-writing it properly.

That said, which requests module do you mean?

That's understandable. Requests probably wasn't even packed for xbmc until 1-2 years ago anyway. (talking about Add-on:Requests (wiki))

**bossanova808** · 2014-03-20, 03:44

Hmm, wierd, have never seen that lib. Looks good!

It just replaces the fetch bit though, doesn't really parse the body content at all. parsedom was easy and apparently fast, that's why I used that over say Beautiful Soup.

I may well get back into refreshing that Addon at some point, if so will use requests then - any tips on something simialrly pythonic for the actual scraping/parsing side of things?

**takoi** · 2014-03-20, 20:31

(2014-03-20, 03:44)bossanova808 Wrote: Hmm, wierd, have never seen that lib. Looks good!

It just replaces the fetch bit though, doesn't really parse the body content at all. parsedom was easy and apparently fast, that's why I used that over say Beautiful Soup.

I may well get back into refreshing that Addon at some point, if so will use requests then - any tips on something simialrly pythonic for the actual scraping/parsing side of things?

I agree on parsedom. It's really great, I don't know of any alternative other than beautiful soup. It still works great though, the only bugs I know of is some minor issues with tabs and passing regex in arguments, not any real issue for myself atm. I still use it. Actually, I've been using it for other non-xbmc things, which is why I've been thinking about forking and make a pypi package for a long time..

**bossanova808** · 2014-04-08, 14:30

My pull request got merged....