I haven't myselft played with minidom but here is an example:
http://sebsauvage.net/python/snyppets/in...#parse_rss
Concerning ElementTree or BeautifulSoup, I used both on the same XML and from my point of view:
- BeautifulSoup supports better errors in XML file but is a little bit slower
Here a quick example (not tested specifically this one, but is should work):
Code:
from BeautifulSoup import BeautifulStoneSoup, Tag, NavigableString
soup = BeautifulStoneSoup((open(os.path.join(CACHEDIR, XMLFile), 'r')).read())
cat_scrapers = soup.find("scrapers")
if cat_scrapers != None:
for item in cat_scrapers.findAll("entry"):
if hasattr(item.title,'string'):
if item.title.string != None:
title = item.title.string.encode("cp1252")
if hasattr(item.version,'string'):
if item.version.string != None:
version = item.version.string.encode("utf-8")
if hasattr(item.lang,'string'):
if item.lang.string != None:
language = item.lang.string.encode("utf-8")
if hasattr(item.date,'string'):
if item.date.string != None:
date = item.date.string.encode("cp1252")
if hasattr(item.previewvideourl,'string'):
if item.previewvideourl.string != None:
previewVideoURL = item.previewvideourl.string.encode("utf-8")
- ElementTree doesn't like error (means you have more to right in order to cover those cases) but is very very fast for parsing XML, I have seen a huge different in the speed.
Here the same example with Element Tree (not tested):
Code:
import elementtree.ElementTree as ET
elems = ET.parse( open( os.path.join( CACHEDIR, XMLFile ), "r" ) ).getroot()
cat_scrapers = elems.find( "scrapers" ).findall( "entry" )
for item in cat_scrapers:
title = item.findtext( "title" )
version = item.findtext( "version" )
language = item.findtext( "lang" )
date = item.findtext( "date" )
added = item.findtext( "added" )
previewVideoURL = item.findtext( "previewVideoURL" )
In both cases you will need to cover the exceptions (using try/excpetion block) and limit cases of course.
Here it is. I hope it helps.