The first step we should do is to learn how to parse and print a simple XML document using both DOM and SAX.This will help you to get the basic concepts in parsing and how does DOM API differ from SAX.6.1.1 Using DOM parses a XML file and prints it on the console . This is a two stage process, first it parses the XML file and creates a tree structure in the memory.
a) Construction of DOM tree b) Traversing the DOM tree It does a Depth First Pre Order Traversal , for more on tree traversal techniques see here As it goes through each node it prints it contents.
This snippet explains how to handle the element Node parses a XML document and prints it on the console.
When a Sax parser parses a XML document and every time it encounters a tag it calls the corresponding tag handler methods when it encounter a Start Tag it calls this method public void start Element(String name, Attribute List attrs) when it encounter a End Tag it calls this method public void end Element(String name) This program also parses a XML file and prints it on the console . In this example, the Print Using Sax Class extends the Handler Base class and implemets the call back methods to handle the printing The steps involved are Get a instance of SAX parser, The previous program illustrated how to Parse an existing XML file using both SAX and DOM Parsers.
But generating a XML file from scratch is a different story , for instance you might like to generate a xml file for the data extracted form the database.
To keep the example simple this program generates a XML file from a Vector preloaded with hard coded data.
The steps involved are a) Get an instance of DOM parser b) Create a new Document c) Load the Data d) Create a DOM tree with this Data e) Print the DOM tree which will be the XML file In detail a) Get an instance of DOM psrser to the same directory where you have downloaded these programs.object has been created, various attributes of the object can be set to handler functions.When an XML document is then fed to the parser, the handler functions are called for the character data and markup in the XML document.This module uses the , if specified, must be a string naming the encoding used by the XML data.Expat doesn’t support as many encodings as Python does, and its repertoire of encodings can’t be extended; it supports UTF-8, UTF-16, ISO-8859-1 (Latin1), and ASCII. When namespace processing is enabled, element type names and attribute names that belong to a namespace will be expanded.The element name passed to the element handlers Returns the input data that generated the current event as a string.