Transcript SAX

SAX
A parser for XML Documents
XML Parsers
• What is an XML parser?
– Software that reads and parses XML
– Passes data to the invoking application
– The application does something useful with the
data
XML Parsers
• Why is this a good thing?
– Since XML is a standard, we can write generic
programs to parse XML data
– Frees the programmer from writing a new
parser each time a new data format comes along
XML Parsers
• Two types of parser
– SAX (Simple API for XML)
• Event driven API
• Sends events to the application as the document is
read
– DOM (Document Object Model)
• Reads the entire document into memory in a tree
structure
Simple API for XML
SAX Parser
• When should I use it?
– Large documents
– Memory constrained devices
• When should I use something else?
– If you need to modify the document
– SAX doesn’t remember previous events unless
you write explicit code to do so.
SAX Parser
• Which languages are supported?
–
–
–
–
Java
Perl
C++
Python
SAX Parser
• Versions
– SAX 1 introduced in May 1998
– SAX 2.0 introduced in May 2000 and adds
support for
• namespaces
• filter chains
• querying and setting properties in the parser
SAX Parser
• Some popular SAX APIs
– Apache XML Project Xerces Java Parser
http://xml.apache.org/xerces-j/index.html
– IBM’s XML for Java (XML4J)
http://www.alphaworks.ibm.com/formula/xml
– For a complete list, see
http://www.megginson.com/SAX
SAX Implementation in Java
• Create a class which extends the SAX event
handler
Import org.xml.sax.*;
import org.xml.sax.helpers.ParserFactory;
Public class SaxApplication extends HandlerBase {
public static void main(String args[]) {
}
}
SAX Implementation in Java
• Create a SAX Parser
public static void main(args[]) {
String parserName = “org.apache.xerces.parsers.SAXParser”;
try {
SaxApplication app = new SaxApplication();
Parser parser = ParserFactory.makeParser(parserName);
parser.setDocumentHandler(app);
parser.setErrorHandler(app);
parser.parse(new InputSource(args[0]));
} catch (Throwable t) {
// Handle exceptions
}
}
SAX Implementation in Java
• Override event handlers of interest
Public class SaxApplication extends HandlerBase {
public void main (String args[]) {
// stuff missing
}
public void startElement(String name, AttributeList attrs) {
// Process this element
}
}
SAX Implementation in Java
• Other events generated by the parser
–
–
–
–
–
startDocument()
endDocument()
startElement()
endElement()
error()
For more information...
•
•
•
•
java.sun.com/xml
www.megginson.com/SAX
www.xml.com
www.ibm.com/developer/xml