XMLFilter - Andrew.cmu.edu

Download Report

Transcript XMLFilter - Andrew.cmu.edu

95-733 Week 5
Basic SAX Example From Chapter 5 of XML and Java
Working with XML SAX Filters as described in Chapter 5
Finding a Pattern using SAX
<?xml version="1.0" encoding="utf-8"?>
<department>
<employee id="J.D">
department.xml
<name>John Doe</name>
<email>[email protected]</email>
</employee>
<employee id="B.S">
<name>Bob Smith </name>
<email>[email protected]</email>
</employee>
</department>
TextMatch.java
import java.io.IOException;
import java.util.Stack;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;
public class TextMatch extends DefaultHandler {
StringBuffer buffer;
String pattern;
Stack context;
public TextMatch(String pattern) {
this.buffer = new StringBuffer();
this.pattern = pattern;
this.context = new Stack();
}
protected void flushText() {
if (this.buffer.length() > 0) {
String text = new String(this.buffer);
if (pattern.equals(text)) {
System.out.print("Pattern '"+this.pattern
+"' has been found around ");
for (int i = 0; i < this.context.size(); i++) {
System.out.print("/"+this.context.elementAt(i));
}
System.out.println("");
}
}
this.buffer.setLength(0);
}
public void characters(char[] ch, int start, int len)
throws SAXException {
this.buffer.append(ch, start, len);
}
public void ignorableWhitespace(char[] ch, int start, int len)
throws SAXException {
this.buffer.append(ch, start, len);
}
public void processingInstruction(String target, String data)
throws SAXException {
// Nothing to do because PI does not affect the meaning
// of a document.
}
public void startElement(String uri, String local,
String qname, Attributes atts)
throws SAXException {
this.flushText();
this.context.push(local);
}
public void endElement(String uri, String local, String qname)
throws SAXException {
this.flushText();
this.context.pop();
}
public static void main(String[] argv) {
if (argv.length != 2) {
System.out.println("TextMatch <pattern> <document>");
System.exit(1);
}
try {
XMLReader xreader = XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser");
xreader.setContentHandler(new TextMatch(argv[0]));
xreader.parse(argv[1]);
} catch (IOException ioe) {
ioe.printStackTrace();
The XMLReader
} catch (SAXException se) {
interface declares
se.printStackTrace();
setContentHandler and
}
parse.
}
}
<?xml version="1.0" encoding="utf-8"?>
<department>
<employee id="J.D">
<name>John Doe</name>
<email>[email protected]</email>
</employee>
<employee id="B.S">
<name>Bob Smith </name>
<email>[email protected]</email>
</employee>
</department>
Looking for
[email protected]
D:\McCarthy\www\95-733\examples\chap05>java TextMatch
"[email protected]" Department.xml
Pattern '[email protected]' has been found around
/department/employee/email
Filtering XML
Perhaps we would like to modify an existing XML
document.
Or, perhaps we would like to generate and XML document
from a flat file or Database.
We’ll look at six examples that will make the filtering process
clear.
XMLReader
Notes from JDK 1.4 Documentation
org.xml.sax
Interface XMLReader
XMLReader is the interface that an XML parser's SAX2
driver must implement. This interface allows an application
to set and query features and properties in the parser, to
register event handlers for document processing, and to
initiate a document parse.
Notes from JDK 1.4 Documentation
org.xml.sax
Interface XMLReader
Two example methods declared in this interface are:
voidsetDTDHandler(DTDHandler handler)
Allow an application to register a DTD event handler.
voidparse(InputSource input)
Parse an XML document.
Notes from JDK 1.4 Documentation
XMLReader
parse
XML source
setContenthandler
Create XMLReader.
Tell it what to parse.
Tell it where its
contentHandler is.
Tell it to parse.
contentHandler
XMLFilter
Notes from JDK 1.4 Documentation
org.xml.XMLFilter Interface
An XML filter is like an XML reader, except that it obtains its
events from another XML reader rather than a primary source
like an XML document or database. Filters can modify a stream
of events as they pass on to the final application.
For example, the Filter might set its own contentHandler. The
parser will call that one. This intervening handler can be
programmed to call the application’s handler. Thus, the calls
from the parser to the handler are filtered.
Notes from JDK 1.4 Documentation
XMLFilter
package org.xml.sax;
public interface XMLFilter extends XMLReader {
// This method allows the application to link
// the filter to a parent reader (which may
// be another filter). The argument may not be null.
public void setParent(XMLReader parent);
Notes from JDK 1.4 Documentation
// This method allows the application to query the
// parent reader (which may be another filter).
// It is generally a bad idea to perform any
// operations on the parent reader directly:
// they should all pass through this filter.
public XMLReader getParent();
}
Notes from JDK 1.4 Documentation
XMLFilter
XMLReader Interface
XMLFilter Interface
14 Methods
14 XMLReader
Methods + 2
XMLFilter
XMLReader Object
XMLFilter Object
All methods of
XMLReader are
here. They may
block, pass on,
or modify the
calls to the parent
org.xml.sax.helpers
Class XMLFilterImpl
All Implemented Interfaces:
ContentHandler, DTDHandler, EntityResolver, ErrorHandler,
XMLFilter, XMLReader
All XMLReader methods are defined.
These methods, by default, pass calls to the parent XMLReader.
By default, the XMLReader is set to call methods defined
here, in XMLFilterImpl, for XML content.
org.xml.sax.helpers
Class XMLFilterImpl
This class is designed to sit between an XMLReader and the
client application's event handlers. By default, it does nothing
but pass requests up to the reader and events on to the handlers
unmodified, but subclasses can override specific methods to
modify the event stream or the configuration requests as they
pass through.
A Constructor –
XMLFilterImpl(XMLReader parent)
Construct an XML filter with the specified parent.
Notes from JDK 1.4 Documentation
Some Examples Using Filters
// Filter demon 1
// A very simple SAX program
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;
import java.io.IOException;
import org.xml.sax.SAXException;
public class MainDriver {
public static void main(String[] argv) throws SAXException,
IOException {
// Get a parser
XMLReader parser =
XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser");
// Get a handler
MyHandler myHandler = new MyHandler();
// Tell the parser about the handler
parser.setContentHandler(myHandler);
// Parse the input document
parser.parse(argv[0]);
}
}
class MyHandler extends DefaultHandler {
// Handle events from the parser
public void startDocument() throws SAXException {
System.out.println("startDocument is called:");
}
public void endDocument() throws SAXException {
System.out.println("endDocument is called:");
}
}
D:\McCarthy\www\95-733\examples\xmlfilter>java MainDriver
department.xml
startDocument is called:
endDocument is called:
Filter Demo 2
// Filter demon 2
// Adding an XMLFilterImpl that does nothing but supply
// an object that acts as an intermediary.
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLFilterImpl;
import java.io.IOException;
import org.xml.sax.SAXException;
public class MainDriver2 {
public static void main(String[] argv) throws SAXException,
IOException {
// Get a parser
XMLReader parser =
XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser");
// Get a handler
MyHandler myHandler = new MyHandler();
// Get a filter – and pass a pointer to the parser
XMLFilterImpl myFilter = new XMLFilterImpl(parser);
// After we create the XMLFilterImpl, all of the calls we make
// on the parser will go through the filter. For example, we will
// call setContentHandler on the filter and not the parser.
// When we create the filter (it implements many interfaces),
// the parser will call filter methods first. These methods will,
// in turn, call our methods.
// Tell the XMLFilterImpl about the handler
myFilter.setContentHandler(myHandler);
// Parse the input document
myFilter.parse(argv[0]);
}
}
class MyHandler extends DefaultHandler {
// Handle events from the parser
public void startDocument() throws SAXException {
System.out.println("startDocument is called:");
}
public void endDocument() throws SAXException {
System.out.println("endDocument is called:");
}
}
D:\McCarthy\www\95-733\examples\xmlfilter>
java MainDriver2 department.xml
startDocument is called:
endDocument is called:
Filter Demo 3
// Filter demon 3
// Adding an XMLFilterImpl
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLFilterImpl;
import java.io.IOException;
import org.xml.sax.SAXException;
class MyCoolFilterImpl extends XMLFilterImpl {
public MyCoolFilterImpl(XMLReader parser) {
super(parser);
}
// There are two startDocument methods in this
// class. This one overrides the inherited method.
// The inherited method calls the outside
// contentHandler.
// The parser calls this method, this method calls
// the base class method wich calls the outside handler.
public void startDocument() throws SAXException {
System.out.println("Inside filter");
super.startDocument();
System.out.println("Leaving filter");
}
public void endDocument() throws SAXException {
System.out.println("Inside filter");
super.startDocument();
System.out.println("Leaving filter");
}
}
public class MainDriver3 {
public static void main(String[] argv) throws SAXException,
IOException {
// Get a parser
XMLReader parser =
XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser");
// Get a handler
MyHandler myHandler = new MyHandler();
// Get a filter that we will treat as a parser
XMLFilterImpl myFilter = new MyCoolFilterImpl(parser);
// Tell the XMLFilterImpl about the handler
myFilter.setContentHandler(myHandler);
// Parse the input document
myFilter.parse(argv[0]);
}
}
class MyHandler extends DefaultHandler {
// Handle events from the parser
public void startDocument() throws SAXException {
System.out.println("startDocument is called:");
}
public void endDocument() throws SAXException {
System.out.println("endDocument is called:");
}
D:\McCarthy\www\95-733\examples\xmlfilter>
java MainDriver3 department.xml
Inside filter
startDocument is called:
Leaving filter
Inside filter
startDocument is called:
Leaving filter
Filter Demo 4
// Filter demon 4
// Passing xml to an XMLSerializer
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLFilterImpl;
import java.io.FileOutputStream;
import java.io.IOException;
import org.xml.sax.SAXException;
import org.apache.xml.serialize.XMLSerializer; // not standard
import org.apache.xml.serialize.OutputFormat; // not standard
public class MainDriver4 {
public static void main(String[] argv) throws
SAXException, IOException {
// Get a parser
XMLReader parser =
XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser");
// we need to write to a file
FileOutputStream fos =
new FileOutputStream("Filtered.xml");
// An XMLSerializer can collect SAX events
XMLSerializer xmlWriter = new XMLSerializer(fos, null);
// Tell the parser about the handler (XMLSerializer)
parser.setContentHandler(xmlWriter);
// Parse the input document
// The parser sends events to the XMLSerializer
parser.parse(argv[0]);
}
}
D:\McCarthy\www\95-733\examples\xmlfilter>
java MainDriver4 department.xml
D:\McCarthy\www\95-733\examples\xmlfilter>type filtered.xml
<?xml version="1.0"?>
<department>
<employee id="J.D">
<name>John Doe</name>
<email>[email protected]</email>
</employee>
<employee id="B.S">
<name>Bob Smith</name>
<email>[email protected]</email>
</employee>
<employee id="A.M">
<name>Alice Miller</name>
<url href="http://www.foo.com/~amiller/"/>
</employee>
</department>
Filter Demo 5
// Filter demon 5
// Placing a filter between the parser and the
// XMLSerializer
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLFilterImpl;
import java.io.FileOutputStream;
import java.io.IOException;
import org.xml.sax.SAXException;
import org.apache.xml.serialize.XMLSerializer; // not standard
import org.apache.xml.serialize.OutputFormat; // not standard
public class MainDriver5 {
public static void main(String[] argv) throws SAXException,
IOException {
// Get a parser
XMLReader parser =
XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser");
// we need to write to a file
FileOutputStream fos = new FileOutputStream("Filtered.xml");
// An XMLSerializer can collect SAX events
XMLSerializer xmlWriter = new XMLSerializer(fos, null);
// Get a filter
XMLFilterImpl myFilter = new AnotherCoolFilterImpl(parser);
// Tell the XMLFilterImpl about the handler (XMLSerializer)
myFilter.setContentHandler(xmlWriter);
// Parse the input document
myFilter.parse(argv[0]);
}
}
class AnotherCoolFilterImpl extends XMLFilterImpl {
public AnotherCoolFilterImpl(XMLReader parser) {
super(parser);
}
public void startDocument() throws SAXException {
System.out.println("Inside filter");
super.startDocument();
System.out.println("Leaving filter");
}
public void endDocument() throws SAXException {
System.out.println("Inside filter");
super.endDocument();
System.out.println("Leaving filter");
}
}
D:\McCarthy\www\95-733\examples\xmlfilter>
java MainDriver5 department.xml
Inside filter
Leaving filter
Inside filter
Leaving filter
Filtered.xml is as before.
Filter Demo 6
// Filter demo 6
// Writing our own parser and passing calls to a filter
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLFilterImpl;
import java.io.FileOutputStream;
import java.io.IOException;
import org.xml.sax.SAXException;
import org.apache.xml.serialize.XMLSerializer; // not standard
import org.apache.xml.serialize.OutputFormat; // not standard
public class MainDriver6 {
public static void main(String[] argv) throws SAXException,
IOException {
// Get a parser
XMLReader parser = new MyCoolParser();
// we need to write to a file
FileOutputStream fos = new FileOutputStream("Filtered.xml");
// An XMLSerializer can collect SAX events
XMLSerializer xmlWriter = new XMLSerializer(fos, null);
// Tell the parser about the handler (XMLSerializer)
parser.setContentHandler(xmlWriter);
// Parse the input document
parser.parse("Some query or file name or ...");
}
}
class MyCoolParser extends XMLFilterImpl {
public MyCoolParser() {
}
public void parse(String aFileNameOrSQLQuery)
throws IOException, SAXException {
char[] ch = new char[10];
ch[0] = 'H';
ch[1] = 'i';
// go to a file or go to a DBMS with a query
// make calls to call back methods when this
// code feels it's appropriate
startDocument();
startElement("", "MyNewTag", "", null);
characters(ch, 0, 2);
endElement("", "MyNewTag", "");
endDocument();
}
}
D:\McCarthy\www\95-733\examples\xmlfilter>java MainDriver6
D:\McCarthy\www\95-733\examples\xmlfilter>type filtered.xml
<?xml version="1.0"?>
<MyNewTag>Hi</MyNewTag>