Friday, 26 December 2014

simple sax parser in java

The XML DocumentHandler interface specifies a number of “callbacks” that your code
must provide. In one sense, this is similar to the Listener interfaces in AWT and
Swing, as covered briefly in Recipe 14.4. The most commonly used methods are
startElement() , endElement() , and characters() . The first two, obviously, are called
at the start and end of an element, and characters() is called when there is charac-
ter data. The characters are stored in a large array, and you are passed the base of the
array and the offset and length of the characters that make up your text. Conve-
niently, there is a string constructor that takes exactly these arguments. Hmmm, I
wonder if they thought of that....
To demonstrate this, I wrote a simple program using SAX to extract names and email
addresses from an XML file. The program itself is reasonably simple and is shown in
import java.io.IOException;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;
import com.darwinsys.util.Debug;
/**
* Simple lister - extract name and children tags from a user file. Version for SAX 2.0
* @version $Id: ch21,v 1.5 2004/05/04 20:13:38 ian Exp $
*/
public class SAXLister {
public static void main(String[] args) throws Exception {
new SAXLister(args);
}
public SAXLister(String[] args) throws SAXException, IOException {
XMLReader parser = XMLReaderFactory
.createXMLReader("org.apache.xerces.parsers.SAXParser");
// should load properties rather than hardcoding class name
parser.setContentHandler(new PeopleHandler());
parser.parse(args.length == 1 ? args[0] : "parents.xml");
}
/** Inner class provides DocumentHandler
*/
class PeopleHandler extends DefaultHandler {
boolean parent = false;
boolean kids = false;
public void startElement(String nsURI, String localName,
String rawName, Attributes attributes) throws SAXException {
Debug.println("docEvents", "startElement: " + localName + ","
+ rawName);
// Consult rawName since we aren't using xmlns prefixes here.
if (rawName.equalsIgnoreCase("name"))
parent = true;
if (rawName.equalsIgnoreCase("children"))
kids = true;
}
public void characters(char[] ch, int start, int length) {
if (parent) {
System.out.println("Parent: " + new String(ch, start, length));
parent = false;
} else if (kids) {
System.out.println("Children: " + new String(ch, start, length));
kids = false;
}
}
/** Needed for parent constructor */
public PeopleHandler() throws org.xml.sax.SAXException {
super();
}
}
}
$ java -classpath .:../jars/darwinsys.jar:../jars/xerces.jar SAXLister people.xml
Parent: Ian Darwin
Parent: Another Darwin
$

No comments:

Post a Comment