Chapter 13. Output from DOM

Table of Contents

Xerces Serialization
OutputFormat
DOM Level 3
Creating DOMWriters
Serialization Features
Filtering Output
Summary

Traditionally, one area in which DOM has been quite weak is serialization, the outputting of an in-memory DOM Document object into a text file. In fact, it’s even possible to use DOM to create Document objects that cannot be serialized as well-formed XML files. (For example, DOM allows an element to have attributes with the same namespace prefix but different namespace URIs.) Serialization has been left as a task for vendor specific classes such as Xerces’s XMLSerializer. However, DOM Level 3 adds several classes for writing XML documents into files, onto the network, or anything else you can hook an OutputStream to.

Xerces Serialization

The Apache XML Project’s Xerces-J includes the org.apache.xml.serialize package for writing DOM Document objects onto output streams. Although this class is bundled with Xerces, it works with any DOM Level 2 implementation. It does not depend on the details of the Xerces implementation classes, only on the standard DOM interfaces.

The basic technique for serializing documents with org.apache.xml.serialize is as follows:

  1. Configure an OutputFormat object with the serialization options you want.

  2. Connect an OutputStream to the location where you want to store the data.

  3. Use the OutputStream and the OutputFormat to construct a new XMLSerializer object.

  4. Pass the Document object you want to serialize to the XMLSerializer’s serialize() method.

For example, the following code could replace the JAXP ID transform from several examples in the last few chapters:

try {
  OutputFormat format = new OutputFormat(document);
  XMLSerializer output = new XMLSerializer(System.out, format);
  output.serialize(document);
}
catch (IOException e) {
  System.err.println(e);
}

XMLSerializer has a number of advantages over JAXP including maintaining the document type declaration`. Furthermore, it’s much more configurable.

XMLSerializer has several constructors. Which to choose depends on whether you want to write to an OutputStream or a Writer and and whether or not you want to provide an OutputFormat when you create the serializer:

public XMLSerializer();
public XMLSerializer(OutputFormat format);
public XMLSerializer(Writer out, OutputFormat format);
public XMLSerializer(OutputStream out, OutputFormat format);

Generally I recommend you specify both the format and the stream or writer when you construct the XMLSerializer. However, if you don’t want to, you can set them or change them later with these methods:

public void setOutputFormat(OutputFormat format);
public void setOutputByteStream(OutputStream out);
public void setOutputCharStream(Writer out);

You must specify either the byte stream or the char stream before you can serialize.

If you don’t want to serialize the entire document, you can pass just an Element or a DocumentFragment to the serialize() method:

public void serialize(DocumentFragment fragment)
    throws IOException;

public void serialize(Document doc)
    throws IOException;

public void serialize(Element element)
    throws IOException;

There are many other methods in the XMLSerializer class. However, mostly you should ignore these unless you’re subclassing it.


Copyright 2001, 2002 Elliotte Rusty Haroldelharo@metalab.unc.eduLast Modified February 10, 2002
Up To Cafe con Leche