Table of Contents
JDOM documents are composed of ten basic classes representing the different kinds of nodes in an XML document:
Document
Element
Attribute
Text
ProcessingInstruction
Namespace
Comment
DocType
EntityRef
CDATA
In JDOM, node objects can belong to a Document; or they can exist independently. A complete JDOM Document contains a list of its children, which normally includes Comments, ProcessingInstructions, possibly a single DocType, and one Element object for the root element. Most of the rest of the structure of the document is built from Element objects, each of which contains a list of its contents, as well as separate lists of attributes and namespaces for that element.
JDOM’s Document class, summarized in Example 15.1, represents a complete well-formed XML document. As a node, it is the root of the document tree. It enforces most of the rules XML imposes on documents. For instance, it is not possible to assign two DocType objects to a single Document object because an XML document cannot have two document type declarations. The single exception to this rule is that it is possible for a Document to be temporarily rootless if it’s built by the no-args constructor. However, in that case, almost anything you do with the document other than setting a root element will throw an IllegalStateException, so this is just a temporary convenience, not a loophole that allows you to break the rules of XML.
Example 15.1. The JDOM Document class
package org.jdom; public class Document implements Serializable, Cloneable { protected ContentList content; protected DocType docType; public Document() public Document(Element root, DocType docType) public Document(Element root) public Document(List newContent, DocType docType) public Document(List content) public boolean hasRootElement() public Element getRootElement() public Document setRootElement(Element rootElement) public Element detachRootElement() public DocType getDocType() public Document setDocType(DocType docType) public Document addContent(ProcessingInstruction pi) public Document addContent(Comment comment) public List getContent() public List getContent(Filter filter) public Document setContent(List newContent) public boolean removeContent(ProcessingInstruction pi) public boolean removeContent(Comment comment) // Java utility methods public String toString() public final boolean equals(Object o) public final int hashCode() public Object clone() }
As you can probably guess from the setter and getter methods, each Document object has the following three properties:
The root Element
A DocType object representing the document type declaration
A List containing the root Element and any processing instructions or comments in the prolog and epilog, in document order.
The first two may occasionally be null. A document that does not have a document type declaration will have a null DocType. The root Element may also be null, but only temporarily. You can’t do much of anything to a Document until you’ve set its root Element.
Most of the time all you do with a Document is get the root element. From that point forward, you work exclusively with the root element and its descendants until you're ready to serialize the Document. For example,
Document doc = builder.build("file.xml"); Element root = doc.getRootElement(); // work with root…
Occasionally, you may need to hunt for a particular processing instruction in the prolog or epilog. For instance, if you were writing a browser, then you would want to look for an xml-stylesheet processing instruction in the prolog by iterating through the list returned by getContent() like so:
List content = doc.getContent(); Iterator iterator = content.iterator(); while (iterator.hasNext()) { Object next = iterator.next(); if (next instanceof Element) { // This is the root element. Thus the prolog is // finished. break; } else if (next instanceof ProcessingInstruction) { ProcessingInstruction candidate = (ProcessingInstruction) next; if (candidate.getTarget().equals("xml-stylesheet")) { // Load stylesheet… } } }…
However, it’s uncommon to pay that much attention to the contents of the prolog (and even rarer to pay attention to the epilog). Consequently, since the addContent(), setContent(), getContent(), and removeContent() methods work pretty much the same for Document as they do for Element, I’m going to defer further discussion of these to the next section.
Copyright 2001, 2002 Elliotte Rusty Harold | elharo@metalab.unc.edu | Last Modified May 05, 2002 |
Up To Cafe con Leche |