Chapter 15. The JDOM Model

Table of Contents

The Document Class
The Element Class
Navigation and Search
The Attribute Class
The Text Class
The CDATA Class
The ProcessingInstruction Class
The Comment Class
The DocType class
The EntityRef Class

JDOM documents are composed of ten basic classes representing the different kinds of nodes in an XML document:

In JDOM, node objects can belong to a Document; or they can exist independently. A complete JDOM Document contains a list of its children, which normally includes Comments, ProcessingInstructions, possibly a single DocType, and one Element object for the root element. Most of the rest of the structure of the document is built from Element objects, each of which contains a list of its contents, as well as separate lists of attributes and namespaces for that element.

The Document Class

JDOM’s Document class, summarized in Example 15.1, represents a complete well-formed XML document. As a node, it is the root of the document tree. It enforces most of the rules XML imposes on documents. For instance, it is not possible to assign two DocType objects to a single Document object because an XML document cannot have two document type declarations. The single exception to this rule is that it is possible for a Document to be temporarily rootless if it’s built by the no-args constructor. However, in that case, almost anything you do with the document other than setting a root element will throw an IllegalStateException, so this is just a temporary convenience, not a loophole that allows you to break the rules of XML.

Example 15.1. The JDOM Document class

package org.jdom;

public class Document implements Serializable, Cloneable {

  protected ContentList content;
  protected DocType     docType;

  public Document()
  public Document(Element root, DocType docType)
  public Document(Element root) 
  public Document(List newContent, DocType docType) 
  public Document(List content)

  public boolean  hasRootElement()
  public Element  getRootElement()
  public Document setRootElement(Element rootElement)
  public Element  detachRootElement() 
  public DocType  getDocType()
  public Document setDocType(DocType docType)
  public Document addContent(ProcessingInstruction pi)
  public Document addContent(Comment comment) 
  public List     getContent()
  public List     getContent(Filter filter)
  public Document setContent(List newContent)
  public boolean  removeContent(ProcessingInstruction pi)
  public boolean  removeContent(Comment comment) 
  // Java utility methods
  public       String  toString()
  public final boolean equals(Object o)
  public final int     hashCode()
  public       Object  clone() 

As you can probably guess from the setter and getter methods, each Document object has the following three properties:

  • The root Element

  • A DocType object representing the document type declaration

  • A List containing the root Element and any processing instructions or comments in the prolog and epilog, in document order.

The first two may occasionally be null. A document that does not have a document type declaration will have a null DocType. The root Element may also be null, but only temporarily. You can’t do much of anything to a Document until you’ve set its root Element.

Most of the time all you do with a Document is get the root element. From that point forward, you work exclusively with the root element and its descendants until you're ready to serialize the Document. For example,

Document doc ="file.xml");
Element root = doc.getRootElement();
// work with root…

Occasionally, you may need to hunt for a particular processing instruction in the prolog or epilog. For instance, if you were writing a browser, then you would want to look for an xml-stylesheet processing instruction in the prolog by iterating through the list returned by getContent() like so:

      List content = doc.getContent();
      Iterator iterator = content.iterator();
      while (iterator.hasNext()) {
        Object next =;
        if (next instanceof Element) {
          // This is the root element. Thus the prolog is
          // finished.
        else if (next instanceof ProcessingInstruction) {
          ProcessingInstruction candidate 
           = (ProcessingInstruction) next;
          if (candidate.getTarget().equals("xml-stylesheet")) {
            // Load stylesheet… 

However, it’s uncommon to pay that much attention to the contents of the prolog (and even rarer to pay attention to the epilog). Consequently, since the addContent(), setContent(), getContent(), and removeContent() methods work pretty much the same for Document as they do for Element, I’m going to defer further discussion of these to the next section.

Copyright 2001, 2002 Elliotte Rusty Haroldelharo@metalab.unc.eduLast Modified May 05, 2002
Up To Cafe con Leche