The CDATA Class

The CDATA class shown in Example 15.13 is a subclass of Text with almost no functionality of its own. The only difference between CDATA and Text is that when an XMLOutputter serializes a CDATA object, it places its contents in a CDATA section rather than escaping reserved characters such as < with character or entity references.

Example 15.13. The JDOM CDATA class

package org.jdom;

public class CDATA extends Text {

  protected CDATA() { }
  public    CDATA(String s) throws IllegalDataException;
  
  public Text setText(String s) throws IllegalDataException;
  public void append(String s) throws IllegalDataException;

  public String toString();
  
}

In my opinion, you really shouldn’t use this class at all. The builder may (or may not) create CDATA objects when it parses a document that contains CDATA sections. However, you should not create them yourself. CDATA sections are purely a convenience for human authors. They are not part of the document’s infoset. They should not be exposed as a separate item in the logical model of a document, and indeed not all parsers and APIs will report them to the client program. Even APIs like JDOM and DOM that support them do not necessarily guarantee that they’ll be used where possible.

Chapter 11 already warned you not to not use CDATA sections as a sort of pseudo-element to hide HTML in your XML documents. That warning bears repeating now. CDATA sections let you add non-well-formed text to a document, but their contents are just text like any other text. They are not a special kind of element, and a parser more likely than not won’t distinguish between the contents of the CDATA section and the surrounding text. If you have a legitimate reason for doing this, you still need to enclose the CDATA section in an actual element to provide structure programs can detect. For example, an HTML tutorial might enclose HTML code fragments or complete documents in example elements like this:

<example>
<![CDATA[<html>
<body>
  <h1>My First Web Page</h1>
      
  HTML is cool!<P>
  <hr>
  &copy; 2002 John Smith
  </body>
</html>]]>
</example>

This is much more flexible and much more robust than relying on CDATA sections to distinguish the examples from the main body text.


Copyright 2001, 2002 Elliotte Rusty Haroldelharo@metalab.unc.eduLast Modified June 04, 2002
Up To Cafe con Leche