|
|
|
All legal text
characters that can appear anywhere in an XML document can appear in
#PCDATA. This includes characters like < and & that may have to be
escaped with character or entity references. When an API presents the
content of a node containing such a character to your code, it will give you
the actual character, not the escaping text. Similarly, when you create such
a node, the string you use should contain the actual character, not the
entity or character reference.
|
|
Consider this
DocBook programlisting element. A CDATA section is used to embed a literal
sequence of XML text:
|
|
<programlisting><![CDATA[<value>
|
|
<double>28657</double>
|
|
</value>]]></programlisting>
|
|
Everything
inside the CDATA section is content, not markup. The content of this
programlisting element is this text:
|
|
<value>
|
|
<double>28657</double>
|
|
</value>
|
|
A CDATA section
is not required for this trick to work either. For instance, consider this
variation of the above element:
|
|
<programlisting><value>
|
|
<double>28657</double>
|
|
</value></programlisting>
|
|
The content of
this element is exactly the same:
|
|
<value>
|
|
<double>28657</double>
|
|
</value>
|
|
In this case
the markup of the entity references < and > is resolved to
produce the text < and >. However, that's just syntax sugar. It does
not affect the content in any way.
|
|
Now consider
the reverse problem. Suppose you're creating an XML document in something at
least a little more XML-aware than a text editor. Possibilities include:
|
|
A tree-based
editor like <Oxygen/> or XML Spy
|
|
A WYSIWYG
application like OpenOffice Writer or Apple's Keynote that saves its data
into XML
|
|
A programming
API such as DOM, JDOM, or XOM
|
|
In all cases,
the creating tool will provide separate means to insert markup and text. The
tool is responsible for escaping any reserved characters like <, >, or
& when it saves the document. You do not need to do this. Indeed if you
try by passing something like
<double>28657</double> into a method that
expects to get plain text what it will actually save is something like
&lt;double&gt;28657&lt;/double&gt;.
|
|
Similarly, you
cannot type <double>28657</double> into a user interface widget
that creates text and expect it to create an element. Instead what you will
get in the serialized document is something like
<double>28657</double>. Instead, you should use
the UI widget or method call designed for creating a new element.
|
|
The key thing
to remember is this: just because something looks like an XML tag does not
always mean it is an XML tag. Context matters. XML documents are made of
markup that sometimes surrounds PCDATA, but that's the limit of the nesting.
You can put PCDATA inside markup, and you can put markup inside markup, but
you can't put markup inside PCDATA. CDATA sections are just an alternative
means of escaping text. They are not a way to embed markup inside PCDATA.
|