[ Cafe con Leche | XML Bible Home Page | Order from amazon.com | Publisher Page ]

In This Chapter


XPointer, the XML Pointer Language, defines an addressing scheme for individual parts of an XML document. These addresses can be used by any application that needs to identify parts of or locations in an XML document. For example, an XML editor could use an XPointer to identify the current position of the insertion point or the range of the selection. An XInclude processor can use an XPointer to determine what part of a document to include. You can also add an XPointer fragment identifier to the URI in an XLink to change it into a URI reference that locates one particular element in the targeted document. XPointers use the same XPath syntax that you’re familiar with from XSLT to identify the parts of the document they point to, along with a few additional pieces.


No mainstream browsers have any support for XPointers. You can use URLs with XPointer fragment identifiers in web pages, but browsers will mostly ignore them.

Why XPointers?

Traditional URLs are simple and easy to use, but they’re also quite limited. For one thing, a URL only points at a single, complete document. More granularity than that, such as linking to the third sentence of the seventeenth paragraph in a document, requires the author of the targeted document to manually insert named anchors at the targeted location. The author of the document doing the linking can’t do this unless he or she also has write access to the document being linked to. Even if the author doing the linking can insert named anchors into the targeted document, it’s almost always inconvenient.

It would be more useful to be able to link to a particular element or group of elements on a page without having to change the document you’re linking to. For example, given a large document such as the television listings of Chapters 4 and 5, you might want to link to only one station or one show. There are several parts to this problem. The first part is addressing the individual elements. This is the part that XPointers solve. XPointers enable you to target a given element by number, name, type, or relation to other elements in the document.

The second part of the problem is the protocol by which a browser asks a web server to send only part of a document rather than the whole thing. This is an area of active research. More work is needed. XPointers do little to solve this problem, except for providing a foundation on which such systems can build. For example, the best efforts to date are the so-called byte range extensions to HTTP available in HTTP 1.1. So far, these have not achieved widespread adoption, mostly because web authors aren’t comfortable specifying a byte range in a document. Furthermore, byte ranges are extremely fragile. Trivial edits to a document, even simple reformatting, can destroy byte range links. HTTP 1.1 does allow other range units besides raw bytes (for example, XML elements), but does not require web servers or browsers to support such units.

For the moment, therefore, an XPointer can be used as an index into a complete document, the whole of which is loaded and then positioned at the location identified by the XPointer, and even this is more than most browsers can handle. In the long term, extensions to XML, XLink, HTTP, and other protocols may allow more sophisticated uses of XPointers. For example, XInclude will let you quote a remote document by using an XPointer to tell browsers where to copy the quote in the original document, rather than retyping the text of the quote. You could include cross-references inside a document that automatically update themselves as the document is revised. These uses, however, will have to wait for the development of several next-generation technologies. For now, you must be content with precisely identifying the part of a document you want to jump to when following an XLink.

XPointer Examples

HTML links generally point to one particular document. Additional granularity—that is, pointing to a particular section, chapter, or paragraph of a particular document—isn’t well supported. Provided you control both the linking and the linked document, you can insert a named anchor into an HTML file at the position to which you want to link, as in the following example:

<H2><A NAME="xtocid20.2">XPointer Examples</A></H2>

You can then link to this position in the file by adding a # and the name of the anchor to the URL. The piece of the URL after the # is called the fragment identifier. For example, in this link the fragment identifier is xtocid20.2:

<A HREF="http://www.cafeconleche.org/bible/20.html#xtocid20.2">
  XPointer Examples


A URL with a fragment identifier is technically a URL reference, not a URL; but outside of specification documents, the distinction is almost never made, nor does it matter.

However, this solution is a kludge. It’s not always possible to modify the target document so that the source document can link to it. The target document might be on a different server controlled by someone other than the author of the source document. And the author of the target document might change or move it without notifying the author of the source.

Furthermore, named anchors violate the principle of separating markup from content. Placing a named anchor in a document says nothing about the document or its content. It’s just a marker for other documents to refer to. It adds nothing to the document’s own content.

XPointers allow much more sophisticated connections between parts of documents. An XPointer can refer to any element of a document; to the first, second, or seventeenth element; to the seventh element named P; to the first element that’s a child of the second DIV element; and so on. XPointers provide precisely targeted addresses of particular parts of documents. They do not require the targeted document to contain additional markup just so its individual pieces can be linked to.

Furthermore, unlike HTML anchors, XPointers don’t point to just a single point in a document. They can point to entire elements, to noncontiguous sets of elements, or to a range of text between two points. Thus, you can use an XPointer to select a particular part of a document, perhaps so it can be copied or loaded into a program.

Here are a few examples of XPointers:


Each of these seven XPointers selects a particular element in a document. The first finds the element with the ID ebnf. The second finds the second language element in the document. The third is a shorthand form of finding the element with the ID ebnf. The fourth and fifth both specify the second language child element of any child element of the body child elements of the spec child of the root node. The sixth finds the second child element of the fourteenth child element of the root element. The final one also points to the element with the ID ebnf. However, if no such element is present, it then finds the element with the ID EBNF.

The document is not specified by the XPointer; rather, the URI that precedes the XPointer specifies the document. This URI may be contained in an XLink linking element or in anything else that contains a URI pointing at an XML document. The XLinks and URIs you saw in Chapter 17 did not contain XPointers, but it isn’t hard to add XPointers to them. Most of the time, you simply append the XPointer to the URI separated by a #, just as you do with named anchors in HTML. For example, the preceding list of XPointers could be suffixed to URLs and come out looking similar to the following:


In fact, these URIs are just six different ways of pointing to the same element of the document at http://www.w3.org/TR/1998/REC-xml-19980210.xml. Often, such URIs are values of the xlink:href attribute of a linking element, as in the following example:

<SPECIFICATION xmlns:xlink="http://www.w3.org/1999/xlink"
 xlink:type="simple" xlink:href=
 xlink:actuate="onRequest" xlink:show="replace">
  Extensible Markup Language (XML) 1.0

XPointers don’t have any special exemptions from the rules of URIs. In particular, if the XPointer contains characters that are not allowed in URLs such as @@OM or ^, these characters must be encoded in UTF-8, and the bytes of the UTF-8 encoding must be hex-escaped using a percent sign. For example, the capital Greek letter omega is Unicode character 3A9 in hexadecimal. When encoded in UTF-8, this character is the two bytes 206 and 169. In hexadecimal, that’s CE and A9. Therefore, the XPointer xpointer(id("@@OM")) would be encoded in a URL as xpointer(id("%CE%A9")).

Because these three characters are part of the ASCII character set, their UTF-8 encodings are simply their values. Therefore, xpointer(descendant::*[.='^']) would be encoded in a URL as xpointer(descendant%3A%3A*[.%3D'%5E']). Modern web browsers allow the square brackets [ and ] in URLs. However, some older browsers do not, so for maximum compatibility you should escape these characters as %5B and %5D, respectively. Thus, the preceding XPointer would become xpointer(descendant%3A%3A*%5B.%3D'%5E'%5D).

A Concrete Example

To demonstrate the different types of XPointers, it’s useful to have a concrete example in mind. Listing 18-1 is a simple, valid document that should be self-explanatory. It contains information about two related families and their members. The root element is FAMILYTREE. A FAMILYTREE can contain PERSON and FAMILY elements. Each PERSON and FAMILY element has a required ID attribute. Persons contain a name, birth date, death date, and spouse. Families contain a husband, a wife, and zero or more children. The individual persons are referred to from the family by reference to their IDs.


This XML application is revisited in Chapter 25.

Listing 18-1: A Family Tree

<?xml version="1.0"?>


  <!-- PERSON elements -->
    ID      ID     #REQUIRED




  <PERSON ID="p1">
    <NAME>Domeniquette Celeste Baudean</NAME>
    <BORN>21 Apr 1836</BORN>
    <SPOUSE IDREF="p2"/>

  <PERSON ID="p2">
    <NAME>Jean Francois Bellau</NAME>
    <SPOUSE IDREF="p1"/>

  <PERSON ID="p3" FATHER="p2" MOTHER="p1">
    <NAME>Elodie Bellau</NAME>
    <BORN>11 Feb 1858</BORN>
    <DIED>12 Apr 1898</DIED>
    <SPOUSE IDREF="p4"/>

  <PERSON ID="p4">
    <NAME>John P. Muller</NAME>
    <SPOUSE IDREF="p3"/>

  <PERSON ID="p7">
    <NAME>Adolf Eno</NAME>
    <SPOUSE IDREF="p6"/>

  <PERSON ID="p6" FATHER="p2" MOTHER="p1">
    <NAME>Maria Bellau</NAME>
    <SPOUSE IDREF="p7"/>

  <PERSON ID="p5" FATHER="p2" MOTHER="p1">
    <NAME>Eugene Bellau</NAME>

  <PERSON ID="p8" FATHER="p2" MOTHER="p1">
    <NAME>Louise Pauline Bellau</NAME>
    <BORN>29 Oct 1868</BORN>
    <DIED>3 May 1938</DIED>
    <SPOUSE IDREF="p9"/>

  <PERSON ID="p9">
    <NAME>Charles Walter Harold</NAME>
    <BORN>about 1861</BORN>
    <DIED>about 1938</DIED>
    <SPOUSE IDREF="p8"/>

  <PERSON ID="p10" FATHER="p2" MOTHER="p1">
    <NAME>Victor Joseph Bellau</NAME>
    <SPOUSE IDREF="p11"/>

  <PERSON ID="p11">
    <NAME>Ellen Gilmore</NAME>
    <SPOUSE IDREF="p10"/>

  <PERSON ID="p12" FATHER="p2" MOTHER="p1">
    <NAME>Honore Bellau</NAME>

  <FAMILY ID="f1">
    <HUSBAND IDREF="p2"/>
    <WIFE IDREF="p1"/>
    <CHILD IDREF="p3"/>
    <CHILD IDREF="p5"/>
    <CHILD IDREF="p6"/>
    <CHILD IDREF="p8"/>
    <CHILD IDREF="p10"/>
    <CHILD IDREF="p12"/>

  <FAMILY ID="f2">
    <HUSBAND IDREF="p7"/>
    <WIFE IDREF="p6"/>


In the sections that follow, this document is assumed to be present at the URL http://www.theharolds.com/genealogy.xml. This isn’t a real URL, but the emphasis here is on selecting individual parts of a document rather than a document as a whole.

Location Paths, Steps, and Sets

Many (though not all) XPointers are location paths. These are the same location paths used by XSLT and discussed in Chapter 15. Consequently, much of the syntax should already be familiar to you.

Location paths are built from location steps. Each location step specifies a point in the targeted document, always relative to some other well-known point such as the start of the document or the previous location step. This well-known point is called the context node. In general, a location step has three parts: the axis, the node test, and an optional predicate. These are combined in this form:


For example, in the location step child::PERSON[position()=2], the axis is child, the node-test is PERSON, and the predicate is [position()=2]. This location step selects the second PERSON element along the child axis, starting from the context node or, less formally, the second PERSON child element of the context node. Of course, which element this actually is depends on what the context node is. Consequently, this is what’s referred to as a relative location step. It’s relative to the context node. There are also absolute location steps that do not depend on the context node.

The axis specifies the direction to search from the context node. For example, an axis can say to look at things that follow the context node, things that precede the context node, things that are children of the context node, things that are attributes of the context node, and so forth.

The node test indicates which nodes to consider along the axis. The most common node test is simply an element name. However, the node test can also be the asterisk (*) wildcard to indicate that any element is to be matched, or one of several functions for selecting comments, text, attributes, processing instructions, points, and ranges. The group of nodes along the given axis that satisfy the node test forms a location set.

The predicate is a boolean XPath expression (exactly like the XPath expressions you learned about in XSLT) that tests each node in that set. If that expression returns false, the node is removed from the set.

Often, after the entire location step—axis, node test, and predicate—has been evaluated, what’s left is a single, unique node. However, not all location steps select exactly one node. In some cases, there may be multiple nodes in the final location set. On occasion, there might be no nodes in the location set; in other words, the location set is the empty set.

A single location step is often not enough to identify the node you want. Commonly, location steps are strung together, separated by slashes, to form a location path. Each location step’s location set becomes the context node-set for the next step in the path. For example, consider this XPointer:


The location path of this XPointer is /child::FAMILYTREE/child::PERSON[position()=3]. It is built from two location steps:

The first location step is an absolute step that selects all child elements of the root node whose name is FAMILYTREE. When applied to Listing 18-1, there’s exactly one such element. The second location step is then applied relative to the FAMILYTREE element returned by the first location step. All of its child nodes are considered. Those that satisfy the node test—that is, elements whose name is PERSON—are returned. There are 12 of these nodes. Each of these 12 nodes is then compared against the predicate to see if its position is equal to 3. This turns out to be true for only one node, Elodie Bellau’s PERSON element, so that is the single node this XPointer points to.

It is not always the case, however, that an XPointer points to exactly one node. For example, consider this XPointer:


This is exactly the same as before except that the equals sign has been changed to a greater than sign. Now when each of the 12 PERSON elements are compared, the predicate returns true for 9 of them. Each of these nine is included in the location set that this XPointer returns. This XPointer points to nine nodes, not to one.

The Root Node

Although Listing 18-1 includes ID attributes for most elements, and although they are convenient, they are not required for linking into the document. You can select any element in the document simply by working your way down from the root node. An initial / indicates the root node.

The root node is not the same as the root element. Rather, it is an abstract node that contains the entire document, including any comments or processing instructions that come before or after the root element, such as xml-stylesheet, and the root element itself. For example, to select the root node of the XML 1.0 specification at http://www.w3.org/TR/REC-xml you can use this URI:


For another example, Domeniquette Celeste Baudean is the first person in Listing 18-1. Therefore, to point at her name, you can get the first element child of the root node (that is, the root element of the document, FAMILYTREE), then count one PERSON down from the root element, and then count one NAME down from that, like this:


This location path says to find the root node, then find all element children of the root node (which, in a well-formed XML document, will be exactly the root element), then find the first PERSON element that’s an immediate child of that element, and then find its NAME child elements.


XPath defines 13 axes along which an XPointer can search for nodes, all from the same XPath syntax used for XSLT. These depend on context to determine exactly what they point to. For example, consider this location path:


It begins with the id() function that returns a node-set containing the element with the ID type attribute whose value is p6. This provides a context node for the following location step along the relative child axis. Other axes include ancestor, descendant, self, ancestor-or-self, descendant-or-self, attribute, and more. Each serves to select a particular subset of the elements in the document. For example, the following axis selects from nodes that come after the context node. The preceding axis selects from nodes that come before the context node. Table 18-1 summarizes the 13 axes.

Table 18-1

Location Step Axes


Selects From


All nodes contained in the context node, but not contained in any other nodes the context node contains


The unique node that contains the context node but that does not contain any other nodes that also contain the context node


The context node


The parent of the context node, the parent of the parent of the context node, the parent of the parent of the parent of the context node, and so forth, back to the root node


The ancestors of the context node and the context node itself


The attributes of the context node


The children of the context node, the children of the children of the context node, and so forth


The context node and its descendants


All nodes that start after the end of the context node, excluding attribute and namespace nodes


All nodes that start after the end of the context node and have the same parent as the context node, excluding attribute and namespace nodes


All nodes that finish before the beginning of the context node, excluding attribute and namespace nodes


All nodes that start before the beginning of the context node and have the same parent as the context node, excluding attribute and namespace nodes

The child axis

The child axis selects from the children of the context node. For example, consider this XPointer:


Reading from right to left, it selects the NAME child elements of the third PERSON element that’s a child of the FAMILYTREE element that’s a child of the root of the document. In this example, there’s only one such element; but if there are more than one, all are returned. For instance, consider this XPointer:


This selects all NAME children of PERSON elements that are children of FAMILYTREE elements that are children of the root. There are a dozen of these in Listing 18-1.

It’s important to note that the child axis only selects from the immediate children of the context node. For example, consider this URI:


This points nowhere because there are no NAME elements in the document that are direct, immediate children of the root node. There are a dozen NAME elements that are indirect children. If you’d like to refer to these, you should use the descendant axis instead of child.

As in XSLT, the child axis is implied if no explicit axis name is present. For example, the preceding three XPointers would more likely be written in this abbreviated form:


The descendant axis

The descendant axis searches through all the descendants of the context node, not just the immediate children. For example, /descendant::BORN selects all the BORN elements in the document. /descendant::BORN[position()=3] selects the third BORN element encountered in a depth-first search of the document tree. (Depth first is the order you get if you simply read through the XML document from beginning to end.) In Listing 18-1, that selects Louise Pauline Bellau’s birthday, <BORN>29 Oct 1868</BORN>. There is no abbreviation for descendant axis.

The descendant-or-self axis

The descendant-or-self axis searches through all the descendants of the context node and the context node itself. For example, id("p11")/descendant-or-self::PERSON refers to all PERSON children of the element with ID p11 as well as that element itself, because it is of type PERSON. There is no abbreviation for descendant-or-self.

The descendant-or-self axis can be abbreviated by using a double slash in place of a single slash. For example, //BORN[position()=3] also selects the third BORN element encountered in a depth-first search of the document tree. //NAME selects all NAME elements in the document. //PERSON/NAME selects all NAME children of PERSON elements.

The parent axis

The parent axis refers to the node that’s the immediate parent of the context node. For example, /descendant::HUSBAND[position()=1]/parent::* refers to the parent element of the first HUSBAND element in the document. In Listing 18-1, this is the FAMILY element with ID f1.

Without a node test, the parent axis can be abbreviated by a .. as in //HUSBAND[position()=1]/...

The self axis

The self axis selects the context node. It’s sometimes useful when making relative links. For example, /self::node() selects the root node of the document (which is not the same as the root element of the document; that would be selected by /child::* or, in this example, /child::FAMILYTREE.) It can be abbreviated by a single period. However, this axis is rarely used in XPointers. It’s more useful for XSLT select expressions.

The ancestor axis

The ancestor axis selects all nodes that contain the context node, starting with its parent. For example, /descendant::BORN[position()=2]/ancestor::*[position()=1] selects the element that contains the second BORN element. Applied to Listing 18-1, it selects Elodie Bellau's PERSON element. There’s no abbreviation for the ancestor axis.

The ancestor-or-self axis

The ancestor-or-self axis selects the context node and all nodes that contain it. For example, id("p1")/ancestor-or-self::* identifies a node-set that includes Domeniquette Celeste Baudean’s PERSON element, that has ID p1, and its parent, the FAMILYTREE element, and its parent, the root node. There’s also no abbreviation for the ancestor-or-self axis.

The preceding axis

The preceding axis selects all nodes that finish before the context node. The first time it encounters an element’s start-tag or empty-element tag, moving backwards from the start of the context node, it counts that element. For example, consider this rule:


This says go to the third BORN element from the root, Louise Pauline Bellau’s birthday, <BORN>29 Oct 1868</BORN>, and then move back six elements. This lands on Maria Bellau’s NAME element. There’s no abbreviation for the preceding axis.

The following axis

The following axis selects all elements that occur after the context node’s closing tag. The first time it encounters an element’s start-tag or empty-element tag, it counts that element. For example, consider this location path:


This says go to Elodie Bellau’s birthday, <BORN>11 Feb 1858</BORN>, and then move forward five elements. This lands on John P. Muller’s SPOUSE element, <SPOUSE IDREF="p3" />, after passing through Elodie Bellau's DIED element, Elodie Bellau’s SPOUSE element, John P. Muller’s PERSON element, and John P. Muller’s NAME element, in this order. There’s no abbreviation for the following axis.

The preceding-sibling axis

The preceding-sibling axis selects elements that precede the context node in the same parent element. For example, /descendant::BORN[position()=2]/preceding-sibling::*[position()=1] selects Elodie Bellau’s NAME element, <NAME>Elodie Bellau</NAME>. /descendant::BORN[position()=2]/preceding-sibling::*[position()=2] doesn’t point to anything, because there’s only one sibling of Elodie Bellau’s BORN element before it. There’s no abbreviation for the preceding-sibling axis.

The following-sibling axis

The following-sibling axis selects elements that follow the context node in the same parent element. For example, /descendant::BORN[position()=2]/following-sibling::*[position()=1] selects Elodie Bellau’s DIED element, <DIED>12 Apr 1898</DIED>. /descendant::BORN[position()=2]/following-sibling::*[position()=3] doesn’t point to anything, because there are only two sibling elements following Elodie Bellau’s BORN element. There’s no abbreviation for the following-sibling axis.

The attribute axis

The attribute axis selects attributes of the context node. For example, the location path /descendant::SPOUSE/attribute::IDREF selects all IDREF attributes of all SPOUSE elements in the document. The attribute axis can be abbreviated by an @ sign. Thus, //SPOUSE/@IDREF also selects all IDREF attributes of all SPOUSE elements in the document. @* is a general abbreviation for an attribute with any name. So, //SPOUSE/@* indicates all attributes of all SPOUSE elements.

For another example, to find all PERSON elements in the document http://www.theharolds.com/genealogy.xml whose FATHER attribute is Jean Francois Bellau (ID p2), you could write //PERSON[@FATHER="p2"].

The xmlns and xmlns:prefix attributes used to declare namespaces are not attribute nodes. To get information about namespaces, you have to use the namespace axis instead.

The namespace axis

The namespace axis contains the namespaces in scope on the context node. It only applies to element nodes. There is one namespace node for each prefix that is mapped to a URI on that element (whether the prefix is used or not, and whether the xmlns:prefix attribute that created the mapping is on the element itself or one of its ancestors). Furthermore, if the element is in a default, nonprefixed namespace, there is also a namespace node for the default namespace.

Although the element is the parent of the namespace node, the namespace node is not the child of the element. A simple walk of the tree or asking for the children of the element will not find namespaces. Instead, you have to walk the namespace axis explicitly. The only node tests that apply to namespace nodes are node() and *.

Fortunately, there’s very little reason to point to a namespace node with an XPointer. This axis is more useful for XSLT and not much used in XPointer.

Node Tests

Most of the time, the node test part of a location step is simply an element or attribute name such as PERSON or IDREF. However, there are nine other possibilities:

An asterisk stands for any element, except on the attribute axis, where it stands for any attribute, and along the namespace axis, where it stands for any namespace. For example, id("p1")/child::* selects all the child elements of the element with the ID p1, regardless of their type. This does, however, select only element nodes. It omits comment nodes, text nodes, processing instruction nodes, and attribute nodes. If you want to select absolutely any kind of node, use the node() node test instead.

A prefix followed by an asterisk selects all elements in the namespace that match the prefix. For example, if the svg prefix is mapped to the http://www.w3.org/2000/svg URI, svg:* matches all SVG elements. Similarly, @prefix:* matches all attributes in the specified namespace. For example, if xlink is mapped to the URI http://www.w3.org/1999/xlink, @xlink:* matches all XLink attributes in the document, such as xlink:type, xlink:show, xlink:actuate, xlink:href, xlink:role, and so forth.

The xmlns:prefix attributes in the document where the XPointer is found do not apply to XPointers in that document (if indeed, the XPointer is even in an XML document and not in an HTML document, painted on the side of a building, or something else). Instead, you prefix the xpointer() part with one or more xmlns(prefix=URI) parts that establish a prefix mapping.

For example, suppose you want to point at the MathML math element in the document at http://www.example.com/equations.xml. You know that this element is in the http://www.w3.org/1998/Math/MathML namespace, but you don’t know what prefix is used in the document. Regardless of what prefix the target document uses, you can use the prefix mml as long as you use an xmlns(mml=http://www.w3.org/1998/Math/MathML) part to associate it with the right URI, as in the following example:


The text() node test specifically refers to the parsed character data content of an element. It’s most commonly used with mixed content. Despite the parentheses, the text() node test does not actually take any arguments. For example, /descendant::text() refers to all of the text but none of the markup of a document. For another example, consider this CITATION element:

    <AUTHOR>Turing, Alan M.</AUTHOR>
    "<TITLE>On Computable Numbers,
      With an Application to the Entscheidungs-problem</TITLE>"
      Proceedings of the London Mathematical Society</JOURNAL>,
    <SERIES>Series 2</SERIES>,

The following location path refers to the quotation mark before the TITLE element:


The first text node in this fragment is the white space between <CITATION CLASS="TURING" ID="C2"> and <AUTHOR>. Technically, this location path refers to all text between </AUTHOR> and <TITLE>, including the white space and not just the quotation mark.

Because character data does not contain any child nodes, you cannot add an additional child, descendant, or attribute relative location step after the first term that selects a text node.

The comment() node test specifically refers to comments. For example, this XPointer points to the third comment in the document:


Because comments do not contain attributes or elements, you cannot add an additional child, descendant, or attribute relative location step after the first term that selects a comment. Despite the parentheses, the comment() node test does not actually take any arguments.

Finally, the processing-instruction() node test selects any processing instructions that occur along the chosen axis. You can use it without any arguments to select all processing instructions, or with an argument to specify the targets of the particular processing instructions you want to select. For example, /descendant::processing-instruction() selects all processing instructions in the document, whereas /descendant::processing-instruction('xml-stylesheet') only finds processing instructions that begin <?xml-stylesheet . /descendant::processing-instruction("php") only finds processing instructions intended for PHP. As with comments, because processing instructions do not contain attributes or elements, you cannot add an additional child, descendant, or attribute relative location step after the first step that selects a processing instruction.

The point() and range() node tests refer to new ways of dividing an XML document that only work in XPointer, not in other standards that use XPath, such as XSLT. They are discussed later in the chapter.


Each location step can contain zero or more predicates that further restrict which nodes an XPointer points to. In many cases, a predicate is necessary to pick the one node from a node-set that you want. This uses the same syntax you learned from XSLT in Chapter 15. Each predicate contains an expression in square brackets ([]). This allows an XPointer to select nodes according to many different criteria, such as the following:

These are just a small sampling of the selections that predicates make possible.

The result of a predicate expression is ultimately converted to a boolean after all calculations are finished. Nonboolean results are converted as follows:

The predicate expression is evaluated for each node in the context node list. Each node for which the expression ultimately evaluates to false is removed from the list. Thus, only those nodes that satisfy the predicate remain. I will not repeat the discussion of the operators and functions available to use expressions here. However, I will show you a few examples of predicates using the expression syntax as it’s likely to be used in XPointers.


Expression syntax is covered in Chapter 15.

Probably the most frequently used function in XPointer predicates is position(). This returns the index of the node in the context node list. This enables you to find the first, second, third, or other indexed node. You can compare positions using the relational operators <, >, =, !=, >=, and <=.

For example, in Listing 18-1 the root FAMILYTREE element has 14 immediate children, 12 PERSON elements, and 2 FAMILY elements. In order, they are:


In fact, this test is so common that XPath offers a shorthand notation for it. Instead of writing [position=X] where X is a number, you can simply enclose the number or an XPath expression that returns the number in the square brackets, like this:


Greater numbers, such as /child::FAMILYTREE/child::*[15], don’t point to anything..

To count all elements in the document, not just the immediate children of the root, you can use the descendant axis instead of child. Table 18-2 shows the first four descendant XPointers for the document element FAMILYTREE of Listing 18-1, and what they point to. Note especially that /child::FAMILYTREE/descendant::*[position()=1] points to the entire first PERSON element, including its children, and not just the <PERSON> start-tag.

Table 18-2

The First Four Descendants of the Document Element


Points To


<PERSON ID="p1">

<NAME>Domeniquette Celeste Baudean</NAME>

<BORN>11 Feb 1858</BORN>

<DIED>12 Apr 1898</DIED>




<NAME>Domeniquette Celeste Baudean</NAME>


<BORN>21 Apr 1836</BORN>



Functions That Return Node-Sets

XPointers are not limited to location paths. In fact, they can use any expression that returns a node-set. In particular, they can use functions that return node-sets. There are three of these:

The last two, here() and origin(), are XPointer extensions to XPath that are not available in XSLT.


The id() function is one of the simplest and most robust means of identifying an element node. It selects the element in the document that has an ID type attribute with a specified value. For example, consider this URI:


If you look at Listing 18-1, you find this element:

<PERSON ID="p12" FATHER="p2" MOTHER="p1">
  <NAME>Honore Bellau</NAME>

Because ID type attributes are unique in valid documents, there shouldn’t be any other elements that match this XPointer. Therefore, http://www.theharolds.com/genealogy.xml#xpointer(id("p12")) should refer to Honore Bellau's PERSON element. Note that the XPointer points to the entire element to which it refers, including all its children, not just the start-tag.


If the document is invalid and more than one element has the same ID, only the first one in the document is selected.

Because ID pointers are so common and so useful, there’s also a shortcut for this. If all you want to do is point to a particular element with a particular ID, you can skip all the xpointer(id("")) frou-frou and just use the bare ID after the #, like this:


This is called a shorthand pointer. You can only use a shorthand pointer if all you want is the particular element with the particular ID. You cannot add additional relative location steps to a URI that uses this shortcut to select children of the element with ID p12 or the third attribute of the element with ID p12. If you want to do that, you have to use the full xpointer(id("p12")) syntax.

The disadvantage of the id() function and shorthand pointers is that they require assistance from the targeted document. If the element you want to point to does not have an ID type attribute, you’re out of luck. If other elements in the document have ID type attributes, you might be able to point to one of them and use a relative location step to point to the one you really want. Nonetheless, ID type attributes work best when you control both the targeted document and the linking document, so that you can ensure that the IDs match the links even as the documents evolve and change over time.

If the document does not have a DTD, it cannot have any ID type attributes, although it may have attributes named ID. In this case, you can’t point at anything using the id() function or a shorthand pointer.

One possibility is to first use an id()-based XPointer, but back it up with an XPointer that looks for the attribute with the specific name anywhere in the document, ID in this example. Simply append the second XPointer to the first, like this:


XPointers are evaluated from left to right. The first match found is returned, so the backup is only used if an ID type attribute with the value p12 can't be found.


The second node-set returning function is here(). However, it’s only useful when used in conjunction with one or more relative location steps. In intradocument links, that is, links from one point in a document to another point in the same document, it’s often necessary to refer to "the next element after this one," or "the parent element of this element." The here() function refers to the node that contains the XPointer so that such references are possible.

Consider Listing 18-2, a simple slide show. In this example, here()/../following::SLIDE[1] refers to the next slide in the show. here()/../preceding::SLIDE[1] refers to the previous slide in the show. Presumably, this would be used in conjunction with a style sheet that showed one slide at a time.

Listing 18-2: A Slide Show

<?xml version="1.0"?>
<SLIDESHOW xmlns:xlink="http://www.w3.org/1999/xlink">
    <H1>Welcome to the slide show!</H1>
    <BUTTON xlink:type="simple"
    <H1>This is the second slide</H1>
    <BUTTON xlink:type="simple"
    <BUTTON xlink:type="simple"
    <H1>This is the third slide</H1>
    <BUTTON xlink:type="simple"
    <BUTTON xlink:type="simple"
    <H1>This is the last slide</H1>
    <BUTTON xlink:type="simple"


Generally, the here() function is only used in XLinks where the href attribute contains a relative URI pointing to the same document. If any URI part is included, it must be the same as the URI of the current document.


The origin() function is much the same as here(); that is, it refers to the source of a link. However, origin() is used in out-of-line links where the link is not actually present in the source document. It points to the element in the source document from which the user activated the link.


Selecting a particular element or node is almost always good enough for pointing into well-formed XML documents. However, on occasion, you might need to point into XML data in which large chunks of non-XML text is embedded via CDATA sections, comments, processing instructions, or some other means. In these cases, you might need to refer to particular ranges of text in the document that don't map onto any particular markup element. Or, you might need to point into non-XML substructure in the text content of particular elements; for example, the month in a BORN element that looks like this:

<BORN>11 Feb 1858</BORN>

An XPath expression can identify an element node, an attribute node, a text node, a comment node, or a processing instruction node. However, it can’t indicate the first two characters of the BORN element (the date) or the substring of text between the first space and the last space in the BORN element (the month).

XPointer generalizes XPath to allow identifiers like this. An XPointer can address points in the document and ranges between points. These may not correspond to any one node. For example, the place between the X and the P in the word XPointer at the beginning of this paragraph is a point. The place between the t and the h in the word this at the end of the first sentence of this paragraph is another point. The text fragment "Pointer generalizes XPath to allow pointers like t" between those two points is a range.

Every point is either between two nodes or between two characters in the parsed character data of a document. To make sense of this, you have to remember that parsed character data is part of a text node. For example, consider this very simple but well-formed XML document:


There are exactly 3 nodes and 14 distinct points in this document. The nodes are the root node, which contains the GREETING element node, which contains a text node. In order, the points are as follows:

  1. The point before the root node
  2. The point before the GREETING element node
  3. The point before the text node containing the text "Hello" (as well as assorted white space)
  4. The point before the line break between <GREETING> and Hello
  5. The point before the first H in Hello
  6. The point between the H and the e in Hello
  7. The point between the e and the l in Hello
  8. The point between the l and the l in Hello
  9. The point between the l and the o in Hello
  10. The point after the o in Hello
  11. The point after the line break between Hello and </GREETING>
  12. The point after the text node containing the text "Hello"
  13. The point after the GREETING element
  14. The point after the root node

Points allow XPointers to indicate arbitrary positions in the parsed character data of a document. They do not, however, enable pointing at a position in the middle of a tag. In essence, what points add is the ability to break up the text content into smaller nodes, one for each character.

A point is selected by using the string-range() function to select a range, then using the start-point () or end-point() function to extract the first or last point from the range. For example, this XPointer selects the point immediately before the D in Domeniquette Celeste Baudean’s NAME element:

xpointer(start-point(string-range (id('p1')/NAME,"Domeniquette")))

This XPointer selects the point after the last e in Domeniquette:


You can also take the start-point () or end-point () of an element, text, comment, processing instruction, or root node to get the first or last point in that node.


Some applications need to specify a range across a document rather than a particular point in the document. For example, the selection a user makes with a mouse is not necessarily going to match up with any one element or node. It might start in the middle of one paragraph, extend across a heading and a picture, and then end in the middle of another paragraph two pages down.

Any such contiguous area of a document can be described with a range. A range begins at one point and continues until another point. The start point and endpoint are each identified by a location path. If the starting path points to a node-set rather than a point, range-to() will return multiple ranges, one starting from the first point of each node in the set.

To specify a range, you append /range-to(end-point) to a location path specifying the start point of the range. The parentheses contain a location path specifying the endpoint of the range. For example, suppose you want to select everything between the first <PERSON> start-tag and the last</PERSON> end-tag in Listing 18-1. The following XPointer accomplishes that:


Range functions

XPointer includes several functions specifically for working with ranges. Most of these operate on location sets. A location set is just a node-set that can also contain points and ranges, as well as nodes.

The range(location-set) function returns a location set containing one range for each location in the argument. The range is the minimum range necessary to cover the entire location. In essence, this function converts locations to ranges.

The range-inside(location-set) function returns a location set containing the interiors of each of the locations in the input. That is, if one of the locations is an element, the location returned is the content of the element (but not including the start- and end-tags). However, if the input location is a range or point, the interior of the location is just the same as the range or point.

The start-point(location-set) function returns a location set that contains the first point of each location in the input location set. For example, start-point(//PERSON[1]) returns the point immediately after the first <PERSON> start-tag in the document. start-point(//PERSON) returns the set of points immediately after each <PERSON> start-tag.

The end-point(location-set) function acts the same as start-point() except that it returns the points immediately after each location in its input.

String ranges

XPointer provides some very basic string-matching capabilities through the string-range() function. This function takes as an argument a location set to search and a substring to search for. It returns a location set containing one range for each nonoverlapping, matching substring. You can also provide optional index and length arguments indicating how many characters after the match the range should start and how many characters after the start the range should continue. The basic syntax is as follows:

string-range(location-set, substring, index, length)

The first argument is an XPath expression that returns a location set specifying which part of the document to search for a matching string. The second substring, argument is the actual string to search for. By default, the range returned starts before the first matched character and encompasses all the matched characters. However, the index argument can give a positive number to start after the beginning of the match. For example, setting it to 2 indicates that the range starts with the second character after the first matched character. The length argument can specify how many characters to include in the range.

A string range points to an occurrence of a specified string, or a substring of a given string in the text (not markup) of the document. For example, this XPointer finds all occurrences of the string "Harold":


You can change the first argument to specify what nodes you want to look in. For example, this XPointer finds all occurrences of the string "Harold" in NAME elements:


String ranges can have predicates. For example, this XPointer finds only the first occurrence of the string "Harold" in the document:


This targets the position immediately preceding the word Harold in Charles Walter Harold’s NAME element. This is not the same as pointing at the entire NAME element as an element-based selector would do.

A third numeric argument targets a particular position in the string. For example, this targets the point between the l and d in the first occurrence of the string "Harold" because d is the sixth letter:


An optional fourth argument specifies the number of characters to select. For example, this URI selects the old from the first occurrence of the entire string "Harold":


If the first string argument in the node test is the empty string, matching positions in the context node’s text contents are selected. For example, the following XPointer targets the first six characters of the document’s parsed character data:


For another example, suppose that you want to find the year of birth for all people born in the nineteenth century. The following will accomplish that:

xpointer(string-range(//BORN, " 18", 2, 4))

This says to look in all BORN elements for the string " 18". (The initial space is important to avoid accidentally matching someone born in 1918 or on the 18th day of the month.) When it’s found, move one character ahead (to skip the space) and return a range covering the next four characters.

When you are matching strings, case is considered. Markup characters are ignored.

Child Sequences

The two most common ways to identify an element in an XML document are by ID and by location. Identifying an element by ID is accomplished through the id() function or a shorthand pointer. Identifying an element by location is generally accomplished by counting children down from the root. For example, the following XPointers both point to John P. Muller’s PERSON element when applied to Listing 18-1:


A child sequence is a shortcut for XPointers like the second example above—that is, an XPointer that consists of nothing but a series of child location steps counting down from the root node, each of which selects a particular child by position only. The shortcut is to use only the position number and the slashes that separate individual elements from each other, like this:


As with the xpointer() scheme, this becomes the fragment identifier in a URI reference:


/1/4 is a child sequence that selects the fourth child element of the first child element of the root. This syntax can be extended for any depth of child elements. For example, these two URIs point to John P. Muller’s NAME and SPOUSE elements, respectively:


Child sequences might include an initial ID. In that case, the counting begins from the element with that ID rather than from the root. For example, John P. Muller’s PERSON element has an ID attribute with the value p4. Consequently, element(p4/1) points to his NAME element and element(p4/2) points to his SPOUSE element.

Each child sequence always points to a single element. You cannot use child sequences with any other relative location steps. You cannot use them to select elements of a particular type. You cannot use them to select attributes or strings. You can only use them to select a single element by its relative location in the tree.


In this chapter, you learned about XPointers. In particular, you learned the following:

In this chapter, you saw XPointers used in XLinks. In Chapter 19, you’ll see them used in XInclude, the third leg in the XML hypertext tripod. XInclude is an element-based syntax for building large XML documents out of smaller XML documents that are themselves complete, well-formed, possibly valid XML documents. The individual pieces out of which the complete document is built are located via URLs. These URLs can have XPointer parts to indicate that only part of a targeted document should be included in the master document.

From XML 1.1 Bible, by Elliotte Rusty Harold. © 2004 by Wiley Publishing, Inc. All rights reserved. Reproduced here by permission of the publisher.
[ Cafe con Leche | XML Bible Home Page | Order from amazon.com | Publisher Page ]