Hands On XSLT


Hands-On XSLT

Elliotte Rusty Harold

Software Development 2003 West

March 24, 2003

elharo@metalab.unc.edu

http://www.cafeconleche.org/


What Is XSL?


Versions


Part I: Basic XSLT


The Process of an XSL Transformation

  1. The XML parser reads an XML document and forms a tree

  2. The tree is passed to the XSLT processor

  3. The XSLT processor compares the nodes in the tree to the instructions in the style sheet

  4. When the XSLT processor finds a match, it outputs a tree fragment

  5. (Optional) The complete output tree is serialized to some other format such as text, HTML, or an XML file


XSLT Software


Two Example XML Documents


An XSLT Style Sheet

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <!-- Templates will go here... -->

</xsl:stylesheet>


Exercise 1: Running the XSLT Processor

Apply the empty stylesheet to both input documents.

What do you see?


Template Rules

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="NAME">
    <h2><xsl:value-of select="."/></h2>
  </xsl:template>

</xsl:stylesheet>
View Transformed Document in Browser

Exercise 2: Template Rules

Add a template to the stylesheet that puts the ATOMIC_NUMBER in a paragraph (an HTML P element).

If time permits put the other information in paragraphs too.


Literal text

  <xsl:template match="ATOMIC_WEIGHT">
    <p>Atomic Weight: <xsl:value-of select="."/></p>
  </xsl:template>
View Transformed Document in Browser

Exercise 3: Literal Character Data and Markup

Easy

  1. Add human readable labels (e.g. "Atomic weight") to each paragraph

  2. Italicize the atomic symbol.

Medium

Same problem, but do it with CSS instead of an i element.)


Adding the root

 <xsl:template match="/">
    <html>
      <body>
        <xsl:value-of select="."/>
      </body>
    </html>
  </xsl:template>
View Transformed Document in Browser

Applying Templates

 <xsl:template match="/">
    <html>
      <body>
        <xsl:apply-templates/>
      </body>
    </html>
  </xsl:template>

View Transformed Document in Browser

The select attribute

Select only the NAME of each ATOM:

  <xsl:template match="ATOM">
    <xsl:apply-templates select="NAME"/>
  </xsl:template>
View Transformed Document in Browser

Exercise 4: Applying Templates

Make each atom produce the following format (modulo white space):

<h2>Name [Symbol]</h2>
    <ul>
      <li>Atomic Weight: Atomic Weight</li>
      <li>Atomic Number: Atomic Number</li>
    </ul>

Attributes

  <xsl:template match="DENSITY">
    <li>Density: <xsl:value-of select="."/> <xsl:value-of select="@UNITS"/></li>
  </xsl:template>

View Transformed Document in Browser

Exercise 5: Attributes

Add density, atomic volume, and atomic radius to the information displayed for each atom. Include the units.


Attribute Value Templates


View Transformed Document in Browser

Iteration with xsl:for-each


View Transformed Document in Browser

Exercise 6: Iteration

Create hypertext links from the element names in the initial list to the full description of that element later in the document.


Matching Elements in Namespaces


Exercise 7: Matching Elements in Namespaces

Add an xmlns="http://www.example.com/" attribute to the root PERIODIC_TABLE element. Then fix the last stylesheet to work with it.


Part II: XPath


XPath Explorer

screenshot of XPath Explorer

Expressions to Try Out


XPath Data Model


Node Properties


Location steps

Examples:


Axes

There are twelve axes along which a location step can move. Each selects a different subset of the nodes in the document, depending on the context node. These are:

self

The node itself.

child

All child nodes of the context node. (Attributes and namespaces are not considered to be children of the node they belong to.)

descendant

All nodes completely contained inside the context node (between the end of its start-tag and the beginning of its end-tag); that is, all child nodes, plus all children of the child nodes, and all children of the children’s children, and so forth. This axis is empty if the context node is not an element node or a root node.

descendant-or-self

All descendants of the context node and the context node itself.

parent

The node which most immediately contains the context node. The root node has no parent. The parent of the root element and comments and processing instructions in the document’s prolog and epilog is the root node. The parent of every other node is an element node. The parent of a namespace or attribute node is the element node that contains it, even though namespaces and attributes aren’t children of their parent elements.

ancestor

The root node and all element nodes that contain the context node.

ancestor-or-self

All ancestors of the context node and the context node itself.

preceding

All non-attribute, non-namespace nodes which come before the context node in document order and which are not ancestors of the context node

preceding-sibling

All non-attribute, non-namespace nodes which come before the context node in document order and have the same parent node

following

All non-attribute, non-namespace nodes which follow the context node in document order and which are not descendants of the context node.

following-sibling

All non-attribute, non-namespace nodes which follow the context node in document order and have the same parent node

attribute

Attributes of the context node. This axis is empty if the context node is not an element node.

namespace

Namespaces in scope on the context node. This axis is empty if the context node is not an element node.


Node Tests

The axis chooses the direction to move from the context node. The node test determines what kinds of nodes will be selected along that axis. The node tests are:

Name

Any element or attribute with the specified name. If the name is prefixed, then the local name and namespace URI are compared, not the qualified names. If the name is not prefixed, then the element must be in no namespace at all. An unprefixed name in an XPath expression never matches an element in a namespace, even in the default namespace. When using XPath to search for an unprefixed element like ATOM that is in a namespace, you have to use a prefixed name instead such as chem:Quote. Exactly how the prefix is mapped to the namespace depends on the environment in which the XPath expression is used.

*

Along the attribute axis the asterisk matches all attribute nodes. Along the namespace axis the asterisk matches all namespace nodes. Along all other axes, this matches all element nodes.

prefix:*

Any element or attribute in the namespace mapped to the prefix.

comment()

Any comment

text()

Any text node

node()

Any node

processing-instruction()

Any processing instruction

processing-instruction('target')

Any processing instruction with the specified target


Exercise 8: Axes and Node Tests

Write a stylesheet that lists all the titles in xpath.xml.


Predicates

Examples:


Exercise 9: Predicates

Easy

The atomic number of uranium, the heaviest naturally occurring element, is 92. Write a stylesheet that lists the names of all the transuranium elements.

Medium

Write a stylesheet that divides the periodic table into two sections, the first containing all the elements with atomic numbers less than or equal to 92, the second with atomic numbers greater than or equal to 92. Put an H2 header in front of each section.


Multistep Location Paths

Examples


Absolute location paths

Examples


Abbreviated location paths

AbbreviationExpanded form
Namechild::Name
@Nameattribute::Name
///descendant-or-self::node()/
.self::node()
..parent::node()

Examples


Combining location paths with |

Examples


General Expressions


Data Types


Literals


Number Operators


Exercise 10: Arithmetic

The boiling points in the periodic table document are given in degrees Kelvin.

Write a stylesheet which produces a table of boiling points for each element in degrees Celsius and degrees Fahrenheit

Celsius = Kelvin - 273.15.

Fahrenheit = (9/5 * Celsius) + 32


Boolean Operators

<less than
>greater than
<=less than or equal to
>=greater than or equal to
=boolean equals (not an assignment statement as in Java)
!=not equal to
orBoolean or
andBoolean and

In an XSLT stylesheet, some of these may need to be escaped with &lt; or &gt;.


Exercise 11: Boolean Operators

Write a stylesheet that lists the names of all elements with a boiling point somewhere around room temperature (0 degrees Fahreneheit to 100 degrees Fahrenheit)

Fahrenheit = (9/5 * Celsius) + 32

Celsius = (5/9) * (Fahrenheit -32) = (9/5 * Celsius) + 32


Functions


Node-set Functions

number last()

Returns the number of nodes in the context node list. This is the same as the position of the last node in the list.

number position()

Returns the position of the context node in the context node list. The first node has position 1, not 0.

number count(node-set)

Returns the number of nodes in the argument

node-set id(object)

Returns a node-set containing the single element node with the specified id as determined by an ID-type attribute. If no node has the specified ID, then this function returns an empty node-set. If the argument is a node-set, then it returns a node-set containing all the element nodes whose ID matches the string-value of any of the nodes in the argument node-set.

string local-name(node-set?)

Returns the local name of the first node in the argument node-set, or the local name of the context node if the argument is omitted. It returns an empty string if the relevant node does not have a local name (i.e. it’s a comment, root, or text node.)

string namespace-uri(node-set?)

Returns the namespace name of the first node in the argument node-set, or the namespace name of the context node if the argument is omitted. It returns an empty string if the node is an element or attribute that is not in a namespace. It also returns an empty string if namespace names don’t apply to this node (i.e. it’s a comment, processing instruction, root, or text node.)

string name(node-set?)

Returns the full, prefixed name of the first node in the argument node-set, or the name of the context node if the argument is omitted. It returns the empty string if the relevant node does not have a name (e.g. it’s a comment or text node.)


Exercise 12: Node-set Functions

A paragraph is represented by a para element.

Easy

Write a stylesheet that extracts the first paragraph of each sect1 in xpath.xml and outputs it in an HTML document.

Medium

Write a stylesheet that extracts the first paragraph of the document and each sect1 in xpath.xml.

Hard

Write a stylesheet that extracts the first paragraph of the document itself and each separate section of the document xpath.xml. Furthermore the title of each section should be replaced by the equivalent HTML header element. That is, sect1/title --> h1, sect2/title --> h2, etc.


Number Functions

XPath includes five functions that operate on numbers:


Exercise 13: Number Formatting

Celsius = Kelvin - 273.15.

Fahrenheit = (9/5 * Celsius) + 32

Easy

Rewrite the boiling points table of exercise 10 to use two decimal digits of precision for each value in the table.

Hard

Rewrite the boiling points table to use the same number of significant digits in the output as are provided in the input.


String functions

string string(object?)

This function returns the string-value of the argument. If the argument is a node-set, then it returns the string-value of the first node in the set. If the argument is omitted, it returns the string-value of the context node.

string concat(string, string, string...)

This function returns a string containing the concatenation of all its arguments.

boolean starts-with(string, string)

This function returns true if the first string starts with the second string. Otherwise it returns false.

boolean contains(string, string)

This function returns true if the first string contains the second string. Otherwise it returns false.

string substring-before(string, string)

This returns that part of the first string that precedes the second string. It returns the empty string if the second string is not a substring of the first string. If the second string appears multiple times in the first string, then this returns the portion of the first string before the first appearance of the second string.

string substring-after(string, string)

This returns that part of the first string that follows the second string. It returns the empty string if the second string is not a substring of the first string. If the second string appears multiple times in the first string, then this returns the portion of the first string after the initial appearance of the second string.

string substring(string, number, number?)

This returns the substring of the first argument beginning at the second argument and continuing for the number of characters specified by the third argument (or until the end of the string if the third argument is omitted.) Unlike Java, the foirst character is at position 1, not 0.

number string-length(string?)

Returns the number of Unicode characters in the string, or the string-value of the context node if the argument is omitted. This may not be the same as the number returned by the length() method in Java’s String class because XSLT counts characters and Java counts UTF-16 code points.

string normalize-space(string?)

This function strips all leading and trailing white-space from its argument, or the string-value of the context node if the argument is omitted, and condenses all other runs of whitespace to a single space. It’s very useful in XML documents where whitespace is used primarily for formatting.

string translate(string, string, string)

This function replaces all characters in the first string that are found in the second string with the corresponding character from the third string.


Exercise 14: String Manipulation

Write a stylesheet that extracts the first sentence of each paragraph in xpath.xml.

A paragraph is represented by a para element.

Each sentence ends with a period followed by at least one white space character or the end of the paragraph.


Boolean Functions

boolean boolean(object)

Converts the argument to a boolean in a mostly sensible way. NaN and 0 are false. All other numbers are true. Empty strings are false. All other strings are true. Empty node-sets are false. All other node-sets are true.

boolean not(boolean)

This function turns true into false and false into true.

boolean true()

This function always returns true. It’s necessary because XPath does not have any boolean literals.

boolean false()

This function always returns false. It’s necessary because XPath does not have any boolean literals.

boolean lang(string)

This function returns true if the context node is written in the language specified by the argument. The language of the context node is determined by the currently in-scope xml:lang attribute. If there is no such attribute, this function returns false.


Match patterns vs. XSLT Expressions


Part III: More XSLT


The Default Template Rules

<xsl:template match="*|/">
  <xsl:apply-templates/>
</xsl:template>

<xsl:template match="text()|@*">
  <xsl:value-of select="."/>
</xsl:template>

<xsl:template match="processing-instruction()|comment()"/>

Testing conditions with xsl:if

  <xsl:if test="expression">
    
  </xsl:if>    

Exercise 15: Conditional Output

Revise the boiling point table of exercise 12 so that elements whose boiling point is unknown are not included in the table.

Harder: Rewrite the boiling points table to use the same number of significant digits in the output as are provided in the input.

Celsius = Kelvin - 273.15.

Fahrenheit = (9/5 * Celsius) + 32


xsl:choose


Exercise 16: Selection

Easy

List the names of the elements in four different colors: red for liquids, blue for gases, black for solids, and green for unknown.

Medium

Same problem, but do it with CSS instead of a font element.

Hard

Divide the list of elements into four sections; one each for gases, liquids, solids, and unknown states


Sorting with xsl:sort

       <xsl:apply-templates select="composer">
         <xsl:sort select="name/last_name"/>
       </xsl:apply-templates>

Multiple Key Sorts

       <xsl:apply-templates select="composer">
         <xsl:sort select="name/last_name"/>
         <xsl:sort select="name/first_name"/>
         <xsl:sort select="name/middle_name"/>
       </xsl:apply-templates>

Sort Options

       <xsl:apply-templates select="composer">
         <xsl:sort select="name/last_name" order="ascending" lang="en" data-type="text"/>
         <xsl:sort select="name/first_name"/>
         <xsl:sort select="name/middle_name"/>
       </xsl:apply-templates>

Exercise 17: Sorting

Starting with the complete information document from Exercise 6

  1. Sort the table of contents alphabetically

  2. Sort the element sections by atomic number


Numbering Output

  <xsl:template match="PERIODIC_TABLE">
    <body>
          <xsl:for-each select="ATOM">
             <xsl:sort select="NAME"/>
            <xsl:number value="position()"/>. <a href="#{SYMBOL}">
            <xsl:value-of select="NAME"/></a><br />
         </xsl:for-each>
        <xsl:apply-templates  select="ATOM">
             <xsl:sort select="ATOMIC_NUMBER" data-type="number"/>
        </xsl:apply-templates>
      </body>
  </xsl:template>

View Transformed Document in Browser

Number Options


Part IV: XSLT in Practice


Where Does the Transformation Happen?

There are three primary ways XML documents are transformed into other formats, such as HTML, with an XSLT style sheet:


Client Side Processing

<?xml version="1.0"?>
<?xml-stylesheet type="text/xml" href="compositions.xsl"?>

Exercise 18: Client Side Transformation

Easy

Load a styled periodic table into Mozilla

Medium

Load a styled periodic table document into Internet Explorer 6.0 and Mozilla

Hard

Load a styled periodic table document into Internet Explorer 5.0, Internet Explorer 6.0, Mozilla

What do you see?


What else does XSLT have?


What does XSLT not have?


Summary


To Learn More


Index | Cafe con Leche

Copyright 2002, 2003 Elliotte Rusty Harold
Elliotte Rusty Harold
Last Modified March 25, 2003