Intro to XML


Intro to XML

Elliotte Rusty Harold

Software Development 2001 West

Sunday, April 8, 2001

elharo@metalab.unc.edu

http://www.ibiblio.org/xml/


Outline


Part I: XML Basics


What is XML?


Extensible Markup Language

Language
Markup Language
Extensible

XML is a Meta Markup Language


XML Applications


Some XML Applications


XML describes structure and semantics, not formatting


A Song Description in HTML

<dt>Hot Cop
<dd> by Jacques Morali, Henri Belolo, and Victor Willis
<ul>
<li>Producer: Jacques Morali
<li>Publisher: PolyGram Records
<li>Length: 6:20
<li>Written: 1978
<li>Artist: Village People
</ul>
View Document in Browser

A Song Description in XML

<SONG>
  <TITLE>Hot Cop</TITLE>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <PUBLISHER>PolyGram Records</PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>
View Document in Browser

Editing and Saving XML Files


Style Sheets provide formatting

SONG {display: block; font-family: New York, Times New Roman, serif}
TITLE {display: block; font-size: 24pt; 
       font-weight: bold; font-family: Helvetica, sans}
COMPOSER {display: block}
PRODUCER {display: block}
YEAR {display: block}
PUBLISHER {display: block}
LENGTH {display: block}
ARTIST {display: block; font-style: italic}

Attaching style sheets to documents

<?xml-stylesheet type="text/css" href="song.css"?>
<SONG>
  <TITLE>Hot Cop</TITLE>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <PUBLISHER>PolyGram Records</PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>

View Document in Browser

Style Sheet Languages


An XSLT stylesheet

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">

  <xsl:template match="/">
    <html>
      <head><title>Song</title></head>
      <body>
        <xsl:apply-templates select="SONG"/>    
      </body>
    </html>
  </xsl:template>
  
  <xsl:template match="SONG">
    <h1>
      <xsl:value-of select="TITLE"/> 
      by the 
      <xsl:value-of select="ARTIST"/>
    </h1>
    
    <ul>
      <li>Length: <xsl:value-of select="LENGTH"/></li>
      <li>Producer: <xsl:value-of select="PRODUCER"/></li>
      <li>Publisher: <xsl:value-of select="PUBLISHER"/></li>
      <li>Year: <xsl:value-of select="YEAR"/></li>
      <xsl:apply-templates select="COMPOSER"/>
    </ul>
  </xsl:template>

  <xsl:template match="COMPOSER">
    <li>Composer: <xsl:value-of select="."/></li>
  </xsl:template>

</xsl:stylesheet>

Transforming the Document

D:\fundamentals\examples> saxon hotcop.xml song3.xsl
<html>
<head>
<title>Song</title>
</head>
<body>
<h1>Hot Cop
      by the
      Village People</h1>
<ul>
<li>Length: 6:20</li>
<li>Producer: Jacques Morali</li>
<li>Publisher: PolyGram Records</li>
<li>Year: 1978</li>
<li>Composer: Jacques Morali</li>
<li>Composer: Henri Belolo</li>
<li>Composer: Victor Willis</li>
</ul>
</body>
</html>

Or alternately:

% java com.icl.saxon.StyleSheet -x org.apache.xerces.parsers.SAXParser xml_fundamentals.xml slides.xsl hotcop.xml song3.xsl
<html>
...


View Document in Browser

CSS or XSL?


Well-formedness

Rules:


Validity

To be valid an XML document must be

  1. Well-formed

  2. Must have a Document Type Definition (DTD)

  3. Must comply with the constraints specified in the DTD


A DTD for Songs

<!ELEMENT SONG (TITLE, COMPOSER+, PRODUCER*, PUBLISHER*, 
                 LENGTH?, YEAR?, ARTIST+)>

<!ELEMENT TITLE (#PCDATA)>

<!ELEMENT COMPOSER (#PCDATA)>
<!ELEMENT PRODUCER (#PCDATA)>
<!ELEMENT PUBLISHER (#PCDATA)>
<!ELEMENT LENGTH (#PCDATA)>
<!-- This should be a four digit year like "1999",
     not a two-digit year like "99" -->
<!ELEMENT YEAR (#PCDATA)>

<!ELEMENT ARTIST (#PCDATA)>

A Valid Song Document

<?xml version="1.0"?>
<!DOCTYPE SONG SYSTEM "song.dtd">
<SONG>
  <TITLE>Hot Cop</TITLE>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <PUBLISHER>PolyGram Records</PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>

Checking Validity

To check validity you pass the document through a validating parser which should report any errors it finds. For example,

% java dom.DOMCount -v validhotcop.xml
[Error] validhotcop.xml:13:9: The content of element type "SONG" must match "(TI
TLE,COMPOSER+,PRODUCER*,PUBLISHER*,LENGTH?,YEAR?)".
validhotcop.xml: 550 ms (10 elems, 0 attrs, 28 spaces, 98 chars)

A valid document:

% java dom.DOMCount -v validhotcop.xml
validhotcop.xml: 291 ms (10 elems, 0 attrs, 28 spaces, 98 chars)

A More Complex Example

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/css" href="song.css"?>
<!DOCTYPE SONG SYSTEM "expanded_song.dtd">
<SONG xmlns="http://metalab.unc.edu/xml/namespace/song"
      xmlns:xlink="http://www.w3.org/1999/xlink">
  <TITLE>Hot Cop</TITLE>
  <PHOTO 
    xlink:type="simple" xlink:show="onLoad" xlink:href="hotcop.jpg"
    ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <!-- The publisher is actually Polygram but I needed 
       an example of a general entity reference. -->
  <PUBLISHER xlink:type="simple" xlink:href="http://www.amrecords.com/">
    A &amp; M Records
  </PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>
<!-- You can tell what album I was 
     listening to when I wrote this example -->

The XML Declaration

<?xml version="1.0" encoding="UTF-8" standalone="no"?>


Attributes

<PHOTO xlink:type="simple" xlink:show="onLoad" xlink:href="hotcop.jpg" ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200" />


Empty Element Tags

<PHOTO xlink:type="simple" xlink:show="onLoad" xlink:href="hotcop.jpg" ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200" />


Comments

<!-- You can tell what album I was listening to when I wrote this example -->


Namespaces

<SONG xmlns="http://www.ibiblio.org/xml/namespace/song"
      xmlns:xlink="http://www.w3.org/1999/xlink">
  <TITLE>Hot Cop</TITLE>
  <PHOTO 
    xlink:type="simple" xlink:show="onLoad" xlink:href="hotcop.jpg"
    ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/>
  <COMPOSER>Jacques Morali</COMPOSER>
  <PUBLISHER xlink:type="simple" xlink:href="http://www.amrecords.com/">
    A &amp; M Records
  </PUBLISHER>
  <ARTIST>Village People</ARTIST>
</SONG>

Entity References

A & M Records


A More Complex DTD

<!ELEMENT SONG (TITLE, PHOTO?, COMPOSER+, PRODUCER*, 
 PUBLISHER*, LENGTH?, YEAR?, ARTIST+)>
<!ATTLIST SONG xmlns       CDATA #REQUIRED
               xmlns:xlink CDATA #REQUIRED>
<!ELEMENT TITLE (#PCDATA)>

<!ELEMENT PHOTO EMPTY>
<!ATTLIST PHOTO xlink:type CDATA #FIXED "simple"
                xlink:href CDATA #REQUIRED
                xlink:show CDATA #IMPLIED
                ALT        CDATA #REQUIRED
                WIDTH      CDATA #REQUIRED
                HEIGHT     CDATA #REQUIRED
>

<!ELEMENT COMPOSER (#PCDATA)>
<!ELEMENT PRODUCER (#PCDATA)>
<!ELEMENT PUBLISHER (#PCDATA)>
<!ATTLIST PUBLISHER xlink:type CDATA #IMPLIED
                    xlink:href CDATA #IMPLIED
>

<!ELEMENT LENGTH (#PCDATA)>
<!-- This should be a four digit year like "1999",
     not a two-digit year like "99" -->
<!ELEMENT YEAR (#PCDATA)>

<!ELEMENT ARTIST (#PCDATA)>

What is XML used for?


Domain-Specific Markup Languages


Self-Describing Data


An XML Fragment

<PERSON ID="p1100" SEX="M">
  <NAME>
    <GIVEN>Judson</GIVEN>
    <SURNAME>McDaniel</SURNAME>
  </NAME>
  <BIRTH>
    <DATE>21 Feb 1834</DATE>
  </BIRTH>
  <DEATH>
    <DATE>9 Dec 1905</DATE>
  </DEATH>
</PERSON>

Interchange of Data Among Applications


Example XML Applications


Mathematical Markup Language

<?xml version="1.0"?>
<html xmlns="http://www.w3.org/TR/REC-html40"
      xmlns:m="http://www.w3.org/TR/REC-MathML/"
>
<head>
<title>Fiat Lux</title>
<meta name="GENERATOR" content="amaya V1.3b" />
</head>
<body>

<P>
And God said,
</P>

<math>
  <m:mrow>
    <m:msub>
      <m:mi>&delta;</m:mi>
      <m:mi>&alpha;</m:mi>
    </m:msub>
    <m:msup>
      <m:mi>F</m:mi>
      <m:mi>&alpha;&beta;</m:mi>
    </m:msup>
    <m:mi></m:mi>
    <m:mo>=</m:mo>
    <m:mi></m:mi>
    <m:mfrac>
      <m:mrow>
        <m:mn>4</m:mn>
        <m:mi>&pi;</m:mi>
      </m:mrow>
      <m:mi>c</m:mi>
    </m:mfrac>
    <m:mi></m:mi>
    <m:msup>
      <m:mi>J</m:mi>
      <m:mrow>
        <m:mi>&beta;</m:mi>
        <m:mo></m:mo>
      </m:mrow>
    </m:msup>
  </m:mrow>
</math>

<P>
and there was light
</P>
</body>
</html>

Channel Definition Format

<?xml version="1.0"?>
<CHANNEL HREF="http://www.ibiblio.org/xml/index.html">
  <TITLE>Cafe con Leche</TITLE>
  <ITEM HREF="http://www.ibiblio.org/xml/books.html">
    <TITLE>Books about XML</TITLE>
  </ITEM>
  <ITEM HREF="http://www.ibiblio.org/xml/tradeshows.html">
    <TITLE>Trade shows and conferences about XML</TITLE>
  </ITEM>
  <ITEM HREF="http://www.ibiblio.org/xml/lists.htm">
    <TITLE>Mailing Lists dedicated to XML</TITLE>
  </ITEM>
</CHANNEL>

Classic Literature


Vector Graphics

A VML document

The Resource Description Framework (RDF)


An Example of RDF

<rdf:RDF 
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/DC/>
  <rdf:Description about="http://www.ibiblio.org/xml/>
    <dc:CREATOR>Elliotte Rusty Harold</dc:CREATOR>
    <dc:TITLE>Cafe con Leche</dc:TITLE>
  </rdf:Description>
</rdf:RDF>

File Formats, in-house applications, and other behind the scenes uses


XML for XML


XSL: The Extensible Stylesheet Language


An XML document

<?xml version="1.0"?>
<?xml-stylesheet type="text/xml" href="17-2.xsl"?>
<PERIODIC_TABLE>

  <ATOM STATE="GAS">
    <NAME>Hydrogen</NAME>
    <SYMBOL>H</SYMBOL>
    <ATOMIC_NUMBER>1</ATOMIC_NUMBER>
    <ATOMIC_WEIGHT>1.00794</ATOMIC_WEIGHT>
    <BOILING_POINT UNITS="Kelvin">20.28</BOILING_POINT>
    <MELTING_POINT UNITS="Kelvin">13.81</MELTING_POINT>
    <DENSITY UNITS="grams/cubic centimeter">
      <!-- At 300K, 1 atm -->
      0.0000899
    </DENSITY>
  </ATOM>

  <ATOM STATE="GAS">
    <NAME>Helium</NAME>
    <SYMBOL>He</SYMBOL>
    <ATOMIC_NUMBER>2</ATOMIC_NUMBER>
    <ATOMIC_WEIGHT>4.0026</ATOMIC_WEIGHT>
    <BOILING_POINT UNITS="Kelvin">4.216</BOILING_POINT>
    <MELTING_POINT UNITS="Kelvin">0.95</MELTING_POINT>
    <DENSITY UNITS="grams/cubic centimeter"><!-- At 300K -->
      0.0001785
    </DENSITY>
  </ATOM>

</PERIODIC_TABLE>

An XSLT style sheet that converts to XSL-FO

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:fo="http://www.w3.org/1999/XSL/Format">

  <xsl:output indent="yes"/>

  <xsl:template match="/">
    <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">

      <fo:layout-master-set>
        <fo:simple-page-master master-name="only">
          <fo:region-body/>
        </fo:simple-page-master>
      </fo:layout-master-set>

      <fo:page-sequence master-name="only">

        <fo:flow flow-name="xsl-region-body">
          <xsl:apply-templates select="//ATOM"/>
        </fo:flow>

      </fo:page-sequence>

    </fo:root>
  </xsl:template>

  <xsl:template match="ATOM">
    <fo:block font-size="20pt" font-family="serif"
              line-height="30pt">
      <xsl:value-of select="NAME"/>
    </fo:block>
  </xsl:template>

</xsl:stylesheet>

The XSL-FO Output

<?xml version="1.0"?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">

  <fo:layout-master-set>
    <fo:simple-page-master master-name="only">
      <fo:region-body/>
    </fo:simple-page-master>
  </fo:layout-master-set>

  <fo:page-sequence master-name="only">

    <fo:flow flow-name="xsl-region-body">
      <fo:block font-size="20pt" font-family="serif"
                line-height="30pt">
        Hydrogen
      </fo:block>
      <fo:block font-size="20pt" font-family="serif"
                line-height="30pt" >
        Helium
      </fo:block>
    </fo:flow>

  </fo:page-sequence>

</fo:root>
The PDF Result

W3C XML Schemas


XML Hypertext

Linking in XML is divided into multiple parts:


XML Hypertext Example

<?xml version="1.0"?>
<story date="January 9, 2001"
       xmlns:xlink="http://www.w3.org/1999/xlink"
       xmlns:xinclude="http://www.w3.org/1999/XML/xinclude"
       xml:base="http://www.cafeaulait.org/">

  <p>
    The W3C XML Linking Working Group has pushed the 
    <link xlink:href="http://www.w3.org/TR/2001/WD-xptr-20010108">
      XPointer specification
    </link> 
    back to working draft status. The specific issue that was 
    uncovered during Candidate Recommendation was some 
    <link xlink:type="simple"
      xlink:href="http://www.w3.org/TR/xptr#xpointer(//div[@class='div3'][7])">
      confusion
    </link> 
    over how to integrate XPointers, particularly those in non-XML documents, 
    with namespaces. 
   </p>

   <p>
     It's also come to light in this draft that Sun has 
     <link xlink:type="simple"
      xlink:href=
      "http://lists.w3.org/Archives/Public/www-xml-linking-comments/2000OctDec/0092.html"
      >
      claimed a patent</link> on some of the technologies needed to 
      implement XPointer. I think this is particularly offensive because Eve 
      L. Maler, a Sun employee, served as co-chair of the XML Linking 
      Working Group and a co-editor of the XPointer specification. As usual 
      Sun wants to use this as a club to lock implementers and users into a 
      licensing agreement that goes beyond what Sun and the W3C could 
      otherwise demand. The specific patent is <cite>United States Patent 
      No. 5,659,729, Method and system for implementing hypertext scroll 
      attributes</cite>, issued to Jakob Nielsen in 1997. The patent was 
      filed on February 1, 1996. It claims:
  </p>
  <blockquote>
    <xinclude:include 
      href=
      "http://www.delphion.com/details?&pn=US05659729__#xpointer(//abstract)"
      >
    </xinclude:include>
  </blockquote>
  
</story>

XLinks: The Extensible Linking Language

<footnote xlink:type="simple" xlink:href="footnote7.xml">7</footnote>

Extended Links


Extended Link Example

<?xml version="1.0"?>
<WEBSITE xmlns:xlink="http://www.w3.org/1999/xlink" 
         xlink:type="extended" xlink:title="Cafe au Lait">
         
  <NAME xlink:type="resource" xlink:label="source">
    Cafe au Lait
  </NAME>

  <HOMESITE xlink:type="locator" 
            xlink:href="http://ibiblio.org/javafaq/"
            xlink:label="us"/>
  
  <MIRROR xlink:type="locator" 
          xlink:title="Cafe au Lait Swedish Mirror"
          xlink:label="se"
          xlink:href="http://sunsite.kth.se/javafaq"/>
  
  <MIRROR xlink:type="locator" 
          xlink:title="Cafe au Lait German Mirror"
          xlink:label="de"
          xlink:href="http://sunsite.informatik.rwth-aachen.de/javafaq/"/>
  
  <MIRROR xlink:type="locator" 
          xlink:title="Cafe au Lait Swiss Mirror"
          xlink:label="ch"
          xlink:href="http://sunsite.cnlab-switch.ch/javafaq/"/>
  
  <CONNECTION xlink:type="arc" xlink:from="source" 
              xlink:to="ch"    xlink:show="replace" 
              xlink:actuate="onRequest"/>
  <CONNECTION xlink:type="arc" xlink:from="source" 
              xlink:to="us"    xlink:show="replace" 
              xlink:actuate="onRequest"/>
  <CONNECTION xlink:type="arc" xlink:from="source" 
              xlink:to="se"    xlink:show="replace" 
              xlink:actuate="onRequest"/>
  <CONNECTION xlink:type="arc" xlink:from="source" 
              xlink:to="sk"    xlink:show="replace" 
              xlink:actuate="onRequest"/>
  
</WEBSITE>

Diagram of an Extended Link

An extended link with arcs

XInclude

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE book SYSTEM "book.dtd" >
<book xmlns:xlink="http://www.w3.org/1999/xlink"
 xmlns:xinclude="http://www.w3.org/1999/XML/xinclude">
  <title>The Java Developer's Resource</title>
    <last_modified>December 3, 2000</last_modified>
    
<xinclude:include href="getting_started.xml"/>
<xinclude:include href="procedural_java.xml"/>
</book>

Non-XML for XML


XPath

descendant::language[position()=2]
/child::spec/child::body/child::*/child::language[2]
/spec/body/*/language[2]

XPointers

xpointer(id("ebnf"))
xpointer(descendant::language[position()=2])
ebnf
xpointer(/child::spec/child::body/child::*/child::language[2])
xpointer(/spec/body/*/language[2])
/1/14/2
xpointer(id("ebnf"))xpointer(id("EBNF"))

XPointers and URIs

http://www.w3.org/TR/1998/REC-xml-19980210.xml#xpointer(id("ebnf"))
http://www.w3.org/TR/1998/REC-xml-19980210.xml#xpointer(descendant::language[position()=2])
http://www.w3.org/TR/1998/REC-xml-19980210.xml#ebnf
http://www.w3.org/TR/1998/REC-xml-19980210.xml#xpointer(/child::spec/child::body/child::*/child::language[2])
http://www.w3.org/TR/1998/REC-xml-19980210.xml#xpointer(/spec/body/*/language[2])
http://www.w3.org/TR/1998/REC-xml-19980210.xml#/1/14/2
http://www.w3.org/TR/1998/REC-xml-19980210.xml#xpointer(id("ebnf"))xpointer(id("EBNF"))


A larger example: Music Catalog


Sample Catalog

http://www.ibiblio.org/nywc
Local copy

Organizing the Data


What is the Root Element


The Root Element

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
  Everything else will go here...
</catalog>
View in Browser

What are the Immediate Children of the Root?


Child Elements

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

</catalog>
View in Browser

White space in XML is not especially significant

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog><category>Small chamber ensembles 
- 2-4 Players by New York Women Composers</category></catalog>
View in Browser

Composers

Each composer has a name

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

  <composer>
    <name>Julie Mandel</name>
  </composer>

  <composer>
    <name>Margaret De Wys</name>
  </composer>  
    
  <composer>
    <name>Beth Anderson</name>
  </composer>
    
  <composer>
    <name>Linda Bouchard</name>
  </composer>

</catalog>
View in Browser

Grand Children

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

  <composer>
    <name>
      <first_name>Julie</first_name> 
      <middle_name></middle_name> 
      <last_name>Mandel</last_name>
    </name>
  </composer>

  <composer>
    <name>
      <first_name>Margaret</first_name> 
      <middle_name>De</middle_name> 
      <last_name>Wys</last_name>
    </name>
  </composer>  
    
  <composer>
    <name>
      <first_name>Beth</first_name> 
      <middle_name></middle_name> 
      <last_name>Anderson</last_name>
    </name>
  </composer>
    
  <composer>
    <name>
      <first_name>Linda</first_name> 
      <middle_name></middle_name> 
      <last_name>Bouchard</last_name>
    </name>
  </composer>

</catalog>
View in Browser

Attributes

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

  <composer id="c1">
    <name>
      <first_name>Julie</first_name> 
      <middle_name></middle_name> 
      <last_name>Mandel</last_name>
    </name>
  </composer>

  <composer id="c2">
    <name>
      <first_name>Margaret</first_name> 
      <middle_name>De</middle_name> 
      <last_name>Wys</last_name>
    </name>
  </composer>  
    
  <composer id="c3">
    <name>
      <first_name>Beth</first_name> 
      <middle_name></middle_name> 
      <last_name>Anderson</last_name>
    </name>
  </composer>
    
  <composer id="c4">
    <name>
      <first_name>Linda</first_name> 
      <middle_name></middle_name> 
      <last_name>Bouchard</last_name>
    </name>
  </composer>

</catalog>
View in Browser

Attributes vs. Elements


When not to use attributes


Compositions

Let's look at an example of what we want:

Rendered HTML:

Brass Swale (1988) 5", tbn, 2 Bfl tpts, bar. hn

Tonal. Commissioned/Premiered by the Redlands' New Music Ensemble. (A swale is a meadow or a marsh where a lot of wild plants grow together. The composer discovered the word when a horse named Swale won the Kentucky Derby several years ago. Since her work is primarily collage of newly composed musical swatches, she has used the name extensively.) ACA - American Composers Alliance

Or in HTML:

<dt><cite>Brass Swale</cite> (1988) 5", tbn, 2 Bfl tpts, bar. hn</dt>
<dd><p>
Tonal. Commissioned/Premiered by the Redlands' New Music 
Ensemble. (A swale is a meadow or a marsh where a lot of 
wild plants grow together. The composer discovered the word 
when a horse named Swale won the Kentucky Derby several 
years ago. Since her work is primarily collage of newly 
composed musical swatches, she has used the name 
extensively.)  ACA - American Composers 
Alliance</p>
</dd>

Each composition has a


Composition Example in XML

  <composition>
    <title>Brass Swale</title>
    <date>1988</date> 
    <length>5"</length>
    <instruments>tbn, 2 Bfl tpts, bar, hn</instruments>
    <description>
      Tonal. Commissioned/Premiered by the Redlands' New Music
      Ensemble. (A swale is a meadow or a marsh where a lot of
      wild plants grow together. The composer discovered the word
      when a horse named Swale won the Kentucky Derby several
      years ago. Since her work is primarily collage of newly
      composed musical swatches, she has used the name
      extensively.)
    </description>
    <publisher>ACA - American Composers Alliance</publisher>
  </composition>
View in Browser

Further Divisions

  <composition>
    <title>Trio for Flute, Viola and Harp</title>
    <date><year>1994</year></date> 
    <length>13'38"</length>
    <instruments>fl, hp, vla</instruments>
    <description>
      <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
      Christine Ims, and Susan Jolles. In 3 movements :</p>
      <ul>
        <li>mvt. 1: 5:01</li>
        <li>mvt. 2: 4:11</li>
        <li>mvt. 3: 4:26</li>
      </ul>  
    </description>
    <publisher>Theodore Presser</publisher>
  </composition>
View in Browser

Attaching the Composer to the Composition

  <composition composer="c3">
    <title>Trio: Dream in D</title>
    <date><year>1980</year></date> 
    <length>10'</length>
    <instruments>fl, pn, vc, or vn, pn, vc</instruments>
    <description>
      Rhapsodic. Passionate. Available on CD 
      <cite><a href=
       "http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr%3D1-2/">
       Two by Three
      </a></cite> from North/South Consonance (1998).
    </description> 
    <publisher></publisher>
  </composition>
View in Browser

Some Keywords For the Search Engines

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

  <cataloging_info>
    <abstract>Compositions by the members of New York Women Composers</abstract>
    <keyword>music publishing</keyword>
    <keyword>scores</keyword>
    <keyword>women composers</keyword>
    <keyword>New York</keyword>
  </cataloging_info>

  <composer id="c1">
    <name>
      <first_name>Julie</first_name> 
      <middle_name></middle_name> 
      <last_name>Mandel</last_name>
    </name>
  </composer>
  
  ...
  
</catalog>
View in Browser

Standard Signature

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
...
  <last_updated>July 28, 1999</last_updated>
  <copyright>1999 New York Women Composers</copyright>
  <maintainer email="elharo@metalab.unc.edu" 
              url="http://www.macfaq.com/personal.html">
    <name>
      <first_name>Elliotte</first_name> 
      <middle_name>Rusty</middle_name> 
      <last_name>Harold</last_name>
    </name>
  </maintainer>

</catalog>
View in Browser

Cascading Style Sheets


A Blank Style Sheet

<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/css" href="compositions1.css"?>
<catalog>
...
</catalog>
View in Browser

The Default Rule

catalog { font-family: "New York", "Times New Roman", serif; 
          font-size: 14pt; 
          background-color: white; 
          color: black; 
          display: block }

View in Browser

A style rule for the category element

category { display: block; 
          font-family: Helvetica, Arial, sans;
          font-size: 32pt; 
          font-weight: bold; 
          text-align: center
         }
       
catalog { font-family: New York, Times New Roman, serif; 
          font-size: 14pt; 
          background-color: white; 
          color: black; 
          display: block 
        }

View in Browser

A style rule for the composer element

composer { display: block; 
          font-family: Helvetica, Arial, sans;
          font-size: 24pt; 
          font-weight: bold; 
          text-align: left
         }     

View in Browser

A style rule for the title element

composition title { display: block; 
       font-family: Helvetica, Arial, sans;
       font-size: 18pt; 
       font-weight: bold; 
       text-align: left
      }                

View in Browser

A style rule for the catalog info

// cataloging_info is only for search engines
cataloging_info { display: none;
       color: white}                

display: none requires CSS2:


View in Browser

Style rules for the signature

  <last_updated>July 28, 1999</last_updated>
  <copyright>1999 New York Women Composers</copyright>
  <maintainer email="elharo@metalab.unc.edu" 
              url="http://www.macfaq.com/personal.html">
    <name>
      <first_name>Elliotte</first_name> 
      <middle_name>Rusty</middle_name> 
      <last_name>Harold</last_name>
    </name>
  </maintainer>            

last_updated, copyright, maintainer {display: block;
       font-size: small}
       
copyright:before {content: "Copyright " }

last_updated:before {content: "Last Modified " }

last_updated {margin-top: 2ex }

Again, some of this requires CSS2


View in Browser

Style Rules for composition children

composition * {display:list-item}
       
description {display: block}

View in Browser

Finished Style Sheet

category { display: block; 
          font-family: Helvetica, Arial, sans;
          font-size: 32pt; 
          font-weight: bold; 
          text-align: center
         }
       
catalog { font-family: "New York", "Times New Roman", serif; 
          font-size: 14pt; 
          background-color: white; 
          color: black; 
          display: block
        }
      
composer { display: block; 
           font-family: Helvetica, Arial, sans;
           font-size: 24pt; 
           font-weight: bold; 
           text-align: left
         }  
       
composition title { display: block; 
       font-family: Helvetica, Arial, sans;
       font-size: 18pt; 
       font-weight: bold; 
       text-align: left}
       
composition * {display:list-item}
       
description {display: block}
              
// cataloging_info is only for search engines
cataloging_info { display: none;
       color: #FFFFFF}
       
last_updated, copyright, maintainer {display: block;
       font-size: small}
       
copyright:before {content: "Copyright " }

last_updated:before {content: "Last Modified " }

last_updated {margin-top: 2ex }

Possible Extensions


Possible Solutions


CSS or XSL?


Part II: Well-formedness, DTDs, and Validity


Well-formedness vs validity


Well-formedness Rules


Open and close all tags


Empty tags end with />


There is a unique root element


Elements may not overlap


Attribute values are quoted


< and & are only used to start tags and entities


Only the five predefined entity references are used


Character References


XML is a Meta Markup Language


A Song Description in XML

<SONG>
  <TITLE>Hot Cop</TITLE>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <PUBLISHER>PolyGram Records</PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>

A DTD for Songs

<!ELEMENT SONG (TITLE, COMPOSER+, PRODUCER*, PUBLISHER*, 
                 LENGTH?, YEAR?, ARTIST+)>

<!ELEMENT TITLE (#PCDATA)>

<!ELEMENT COMPOSER (#PCDATA)>
<!ELEMENT PRODUCER (#PCDATA)>
<!ELEMENT PUBLISHER (#PCDATA)>
<!ELEMENT LENGTH (#PCDATA)>
<!-- This should be a four digit year like "1999",
     not a two-digit year like "99" -->
<!ELEMENT YEAR (#PCDATA)>

<!ELEMENT ARTIST (#PCDATA)>

A Valid Song Document

<?xml version="1.0"?>
<!DOCTYPE SONG SYSTEM "song.dtd">
<SONG>
  <TITLE>Hot Cop</TITLE>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <PUBLISHER>PolyGram Records</PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>

Validity

To be valid an XML document must be

  1. Well-formed

  2. Must have a document type declaration

  3. Must comply with the constraints specified in the DTD


Checking Validity

To check validity you pass the document through a validating parser which should report any errors it finds. For example,

% java sax.SAXCount -v invalidhotcop.xml
Error at (file file:/D:/speaking/SD99EAST/dtds/invalidhotcop.xml, line 10, char
8): Element "<SONG>" is not valid because it does not follow the rule, "(TITLE,C
OMPOSER+,PRODUCER*,PUBLISHER*,LENGTH?,YEAR?,ARTIST+)".
invalidhotcop.xml: 281 ms

A valid document:

% java sax.SAXCount -v validhotcop.xml
validhotcop.xml: 170 ms

Internal DTDs

<?xml version="1.0"?>
<!DOCTYPE SONG [
  <!ELEMENT SONG (TITLE, COMPOSER+, PRODUCER*, PUBLISHER*, 
                  LENGTH?, YEAR?, ARTIST+)>

  <!ELEMENT TITLE (#PCDATA)>

  <!ELEMENT COMPOSER (#PCDATA)>
  <!ELEMENT PRODUCER (#PCDATA)>
  <!ELEMENT PUBLISHER (#PCDATA)>
  <!ELEMENT LENGTH (#PCDATA)>
  <!-- This should be a four digit year like "1999",
     not a two-digit year like "99" -->
  <!ELEMENT YEAR (#PCDATA)>

  <!ELEMENT ARTIST (#PCDATA)>
]>
<SONG>
  <TITLE>Hot Cop</TITLE>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <PUBLISHER>PolyGram Records</PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>

What is XML used for?


Domain-Specific Markup Languages


Self-Describing Data

<!-- This should be a four digit year like "1999",
     not a two-digit year like "99" -->
<!ELEMENT YEAR (#PCDATA)>

Interchange of Data Among Applications


Structured and Integrated Data


XML Applications


DTDs and Validity


What is a DTD?


The importance of validation


An Example Document

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE catalog SYSTEM "compositions.dtd">
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

  <cataloging_info>
    <abstract>Compositions by the members of New York Women Composers</abstract>
    <keyword>music publishing</keyword>
    <keyword>scores</keyword>
    <keyword>women composers</keyword>
    <keyword>New York</keyword>
  </cataloging_info>

  <last_updated>July 28, 1999</last_updated>
  <copyright>1999 New York Women Composers</copyright>
  <maintainer email="elharo@metalab.unc.edu" 
              url="http://www.macfaq.com/personal.html">
    <name>
      <first_name>Elliotte</first_name> 
      <middle_name>Rusty</middle_name> 
      <last_name>Harold</last_name>
    </name>
  </maintainer>

  <composer id="c1">
    <name>
      <first_name>Julie</first_name> 
      <middle_name></middle_name> 
      <last_name>Mandel</last_name>
    </name>
  </composer>

  <composer id="c2">
    <name>
      <first_name>Margaret</first_name> 
      <middle_name>De</middle_name> 
      <last_name>Wys</last_name>
    </name>
  </composer>  
    
  <composer id="c3">
    <name>
      <first_name>Beth</first_name> 
      <middle_name></middle_name> 
      <last_name>Anderson</last_name>
    </name>
  </composer>
    
  <composer id="c4">
    <name>
      <first_name>Linda</first_name> 
      <middle_name></middle_name> 
      <last_name>Bouchard</last_name>
    </name>
  </composer>
    
  <composition composer="c1">
    <title>Trio for Flute, Viola and Harp</title>
    <date><year>1994</year></date> 
    <length>13'38"</length>
    <instruments>fl, hp, vla</instruments>
    <description>
      <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
      Christine Ims, and Susan Jolles. In 3 movements :</p>
      <ul>
        <li>mvt. 1: 5:01</li>
        <li>mvt. 2: 4:11</li>
        <li>mvt. 3: 4:26</li>
      </ul>  
    </description>
    <publisher>Theodore Presser</publisher>
  </composition>

  <composition composer="c2">
    <title>Charmonium</title>
    <date><year>1991</year></date> 
    <length>9'</length>
    <instruments>2 vln, vla, vc</instruments>
    <description>
      <p>Commissioned as quartet for the Meridian String Quartet. 
      Sonorous, bold. Moderate difficulty. Tape available.</p>
    </description> 
  </composition>

  <composition composer="c1">
    <title>Invention for Flute and Piano</title>
    <date><year>1994</year></date> 
    <instruments>fl, pn</instruments>
    <description><p>3 movements</p></description> 
  </composition>

  <composition composer="c3">
    <title>Little Trio</title>
    <date><year>1984</year></date> 
    <length>4'</length>
    <instruments>fl, guit, va</instruments>
    <publisher>ACA</publisher>
  </composition>

  <composition composer="c3">
    <title>Dr. Blood's Mermaid Lullaby</title>
    <date><year>1980</year></date> 
    <length>3'</length>
    <instruments>fl or ob, or vn, or vc, pn</instruments>
    <publisher>ACA</publisher>
  </composition>

  <composition composer="c3">
    <title>Trio: Dream in D</title>
    <date><year>1980</year></date> 
    <length>10'</length>
    <instruments>fl, pn, vc, or vn, pn, vc</instruments>
    <description>
      <p>Rhapsodic. Passionate. Available on CD 
      <cite>
      <a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr=1-2/">
      Two by Three</a></cite> 
      from North/South Consonance (1998).</p>
    </description> 
  </composition>

  <composition composer="c4">
    <title>Propos II</title>
    <date><year>1985</year></date> 
    <length>11'</length>
    <instruments>2 tpt</instruments>
    <description><p>Arrangement from Propos</p></description> 
  </composition>

  <composition composer="c4">
    <title>Rictus En Mirroir</title>
    <date><year>1985</year></date> 
    <length>14'</length>
    <instruments>fl, ob, hpschd, vc</instruments>
  </composition>

</catalog>
View in Browser

Element Declarations


Content Specifications


ANY

<!ELEMENT catalog ANY>

#PCDATA

  <year>1984</year>

<!ELEMENT year (#PCDATA)>

#PCDATA

<year>1999</year>
<year>99</year>
<year>1999 C.E.</year>
<year>
 The year of our Lord one thousand, nine hundred, and ninety-nine
</year>
<year>
<month>January</month>
<month>February</month>
<month>March</month>
<month>April</month>
<month>May</month>
<month>June</month>
<month>July</month>
<month>August</month>
<month>September</month>
<month>October</month>
<month>November</month>
<month>December</month>
</year>

#PCDATA

There are a number of elements in the example document that only contain PCDATA:

<!ELEMENT category (#PCDATA)>
<!ELEMENT abstract (#PCDATA)>
<!ELEMENT keyword (#PCDATA)>
<!ELEMENT last_updated (#PCDATA)>
<!ELEMENT copyright (#PCDATA)>
<!ELEMENT first_name (#PCDATA)>
<!ELEMENT middle_name (#PCDATA)>
<!ELEMENT last_name (#PCDATA)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT instruments (#PCDATA)>
<!ELEMENT publisher (#PCDATA)>
<!ELEMENT length (#PCDATA)>

Comments in DTDs

<!-- e.g. "1999 New York Women Composers", 
     not "Copyright 1999 New York Women Composers" -->
<!ELEMENT copyright (#PCDATA)>

Child Elements

    <date><year>1994</year></date> 
<!ELEMENT date (year)>

Child Elements

You only have to declare the immediate children

   <maintainer email="elharo@metalab.unc.edu" 
              url="http://www.macfaq.com/personal.html">
    <name>
      <first_name>Elliotte</first_name> 
      <middle_name>Rusty</middle_name> 
      <last_name>Harold</last_name>
    </name>
  </maintainer>

  <composer id="c1">
    <name>
      <first_name>Julie</first_name> 
      <middle_name></middle_name> 
      <last_name>Mandel</last_name>
    </name>
  </composer> 
<!ELEMENT maintainer (name)>
<!ELEMENT composer (name)>

Sequences

    <name>
      <first_name>Elliotte</first_name> 
      <middle_name>Rusty</middle_name> 
      <last_name>Harold</last_name>
    </name>

More Sequences

ELEMENT

One or More Children +

  <cataloging_info>
    <abstract>Compositions by the members of New York Women Composers</abstract>
    <keyword>music publishing</keyword>
    <keyword>scores</keyword>
    <keyword>women composers</keyword>
    <keyword>New York</keyword>
  </cataloging_info>
<!ELEMENT cataloging_info (abstract, keyword+)>

Zero or More Children *

<!ELEMENT catalog (category, cataloging_info, last_updated, copyright, 
                   maintainer, composer*, composition*)>

Zero or One Children ?

  <composition composer="c1">
    <title>Trio for Flute, Viola and Harp</title>
    <date><year>1994</year></date> 
    <length>13'38"</length>
    <instruments>fl, hp, vla</instruments>
    <description>
      <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
      Christine Ims, and Susan Jolles. In 3 movements :</p>
      <ul>
        <li>mvt. 1: 5:01</li>
        <li>mvt. 2: 4:11</li>
        <li>mvt. 3: 4:26</li>
      </ul>  
    </description>
    <publisher>Theodore Presser</publisher>
  </composition>
<!ELEMENT composition 
   (title, date, length?, instruments, description?, publisher?)>

Choices

<!ELEMENT date (year | ISODate)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT ISODate (#PCDATA)>

Grouping With Parentheses

<!ELEMENT dl (dt, dd)*>
<!ELEMENT ARTICLE (TITLE, (P | PHOTO | GRAPH | SIDEBAR | PULLQUOTE | SUBHEAD)*, BYLINE?)>

<ELEMENT catalog (category, cataloging_info, last_updated, 
   copyright, maintainer, (composer | composition)*)>

Mixed Content

<!ELEMENT description (#PCDATA | ul | a | cite | p)*>
<!ELEMENT cite (#PCDATA | a)*>
<!ELEMENT ul (li*)>
<!ELEMENT li (#PCDATA)>
<!ELEMENT a (#PCDATA)>
<!ELEMENT p (#PCDATA)>

Empty elements

<!ELEMENT BR EMPTY>
<!ELEMENT IMG EMPTY>
<!ELEMENT HR EMPTY>

Content Models You Can't Declare


Attribute Declarations

Recall this element:

  <maintainer email="elharo@metalab.unc.edu" 
              url="http://www.macfaq.com/personal.html">
    <name>
      <first_name>Elliotte</first_name> 
      <middle_name>Rusty</middle_name> 
      <last_name>Harold</last_name>
    </name>
  </maintainer>

It is declared like this:

<!ELEMENT maintainer (name)>
<!ATTLIST maintainer email CDATA "webmaster@nywc.org">
<!ATTLIST maintainer url CDATA "http://www.ibiblio.org/nywc">

The general format of an <!ATTLIST> declaration is:

<!ATTLIST Element_name Attribute_name Type Default_value>

Multiple Attribute Declarations

  <maintainer email="elharo@metalab.unc.edu" 
              url="http://www.macfaq.com/personal.html">
    <name>
      <first_name>Elliotte</first_name> 
      <middle_name>Rusty</middle_name> 
      <last_name>Harold</last_name>
    </name>
  </maintainer>

It is declared like this:

<!ELEMENT maintainer (name)>
<!ATTLIST maintainer email CDATA "webmaster@nywc.org">
<!ATTLIST maintainer url CDATA "http://www.ibiblio.org/nywc">

But it can also be declared in a single <!ATTLIST> declaration like this:

<!ATTLIST maintainer email CDATA "webmaster@nywc.org" url CDATA "http://www.ibiblio.org/nywc/">

This is more obvious with better indentation:

<!ATTLIST maintainer email CDATA "webmaster@nywc.org" 
                     url   CDATA "http://www.ibiblio.org/nywc/">
                     

Attribute Default Values


#REQUIRED

<!ELEMENT maintainer (name)>
<!ATTLIST maintainer email CDATA #REQUIRED 
                     url   CDATA #REQUIRED>

#IMPLIED

<!ELEMENT a (#PCDATA)>
<!ATTLIST a href CDATA #IMPLIED>

#FIXED

<!ELEMENT maintainer (name)>
<!ATTLIST maintainer email CDATA #FIXED "webmaster@nywc.org" 
                     url   CDATA #REQUIRED>

Attribute Types


CDATA

<!ATTLIST maintainer email CDATA #REQUIRED 
                     url   CDATA #IMPLIED>

ID

<!ELEMENT composer (name)>
<!ATTLIST composer id ID #REQUIRED>

IDREF

<!ELEMENT composition (title, date, length?, 
   instruments, description?, publisher?)>
<!ATTLIST composition composer IDREF #REQUIRED>

IDREFS

<!ELEMENT composition (title, date, length?, 
   instruments, description?, publisher?)>
<!ATTLIST composition composer IDREFS #REQUIRED>

Finished DTD

<!ELEMENT category (#PCDATA)>
<!ELEMENT abstract (#PCDATA)>
<!ELEMENT keyword (#PCDATA)>
<!ELEMENT last_updated (#PCDATA)>
<!ELEMENT copyright (#PCDATA)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT instruments (#PCDATA)>
<!ELEMENT publisher (#PCDATA)>
<!ELEMENT length (#PCDATA)>

<!ELEMENT date (year | ISODate)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT ISODate (#PCDATA)>

<!ELEMENT catalog (category, cataloging_info, last_updated, 
   copyright, maintainer, (composer | composition)*)>

<!ELEMENT cataloging_info (abstract, keyword+)>

<!ELEMENT description (#PCDATA | ul | a | cite | p)*>
<!ELEMENT cite (#PCDATA | a)*>
<!ELEMENT ul (li*)>
<!ELEMENT li (#PCDATA)>
<!ELEMENT a (#PCDATA)>
<!ELEMENT p (#PCDATA)>

<!ELEMENT maintainer (name)>
<!ELEMENT name (first_name, middle_name, last_name)>
<!ELEMENT first_name (#PCDATA)>
<!ELEMENT middle_name (#PCDATA)>
<!ELEMENT last_name (#PCDATA)>
<!ATTLIST maintainer email CDATA #REQUIRED 
                     url   CDATA #IMPLIED>
                     
<!ELEMENT composer (name)>
<!ATTLIST composer id ID #REQUIRED>

<!ELEMENT composition (title, date, length?, instruments, description?, publisher?)>
<!ATTLIST composition composer IDREFS #REQUIRED>

<!ATTLIST a href CDATA #REQUIRED>

ENTITY


ENTITIES


NOTATION


NMTOKEN


NMTOKENS


Enumerated

<!ATTLIST P VISIBLE (TRUE | FALSE) "TRUE">

General Entity References


General Entity Reference Example

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE DOCUMENT [

   <!ENTITY ERH "Elliotte Rusty Harold">

   <!ELEMENT DOCUMENT (TITLE, SIGNATURE)>
   <!ELEMENT TITLE (#PCDATA)>
   <!ELEMENT COPYRIGHT (#PCDATA)>
   <!ELEMENT EMAIL (#PCDATA)>
   <!ELEMENT LAST_MODIFIED (#PCDATA)>
   <!ELEMENT SIGNATURE (COPYRIGHT, EMAIL, LAST_MODIFIED)>
]>
<DOCUMENT>
  <TITLE>&ERH;</TITLE>
  <SIGNATURE>
    <COPYRIGHT>1999 &ERH;</COPYRIGHT>
    <EMAIL>elharo@metalab.unc.edu</EMAIL>
    <LAST_MODIFIED>March 10, 1999</LAST_MODIFIED>
   </SIGNATURE>
</DOCUMENT>
View in Browser

External General Entity References


A DTD that declares External General Entities

<!ENTITY AlLeiter         SYSTEM "mets/AlLeiter.xml">
<!ENTITY ArmandoReynoso   SYSTEM "mets/ArmandoReynoso.xml">
<!ENTITY BobbyJones       SYSTEM "mets/BobbyJones.xml">
<!ENTITY BradClontz       SYSTEM "mets/BradClontz.xml">
<!ENTITY DennisCook       SYSTEM "mets/DennisCook.xml">
<!ENTITY GregMcmichael    SYSTEM "mets/GregMcmichael.xml">
<!ENTITY HideoNomo        SYSTEM "mets/HideoNomo.xml">
<!ENTITY JohnFranco       SYSTEM "mets/JohnFranco.xml">
<!ENTITY JosiasManzanillo SYSTEM "mets/JosiasManzanillo.xml">
<!ENTITY OctavioDotel     SYSTEM "mets/OctavioDotel.xml">
<!ENTITY RickReed         SYSTEM "mets/RickReed.xml">
<!ENTITY RigoBeltran      SYSTEM "mets/RigoBeltran.xml">
<!ENTITY WillieBlair      SYSTEM "mets/WillieBlair.xml">

An External Parsed General Entity

<?xml version="1.0" encoding="UTF-8"?>
<PLAYER>
  <GIVEN_NAME>Al</GIVEN_NAME>
  <SURNAME>Leiter</SURNAME>
  <P>Starting Pitcher</P>
  <G>28</G>
  <GS>28</GS>
  <W>17</W>
  <L>6</L>
  <SV>0</SV>
  <CG>4</CG>
  <SO>2</SO>
  <ERA>2.47</ERA>
  <IP>193</IP>
  <HRA>8</HRA>
  <RA>55</RA>
  <ER>53</ER>
  <HB>11</HB>
  <WP>4</WP>
  <B>1</B>
  <WB>71</WB>
  <K>174</K>
</PLAYER>
View in Browser

A Document that uses External General Entity References

<?xml version="1.0" standalone="no"?>
<!DOCTYPE TEAM SYSTEM "team.dtd" [
  <!ENTITY % players SYSTEM "mets.dtd"> 
  %players;
  ]
>
<TEAM>
  <TEAM_CITY>New York</TEAM_CITY>
  <TEAM_NAME>Mets</TEAM_NAME>
  &AlLeiter;
  &ArmandoReynoso;
  &BobbyJones;
  &BradClontz;
  &DennisCook;
  &GregMcmichael;
  &HideoNomo;
  &JohnFranco;
  &JosiasManzanillo;
  &OctavioDotel;
  &RickReed;
  &RigoBeltran;
  &WillieBlair;
</TEAM>
View in Browser

Parameter Entity References


External Parameter Entity References


The description element

    <description>
      <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
      Christine Ims, and Susan Jolles. In 3 movements :</p>
      <ul>
        <li>mvt. 1: 5:01</li>
        <li>mvt. 2: 4:11</li>
        <li>mvt. 3: 4:26</li>
      </ul>  
    </description>

<!ELEMENT description (#PCDATA | ul | a | cite | p)*>
<!ELEMENT cite (#PCDATA | a)*>
<!ELEMENT ul (li*)>
<!ELEMENT li (#PCDATA)>
<!ELEMENT p (#PCDATA)>
<!ELEMENT a (#PCDATA)>
<!ATTLIST a href CDATA #IMPLIED>


XHTML


External Parameter Entity References

<!ENTITY % xhtml1 SYSTEM "http://www.w3.org/TR/xhtml1/DTD/strict.dtd">
%xhtml1;

Finished DTD

<!ENTITY % xhtml1 SYSTEM "http://www.w3.org/TR/xhtml1/DTD/strict.dtd">
%xhtml1;

<!ELEMENT category (#PCDATA)>
<!ELEMENT abstract (#PCDATA)>
<!ELEMENT keyword (#PCDATA)>
<!ELEMENT last_updated (#PCDATA)>

<!-- e.g. "1999 New York Women Composers", 
     not "Copyright 1999 New York Women Composers" -->
<!ELEMENT copyright (#PCDATA)>


<!ELEMENT instruments (#PCDATA)>
<!ELEMENT publisher (#PCDATA)>
<!ELEMENT length (#PCDATA)>

<!ELEMENT date (year | ISODate)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT ISODate (#PCDATA)>

<!ELEMENT catalog (category, cataloging_info, last_updated, 
   copyright, maintainer, (composer | composition)*)>

<!ELEMENT cataloging_info (abstract, keyword+)>

<!ELEMENT description %Block;>

<!ELEMENT maintainer (name)>
<!ELEMENT name (first_name, middle_name, last_name)>
<!ELEMENT first_name (#PCDATA)>
<!ELEMENT middle_name (#PCDATA)>
<!ELEMENT last_name (#PCDATA)>
<!ATTLIST maintainer email CDATA #REQUIRED 
                     url   CDATA #IMPLIED>
                     
<!ELEMENT composer (name)>
<!ATTLIST composer id ID #REQUIRED>

<!ELEMENT composition (title, date, length?, instruments, description?, publisher?)>
<!ATTLIST composition composer IDREFS #REQUIRED>

<!ATTLIST a href CDATA #IMPLIED>



Internal DTD Subsets

<?xml version="1.0"?>
<!DOCTYPE document SYSTEM "http://www.w3.org/TR/xhtml1/DTD/transitional.dtd" [
  <!ELEMENT document %BLOCK; >
]>
<document>
<p>Hello There!</p>
</document>

Schemas


Part III: Namespaces


Raison d'etre

  1. To distinguish between elements and attributes from different vocabularies with different meanings.

  2. To group all related elements and attributes together so that a parser can easily recognize them.


The Need for Namespaces


Namespaces disambiguate elements


Namespace Syntax


Namespace URIs


Binding Prefixes to Namespace URIs


Binding Prefixes to Namespace URIs Example

<xhtml:html xmlns:xhtml="http://www.w3.org/1999/xhtml"
            xmlns:xlink="http://www.w3.org/XML/XLink/0.9">
  <xhtml:head><xhtml:title>Three Namespaces</xhtml:title></xhtml:head>
  <xhtml:body>
    <xhtml:h1 align="center">An Ellipse and a Rectangle</xhtml:h1>
    <svg:svg xmlns:svg="http://www.w3.org/Graphics/SVG/SVG-19991203.dtd" 
             width="12cm" height="10cm">
      <svg:ellipse rx="110" ry="130" />
      <svg:rect x="4cm" y="1cm" width="3cm" height="6cm" />
    </svg:svg>
    <xhtml:p xlink:type="simple" 
      xlink:href="ellipses.html">
      More about ellipses
    </xhtml:p>
    <xhtml:p xlink:type="simple" xlink:href="rectangles.html">
      More about rectangles
    </xhtml:p>
    <xhtml:hr/>
    <xhtml:p>Last Modified February 13, 2000</xhtml:p>    
  </xhtml:body>
</xhtml:html>

Binding namespaces to prefixes with #FIXED ATTLIST declarations

<!ATTLIST svg:svg xmlns:svg (CDATA) #FIXED "http://www.w3.org/Graphics/SVG/SVG-19991203.dtd">


<svg:svg width="12cm" height="10cm">
  <svg:ellipse rx="110" ry="130" />
  <svg:rect x="4cm" y="1cm" width="3cm" height="6cm" />
</svg:svg>

Unprefixed attributes are never in any namespace


URIs matter; not prefixes


Namespace URIs do not necessarily point to a document, page, or schema


The Default Namespace

<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:xlink="http://www.w3.org/XML/XLink/0.9">
  <head><title>Three Namespaces</title></head>
  <body>
    <h1 align="center">An Ellipse and a Rectangle</h1>
    <svg xmlns="http://www.w3.org/Graphics/SVG/SVG-19991203.dtd" 
         width="12cm" height="10cm">
      <ellipse rx="110" ry="130" />
      <rect x="4cm" y="1cm" width="3cm" height="6cm" />
    </svg>
    <p xlink:type="simple" xlink:href="ellipses.html">
      More about ellipses
    </p>
    <p xlink:type="simple" xlink:href="rectangles.html">
      More about rectangles
    </p>
    <hr/>
    <p>Last Modified February 13, 2000</p>    
  </body>
</html>

Binding default namespaces to elements with #FIXED ATTLISTs

<!ATTLIST svg xmlns (CDATA) #FIXED "http://www.w3.org/Graphics/SVG/SVG-19991203.dtd">


<svg width="12cm" height="10cm">
  <ellipse rx="110" ry="130" />
  <rect x="4cm" y="1cm" width="3cm" height="6cm" />
</svg>

How Parsers Handle Namespaces


Namespaces and DTDs


Part IV: XSL, the Extensible Stylesheet Language


What Is XSL?


Versions


The Process of an XSL Transformation

  1. The XML parser reads an XML document and forms a tree

  2. The tree is passed to the XSLT processor

  3. The XSLT processor compares the nodes in the tree to the instructions in the style sheet

  4. When the XSLT processor finds a match, it outputs a tree fragment

  5. (Optional) The complete output tree is serialized to some other format such as text, HTML, or an XML file


An Example XML Document

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE catalog SYSTEM "compositions.dtd">
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

  <cataloging_info>
    <abstract>Compositions by the members of New York Women Composers</abstract>
    <keyword>music publishing</keyword>
    <keyword>scores</keyword>
    <keyword>women composers</keyword>
    <keyword>New York</keyword>
  </cataloging_info>

  <last_updated>July 28, 1999</last_updated>
  <copyright>1999 New York Women Composers</copyright>
  <maintainer email="elharo@metalab.unc.edu" 
              url="http://www.macfaq.com/personal.html">
    <name>
      <first_name>Elliotte</first_name> 
      <middle_name>Rusty</middle_name> 
      <last_name>Harold</last_name>
    </name>
  </maintainer>

  <composer id="c1">
    <name>
      <first_name>Julie</first_name> 
      <middle_name></middle_name> 
      <last_name>Mandel</last_name>
    </name>
  </composer>

  <composer id="c2">
    <name>
      <first_name>Margaret</first_name> 
      <middle_name>De</middle_name> 
      <last_name>Wys</last_name>
    </name>
  </composer>  
    
  <composer id="c3">
    <name>
      <first_name>Beth</first_name> 
      <middle_name></middle_name> 
      <last_name>Anderson</last_name>
    </name>
  </composer>
    
  <composer id="c4">
    <name>
      <first_name>Linda</first_name> 
      <middle_name></middle_name> 
      <last_name>Bouchard</last_name>
    </name>
  </composer>
    
  <composition composer="c1">
    <title>Trio for Flute, Viola and Harp</title>
    <date><year>1994</year></date> 
    <length>13'38"</length>
    <instruments>fl, hp, vla</instruments>
    <description>
      <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
      Christine Ims, and Susan Jolles. In 3 movements :</p>
      <ul>
        <li>mvt. 1: 5:01</li>
        <li>mvt. 2: 4:11</li>
        <li>mvt. 3: 4:26</li>
      </ul>  
    </description>
    <publisher>Theodore Presser</publisher>
  </composition>

  <composition composer="c2">
    <title>Charmonium</title>
    <date><year>1991</year></date> 
    <length>9'</length>
    <instruments>2 vln, vla, vc</instruments>
    <description>
      <p>Commissioned as quartet for the Meridian String Quartet. 
      Sonorous, bold. Moderate difficulty. Tape available.</p>
    </description> 
  </composition>

  <composition composer="c1">
    <title>Invention for Flute and Piano</title>
    <date><year>1994</year></date> 
    <instruments>fl, pn</instruments>
    <description><p>3 movements</p></description> 
  </composition>

  <composition composer="c3">
    <title>Little Trio</title>
    <date><year>1984</year></date> 
    <length>4'</length>
    <instruments>fl, guit, va</instruments>
    <publisher>ACA</publisher>
  </composition>

  <composition composer="c3">
    <title>Dr. Blood's Mermaid Lullaby</title>
    <date><year>1980</year></date> 
    <length>3'</length>
    <instruments>fl or ob, or vn, or vc, pn</instruments>
    <publisher>ACA</publisher>
  </composition>

  <composition composer="c3">
    <title>Trio: Dream in D</title>
    <date><year>1980</year></date> 
    <length>10'</length>
    <instruments>fl, pn, vc, or vn, pn, vc</instruments>
    <description>
      <p>Rhapsodic. Passionate. Available on CD 
      <cite>
      <a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr=1-2/">
      Two by Three</a></cite> 
      from North/South Consonance (1998).</p>
    </description> 
  </composition>

  <composition composer="c4">
    <title>Propos II</title>
    <date><year>1985</year></date> 
    <length>11'</length>
    <instruments>2 tpt</instruments>
    <description><p>Arrangement from Propos</p></description> 
  </composition>

  <composition composer="c4">
    <title>Rictus En Mirroir</title>
    <date><year>1985</year></date> 
    <length>14'</length>
    <instruments>fl, ob, hpschd, vc</instruments>
  </composition>

</catalog>

An XSLT Style Sheet

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">



</xsl:stylesheet>

Let's use xt to apply this stylesheet to compositions.xml.
















Windows executable:

C:> xt compositions.xml sheet1.xsl

Java executable:

C:> java -Dcom.jclark.xsl.sax.parser=com.jclark.xml.sax.CommentDriver
com.jclark.xsl.sax.Driver compositions.xml sheet1.xsl output1.html 

Output

<?xml version="1.0" encoding="utf-8"?>

  
    Small chamber ensembles - 2-4 Players by New York Women Composers
  

  
    Compositions by the members of New York Women Composers
    music publishing
    scores
    women composers
    New York
  

  July 28, 1999
  1999 New York Women Composers
  
    
      Elliotte 
      Rusty 
      Harold
    
  

  
    
      Julie 
       
      Mandel
    
  

  
    
      Margaret 
      De 
      Wys
    
    
    
  
    
      Beth 
       
      Anderson
    
  
    
  
    
      Linda 
       
      Bouchard
    
  
    
  
    Trio for Flute, Viola and Harp
    1994 
    13'38"
    fl, hp, vla
    
      Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
      Christine Ims, and Susan Jolles. In 3 movements :
      
        mvt. 1: 5:01
        mvt. 2: 4:11
        mvt. 3: 4:26
        
    
    Theodore Presser
  

  
    Charmonium
    1991 
    9'
    2 vln, vla, vc
    
      Commissioned as quartet for the Meridian String Quartet. 
      Sonorous, bold. Moderate difficulty. Tape available.
     
  

  
    Invention for Flute and Piano
    1994 
    fl, pn
    3 movements 
  

  
    Little Trio
    1984 
    4'
    fl, guit, va
    ACA
  

  
    Dr. Blood's Mermaid Lullaby
    1980 
    3'
    fl or ob, or vn, or vc, pn
    ACA
  

  
    Trio: Dream in D
    1980 
    10'
    fl, pn, vc, or vn, pn, vc
    
      Rhapsodic. Passionate. Available on CD 
      
      
      Two by Three 
      from North/South Consonance (1998).
     
  

  
    Propos II
    1985 
    11'
    2 tpt
    Arrangement from Propos 
  

  
    Rictus En Mirroir
    1985 
    14'
    fl, ob, hpschd, vc
  

View in Netscape

Templates

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="composition">
    <h3><xsl:value-of select="title"/></h3>
  </xsl:template>

</xsl:stylesheet>
View Transformed Document in Netscape

The actual HTML output

<?xml version="1.0" encoding="utf-8"?>

  
    Small chamber ensembles - 2-4 Players by New York Women Composers
  

  
    Compositions by the members of New York Women Composers
    music publishing
    scores
    women composers
    New York
  

  July 28, 1999
  1999 New York Women Composers
  
    
      Elliotte 
      Rusty 
      Harold
    
  

  
    
      Julie 
       
      Mandel
    
  

  
    
      Margaret 
      De 
      Wys
    
    
    
  
    
      Beth 
       
      Anderson
    
  
    
  
    
      Linda 
       
      Bouchard
    
  
    
  <h3>Trio for Flute, Viola and Harp</h3>

  <h3>Charmonium</h3>

  <h3>Invention for Flute and Piano</h3>

  <h3>Little Trio</h3>

  <h3>Dr. Blood's Mermaid Lullaby</h3>

  <h3>Trio: Dream in D</h3>

  <h3>Propos II</h3>

  <h3>Rictus En Mirroir</h3>


Adding the root

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
    <html>
      <body>        
        
      </body>
    </html>
  </xsl:template>

  <xsl:template match="composition">
    <h3><xsl:value-of select="title"/></h3>
  </xsl:template>

</xsl:stylesheet>
View Transformed Document in Netscape

The actual HTML output

<html>
   <body></body>
</html>

Applying Templates

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
    <html>
      <body>
        <xsl:apply-templates/>
      </body>
    </html>
  </xsl:template>

  <xsl:template match="composition">
    <h3><xsl:value-of select="title"/></h3>
  </xsl:template>

</xsl:stylesheet>

View Transformed Document in Netscape

The actual HTML output

<html>
   <body>
      
      
      Small chamber ensembles - 2-4 Players by New York Women Composers
      
      
      
      Compositions by the members of New York Women Composers
      music publishing
      scores
      women composers
      New York
      
      
      July 28, 1999
      1999 New York Women Composers
      
      
      Elliotte 
      Rusty 
      Harold
      
      
      
      
      
      Julie 
       
      Mandel
      
      
      
      
      
      Margaret 
      De 
      Wys
      
        
      
      
      
      Beth 
       
      Anderson
      
      
      
      
      
      Linda 
       
      Bouchard
      
      
      
      
      <h3>Trio for Flute, Viola and Harp</h3>
      
      
      <h3>Charmonium</h3>
      
      
      <h3>Invention for Flute and Piano</h3>
      
      
      <h3>Little Trio</h3>
      
      
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      
      
      <h3>Trio: Dream in D</h3>
      
      
      <h3>Propos II</h3>
      
      
      <h3>Rictus En Mirroir</h3>
      
      
   </body>
</html>

The select attribute

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
    <html>
       <xsl:apply-templates select="catalog"/>
    </html>
  </xsl:template>

  <xsl:template match="catalog">
    <head>
       <title><xsl:value-of select="category"/></title>      
    </head>
    <body>
       <h1><xsl:value-of select="category"/></h1>      
       <xsl:apply-templates select="composition"/>
    </body>
  </xsl:template>

  <xsl:template match="composition">
    <h3><xsl:value-of select="title"/></h3>
  </xsl:template>

</xsl:stylesheet>
View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <h3>Trio for Flute, Viola and Harp</h3>
      <h3>Charmonium</h3>
      <h3>Invention for Flute and Piano</h3>
      <h3>Little Trio</h3>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <h3>Trio: Dream in D</h3>
      <h3>Propos II</h3>
      <h3>Rictus En Mirroir</h3>
   </body>
</html>

Expanding the coverage

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
    <html>
       <xsl:apply-templates select="catalog"/>
    </html>
  </xsl:template>

  <xsl:template match="catalog">
    <head>
       <title><xsl:value-of select="category"/></title>      
    </head>
    <body>
       <h1><xsl:value-of select="category"/></h1>      
       <xsl:apply-templates select="composition"/>
       <hr/>
       Copyright <xsl:value-of select="copyright"/><br/>
       Last Modified: <xsl:value-of select="last_updated"/>
    </body>
  </xsl:template>

  <xsl:template match="composition">
    <h3><xsl:value-of select="title"/></h3>

    <ul>
     <li><xsl:value-of select="date"/></li>
     <li><xsl:value-of select="length"/></li>
     <li><xsl:value-of select="instruments"/></li>
     <li><xsl:value-of select="publisher"/></li>    
    </ul>

    <p><xsl:value-of select="description"/></p>    
    
  </xsl:template>

</xsl:stylesheet>
View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <h3>Trio for Flute, Viola and Harp</h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
         Christine Ims, and Susan Jolles. In 3 movements :
         
         mvt. 1: 5:01
         mvt. 2: 4:11
         mvt. 3: 4:26
         
         
      </p>
      <h3>Charmonium</h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
         <li></li>
      </ul>
      <p>
         Commissioned as quartet for the Meridian String Quartet. 
         Sonorous, bold. Moderate difficulty. Tape available.
         
      </p>
      <h3>Invention for Flute and Piano</h3>
      <ul>
         <li>1994</li>
         <li></li>
         <li>fl, pn</li>
         <li></li>
      </ul>
      <p>3 movements</p>
      <h3>Little Trio</h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Trio: Dream in D</h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
         <li></li>
      </ul>
      <p>
         Rhapsodic. Passionate. Available on CD 
         
         
         Two by Three 
         from North/South Consonance (1998).
         
      </p>
      <h3>Propos II</h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
         <li></li>
      </ul>
      <p>Arrangement from Propos</p>
      <h3>Rictus En Mirroir</h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
         <li></li>
      </ul>
      <p></p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999
   </body>
</html>

Attributes


View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <h3>Trio for Flute, Viola and Harp</h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
         Christine Ims, and Susan Jolles. In 3 movements :
         
         mvt. 1: 5:01
         mvt. 2: 4:11
         mvt. 3: 4:26
         
         
      </p>
      <h3>Charmonium</h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
         <li></li>
      </ul>
      <p>
         Commissioned as quartet for the Meridian String Quartet. 
         Sonorous, bold. Moderate difficulty. Tape available.
         
      </p>
      <h3>Invention for Flute and Piano</h3>
      <ul>
         <li>1994</li>
         <li></li>
         <li>fl, pn</li>
         <li></li>
      </ul>
      <p>3 movements</p>
      <h3>Little Trio</h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Trio: Dream in D</h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
         <li></li>
      </ul>
      <p>
         Rhapsodic. Passionate. Available on CD 
         
         
         Two by Three 
         from North/South Consonance (1998).
         
      </p>
      <h3>Propos II</h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
         <li></li>
      </ul>
      <p>Arrangement from Propos</p>
      <h3>Rictus En Mirroir</h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
         <li></li>
      </ul>
      <p></p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999<br>
      Elliotte 
      Rusty 
      Harold
      <br>elharo@metalab.unc.edu
   </body>
</html>

Attribute Value Templates


View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <h3>Trio for Flute, Viola and Harp</h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
         Christine Ims, and Susan Jolles. In 3 movements :
         
         mvt. 1: 5:01
         mvt. 2: 4:11
         mvt. 3: 4:26
         
         
      </p>
      <h3>Charmonium</h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
         <li></li>
      </ul>
      <p>
         Commissioned as quartet for the Meridian String Quartet. 
         Sonorous, bold. Moderate difficulty. Tape available.
         
      </p>
      <h3>Invention for Flute and Piano</h3>
      <ul>
         <li>1994</li>
         <li></li>
         <li>fl, pn</li>
         <li></li>
      </ul>
      <p>3 movements</p>
      <h3>Little Trio</h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Trio: Dream in D</h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
         <li></li>
      </ul>
      <p>
         Rhapsodic. Passionate. Available on CD 
         
         
         Two by Three 
         from North/South Consonance (1998).
         
      </p>
      <h3>Propos II</h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
         <li></li>
      </ul>
      <p>Arrangement from Propos</p>
      <h3>Rictus En Mirroir</h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
         <li></li>
      </ul>
      <p></p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999<br><a href="http://www.macfaq.com/personal.html">
         Elliotte 
         Rusty 
         Harold
         
      </a><br><a href="mailto:elharo@metalab.unc.edu">elharo@metalab.unc.edu</a></body>
</html>

xsl:copy


View Transformed Document in Netscape

The generated HTML

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <h3>Trio for Flute, Viola and Harp</h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         
         <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
            Christine Ims, and Susan Jolles. In 3 movements :
         </p>
         
         <ul>
            
            <li>mvt. 1: 5:01</li>
            
            <li>mvt. 2: 4:11</li>
            
            <li>mvt. 3: 4:26</li>
            
         </ul>  
         
      </p>
      <h3>Charmonium</h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
         <li></li>
      </ul>
      <p>
         
         <p>Commissioned as quartet for the Meridian String Quartet. 
            Sonorous, bold. Moderate difficulty. Tape available.
         </p>
         
      </p>
      <h3>Invention for Flute and Piano</h3>
      <ul>
         <li>1994</li>
         <li></li>
         <li>fl, pn</li>
         <li></li>
      </ul>
      <p>
         <p>3 movements</p>
      </p>
      <h3>Little Trio</h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Trio: Dream in D</h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
         <li></li>
      </ul>
      <p>
         
         <p>Rhapsodic. Passionate. Available on CD 
            <cite>
               
               Two by Three
            </cite> 
            from North/South Consonance (1998).
         </p>
         
      </p>
      <h3>Propos II</h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
         <li></li>
      </ul>
      <p>
         <p>Arrangement from Propos</p>
      </p>
      <h3>Rictus En Mirroir</h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
         <li></li>
      </ul>
      <p></p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999<br><a href="http://www.macfaq.com/personal.html">
         Elliotte 
         Rusty 
         Harold
         
      </a><br><a href="mailto:elharo@metalab.unc.edu">elharo@metalab.unc.edu</a></body>
</html>

The Or Operator in Match Patterns

Since all four template rules for the HTML element have the same content, we can combine them into a single rule that applies to each of the four using the or operator |

  <xsl:template match="p|ul|li|cite">
    <xsl:copy>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>  

The output is the same

The * wildcard

Right now the descriptions in the input document only use a few HTML tags, but potentially they could use full HTML up to and including tables, images, styles, and more. You could include separate template rules for each of these, but it's easier to specify a rule that applies to all elements.

  <!-- pass unrecognized tags along unchanged -->
  <xsl:template match="*">
    <xsl:copy>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>  

The * matches all elements that are not matched by some more specific rules. It only matches element nodes, though. It does not match nodes for


The output is the same in this case, though for a document that used more HTML it might be different.


The node() and @* wildcards

To copy everything including:

we have to use greedier wild cards:

  <!-- pass unrecognized nodes along unchanged -->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>  

xt doesn't yet recognize node() in match patterns


The output is the same in this case, though for a document that used more HTML it might be different.


Are we being too greedy?

Perhaps this is too greedy. Do we really only want to recognize HTML in the description element? What if somebody puts HTML in a different, element like instruments? What if somebody makes a mistake and adds an element that shouldn't be there?

I don't think so, but it would be possible to use modes or other techniques to make this default rule only apply inside the description element.


Testing conditions with xsl:if

  <xsl:template match="composition">
    <h3><xsl:value-of select="title"/></h3>

    <ul>
     <xsl:if test="string(date)">
       <li><xsl:value-of select="date"/></li>
     </xsl:if>
     <xsl:if test="string(length)">
       <li><xsl:value-of select="length"/></li>
     </xsl:if>
     <xsl:if test="string(instruments)">
       <li><xsl:value-of select="instruments"/></li>
     </xsl:if>
     <xsl:if test="string(publisher)">
       <li><xsl:value-of select="publisher"/></li>
     </xsl:if>    
    </ul>

    <p><xsl:apply-templates select="description"/></p>    
    
  </xsl:template>

View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <h3>Trio for Flute, Viola and Harp</h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         <description>
            
            <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
               Christine Ims, and Susan Jolles. In 3 movements :
            </p>
            
            <ul>
               
               <li>mvt. 1: 5:01</li>
               
               <li>mvt. 2: 4:11</li>
               
               <li>mvt. 3: 4:26</li>
               
            </ul>  
            
         </description>
      </p>
      <h3>Charmonium</h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Commissioned as quartet for the Meridian String Quartet. 
               Sonorous, bold. Moderate difficulty. Tape available.
            </p>
            
         </description>
      </p>
      <h3>Invention for Flute and Piano</h3>
      <ul>
         <li>1994</li>
         <li>fl, pn</li>
      </ul>
      <p>
         <description>
            <p>3 movements</p>
         </description>
      </p>
      <h3>Little Trio</h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Trio: Dream in D</h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Rhapsodic. Passionate. Available on CD 
               <cite>
                  <a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr=1-2/" shape="rect">
                     Two by Three
                  </a>
               </cite> 
               from North/South Consonance (1998).
            </p>
            
         </description>
      </p>
      <h3>Propos II</h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
      </ul>
      <p>
         <description>
            <p>Arrangement from Propos</p>
         </description>
      </p>
      <h3>Rictus En Mirroir</h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
      </ul>
      <p></p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999<br><a href="http://www.macfaq.com/personal.html">
         Elliotte 
         Rusty 
         Harold
         
      </a><br><a href="mailto:elharo@metalab.unc.edu">elharo@metalab.unc.edu</a></body>
</html>

Matching composers with compositions

The composers and their compositions are linked through the the id attribute of the composer element and the composer attribute of the composition element.

  <composer id="c3">
    <name>
      <first_name>Beth</first_name> 
      <middle_name></middle_name> 
      <last_name>Anderson</last_name>
    </name>
  </composer>
    
  <composition composer="c3">
    <title>Trio: Dream in D</title>
    <date><year>(1980)</year></date> 
    <length>10'</length>
    <instruments>fl, pn, vc, or vn, pn, vc</instruments>
    <description>
      Rhapsodic. Passionate. Available on CD 
      <cite><a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid%3D913265342/sr%3D1-2/">Two by Three</a></cite> 
      from North/South Consonance (1998).
    </description> 
    <publisher></publisher>
  </composition>

Matching composers with compositions

  <xsl:template match="catalog">
    <head>
       <title><xsl:value-of select="category"/></title>      
    </head>
    <body>
       <h1><xsl:value-of select="category"/></h1>      
       <xsl:apply-templates select="composer"/>
       <hr/>
       Copyright <xsl:value-of select="copyright"/><br/>
       Last Modified: <xsl:value-of select="last_updated"/><br/>
       <xsl:apply-templates select="maintainer"/>
    </body>
  </xsl:template>
  
  <xsl:template match="composer">
    <h2><xsl:value-of select="name"/></h2>
    <xsl:apply-templates select="../composition[@composer=current()/@id]"/>
  </xsl:template>

View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <h2>
         Julie 
         
         Mandel
         
      </h2>
      <h3>Trio for Flute, Viola and Harp</h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         <description>
            
            <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
               Christine Ims, and Susan Jolles. In 3 movements :
            </p>
            
            <ul>
               
               <li>mvt. 1: 5:01</li>
               
               <li>mvt. 2: 4:11</li>
               
               <li>mvt. 3: 4:26</li>
               
            </ul>  
            
         </description>
      </p>
      <h3>Invention for Flute and Piano</h3>
      <ul>
         <li>1994</li>
         <li>fl, pn</li>
      </ul>
      <p>
         <description>
            <p>3 movements</p>
         </description>
      </p>
      <h2>
         Margaret 
         De 
         Wys
         
      </h2>
      <h3>Charmonium</h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Commissioned as quartet for the Meridian String Quartet. 
               Sonorous, bold. Moderate difficulty. Tape available.
            </p>
            
         </description>
      </p>
      <h2>
         Beth 
         
         Anderson
         
      </h2>
      <h3>Little Trio</h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Trio: Dream in D</h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Rhapsodic. Passionate. Available on CD 
               <cite>
                  <a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr=1-2/" shape="rect">
                     Two by Three
                  </a>
               </cite> 
               from North/South Consonance (1998).
            </p>
            
         </description>
      </p>
      <h2>
         Linda 
         
         Bouchard
         
      </h2>
      <h3>Propos II</h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
      </ul>
      <p>
         <description>
            <p>Arrangement from Propos</p>
         </description>
      </p>
      <h3>Rictus En Mirroir</h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
      </ul>
      <p></p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999<br><a href="http://www.macfaq.com/personal.html">
         Elliotte 
         Rusty 
         Harold
         
      </a><br><a href="mailto:elharo@metalab.unc.edu">elharo@metalab.unc.edu</a></body>
</html>

Sorting with xsl:sort

  <xsl:template match="catalog">
    <head>
       <title><xsl:value-of select="category"/></title>      
    </head>
    <body>
       <h1><xsl:value-of select="category"/></h1>      
       <xsl:apply-templates select="composer">
         <xsl:sort select="name/last_name"/>
       </xsl:apply-templates>
       <hr/>
       Copyright <xsl:value-of select="copyright"/><br/>
       Last Modified: <xsl:value-of select="last_updated"/><br/>
       <xsl:apply-templates select="maintainer"/>
    </body>
  </xsl:template>


View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <h2>
         Beth 
         
         Anderson
         
      </h2>
      <h3>Little Trio</h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Trio: Dream in D</h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Rhapsodic. Passionate. Available on CD 
               <cite>
                  <a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr=1-2/" shape="rect">
                     Two by Three
                  </a>
               </cite> 
               from North/South Consonance (1998).
            </p>
            
         </description>
      </p>
      <h2>
         Linda 
         
         Bouchard
         
      </h2>
      <h3>Propos II</h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
      </ul>
      <p>
         <description>
            <p>Arrangement from Propos</p>
         </description>
      </p>
      <h3>Rictus En Mirroir</h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
      </ul>
      <p></p>
      <h2>
         Julie 
         
         Mandel
         
      </h2>
      <h3>Trio for Flute, Viola and Harp</h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         <description>
            
            <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
               Christine Ims, and Susan Jolles. In 3 movements :
            </p>
            
            <ul>
               
               <li>mvt. 1: 5:01</li>
               
               <li>mvt. 2: 4:11</li>
               
               <li>mvt. 3: 4:26</li>
               
            </ul>  
            
         </description>
      </p>
      <h3>Invention for Flute and Piano</h3>
      <ul>
         <li>1994</li>
         <li>fl, pn</li>
      </ul>
      <p>
         <description>
            <p>3 movements</p>
         </description>
      </p>
      <h2>
         Margaret 
         De 
         Wys
         
      </h2>
      <h3>Charmonium</h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Commissioned as quartet for the Meridian String Quartet. 
               Sonorous, bold. Moderate difficulty. Tape available.
            </p>
            
         </description>
      </p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999<br><a href="http://www.macfaq.com/personal.html">
         Elliotte 
         Rusty 
         Harold
         
      </a><br><a href="mailto:elharo@metalab.unc.edu">elharo@metalab.unc.edu</a></body>
</html>

Multiple Key Sorts

  <xsl:template match="catalog">
    <head>
       <title><xsl:value-of select="category"/></title>      
    </head>
    <body>
       <h1><xsl:value-of select="category"/></h1>      
       <xsl:apply-templates select="composer">
         <xsl:sort select="name/last_name"/>
         <xsl:sort select="name/first_name"/>
         <xsl:sort select="name/middle_name"/>
       </xsl:apply-templates>
       <hr/>
       Copyright <xsl:value-of select="copyright"/><br/>
       Last Modified: <xsl:value-of select="last_updated"/><br/>
       <xsl:apply-templates select="maintainer"/>
    </body>
  </xsl:template>

View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <h2>
         Beth 
         
         Anderson
         
      </h2>
      <h3>Little Trio</h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Trio: Dream in D</h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Rhapsodic. Passionate. Available on CD 
               <cite>
                  <a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr=1-2/" shape="rect">
                     Two by Three
                  </a>
               </cite> 
               from North/South Consonance (1998).
            </p>
            
         </description>
      </p>
      <h2>
         Linda 
         
         Bouchard
         
      </h2>
      <h3>Propos II</h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
      </ul>
      <p>
         <description>
            <p>Arrangement from Propos</p>
         </description>
      </p>
      <h3>Rictus En Mirroir</h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
      </ul>
      <p></p>
      <h2>
         Julie 
         
         Mandel
         
      </h2>
      <h3>Trio for Flute, Viola and Harp</h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         <description>
            
            <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
               Christine Ims, and Susan Jolles. In 3 movements :
            </p>
            
            <ul>
               
               <li>mvt. 1: 5:01</li>
               
               <li>mvt. 2: 4:11</li>
               
               <li>mvt. 3: 4:26</li>
               
            </ul>  
            
         </description>
      </p>
      <h3>Invention for Flute and Piano</h3>
      <ul>
         <li>1994</li>
         <li>fl, pn</li>
      </ul>
      <p>
         <description>
            <p>3 movements</p>
         </description>
      </p>
      <h2>
         Margaret 
         De 
         Wys
         
      </h2>
      <h3>Charmonium</h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Commissioned as quartet for the Meridian String Quartet. 
               Sonorous, bold. Moderate difficulty. Tape available.
            </p>
            
         </description>
      </p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999<br><a href="http://www.macfaq.com/personal.html">
         Elliotte 
         Rusty 
         Harold
         
      </a><br><a href="mailto:elharo@metalab.unc.edu">elharo@metalab.unc.edu</a></body>
</html>

Sorting by Composition Title

Sorting by composition title is equally straight-forward but we have to do it in a separate apply-templates element

  <xsl:template match="catalog">
    <head>
       <title><xsl:value-of select="category"/></title>      
    </head>
    <body>
       <h1><xsl:value-of select="category"/></h1>      
       <xsl:apply-templates select="composer">
         <xsl:sort select="name/last_name"/>
         <xsl:sort select="name/first_name"/>
         <xsl:sort select="name/middle_name"/>
       </xsl:apply-templates>
       <hr/>
       Copyright <xsl:value-of select="copyright"/><br/>
       Last Modified: <xsl:value-of select="last_updated"/><br/>
       <xsl:apply-templates select="maintainer"/>
    </body>
  </xsl:template>
  
  <xsl:template match="composer">
    <h2><xsl:value-of select="name"/></h2>
    <xsl:apply-templates select="../composition[@composer=current()/@id]">
       <xsl:sort select="title"/>      
    </xsl:apply-templates>
  </xsl:template>

View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <h2>
         Beth 
         
         Anderson
         
      </h2>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Little Trio</h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Trio: Dream in D</h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Rhapsodic. Passionate. Available on CD 
               <cite>
                  <a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr=1-2/" shape="rect">
                     Two by Three
                  </a>
               </cite> 
               from North/South Consonance (1998).
            </p>
            
         </description>
      </p>
      <h2>
         Linda 
         
         Bouchard
         
      </h2>
      <h3>Propos II</h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
      </ul>
      <p>
         <description>
            <p>Arrangement from Propos</p>
         </description>
      </p>
      <h3>Rictus En Mirroir</h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
      </ul>
      <p></p>
      <h2>
         Julie 
         
         Mandel
         
      </h2>
      <h3>Invention for Flute and Piano</h3>
      <ul>
         <li>1994</li>
         <li>fl, pn</li>
      </ul>
      <p>
         <description>
            <p>3 movements</p>
         </description>
      </p>
      <h3>Trio for Flute, Viola and Harp</h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         <description>
            
            <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
               Christine Ims, and Susan Jolles. In 3 movements :
            </p>
            
            <ul>
               
               <li>mvt. 1: 5:01</li>
               
               <li>mvt. 2: 4:11</li>
               
               <li>mvt. 3: 4:26</li>
               
            </ul>  
            
         </description>
      </p>
      <h2>
         Margaret 
         De 
         Wys
         
      </h2>
      <h3>Charmonium</h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Commissioned as quartet for the Meridian String Quartet. 
               Sonorous, bold. Moderate difficulty. Tape available.
            </p>
            
         </description>
      </p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999<br><a href="http://www.macfaq.com/personal.html">
         Elliotte 
         Rusty 
         Harold
         
      </a><br><a href="mailto:elharo@metalab.unc.edu">elharo@metalab.unc.edu</a></body>
</html>

xsl:for-each

  <xsl:template match="catalog">
    <head>
       <title><xsl:value-of select="category"/></title>      
    </head>
    <body>
       <!-- Header -->            
       <h1><xsl:value-of select="category"/></h1>
       <ul>
         <xsl:for-each select="composition">
           <li><xsl:value-of select="title"/></li>
         </xsl:for-each>
       </ul>
       
       <!-- Body -->            
       <xsl:apply-templates select="composer">
         <xsl:sort select="name/last_name"/>
         <xsl:sort select="name/first_name"/>
         <xsl:sort select="name/middle_name"/>
       </xsl:apply-templates>
       
       <!-- Signature -->      
       <hr/>
       Copyright <xsl:value-of select="copyright"/><br/>
       Last Modified: <xsl:value-of select="last_updated"/><br/>
       <xsl:apply-templates select="maintainer"/>
    </body>
  </xsl:template>

View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <ul>
         <li>Trio for Flute, Viola and Harp</li>
         <li>Charmonium</li>
         <li>Invention for Flute and Piano</li>
         <li>Little Trio</li>
         <li>Dr. Blood's Mermaid Lullaby</li>
         <li>Trio: Dream in D</li>
         <li>Propos II</li>
         <li>Rictus En Mirroir</li>
      </ul>
      <h2>
         Beth 
         
         Anderson
         
      </h2>
      <h3>Dr. Blood's Mermaid Lullaby</h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Little Trio</h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3>Trio: Dream in D</h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Rhapsodic. Passionate. Available on CD 
               <cite>
                  <a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr=1-2/" shape="rect">
                     Two by Three
                  </a>
               </cite> 
               from North/South Consonance (1998).
            </p>
            
         </description>
      </p>
      <h2>
         Linda 
         
         Bouchard
         
      </h2>
      <h3>Propos II</h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
      </ul>
      <p>
         <description>
            <p>Arrangement from Propos</p>
         </description>
      </p>
      <h3>Rictus En Mirroir</h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
      </ul>
      <p></p>
      <h2>
         Julie 
         
         Mandel
         
      </h2>
      <h3>Invention for Flute and Piano</h3>
      <ul>
         <li>1994</li>
         <li>fl, pn</li>
      </ul>
      <p>
         <description>
            <p>3 movements</p>
         </description>
      </p>
      <h3>Trio for Flute, Viola and Harp</h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         <description>
            
            <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
               Christine Ims, and Susan Jolles. In 3 movements :
            </p>
            
            <ul>
               
               <li>mvt. 1: 5:01</li>
               
               <li>mvt. 2: 4:11</li>
               
               <li>mvt. 3: 4:26</li>
               
            </ul>  
            
         </description>
      </p>
      <h2>
         Margaret 
         De 
         Wys
         
      </h2>
      <h3>Charmonium</h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Commissioned as quartet for the Meridian String Quartet. 
               Sonorous, bold. Moderate difficulty. Tape available.
            </p>
            
         </description>
      </p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999<br><a href="http://www.macfaq.com/personal.html">
         Elliotte 
         Rusty 
         Harold
         
      </a><br><a href="mailto:elharo@metalab.unc.edu">elharo@metalab.unc.edu</a></body>
</html>

Sorting the Table of Contents

xsl:for-each can have an xsl:sort child just like xsl:apply-templates

  <xsl:template match="catalog">
    <head>
       <title><xsl:value-of select="category"/></title>      
    </head>
    <body>
       <!-- Header -->            
       <h1><xsl:value-of select="category"/></h1>
       <ul>
         <xsl:for-each select="composition">
           <xsl:sort select="title"/>
           <li><xsl:value-of select="title"/></li>
         </xsl:for-each>
       </ul>
       
       <!-- Body -->            
       <xsl:apply-templates select="composer">
         <xsl:sort select="name/last_name"/>
         <xsl:sort select="name/first_name"/>
         <xsl:sort select="name/middle_name"/>
       </xsl:apply-templates>
       
       <!-- Signature -->      
       <hr/>
       Copyright <xsl:value-of select="copyright"/><br/>
       Last Modified: <xsl:value-of select="last_updated"/><br/>
       <xsl:apply-templates select="maintainer"/>
    </body>
  </xsl:template>

View Transformed Document in Netscape

Adding Links from the Table of Contents

Here's the new template for the composition

  <xsl:template match="composition">
    <h3>
      <a name="{generate-id()}">
        <xsl:value-of select="title"/>
      </a>
    </h3>

    <ul>
     <xsl:if test="string(date)">
       <li><xsl:value-of select="date"/></li>
     </xsl:if>
     <xsl:if test="string(length)">
       <li><xsl:value-of select="length"/></li>
     </xsl:if>
     <xsl:if test="string(instruments)">
       <li><xsl:value-of select="instruments"/></li>
     </xsl:if>
     <xsl:if test="string(publisher)">
       <li><xsl:value-of select="publisher"/></li>
     </xsl:if>    
    </ul>

    <p><xsl:apply-templates select="description"/></p>    
    
  </xsl:template>

Here's the new template for the tabel of contents link

   <xsl:template match="catalog">
    <head>
       <title><xsl:value-of select="category"/></title>      
    </head>
    <body>
       <!-- Header -->            
       <h1><xsl:value-of select="category"/></h1>
       <ul>
         <xsl:for-each select="composition">
           <xsl:sort select="title"/>
           <li>
             <a href="#{generate-id()}">
               <xsl:value-of select="title"/>
             </a>
           </li>
         </xsl:for-each>
       </ul>
       
       <!-- Body -->            
       <xsl:apply-templates select="composer">
         <xsl:sort select="name/last_name"/>
         <xsl:sort select="name/first_name"/>
         <xsl:sort select="name/middle_name"/>
       </xsl:apply-templates>
       
       <!-- Signature -->      
       <hr/>
       Copyright <xsl:value-of select="copyright"/><br/>
       Last Modified: <xsl:value-of select="last_updated"/><br/>
       <xsl:apply-templates select="maintainer"/>
    </body>
  </xsl:template> 

Although the ID is generated in two separate places, it is generated for the same node. Consequently, they are the same.


View Transformed Document in Netscape

The actual HTML output

<html>
   <head>
      <meta http-equiv="Content-Type" content="application/xml; charset=utf-8">
   
      <title>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </title>
   </head>
   <body>
      <h1>
         Small chamber ensembles - 2-4 Players by New York Women Composers
         
      </h1>
      <ul>
         <li><a href="#b1ac21">Charmonium</a></li>
         <li><a href="#b1ac27">Dr. Blood's Mermaid Lullaby</a></li>
         <li><a href="#b1ac23">Invention for Flute and Piano</a></li>
         <li><a href="#b1ac25">Little Trio</a></li>
         <li><a href="#b1ac31">Propos II</a></li>
         <li><a href="#b1ac33">Rictus En Mirroir</a></li>
         <li><a href="#b1ac19">Trio for Flute, Viola and Harp</a></li>
         <li><a href="#b1ac29">Trio: Dream in D</a></li>
      </ul>
      <h2>
         Beth 
         
         Anderson
         
      </h2>
      <h3><a name="b1ac27">Dr. Blood's Mermaid Lullaby</a></h3>
      <ul>
         <li>1980</li>
         <li>3'</li>
         <li>fl or ob, or vn, or vc, pn</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3><a name="b1ac25">Little Trio</a></h3>
      <ul>
         <li>1984</li>
         <li>4'</li>
         <li>fl, guit, va</li>
         <li>ACA</li>
      </ul>
      <p></p>
      <h3><a name="b1ac29">Trio: Dream in D</a></h3>
      <ul>
         <li>1980</li>
         <li>10'</li>
         <li>fl, pn, vc, or vn, pn, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Rhapsodic. Passionate. Available on CD 
               <cite>
                  <a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr=1-2/" shape="rect">
                     Two by Three
                  </a>
               </cite> 
               from North/South Consonance (1998).
            </p>
            
         </description>
      </p>
      <h2>
         Linda 
         
         Bouchard
         
      </h2>
      <h3><a name="b1ac31">Propos II</a></h3>
      <ul>
         <li>1985</li>
         <li>11'</li>
         <li>2 tpt</li>
      </ul>
      <p>
         <description>
            <p>Arrangement from Propos</p>
         </description>
      </p>
      <h3><a name="b1ac33">Rictus En Mirroir</a></h3>
      <ul>
         <li>1985</li>
         <li>14'</li>
         <li>fl, ob, hpschd, vc</li>
      </ul>
      <p></p>
      <h2>
         Julie 
         
         Mandel
         
      </h2>
      <h3><a name="b1ac23">Invention for Flute and Piano</a></h3>
      <ul>
         <li>1994</li>
         <li>fl, pn</li>
      </ul>
      <p>
         <description>
            <p>3 movements</p>
         </description>
      </p>
      <h3><a name="b1ac19">Trio for Flute, Viola and Harp</a></h3>
      <ul>
         <li>1994</li>
         <li>13'38"</li>
         <li>fl, hp, vla</li>
         <li>Theodore Presser</li>
      </ul>
      <p>
         <description>
            
            <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
               Christine Ims, and Susan Jolles. In 3 movements :
            </p>
            
            <ul>
               
               <li>mvt. 1: 5:01</li>
               
               <li>mvt. 2: 4:11</li>
               
               <li>mvt. 3: 4:26</li>
               
            </ul>  
            
         </description>
      </p>
      <h2>
         Margaret 
         De 
         Wys
         
      </h2>
      <h3><a name="b1ac21">Charmonium</a></h3>
      <ul>
         <li>1991</li>
         <li>9'</li>
         <li>2 vln, vla, vc</li>
      </ul>
      <p>
         <description>
            
            <p>Commissioned as quartet for the Meridian String Quartet. 
               Sonorous, bold. Moderate difficulty. Tape available.
            </p>
            
         </description>
      </p>
      <hr>
      Copyright 1999 New York Women Composers<br>
      Last Modified: July 28, 1999<br><a href="http://www.macfaq.com/personal.html">
         Elliotte 
         Rusty 
         Harold
         
      </a><br><a href="mailto:elharo@metalab.unc.edu">elharo@metalab.unc.edu</a></body>
</html>

Numbering Output

  <xsl:template match="composition">
    <h3><xsl:number value="position()"/>.
      <a name="{generate-id()}">
        <xsl:value-of select="title"/>
      </a>
    </h3>

    <ul>
     <xsl:if test="string(date)">
       <li><xsl:value-of select="date"/></li>
     </xsl:if>
     <xsl:if test="string(length)">
       <li><xsl:value-of select="length"/></li>
     </xsl:if>
     <xsl:if test="string(instruments)">
       <li><xsl:value-of select="instruments"/></li>
     </xsl:if>
     <xsl:if test="string(publisher)">
       <li><xsl:value-of select="publisher"/></li>
     </xsl:if>    
    </ul>

    <p><xsl:apply-templates select="description"/></p>    
    
  </xsl:template>

View Transformed Document in Netscape

String functions


substring()

  <xsl:template match="composition">
    <h3><xsl:number value="position()"/>.
      <a name="{generate-id()}">
        <xsl:value-of select="title"/>
      </a>
    </h3>

    <ul>
     <xsl:if test="string(date)">
       <!--not Y10K safe! -->
       <li><xsl:value-of select="substring(date,2,4)"/></li>
     </xsl:if>
     <xsl:if test="string(length)">
       <li><xsl:value-of select="length"/></li>
     </xsl:if>
     <xsl:if test="string(instruments)">
       <li><xsl:value-of select="instruments"/></li>
     </xsl:if>
     <xsl:if test="string(publisher)">
       <li><xsl:value-of select="publisher"/></li>
     </xsl:if>    
    </ul>

    <p><xsl:apply-templates select="description"/></p>    
    
  </xsl:template>

View Transformed Document in Netscape

The actual HTML output

<html></html>

Number Operators


Number Functions

XPath includes five functions that operate on numbers:


Where Does the Transformation Happen?

There are three primary ways XML documents are transformed into other formats, such as HTML, with an XSLT style sheet:


Client Side Processing

<?xml version="1.0"?>
<?xml-stylesheet type="text/xml" href="compositions.xsl"?>

A style sheet that works with Internet Explorer 5.0 and 5.5

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">

  <xsl:template match="/">
    <html>
      <xsl:apply-templates select="catalog"/>
    </html>
  </xsl:template>

  <xsl:template match="catalog">
    <body>
      <xsl:apply-templates select="composition"/>
    </body>
  </xsl:template>

  <xsl:template match="composition">
    <h3><xsl:value-of select="name"/></h3>
  </xsl:template>

</xsl:stylesheet>

What else does XSLT have?


What does XSLT not have?


Summary


To Learn More


Part V: Programming with XML


Several APIs to choose from


SAX


SAX2


The SAX Process


Parsing a Document with XMLReader

import org.xml.sax.*;
import org.xml.sax.helpers.*;
import java.io.*;


public class SAX2Checker {

  public static void main(String[] args) {
    
    if (args.length == 0) {
      System.out.println("Usage: java SAX2Checker URL1 URL2..."); 
    } 
    
    // set up the parser 
    XMLReader parser;
    try {
      parser = XMLReaderFactory.createXMLReader();
    } 
    catch (SAXException e) {
      try {
        parser = XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
      }
      catch (SAXException e2) {
        System.err.println("Error: could not locate a parser.");
        return;
      }
    }
     
    // start parsing... 
    for (int i = 0; i < args.length; i++) {
      
      // command line should offer URIs or file names
      try {
        parser.parse(args[i]);
        // If there are no well-formedness errors
        // then no exception is thrown
        System.out.println(args[i] + " is well formed.");
      }
      catch (SAXParseException e) { // well-formedness error
        System.out.println(args[i] + " is not well formed.");
        System.out.println(e.getMessage()
         + " at line " + e.getLineNumber() 
         + ", column " + e.getColumnNumber());
      }
      catch (SAXException e) { // some other kind of error
        System.out.println(e.getMessage());
      }
      catch (IOException e) {
        System.out.println("Could not check " + args[i] 
         + " because of the IOException " + e);
      }
      
    }  
  
  }

}

The ContentHandler interface

package org.xml.sax;


public interface ContentHandler {

    public void setDocumentLocator(Locator locator);
    
    public void startDocument() throws SAXException;
    
    public void endDocument()	throws SAXException;
    
    public void startPrefixMapping(String prefix, String uri) 
     throws SAXException;

    public void endPrefixMapping(String prefix) throws SAXException;

    public void startElement(String namespaceURI, String localName,
		 String rawName, Attributes atts) throws SAXException;

    public void endElement(String namespaceURI, String localName,
     String rawName) throws SAXException;

    public void characters(char[] ch, int start, int length) 
     throws SAXException;

    public void ignorableWhitespace(char ch[], int start, int length)
     throws SAXException;

    public void processingInstruction(String target, String data)
     throws SAXException;

    public void skippedEntity(String name) throws SAXException;
     
}

SAX Example

import org.apache.xerces.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;
import java.io.IOException;
import java.util.StringTokenizer;


public class SAXWordCount implements ContentHandler {

  private int numWords;
    
  public void startDocument() throws SAXException {
    this.numWords = 0; 
  }

  public void endDocument() throws SAXException {
    System.out.println(numWords + " words");
    System.out.flush();
  }
  
  private StringBuffer sb = new StringBuffer();
  
  public void characters(char[] text, int start, int length) 
   throws SAXException {
    
    sb.append(text, start, length);
    
  }
  
  private void flush() {
    numWords += countWords(sb.toString());
    sb = new StringBuffer();    
  }
  
  // methods that signify a word break
  public void startElement(String namespaceURI, String localName,
	 String rawName, Attributes atts) throws SAXException {
    this.flush(); 
  }
  
  public void endElement(String namespaceURI, String localName,
	 String rawName) throws SAXException {
    this.flush(); 
  }
  
  public void processingInstruction(String target, String data)
   throws SAXException {
    this.flush(); 
  }

  // methods that aren't necessary in this example
  public void startPrefixMapping(String prefix, String uri) 
   throws SAXException {
    // ignore; 
  }

  public void ignorableWhitespace(char[] text, int start, int length)
   throws SAXException {
    // ignore; 
  }
  
  public void endPrefixMapping(String prefix) throws SAXException {
    // ignore; 
  }

  public void skippedEntity(String name) throws SAXException {
    // ignore; 
  }   
  
  public void setDocumentLocator(Locator locator) {}

  private static int countWords(String s) {
    
    if (s == null) return 0;
    s = s.trim();
    if (s.length() == 0) return 0;
    
    StringTokenizer st = new StringTokenizer(s);
    return st.countTokens();
    
  } 

  public static void main(String[] args) {
     
    SAXParser parser = new SAXParser();
    SAXWordCount counter = new SAXWordCount();
    parser.setContentHandler(counter);
    
    for (int i = 0; i < args.length; i++) {
      try {
        parser.parse(args[i]); 
      }
      catch (SAXException e) {
        System.err.println(e); 
      }
      catch (IOException e) {
        System.err.println(e); 
      }
      
    }
  
  } // end main

}
% java SAXWordCount hotcop.xml
16 words

Event Based API Caveats


Document Object Model


The Design of the DOM API


DOM Evolution


Eight Modules:


DOM Trees


org.w3c.dom


The DOM Process


Parsing documents with a DOM Parser Example

import org.apache.xerces.parsers.DOMParser;
import org.xml.sax.SAXException;
import java.io.IOException;
import org.w3c.dom.*;


public class DOMChecker {

  public static void main(String[] args) {
     
    // This is simpler but less flexible than the SAX approach.
    // Perhaps a good creational design pattern is needed here?   
  
    DOMParser parser = new DOMParser();
    
    for (int i = 0; i < args.length; i++) {
      try {
        // Read the entire document into memory
        parser.parse(args[i]); 
       
        Document d = parser.getDocument();
        // work with the document...
      }
      catch (SAXException e) {
        System.err.println(e); 
      }
      catch (IOException e) {
        System.err.println(e); 
      }
      
    }
  
  }

}

DOM Example

import org.apache.xerces.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;
import java.io.IOException;
import java.util.StringTokenizer;


public class DOMWordCount {

  public static void main(String[] args) {
     
    DOMParser parser = new DOMParser();
    DOMWordCount counter = new DOMWordCount();
    
    for (int i = 0; i < args.length; i++) {
      try {
        // Read the entire document into memory
        parser.parse(args[i]); 
       
        Document d = parser.getDocument();
        int numWords = countWordsInNode(d);
        System.out.println(numWords + " words");

      }
      catch (SAXException e) {
        System.err.println(e); 
      }
      catch (IOException e) {
        System.err.println(e); 
      }
      
    }
  
  } // end main

  // note use of recursion
  public static int countWordsInNode(Node node) {
    
    int numWords = 0;
    
    if (node.hasChildNodes()) {
      NodeList children = node.getChildNodes();
      for (int i = 0; i < children.getLength(); i++) {
        numWords += countWordsInNode(children.item(i));
      } 
    }  

    int type = node.getNodeType();
    if (type == Node.TEXT_NODE) {
      String s = node.getNodeValue();
      numWords += countWordsInString(s);
    }
    
    return numWords;  
    
  }
  
  private static int countWordsInString(String s) {
    
    if (s == null) return 0;
    s = s.trim();
    if (s.length() == 0) return 0;
    
    StringTokenizer st = new StringTokenizer(s);
    return st.countTokens();
    
  } 

}
% java DOMWordCount hotcop.xml
16 words

JDOM


The JDOM Process


Parsing a Document with JDOM

import org.jdom.JDOMException;
import org.jdom.input.SAXBuilder;


public class JDOMChecker {

  public static void main(String[] args) {
  
    if (args.length == 0) {
      System.out.println("Usage: java JDOMChecker URL1 URL2..."); 
    } 
      
    SAXBuilder builder = new SAXBuilder();
     
    // start parsing... 
    for (int i = 0; i < args.length; i++) {
      
      // command line should offer URIs or file names
      try {
        builder.build(args[i]);
        // If there are no well-formedness errors, 
        // then no exception is thrown
        System.out.println(args[i] + " is well formed.");
      }
      catch (JDOMException e) { // indicates a well-formedness or other error
        System.out.println(args[i] + " is not well formed.");
        System.out.println(e.getMessage());
      }
      
    }   
  
  }

}

Parser Results

% java JDOMChecker shortlogs.xml HelloJDOM.java
shortlogs.xml is well formed.
HelloJDOM.java is not well formed.
The markup in the document preceding the root element must be well-formed.: 
Error on line 1 of XML document: The markup in the document preceding the 
root element must be well-formed.

JDOM Example

import org.jdom.*;
import org.jdom.input.SAXBuilder;
import java.util.*;


public class JDOMWordCount {

  public static void main(String[] args) {
  
    if (args.length == 0) {
      System.out.println("Usage: java JDOMWordCount URL1 URL2..."); 
    } 
      
    SAXBuilder builder = new SAXBuilder();
     
    // start parsing... 
    for (int i = 0; i < args.length; i++) {
      
      // command line should offer URIs or file names
      try {
        Document doc = builder.build(args[i]);
        Element root = doc.getRootElement();
        int numWords = countWordsInElement(root);
        System.out.println(numWords + " words");

      }
      catch (JDOMException e) { // indicates a well-formedness or other error
        System.out.println(args[i] + " is not well formed.");
        System.out.println(e.getMessage());
      }
      
    }   
  
  }

  public static int countWordsInElement(Element element) {
    
    int numWords = 0;
    
    List children = element.getMixedContent();
    Iterator iterator = children.iterator();
    while (iterator.hasNext()) {
      Object o = iterator.next();
      if (o instanceof String) {
        numWords += countWordsInString((String) o);
      } 
      else if (o instanceof Element) {
        // note use of recursion
        numWords += countWordsInElement((Element) o); 
      } 
    }
    
    return numWords;  
    
  }

  private static int countWordsInString(String s) {
    
    if (s == null) return 0;
    s = s.trim();
    if (s.length() == 0) return 0;
    
    StringTokenizer st = new StringTokenizer(s);
    return st.countTokens();
    
  }

}
% java JDOMWordCount hotcop.xml
16 words

XML and Databases


Integrating XML with Databases


Middleware


Database Exchange and Integration


To Learn More


Index | Cafe con Leche

Copyright 2000, 2001 Elliotte Rusty Harold
elharo@metalab.unc.edu
Last Modified February 18, 2001