XML Basics

XML Basics

Elliotte Rusty Harold

SD2000 East October 31, 2000

elharo@metalab.unc.edu

http://metalab.unc.edu/xml/


What is XML?


XML is a Meta Markup Language


XML describes structure and semantics, not formatting


A Song Description in HTML

<dt>Hot Cop
<dd> by Jacques Morali, Henri Belolo, and Victor Willis
<ul>
<li>Producer: Jacques Morali
<li>Publisher: PolyGram Records
<li>Length: 6:20
<li>Written: 1978
<li>Artist: Village People
</ul>
View Document in Browser

A Song Description in XML

<SONG>
  <TITLE>Hot Cop</TITLE>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <PUBLISHER>PolyGram Records</PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>
View Document in Browser

Editing and Saving XML Files


The XML Declaration

<?xml version="1.0" standalone="yes" encoding="UTF-8"?>
<SONG>
  <TITLE>Hot Cop</TITLE>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <PUBLISHER>PolyGram Records</PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>

The TITLE element

<TITLE>Hot Cop</TITLE>


The SONG element

<SONG>
  <TITLE>Hot Cop</TITLE>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <PUBLISHER>PolyGram Records</PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>

Style Sheets provide formatting

SONG {display: block; font-family: New York, Times New Roman, serif}
TITLE {display: block; font-size: 24pt; 
       font-weight: bold; font-family: Helvetica, sans}
COMPOSER {display: block}
PRODUCER {display: block}
YEAR {display: block}
PUBLISHER {display: block}
LENGTH {display: block}
ARTIST {display: block; font-style: italic}

Attaching style sheets to documents

<?xml-stylesheet type="text/css" href="song.css"?>
<SONG>
  <TITLE>Hot Cop</TITLE>
  <COMPOSER>Jacques Morali</COMPOSER>
  <COMPOSER>Henri Belolo</COMPOSER>
  <COMPOSER>Victor Willis</COMPOSER>
  <PRODUCER>Jacques Morali</PRODUCER>
  <PUBLISHER>PolyGram Records</PUBLISHER>
  <LENGTH>6:20</LENGTH>
  <YEAR>1978</YEAR>
  <ARTIST>Village People</ARTIST>
</SONG>

View Document in Browser

Style Sheet Languages


What is XML used for?


Domain-Specific Markup Languages


Self-Describing Data


An XML Fragment

<PERSON ID="p1100" SEX="M">
  <NAME>
    <GIVEN>Judson</GIVEN>
    <SURNAME>McDaniel</SURNAME>
  </NAME>
  <BIRTH>
    <DATE>21 Feb 1834</DATE>
  </BIRTH>
  <DEATH>
    <DATE>9 Dec 1905</DATE>
  </DEATH>
</PERSON>

Interchange of Data Among Applications


XML Applications


Example XML Applications


Mathematical Markup Language

<?xml version="1.0"?>
<html xmlns="http://www.w3.org/TR/REC-html40"
      xmlns:m="http://www.w3.org/TR/REC-MathML/"
>
<head>
<title>Fiat Lux</title>
<meta name="GENERATOR" content="amaya V1.3b" />
</head>
<body>

<P>
And God said,
</P>

<math>
  <m:mrow>
    <m:msub>
      <m:mi>&delta;</m:mi>
      <m:mi>&alpha;</m:mi>
    </m:msub>
    <m:msup>
      <m:mi>F</m:mi>
      <m:mi>&alpha;&beta;</m:mi>
    </m:msup>
    <m:mi></m:mi>
    <m:mo>=</m:mo>
    <m:mi></m:mi>
    <m:mfrac>
      <m:mrow>
        <m:mn>4</m:mn>
        <m:mi>&pi;</m:mi>
      </m:mrow>
      <m:mi>c</m:mi>
    </m:mfrac>
    <m:mi></m:mi>
    <m:msup>
      <m:mi>J</m:mi>
      <m:mrow>
        <m:mi>&beta;</m:mi>
        <m:mo></m:mo>
      </m:mrow>
    </m:msup>
  </m:mrow>
</math>

<P>
and there was light
</P>
</body>
</html>

Channel Definition Format

<?xml version="1.0"?>
<CHANNEL HREF="http://metalab.unc.edu/xml/index.html">
  <TITLE>Cafe con Leche</TITLE>
  <ITEM HREF="http://metalab.unc.edu/xml/books.html">
    <TITLE>Books about XML</TITLE>
  </ITEM>
  <ITEM HREF="http://metalab.unc.edu/xml/tradeshows.html">
    <TITLE>Trade shows and conferences about XML</TITLE>
  </ITEM>
  <ITEM HREF="http://metalab.unc.edu/xml/lists.htm">
    <TITLE>Mailing Lists dedicated to XML</TITLE>
  </ITEM>
</CHANNEL>

Classic Literature


Vector Graphics

A VML document

The Resource Description Framework (RDF)


An Example of RDF

<rdf:RDF 
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/DC/>
  <rdf:Description about="http://metalab.unc.edu/xml/>
    <dc:Creator>Elliotte Rusty Harold</dc:Creator>
    <dc:Title>Cafe con Leche</dc:Title>
  </rdf:Description>
</rdf:RDF>

XML for XML


XSL: The Extensible Stylesheet Language


XLinks: The Extensible Linking Language

<footnote xlink:type="simple" xlink:href="footnote7.xml">7</footnote>

File Formats, in-house applications, and other behind the scenes uses


A larger example: Music Catalog


Sample Catalog

http://metalab.unc.edu/nywc/

Organizing the Data


What is the Root Element


The Root Element

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
  Everything else will go here...
</catalog>
View in Browser

What are the Immediate Children of the Root?


Child Elements

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

</catalog>
View in Browser

White space in XML is not especially significant

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog><category>Small chamber ensembles 
- 2-4 Players by New York Women Composers</category></catalog>
View in Browser

Composers

Each composer has a name

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

  <composer>
    <name>Julie Mandel</name>
  </composer>

  <composer>
    <name>Margaret De Wys</name>
  </composer>  
    
  <composer>
    <name>Beth Anderson</name>
  </composer>
    
  <composer>
    <name>Linda Bouchard</name>
  </composer>

</catalog>
View in Browser

Grand Children

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

  <composer>
    <name>
      <first_name>Julie</first_name> 
      <middle_name></middle_name> 
      <last_name>Mandel</last_name>
    </name>
  </composer>

  <composer>
    <name>
      <first_name>Margaret</first_name> 
      <middle_name>De</middle_name> 
      <last_name>Wys</last_name>
    </name>
  </composer>  
    
  <composer>
    <name>
      <first_name>Beth</first_name> 
      <middle_name></middle_name> 
      <last_name>Anderson</last_name>
    </name>
  </composer>
    
  <composer>
    <name>
      <first_name>Linda</first_name> 
      <middle_name></middle_name> 
      <last_name>Bouchard</last_name>
    </name>
  </composer>

</catalog>
View in Browser

Attributes

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

  <composer id="c1">
    <name>
      <first_name>Julie</first_name> 
      <middle_name></middle_name> 
      <last_name>Mandel</last_name>
    </name>
  </composer>

  <composer id="c2">
    <name>
      <first_name>Margaret</first_name> 
      <middle_name>De</middle_name> 
      <last_name>Wys</last_name>
    </name>
  </composer>  
    
  <composer id="c3">
    <name>
      <first_name>Beth</first_name> 
      <middle_name></middle_name> 
      <last_name>Anderson</last_name>
    </name>
  </composer>
    
  <composer id="c4">
    <name>
      <first_name>Linda</first_name> 
      <middle_name></middle_name> 
      <last_name>Bouchard</last_name>
    </name>
  </composer>

</catalog>
View in Browser

Attributes vs. Elements


When not to use attributes


Compositions

Let's look at an example of what we want:

Rendered HTML:

Brass Swale (1988) 5", tbn, 2 Bfl tpts, bar. hn

Tonal. Commissioned/Premiered by the Redlands' New Music Ensemble. (A swale is a meadow or a marsh where a lot of wild plants grow together. The composer discovered the word when a horse named Swale won the Kentucky Derby several years ago. Since her work is primarily collage of newly composed musical swatches, she has used the name extensively.) ACA - American Composers Alliance


Composition Example in HTML

Or in HTML:

<dt><cite>Brass Swale</cite> (1988) 5", tbn, 2 Bfl tpts, bar. hn</dt>
<dd><p>
Tonal. Commissioned/Premiered by the Redlands' New Music 
Ensemble. (A swale is a meadow or a marsh where a lot of 
wild plants grow together. The composer discovered the word 
when a horse named Swale won the Kentucky Derby several 
years ago. Since her work is primarily collage of newly 
composed musical swatches, she has used the name 
extensively.)  ACA - American Composers 
Alliance</p>
</dd>

Each composition has a


Composition Example in XML

  <composition>
    <title>Brass Swale</title>
    <date>1988</date> 
    <length>5"</length>
    <instruments>tbn, 2 Bfl tpts, bar, hn</instruments>
    <description>
	  Tonal. Commissioned/Premiered by the Redlands' New Music
	  Ensemble. (A swale is a meadow or a marsh where a lot of
	  wild plants grow together. The composer discovered the word
	  when a horse named Swale won the Kentucky Derby several
	  years ago. Since her work is primarily collage of newly
	  composed musical swatches, she has used the name
	  extensively.)
    </description>
    <publisher>ACA - American Composers Alliance</publisher>
  </composition>
View in Browser

Further Divisions

  <composition>
    <title>Trio for Flute, Viola and Harp</title>
    <date><year>1994</year></date> 
    <length>13'38"</length>
    <instruments>fl, hp, vla</instruments>
    <description>
      <p>Premiered at Queens College in April, 1996 by Sue Ann Kahn, 
      Christine Ims, and Susan Jolles. In 3 movements :</p>
      <ul>
        <li>mvt. 1: 5:01</li>
        <li>mvt. 2: 4:11</li>
        <li>mvt. 3: 4:26</li>
      </ul>  
    </description>
    <publisher>Theodore Presser</publisher>
  </composition>
View in Browser

Attaching the Composer to the Composition

  <composition composer="c3">
    <title>Trio: Dream in D</title>
    <date><year>1980</year></date> 
    <length>10'</length>
    <instruments>fl, pn, vc, or vn, pn, vc</instruments>
    <description>
      Rhapsodic. Passionate. Available on CD 
      <cite><a href="http://www.amazon.com/exec/obidos/ASIN/B000007NMH/qid=913265342/sr%3D1-2/">
       Two by Three</a></cite> from North/South Consonance (1998).
    </description> 
    <publisher></publisher>
  </composition>
View in Browser

Some Keywords For the Search Engines

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>

  <category>
    Small chamber ensembles - 2-4 Players by New York Women Composers
  </category>

  <cataloging_info>
    <abstract>Compositions by the members of New York Women Composers</abstract>
    <keyword>music publishing</keyword>
    <keyword>scores</keyword>
    <keyword>women composers</keyword>
    <keyword>New York</keyword>
  </cataloging_info>

  <composer id="c1">
    <name>
      <first_name>Julie</first_name> 
      <middle_name></middle_name> 
      <last_name>Mandel</last_name>
    </name>
  </composer>
  
  ...
  
</catalog>
View in Browser

Standard Signature

<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
...
  <last_updated>July 28, 1999</last_updated>
  <copyright>1999 New York Women Composers</copyright>
  <maintainer email="elharo@metalab.unc.edu" 
              url="http://www.macfaq.com/personal.html">
    <name>
      <first_name>Elliotte</first_name> 
      <middle_name>Rusty</middle_name> 
      <last_name>Harold</last_name>
    </name>
  </maintainer>

</catalog>
View in Browser

Cascading Style Sheets


A Blank Style Sheet

<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/css" href="compositions1.css"?>
<catalog>
...
</catalog>
View in Browser

The Default Rule

catalog { font-family: New York, Times New Roman, serif; 
       font-size: 14pt; 
       background-color: white; 
       color: black; 
       display: block }

View in Browser

A style rule for the category element

category { display: block; 
       font-family: Helvetica, Arial, sans;
       font-size: 32pt; 
       font-weight: bold; 
       text-align: center}
       
catalog { font-family: New York, Times New Roman, serif; 
       font-size: 14pt; 
       background-color: white; 
       color: black; 
       display: block }

View in Browser

A style rule for the composer element

composer { display: block; 
       font-family: Helvetica, Arial, sans;
       font-size: 24pt; 
       font-weight: bold; 
       text-align: left}     

View in Browser

A style rule for the title element

composition title { display: block; 
       font-family: Helvetica, Arial, sans;
       font-size: 18pt; 
       font-weight: bold; 
       text-align: left}                

View in Browser

A style rule for the catalog info

// cataloging_info is only for search engines
cataloging_info { display: none;
       color: white}                

display: none requires CSS2:


View in Browser

Style rules for the signature

  <last_updated>July 28, 1999</last_updated>
  <copyright>1999 New York Women Composers</copyright>
  <maintainer email="elharo@metalab.unc.edu" 
              url="http://www.macfaq.com/personal.html">
    <name>
      <first_name>Elliotte</first_name> 
      <middle_name>Rusty</middle_name> 
      <last_name>Harold</last_name>
    </name>
  </maintainer>            

last_updated, copyright, maintainer {display: block;
       font-size: small}
       
copyright:before {content: "Copyright " }

last_updated:before {content: "Last Modified " }

last_updated {margin-top: 2ex }

Again, some of this requires CSS2


View in Browser

Style Rules for composition children

composition * {display:list-item}
       
description {display: block}

View in Browser

Finished Style Sheet

category { display: block; 
       font-family: Helvetica, Arial, sans;
       font-size: 32pt; 
       font-weight: bold; 
       text-align: center}
       
catalog { font-family: New York, Times New Roman, serif; 
       font-size: 14pt; 
       background-color: white; 
       color: black; 
       display: block }
      
composer { display: block; 
       font-family: Helvetica, Arial, sans;
       font-size: 24pt; 
       font-weight: bold; 
       text-align: left}  
       
composition title { display: block; 
       font-family: Helvetica, Arial, sans;
       font-size: 18pt; 
       font-weight: bold; 
       text-align: left}
       
composition * {display:list-item}
       
description {display: block}
              
// cataloging_info is only for search engines
cataloging_info { display: none;
       color: #FFFFFF}
       
last_updated, copyright, maintainer {display: block;
       font-size: small}
       
copyright:before {content: "Copyright " }

last_updated:before {content: "Last Modified " }

last_updated {margin-top: 2ex }

Possible Extensions


Possible Solutions


CSS or XSL?


Well-formedness Rules


Open and close all tags


Empty tags end with />


There is a unique root element


Elements may not overlap


Attribute values are quoted


< and & are only used to start tags and entities


Only the five predefined entity references are used


Programming with XML


SAX, the Simple API for XML


The Document Object Model (DOM)


To Learn More



Questions?


Index | Cafe con Leche

Copyright 2000 Elliotte Rusty Harold
elharo@metalab.unc.edu
Last Modified September 4, 2000