Advanced XML
Outline
Part I: XML Infoset
A normal XML document
A canonical XML document
An org.w3c.dom.Document object formed by reading hotcop.xml
Are these three the same thing or not?
What is the XML InfoSet?
The InfoSet defines 15 kinds of Information Items
The Document Information Item
Elements
Element Information Items
Attributes
Comments
A Processing Instruction Information Item Includes:
Characters
Namespace Declarations
Document Type Declaration
Document Type Definition
Entities
Entity Marker Information Items
Entity Declaration Information Items
The InfoSet Omits:
What is Canonical XML?
How are documents canonicalized?
Digital Signatures
To Learn More
Questions?
Part II: JDOM
Where we're going
Trees
Processing XML with JDOM is easy
What is JDOM?
About JDOM
JDOM versions
Four packages:
The org.jdom package
The org.jdom.input package
The org.jdom.output package
The org.jdom.adapters package
Writing XML Documents with JDOM
A JDOM program that writes this XML document
Hello JDOM
Actual Output
Hello DOM
A Java program that writes Fibonacci numbers into a text file
fibonacci.txt
fibonacci.xml
A JDOM program that writes Fibonacci numbers into an XML file
Output
A DOM program that writes Fibonacci numbers into an XML file
Suppose you want to include a DTD
ValidFibonacci
validfibonacci.xml
Using Namespaces
With Namespace Prefixes
The Default, Unprefixed Namespace
With Default Namespace
Converting data to XML
Sample Tab Delimited Data: Baseball Statistics
A Program to convert tab delimited data to XML
Baseball Stats in XML
A Shortcut
Converting data to XML while Processing it
Batting Averages in XML
Advantages of JDOM for Writing Documents
Reading XML with JDOM
Parser APIs
JDOM Compatible Parsers for Java
SAX
SAX2
The SAX Process
Event Based API Caveats
Document Object Model
The Design of the DOM API
DOM Evolution
Eight Modules:
DOM Trees
org.w3c.dom
The DOM Process
The JDOM Process
Parsing a Document with JDOM
Parser Results
Turning on Validation in JDOM
JDOM Validator
Validation Output
Building with DOM instead of SAX
DOMBuilder Example
Reading XML Documents
UserLand's RSS based list of Web logs
Goal: Return a list of all the URLs in this list as java.net.URL objects
The SAX ContentHandler interface
SAX Design
User Interface Class
ContentHandler Class
Weblogs Output
Weblogs with DOM
DOM Design
The Node Interface
The NodeIterator Interface
The NodeFilter Interface
Weblogs with DOM
Weblogs Output
Weblogs with JDOM
JDOM Design
Weblogs with JDOM
Weblogs Output
The org.jdom Package
The Document Node
The Document Class
Document Example
Output from XMLPrinter
Element Nodes
Element Class Implementation
The Element Class
Element Example: XCount
XCount Output
Handling Attributes in JDOM
The Attribute Class
IDTagger
Before IDTagger
After IDTagger
Handling Entities in JDOM
The Entity Class
Handling Comments in JDOM
The Comment Class
Comment Example
CommentReader Output
ProcessingInstruction Nodes
The ProcessingInstruction Class
XLinkSpider that Respects the robots Processing Instruction
Handling Namespaces
The Namespace Class
The Namespace Class
DocType Nodes
The DocType class
Example of the DocType Class
XHTMLValidator
Using the XHTMLValidator
The Verifier Class
The Verifier Class
JDOMException
JDOMException Class
The org.jdom.output Package
Serialization
XMLOutputter
Using the XMLOutputter Class Directly
Using the XMLOutputter Class Indirectly
JDOM based TagStripper
Output from a JDOM based TagStripper
Talking to DOM Programs
Talking to SAX Programs
What JDOM doesn't do
To Learn More
Questions?
Part III: XML Base and XInclude
What is XML Base?
The xml:base attribute
XML Base Example
Open Issues
What is XInclude?
Alternatives (and why they don't work)
The include element
The parse attribute
Implementation as a SAX filter
Implementation as JDOM
Implementation as DOM
To Learn More
Questions?
Part IV: Schemas
What are Schemas?
About Schemas
What's Wrong with DTDs?
DTDs vs. Schemas
Schema versions
greeting.xml
greeting.xsd according to the April 7 Working Draft
Attaching the schema to the document without namespaces
Validating the document with Xerces-J 1.2.0
An Invalid Document
Checking the Invalid Document
greeting.xsd in the Candidate Recommendation
New XSI namespace
Validating the document with XSV
An Invalid Document
Checking the Invalid Document
A More Complex Document
Complex vs. Simple Types
A More Complex Schema
Three main schema elements:
Validating the Song Document
Fixed Hot Cop
Primitive Data Types for Schemas
Numeric Data Types for Schemas
Time Data Types for Schemas
XML Data Types for Schemas
Assorted Data Types for Schemas
A Document with Attributes
Declaring Attributes
Element Content
Declaring Complex Types
Sharing Content Models
Mixed Content
Declaring Mixed Content
When Order Doesn't Matter
The xsd:all Group
Choices
Sequences
Adding a Price
Derived Types
Regular Expressions
The xsd:simpletype element
The Price Schema
Default Namespace
The targetNamespace attribute
A Song with a Namespace
A Schema for a Document that Uses the Default Namespace
Multiple Namespaces, Multiple Schemas
XLink Schema
Song Schema with XLink Support
Annotations
What Schemas don't do
Schema Alternatives
Schematron
RELAX
DTDs aren't Dead!
To Learn More
Questions?
Part V: XLinks
Three Technologies
Versions
HTML Links are Limited
XLinks are More Powerful
Application Support
Linking Elements
For example
Declaring XLink Attributes in DTDs
Fixed Attributes
Other Attributes
Descriptions of the Remote Resource
Link Behavior
xlink:show
xlink:actuate
Parameter Entities for Link Attributes
Extended Links
Extended Links
Resources
Resource Example
Resource Example Diagram
Roles and Titles for Resources
DTD for Extended Links
Another Shortcut for the DTD
Arcs
Arc Example
Arc Example Diagram
Arc Example
Arc Example Diagram
Arc Example with omitted to attribute
Arc Example Diagram
Arc DTD Fragment
Out-of-Line Links
Out of line Link example
Out of line Link example
Out of line Link Example
Another Out of line Link Example
Linkbases
XLink Summary
To Learn More
Questions?
Part VI: XPointers
XPointers
What are XPointers?
Why Use XPointers?
XPointer Examples
A Concrete Example
Location Paths, Steps, and Sets
Location Steps
Location Paths
Location Paths that Identify Multiple Nodes
Axes
Location Step Axes
The child Axis
The descendant Axis
The descendant-or-self Axis
The parent Axis
The self Axis
The ancestor Axis
The ancestor-or-self Axis
The preceding Axis
The following Axis
The preceding-sibling Axis
The following-sibling Axis
The attribute Axis
Node Tests
Predicates
Boolean Conversion
The position() function
Identifying an element by its position
Functions that Return Node Sets
id()
here()
origin()
Points
Point Expressions
Ranges
Range Expressions
Range Functions
String Ranges
Child Sequences
Summary
To Learn More
Questions?
Part VII: The Oracle Speaks, Predictions for the Future
XInclude succeeds once parsers support it
JDOM succeeds, much to the consternation of the W3C
Schemas, a partial success
XLinks
XPointers; the same story
Stuff we didn't talk about
XSLT 1.1
XSL-FO
DOM Level III
XHTML Fails
XML Query Languages
Schema Repositories all fail
MathML succeeds
SVG Takes Off in 2001
Browser Support
Invent the Future!
To Learn More
Questions?
Entire Presentation as Single File
Start
|
Cafe con Leche
Copyright 2000
Elliotte Rusty Harold
elharo@metalab.unc.edu
Last Modified November 7, 2000