Effective XML
Please turn off all
Speaker’s Qualifications
Part I: XML Overview
What is XML?
XML is a Meta Markup Language
Rule 1: Focus on Meaning, Not Appearance
A Song Description in HTML
A Song Description in XML
Elements
Rule 2: Subdivide Elements Into the Smallest Units of Information
Cascading Style Sheets
CSS Stylesheet for Songs
Attaching style sheets to documents
song.xsl
Applying an XSLT Style Sheet
Output
CSS or XSL?
Editing and Saving XML Documents
Rule 3: Prefer UTF-8
Questions?
Well-formedness
Well-formedness Rules
Open and close all tags
Empty-element tags end with />
There is a unique root element
Elements may not overlap
Attribute values are quoted
< and & are only used to start tags and entities
Only the five predefined entity references are used
Numeric Character References
Rule 4: Never compromise on well-formedness
Questions?
Validity
A DTD for Songs
A Valid Song Document
Checking Validity
Well-formedness vs. validity
DTDs and Validity
What is a DTD?
Internal DTD Subsets
The importance of validation
Comments in DTDs
Rule 5: Do compromise on validity
A More Complex Example
The XML Declaration
Rule 6: Include an XML Declaration
Empty-element Tags
Rule 7: Don't assign meaning to Empty-element Tags
Attributes
Rule 8: Use Attributes for Metadata; Elements for Content
Rule 8.5: When not to use attributes
Comments
Rule 9: Don't Rely on Comments
Processing Instructions
Rule 10: Use processing instructions for prcess specific content
Mixed Content
Rule 11: Provide for Mixed Content
A More Complex DTD
Questions?
Namespaces
Raison d'etre
The Need for Namespaces
Namespaces disambiguate elements
Namespace Syntax
Namespace URIs
Binding Prefixes to Namespace URIs
Binding Prefixes to Namespace URIs Example
The Default Namespace
Unprefixed attributes are never in any namespace
Rule 13: URIs matter; not prefixes
Namespace URIs do not necessarily point to a document, page, or schema
Namespaces and DTDs
Questions?
Rule 12: Prefer Numeric Character References
What is XML used for?
Domain-Specific Markup Languages
Self-Describing Data
An XML Fragment
Interchange of Data Among Applications
Can assemble data from multiple sources
XML Applications
Example XML Applications
Mathematical Markup Language
RSS
Books
Vector Graphics
SOAP
WSDL
Database interchange and export
Narrative vs. Record-Like
XML is not:
XML for XML
XSL: The Extensible Stylesheet Language
Schemas
W3C XML Schema Language Example
XLinks
File Formats, in-house applications, and other behind the scenes uses
When should you not use XML?
Questions?
Processing XML
SAX
Processing XML with Java is easy
XML API Styles
Parser APIs
XML Infoset
Markup and Character Data
Markup and Character Data Example
Entities
Parsed Character Data
CDATA sections
How Parsers Handle Namespaces
The Five Layers of XML Processing
Questions?
Writing XML Documents with Java
You don't always need a new API
Unicode
Readers and Writers
A Java program that writes Fibonacci numbers into a text file
fibonacci.txt
A Java program that writes Fibonacci numbers into an XML file
fibonacci.xml
Single quoted attribute values are a little cleaner
Suppose you want to include a DTD
valid_fibonacci.xml
Questions?
To Learn More
Reading XML Documents with SAX
Reading XML Documents
SAX
SAX Parsers for Java
The Horrors of the CLASSPATH
The SAX2 Process
Making an XMLReader
Parsing a Document with XMLReader
Sample Output from SAX2Checker
JAXP Brain Damage
The ContentHandler interface
A Sample Application
Goal: Return a list of all the URLs in this list as java.net.URL objects
SAX Design
User Interface Class
ContentHandler Class
Weblogs Output
Event Based API Caveats
To Learn More
Questions?
DOM, The Document Object Model
Trees
Document Object Model
DOM Evolution
DOM Implementations for Java
Eight Modules:
DOM Trees
org.w3c.dom
The DOM Process
Parsing documents with a DOM Parser Example
The JAXP Process
Parsing documents with a JAXP DocumentBuilder
The Node Interface
The NodeList Interface
Node Reporter
Node Reporter Output
Node Values as returned by getNodeValue()
The Document Node
The Document Interface
A Sample Application
DOM Design
Weblogs with DOM
Weblogs Output
Element Nodes
The Element Interface
IDTagger
Output from IDTagger
Namespaces in DOM
Writing XML Documents with DOM
The DOMImplementation interface
org.apache.xerces.dom.DOMImplementationImpl
A Xerces/DOM program that writes Fibonacci numbers into an XML document
A JAXP/DOM program that writes Fibonacci numbers into an XML document
Serialization
A DOM program that writes Fibonacci numbers onto System.out
fibonacci.xml
OutputFormat
Better formatted output
formatted_fibonacci.xml
DOM based XMLPrettyPrinter
Output from a DOM based XMLPrettyPrinter
The point is this:
Questions?
To Learn More
To Learn More
Questions?
Entire Presentation as Single File
Start
|
Cafe con Leche
Copyright 2002-2005 Elliotte Rusty Harold
Last Modified December 10, 2005