Processing XML with SAX and DOM
Processing XML with SAX and DOM
Please turn off all
Where we're going
Processing XML with Java is easy
Prerequisites
XML API Styles
Parser APIs
Part I: XML Infoset
A simple example
Markup and Character Data
Markup and Character Data Example
Elements and Tags
Entities
Parsed Character Data
CDATA sections
Comments
Processing Instructions
The XML Declaration
Document Type Declaration
Document Type Definition (DTD)
XML Names
Questions?
XML Namespaces
Namespace Syntax
Namespace URIs
Binding Prefixes to Namespace URIs
The Default Namespace
How Parsers Handle Namespaces
Questions?
The Five Layers of XML Processing
To Learn More
Questions?
Part II: Writing XML Documents with Java
You don't always need a new API
Unicode
Readers and Writers
A Java program that writes Fibonacci numbers into a text file
fibonacci.txt
A Java program that writes Fibonacci numbers into an XML file
fibonacci.xml
Single quoted attribute values are a little cleaner
Suppose we want to use a different encoding than UTF-8
fibonacci_Latin_1.xml
Suppose you want to include a DTD
valid_fibonacci.xml
Questions?
To Learn More
Part III: Reading XML Documents with SAX
Reading XML Documents
SAX
SAX Parsers for Java
The Horrors of the CLASSPATH
SAX1
SAX2
The SAX2 Process
Making an XMLReader
Parsing a Document with XMLReader
Sample Output from SAX2Checker
JAXP Brain Damage
The ContentHandler interface
SAX2 Event Reporter
Event Reporter Output
Questions?
A Sample Application
Goal: Return a list of all the URLs in this list as java.net.URL objects
SAX Design
User Interface Class
ContentHandler Class
Weblogs Output
Questions?
Features and Properties
Feature/Property SAXExceptions
Required Features
Core Features
Turning on Validation
Three Levels of Errors
The ErrorHandler interface
An ErrorHandler for Reporting Validity Errors
Validating
Core Properties
Nonstandard Features in Xerces
Nonstandard Properties in Xerces
Properties for Extension Handlers
Questions?
Handling Attributes in SAX2
Attributes Example
Resolving Entities
EntityResolver Example
Questions?
The NamespaceSupport class
Filtering XML
XMLFilter Example
TextMerger
InputSource
The InputSource interface
Example of InputSource
What SAX2 doesn't do
Event Based API Caveats
To Learn More
Questions?
Part IV: DOM, The Document Object Model
Where we're going
Trees
Document Object Model
DOM Evolution
DOM Implementations for Java
Eight Modules:
DOM Trees
org.w3c.dom
The DOM Process
Parsing documents with a DOM Parser Example
The JAXP Process
Parsing documents with a JAXP DocumentBuilder
Questions?
The Node Interface
The NodeList Interface
Node Reporter
Node Reporter Output
Node Values as returned by getNodeValue()
The Document Node
The Document Interface
A Sample Application
DOM Design
Weblogs with DOM
Weblogs Output
Questions?
Element Nodes
The Element Interface
IDTagger
Output from IDTagger
CharacterData interface
The CharacterData Interface
ROT13 XML Text
ROT13 XML Output
Text Nodes
The Text Interface
CDATA section Nodes
The CDATASection Interface
DocumentType Nodes
The DocumentType Interface
Example of the DocumentType Interface
XHTMLValidator
Attr Nodes
The Attr Interface
XLinkSpider with DOM
ProcessingInstruction Nodes
The ProcessingInstruction Interface
XLinkSpider that Respects robots processing instruction
Comment Nodes
The Comment Interface
Comment Example
DOMCommentReader Output
DOMException
Questions?
The org.w3c.dom.traversal Package
NodeIterator
ValueReporter
ValueReporter Output
NodeFilter
DOM based TagStripper
Output from a DOM based TagStripper
TreeWalker
Questions?
Writing XML Documents with DOM
The DOMImplementation interface
org.apache.xerces.dom.DOMImplementationImpl
A Xerces/DOM program that writes Fibonacci numbers into an XML document
A JAXP/DOM program that writes Fibonacci numbers into an XML document
Serialization
A DOM program that writes Fibonacci numbers onto System.out
fibonacci.xml
OutputFormat
Better formatted output
formatted_fibonacci.xml
DOM based XMLPrettyPrinter
Output from a DOM based XMLPrettyPrinter
The point is this:
Questions?
To Learn More
Questions?
Entire Presentation as Single File
Start
|
Cafe con Leche
Copyright 2000-2004 Elliotte Rusty Harold
Last Modified February 23, 2004