Intro to Processing XML with Java using SAX and DOM
Intro to Processing XML with Java using SAX and DOM
Please turn off all
Where we're going
Processing XML with Java is easy
Prerequisites
Parser APIs
Part I: Writing XML Documents with Java
Unicode
Readers and Writers
A Java program that writes Fibonacci numbers into a text file
fibonacci.txt
A Java program that writes Fibonacci numbers into an XML file
fibonacci.xml
Suppose we want to use a different encoding than UTF-8
fibonacci_8859_1.xml
Suppose you want to include a DTD
valid_fibonacci.xml
Questions?
Converting data to XML
Sample Tab Delimited Data: Baseball Statistics
A Program to convert tab delimited data to XML
Baseball Stats in XML
Converting data to XML while Processing it
Batting Averages in XML
The point is this:
Questions?
To learn more about streams and readers and writers
Part II: Reading XML Documents with SAX
SAX
SAX Parsers for Java
SAX1
SAX2
The SAX2 Process
Making an XMLReader
Parsing a Document with XMLReader
Sample Output from SAX2Checker
The ContentHandler interface
SAX2 Event Reporter
Event Reporter Output
Questions?
A Sample Application
Goal: Return a list of all the URLs in this list as java.net.URL objects
SAX Design
User Interface Class
ContentHandler Class
Weblogs Output
Features and Properties
Feature/Property SAXExceptions
Required Features
Core Features
Turning on Validation
Three Levels of Errors
The ErrorHandler interface
An ErrorHandler for Reporting Validity Errors
Validating
Core Properties
Nonstandard Features in Xerces
Nonstandard Properties in Xerces
Properties for Extension Handlers
Handling Attributes in SAX2
Attributes Example
InputSource
The InputSource interface
Example of InputSource
What SAX2 doesn't do
Event Based API Caveats
To Learn More
Questions?
Part III: DOM, The Document Object Model
Trees
Document Object Model
DOM Evolution
DOM Parsers for Java
Eight Modules:
Which modules and features are supported?
Which modules are supported?
Which modules are supported? Results
DOM Trees
org.w3c.dom
The DOM Process
Parsing documents with a DOM Parser Example
The Node Interface
The NodeList Interface
Node Reporter
Node Reporter Output
Node Values as returned by getNodeValue()
The Document Node
The Document Interface
A Sample Application
DOM Design
Weblogs with DOM
Weblogs Output
Element Nodes
The Element Interface
IDTagger
Output from IDTagger
CharacterData interface
The CharacterData Interface
ROT13 XML Text
ROT13 XML Output
Text Nodes
The Text Interface
CDATA section Nodes
The CDATASection Interface
DocumentType Nodes
The DocumentType Interface
Example of the DocumentType Interface
XHTMLValidator
EntityReference Nodes
The EntityReference Interface
Attr Nodes
The Attr Interface
XLinkSpider with DOM
ProcessingInstruction Nodes
The ProcessingInstruction Interface
XLinkSpider that Respects robots processing instruction
Comment Nodes
The Comment Interface
Comment Example
DOMCommentReader Output
Entity Nodes
The Entity Interface
DOMException
Questions?
The org.w3c.dom.traversal Package
NodeIterator
ValueReporter
ValueReporter Output
NodeFilter
DOM based TagStripper
Output from a DOM based TagStripper
Questions?
Writing XML Documents with DOM
org.apache.xerces.dom.DOMImplementationImpl
A DOM program that writes Fibonacci numbers into an XML document
Serialization
A DOM program that writes Fibonacci numbers onto System.out
fibonacci.xml
OutputFormat
Better formatted output
formatted_fibonacci.xml
DOM based XMLPrettyPrinter
Output from a DOM based XMLPrettyPrinter
The point is this:
Questions?
To Learn More
Entire Presentation as Single File
Start
|
Cafe con Leche
Copyright 2000 Elliotte Rusty Harold
Last Modified March 6, 2001