Processing XML with Java
Processing XML with Java
Where we're going
Relevant Standards
Processing XML with Java is easy
Prerequisites
A simple example
Markup and Character Data
Markup and Character Data Example
Entities
Parsed Character Data
CDATA sections
Comments
Processing Instructions
The XML Declaration
Document Type Declaration
Document Type Definition
XML Names
XML Namespaces
Namespace Syntax
Namespace URIs
Binding Prefixes to Namespace URIs
The Default Namespace
How Parsers Handle Namespaces
Canonical XML
Trees
Part II: Writing XML Documents with Java
Unicode
Readers and Writers
A Java program that writes Fibonacci numbers into a text file
fibonacci.txt
A Java program that writes Fibonacci numbers into an XML file
fibonacci.xml
Suppose we want to use a different encoding than UTF-8
fibonacci_8859_1.xml
Suppose you want to include a DTD
valid_fibonacci.xml
Converting data to XML
Sample Tab Delimited Data: Baseball Statistics
A Program to convert tab delimited data to XML
Baseball Stats in XML
Converting data to XML while Processing it
Batting Averages in XML
The point is this:
Part III: Reading XML Documents with SAX1
Parser APIs
SAX
SAX Parsers for Java
The SAX1 Process
Making a Parser
Making a Parser Example
Parsing a Document
SAXParseException
Using SAXParseException
DocumentHandler.java
DocumentHandler interface
Event Reporter
Event Reporter Output
SAX based TagStripper
Output from a SAX based TagStripper
An Improved TagStripper
Output from XMLToText
SAX based XMLPrettyPrinter
Output from a SAX based XMLPrettyPrinter
Event Based API Caveats
SAX based XMLPrettyPrinter
Output from a SAX based XMLPrettyPrinter
The AttributeList Interface
XMLPrettyPrinter3
Output from a SAX based XMLPrettyPrinter
The HandlerBase class
An XLink Spider Utility
Converting XML to other Formats
Converting slashdot.xml to Tab Delimited Text
InputSource
The InputSource interface
Example of InputSource
What SAX1 doesn't do
Part IV: Reading XML Documents with SAX2
SAX2
The SAX2 Process
Making an XMLReader
Parsing a Document with XMLReader
ContentHandler.java
The ContentHandler interface
SAX2 Event Reporter
Event Reporter Output
ContentHandler Example
Design
User Interface Class
ContentHandler Class
Weblogs Output
Features and Properties
Feature/Property SAXExceptions
Required Features
Core Features
Turning on Validation
Three Levels of Errors
The ErrorHandler interface
An ErrorHandler for Reporting Validity Errors
Validating
Post DTD Validation Example
Schemas
Schema for Songs
Core Properties
Nonstandard Features in Xerces 1.0.2
Nonstandard Properties in Xerces 1.0.2
Properties for Extension Handlers
Handling Attributes in SAX2
Attributes Example
Resolving Entities
EntityResolver Example
Handling DTDs
DTDHandler Example
TextEntityReplacer
Handling Declarations
The DeclHandler interface:
Handling Lexical Events
The LexicalHandler interface
LexicalHandler Example
The Locator interface
Locator Example
The DefaultHandler class
The NamespaceSupport class
Filtering XML
XMLFilter Example
TextMerger
What SAX2 doesn't do
Part V: Document Object Model
Document Object Model
DOM Evolution
Eight Modules:
Which modules and features are supported?
Which modules are supported?
Which modules are supported? Results
DOM Trees
org.w3c.dom
DOM Parsers for Java
The DOM Process
Parsing documents with a DOM Parser Example
The Node Interface
The NodeList Interface
Node Iterator
Node Reporter
Node Reporter Output
Node Values as returned by getNodeValue()
DOM based TagStripper
Output from a DOM based TagStripper
The Document Node
The Document Interface
Grab HeadLines
Element Nodes
The Element Interface
IDTagger
CharacterData interface
The CharacterData Interface
ROT13 XML Text
Text Nodes
The Text Interface
Document Fragment Nodes
The DocumentFragment Interface
DocumentType Nodes
The DocumentType Interface
Example of the DocumentType Interface
XHTMLValidator
EntityReference Nodes
The EntityReference Interface
Attr Nodes
The Attr Interface
XLinkSpider with DOM
ProcessingInstruction Nodes
The ProcessingInstruction Interface
XLinkSpider that Respects robots processing instruction
Notation Nodes
The Notation Interface
Comment Nodes
The Comment Interface
Comment Example
CommentReader Output
CDATA section Nodes
The CDATASection Interface
Entity Nodes
The Entity Interface
The Named Node Map interface
DOMException
Part V.5: Writing XML Documents with DOM
org.apache.xerces.dom.DOMImplementationImpl
A DOM program that writes Fibonacci numbers into an XML document
Serialization
A DOM program that writes Fibonacci numbers onto System.out
fibonacci.xml
OutputFormat
Better formatted output
formatted_fibonacci.xml
DOM based XMLPrettyPrinter
Output from a DOM based XMLPrettyPrinter
The point is this:
Part VI: Database Integration
The Advantages of XML for Database Integration
The Advantages of XML for Database Heavy Web Servers
Database Technologies
Database-Java Integration
Reading XML from Databases
Fixed XML Vocabulary
Looser Vocabulary
Customizable Vocabulary
If the database serves XML directly, you're in luck
Middleware to convert to XML
PHP to XML
Loose Coupling is Better
NYWC database through JDBC
NYWC database through JDBC to XML
NYWC database through JDBC to DOM
NYWC database through JDBC to Piped Reader
NYWC database through JDBC to InputSource
Generic database access
Generic database grammar
Generic database through JDBC to InputSource
Database Dependent Grammar
Generic database access through JDBC to InputSource
Putting XML into Databases
XML to SQL
Part VII: XML-RPC
XML-RPC
Uses
Implemented in Various Languages
Sample Invocation with HTTP Headers
The HTTP Header
The XML Payload
Scalar Values
Struct Values
Array Values
Sample XML-RPC Response with HTTP Header
XML-RPC Response HTTP Header
XML-RPC Response XML Payload
XML-RPC Fault Response
Communicating with XML-RPC Servers
Fibonacci Server
Fibonacci Thread
Fibonacci Client
To Learn More
Questions?
Start
|
Cafe con Leche
Copyright 2000
Elliotte Rusty Harold
elharo@metalab.unc.edu
Last Modified March 18, 2000