The ContentHandler interface is designed to provide everything most applications need to know about an XML instance document. The things it leaves out are things you rarely care about, though most of these are available through other callback interfaces discussed in upcoming chapters. These include:
Comments, unskipped entities, and CDATA sections, all of which are available through the LexicalHandler interface
The names, public IDs, system IDs and notations for unparsed entities and the names, public IDs, and system IDs for notations, all of which are available through the DTDHandler interface
ELEMENT, ATTLIST, and parsed ENTITY declarations from the DTD, all of which are reported through the DeclHandler interface
Validity errors and other non-fatal errors which are reported through the ErrorHandler interface
The only things that are truly not available in SAX2, even after all optional extensions are included, are:
The version, encoding, and standalone attributes from the XML declaration (scheduled to be added in SAX 2.1)
Insignificant white space in tags and before and after the root element
The order of attributes
The type of quotes that surround attributes
Character references
Pre-normalized attribute values
Whether an attribute was specified in the instance document or defaulted in from the DTD or schema
Whether empty elements are represented as <name></name> or <name />
Skipped entities in attribute values
The only common use case for most of this information is an XML editor. Editors are actually quite strange beasts compared to most client applications and really require a custom parser and API. None of the standard APIs or parsers provide all the information an editor needs.
Copyright 2001, 2002 Elliotte Rusty Harold | elharo@metalab.unc.edu | Last Modified October 16, 2001 |
Up To Cafe con Leche |