What's New in XML in Java 5 and 6

Elliotte Rusty Harold

Thursday, March 22, 2007

elharo@metalab.unc.edu

http://www.cafeconleche.org/


Where We're Going


JAXP, the Java API for XML Processing


JAXP Versions


DOM Level 3


DOM Level 3

of all of the things the W3C has given us, the DOM is probably the one with the least value.

--Michael Brennan on the xml-dev mailing list


DOM Level 3 Core Changes


New methods in the Node interface


User Data

package org.w3c.dom;

public interface UserDataHandler {

  // OperationType
  public static final short NODE_CLONED   = 1;
  public static final short NODE_IMPORTED = 2;
  public static final short NODE_DELETED  = 3;
  public static final short NODE_RENAMED  = 4;
  public static final short NODE_ADOPTED  = 5;

  public void handle(short operation, String key, Object data, Node src, Node dst);

}

DOMConfiguration


New methods in Text


New methods in Element


New methods in Attr


Bootstrapping


DOM3 Bootstrapping


Bootstrapping Example


DOM Error Handler Interfaces


Load and Save


Parsing documents with DOM3

import org.w3c.dom.*;
import org.w3c.dom.ls.*;
import org.w3c.dom.bootstrap.*;


public class DOM3ParserMaker {

  public static void main(String[] args) 
    throws ClassNotFoundException, InstantiationException, IllegalAccessException {

    System.setProperty(DOMImplementationRegistry.PROPERTY,
      "org.apache.xerces.dom.DOMImplementationSourceImpl");
    DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
 
    DOMImplementation impl = registry.getDOMImplementation("LS-Load");
    if (impl == null) {
        System.err.println("Coudl not locate a DOM3 Parser");
        return;    
    }
    
    DOMImplementationLS implls = (DOMImplementationLS) impl;
    LSParser parser = implls.createLSParser(DOMImplementationLS.MODE_SYNCHRONOUS , null);

    for (int i = 0; i < args.length; i++) {
      try {
        Document d = parser.parseURI(args[i]);
      }
      catch (DOMException ex) {
        System.err.println(ex);
      }
      
    }

  }

}

The Load and Save Package: org.w3c.dom.ls

DOMImplementationLS
A sub-interface of DOMImplementation that provides the factory methods for creating the objects required for loading and saving.
LSParser
A parser interface
LSInput
Encapsulate information about the source of the XML to be loaded, like SAX's InputSource
LSResourceResolver
During loading, provides a way for applications to redirect references to external entities.
LSParserFilter
Provide the ability to examine and optionally remove Element nodes as they are being processed during the parsing of a document. like SAX filters.
LSSerializer
An interface for serializing DOM documents onto a stream or string.
LSSerializerFilter
Provide the ability to examine and optionally remove or modify nodes as they are being output.
LSLoadEvent
A document has been completely loaded
LSProgressEvent
A document has been partially loaded

DOMImplementationLS


Creating DOMImplementationLS Objects

  1. Use the feature "LS" or "LS-Async" to find a DOMImplementation object that supports Load and Save.

  2. Cast the DOMImplementation object to DOMImplementationLS.

System.setProperty(DOMImplementationRegistry.PROPERTY,
  "org.apache.xerces.dom.DOMImplementationSourceImpl");
DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
DOMImplementation impl = registry.getDOMImplementation("XML 1.0 LS 3.0");
  if (impl != null) {
    DOMImplementationLS implls = (DOMImplementationLS) impl;
    // ...
  }

LSParser


LSInput


LSOutput


LSResourceResolver


LSSerializer


Fibonacci with DOM3

import java.math.*;
import java.io.*;
import org.w3c.dom.*;
import org.w3c.dom.bootstrap.*;
import org.w3c.dom.ls.*;


public class FibonacciDOM3 {

  public static void main(String[] args) throws Exception {

      System.setProperty(DOMImplementationRegistry.PROPERTY,
        "org.apache.xerces.dom.DOMImplementationSourceImpl");
      DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
      DOMImplementation impl = registry.getDOMImplementation("XML 1.0 LS");
      if (impl == null) {
         System.err.println("Oops! Couln't find DOM3 implementation");   
         return;
      }
      Document fibonacci = impl.createDocument(null, "Fibonacci_Numbers", null );

      BigInteger low  = BigInteger.ZERO;
      BigInteger high = BigInteger.ONE;

      Element root = fibonacci.getDocumentElement();

      for (int i = 0; i <= 25; i++) {
        Element number = fibonacci.createElement("fibonacci");
        number.setAttribute("index", Integer.toString(i));
        Text text = fibonacci.createTextNode(low.toString());
        number.appendChild(text);
        root.appendChild(number);

        BigInteger temp = high;
        high = high.add(low);
        low = temp;
      }

      // Now that the document is created we need to *serialize* it
      DOMImplementationLS implls = (DOMImplementationLS) impl;
      LSSerializer serializer = implls.createLSSerializer();
      LSOutput output = implls.createLSOutput();
      output.setByteStream(new FileOutputStream("fibonacci_dom.xml"));
      
      serializer.write(fibonacci, output);

  }

}

LSParserFilter


LSSerializerFilter


XPath


Java XPath API


Using the XPath API

import java.io.IOException;
import org.w3c.dom.*;
import org.xml.sax.SAXException;
import javax.xml.parsers.*;
import javax.xml.xpath.*;

public class XPathExample {

  public static void main(String[] args) 
   throws ParserConfigurationException, SAXException, 
          IOException, XPathExpressionException {

    // 1. Parse a document with JAXP
    DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
    domFactory.setNamespaceAware(true); // never forget this!
    DocumentBuilder builder = domFactory.newDocumentBuilder();
    Document doc = builder.parse("books.xml");

    // 2. Compile the expression
    XPathFactory factory = XPathFactory.newInstance();
    XPath xpath = factory.newXPath();
    XPathExpression expr 
     = xpath.compile("//book[author='Neal Stephenson']/title/text()");

    // 3. Make the query
    Object result = expr.evaluate(doc, XPathConstants.NODESET);

    // 4. Get the result.
    NodeList nodes = (NodeList) result;
    for (int i = 0; i < nodes.getLength(); i++) {
        System.out.println(nodes.item(i).getNodeValue()); 
    }

  }

}

XPath Data Model


Namespaces

//pre:book[pre:author='Neal Stephenson']/pre:title/text()

NamespaceContext


NamespaceContext Example

import java.util.Iterator;
import javax.xml.*;
import javax.xml.namespace.NamespaceContext;

public class PersonalNamespaceContext implements NamespaceContext {

    public String getNamespaceURI(String prefix) {
        if (prefix == null) throw new NullPointerException("Null prefix");
        else if ("pre".equals(prefix)) return "http://www.example.org/books";
        else if ("xml".equals(prefix)) return XMLConstants.XML_NS_URI;
        return XMLConstants.NULL_NS_URI;
    }

    // This method isn't necessary for XPath processing.
    public String getPrefix(String uri) {
        throw new UnsupportedOperationException();
    }

    // This method isn't necessary for XPath processing either.
    public Iterator getPrefixes(String uri) {
        throw new UnsupportedOperationException();
    }

}

XPath query that uses namespaces

  XPathFactory factory = XPathFactory.newInstance();
  XPath xpath = factory.newXPath();
  xpath.setNamespaceContext(new PersonalNamespaceContext());
  XPathExpression expr = xpath.compile(
    "//pre:book[pre:author='Neal Stephenson']/pre:title/text()"
  );

  Object result = expr.evaluate(doc, XPathConstants.NODESET);
  NodeList nodes = (NodeList) result;
  for (int i = 0; i < nodes.getLength(); i++) {
    System.out.println(nodes.item(i).getNodeValue()); 
  }

XPath Extension Functions


Extension Function that verifies ISBN checksums

import java.util.List;
import javax.xml.xpath.*;
import org.w3c.dom.*;

public class ISBNValidator implements XPathFunction {
    
  public Object evaluate(List args) throws XPathFunctionException {

    if (args.size() != 1) {
      throw new XPathFunctionException(
       "Wrong number of arguments to valid-isbn()");
    }

    String isbn;
    Object o = args.get(0);

    // perform conversions
    if (o instanceof String) isbn = (String) args.get(0);
    else if (o instanceof Boolean) isbn = o.toString();
    else if (o instanceof Double) isbn = o.toString();
    else if (o instanceof NodeList) {
        NodeList list = (NodeList) o;
        Node node = list.item(0);
        // getTextContent is available in Java 5 and DOM 3.
        // In Java 1.4 and DOM 2, you'd need to recursively 
        // accumulate the content.
        isbn= node.getTextContent();
    }
    else {
      throw new XPathFunctionException("Could not convert argument type");
    }

    char[] data = isbn.toCharArray();
    if (data.length != 10) return Boolean.FALSE;
    int checksum = 0;
    for (int i = 0; i < 9; i++) {
        checksum += (i+1) * (data[i]-'0');
    }
    int checkdigit = checksum % 11;

    if (checkdigit + '0' == data[9] 
     || (data[9] == 'X && checkdigit == 10)) {
        return Boolean.TRUE;
    }
    return Boolean.FALSE;

  }

}

Java Validation API


Schema Validation Example

import java.io.*;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.*;
import org.xml.sax.SAXException;

public class DocbookXSDCheck {

  public static void main(String[] args) throws SAXException, IOException {

    // 1. Lookup a factory for the W3C XML Schema language
    SchemaFactory factory = 
      SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
    
    // 2. Compile the schema. 
    // Here the schema is loaded from a java.io.File, but you could use 
    // a java.net.URL or a javax.xml.transform.Source instead.
    File schemaLocation = new File("/opt/xml/docbook/xsd/docbook.xsd");
    Schema schema = factory.newSchema(schemaLocation);
  
    // 3. Get a validator from the schema.
    Validator validator = schema.newValidator();
    
    // 4. Parse the document you want to check.
    Source source = new StreamSource(args[0]);
    
    // 5. Check the document
    try {
      validator.validate(source);
      System.out.println(args[0] + " is valid.");
    }
    catch (SAXException ex) {
      System.out.println(args[0] + " is not valid because ");
      System.out.println(ex.getMessage());
    }  
    
  }

}

Validate against a document-specified schema

SchemaFactory factory = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
Schema schema = factory.newSchema();

Other schema languages


Schema Augmentation

import java.io.*;
import javax.xml.transform.dom.*;
import javax.xml.validation.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.SAXException;

public class DocbookXSDAugmenter {

    public static void main(String[] args) 
      throws SAXException, IOException, ParserConfigurationException {

        SchemaFactory factory 
         = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
        File schemaLocation = new File("/opt/xml/docbook/xsd/docbook.xsd");
        Schema schema = factory.newSchema(schemaLocation);
        Validator validator = schema.newValidator();
        
        DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
        domFactory.setNamespaceAware(true); // never forget this
        DocumentBuilder builder = domFactory.newDocumentBuilder();
        Document doc = builder.parse(new File(args[0]));
        
        DOMSource source = new DOMSource(doc);
        DOMResult result = new DOMResult();
        
        try {
            validator.validate(source, result);
            Document augmented = (Document) result.getNode();
            // do whatever you need to do with the augmented document...
        }
        catch (SAXException ex) {
            System.out.println(args[0] + " is not valid because ");
            System.out.println(ex.getMessage());
        }  
        
    }

}

Reporting Type Information


DOM 3 TypeInfo interface

package org.w3c.dom;
  
public interface TypeInfo {

 public String  getTypeName();
 public String  getTypeNamespace();
 public boolean isDerivedFrom(String typeNamespace, 
   String typeName, int derivationMethod);
 
 public static int DERIVATION_EXTENSION;
 public static int DERIVATION_LIST;
 public static int DERIVATION_RESTRICTION;
 public static int DERIVATION_UNION;

}

JAXP TypeInfo provider class

package javax.xml.validation;

public abstract class TypeInfoProvider {

  public abstract TypeInfo getElementTypeInfo();
  public abstract TypeInfo getAttributeTypeInfo(int index);
  public abstract boolean  isIdAttribute(int index);
  public abstract boolean  isSpecified(int index);

}

Program to Report All Types

import java.io.*;
import javax.xml.validation.*;

import org.xml.sax.*;
import org.xml.sax.helpers.*;

public class TypeLister extends DefaultHandler {

    private TypeInfoProvider provider;
    
    public TypeLister(TypeInfoProvider provider) {
        this.provider = provider;
    }

    public static void main(String[] args) throws SAXException, IOException {

        SchemaFactory factory 
         = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
        File schemaLocation = new File("/opt/xml/docbook/xsd/docbook.xsd");
        Schema schema = factory.newSchema(schemaLocation);
    
        ValidatorHandler vHandler = schema.newValidatorHandler();
        TypeInfoProvider provider = vHandler.getTypeInfoProvider();
        ContentHandler   cHandler = new TypeLister(provider);
        vHandler.setContentHandler(cHandler);
        
        XMLReader parser = XMLReaderFactory.createXMLReader();
        parser.setContentHandler(vHandler);
        parser.parse(args[0]);
        
    }
    
    public void startElement(String namespace, String localName,
      String qualifiedName, Attributes atts) throws SAXException {
        String type = provider.getElementTypeInfo().getTypeName();
        System.out.println(qualifiedName + ": " + type);
    }

}

Output

book: #AnonType_book
title: #AnonType_title
subtitle: #AnonType_subtitle
info: #AnonType_info
copyright: #AnonType_copyright
year: #AnonType_year
holder: #AnonType_holder
author: #AnonType_author
personname: #AnonType_personname
firstname: #AnonType_firstname
othername: #AnonType_othername
surname: #AnonType_surname
personblurb: #AnonType_personblurb
para: #AnonType_para
link: #AnonType_link

Java XML Digital Signatures API


Digital Signatures


Not Just for Signing XML


Generic Digital Signature Process

  1. The signature processor calculates a hash code for some data using a strong, one-way hash function.

  2. The processor encrypts the hash code using a private key.

  3. The verifier calculates the hash code for the data it's received.

  4. It then decrypts the encrypted hash code using the public key to see if the hash codes match.


XML Signature Process

  1. The signature processor digests (calculates the hash code for) a data object.

  2. The processor places the digest value in a Signature element.

  3. The processor digests the Signature element.

  4. The processor cryptographically signs the Signature element.


A Detached Signature

<?xml version='1.0' encoding='UTF-8'?>
<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
  <SignedInfo>
    <CanonicalizationMethod Algorithm="http://www.w3.org/TR/2000/WD-xml-c14n-20000119"/>
    <SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#dsa-sha1"/>
    <Reference URI="http://www.cafeconleche.org/slides/hoffman/fundamentals/examples/hotcop.xml">
      <DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
      <DigestValue>nvfYilfgN/rICyzhGmjidKCFoC8=</DigestValue>
    </Reference>
  </SignedInfo>
  <SignatureValue>
    hfowa4qdbuMkoZfX1/VXd4UBpIpZMM5+6CElmY7jOIKFqvXq5A5VKw==
  </SignatureValue>
  <KeyInfo>
    <KeyValue>
      <DSAKeyValue>
        <P>
          /X9TgR11EilS30qcLuzk5/YRt1I870QAwx4/gLZRJmlFXUAiUftZPY1Y+r/F9bow9s
          ubVWzXgTuAHTRv8mZgt2uZUKWkn5/oBHsQIsJPu6nX/rfGG/g7V+fGqKYVDwT7g/bT
          xR7DAjVUE1oWkTL2dfOuK2HXKu/yIgMZndFIAcc=
        </P>
        <Q>l2BQjxUjC8yykrmCouuEC/BYHPU=</Q>
        <G>
          9+GghdabPd7LvKtcNrhXuXmUr7v6OuqC+VdMCz0HgmdRWVeOutRZT+ZxBxCBgLRJFn
          Ej6EwoFhO3zwkyjMim4TwWeotUfI0o4KOuHiuzpnWRbqN/C/ohNWLx+2J6ASQ7zKTx
          vqhRkImog9/hWuWfBpKLZl6Ae1UlZAFMO/7PSSo=
        </G>
        <Y>
          6jKpNnmkkWeArsn5Oeeg2njcz+nXdk0f9kZI892ddlR8Lg1aMhPeFTYuoq3I6neFlb
          BjWzuktNZKiXYBfKsSTB8U09dTiJo2ir3HJuY7eW/p89osKMfixPQsp9vQMgzph6Qa
          lY7j4MB7y5ROJYsTr1/fFwmj/yhkHwpbpzed1LE=
        </Y>
      </DSAKeyValue>
    </KeyValue>
    <X509Data>
      <X509IssuerSerial>
        <X509IssuerName>CN=Elliotte Rusty Harold, OU=Metrotech, O=Polytechnic, L=Brooklyn, ST=New York, C=US</X509IssuerName>
        <X509SerialNumber>983556890</X509SerialNumber></X509IssuerSerial>
      <X509SubjectName>CN=Elliotte Rusty Harold, OU=Metrotech, O=Polytechnic, L=Brooklyn, ST=New York, C=US</X509SubjectName>
      <X509Certificate>
MIIDLzCCAu0CBDqf4xowCwYHKoZIzjgEAwUAMH0xCzAJBgNVBAYTAlVTMREwDwYDVQQIEwhOZXcg
WW9yazERMA8GA1UEBxMIQnJvb2tseW4xFDASBgNVBAoTC1BvbHl0ZWNobmljMRIwEAYDVQQLEwlN
ZXRyb3RlY2gxHjAcBgNVBAMTFUVsbGlvdHRlIFJ1c3R5IEhhcm9sZDAeFw0wMTAzMDIxODE0NTBa
Fw0wMTA1MzExODE0NTBaMH0xCzAJBgNVBAYTAlVTMREwDwYDVQQIEwhOZXcgWW9yazERMA8GA1UE
BxMIQnJvb2tseW4xFDASBgNVBAoTC1BvbHl0ZWNobmljMRIwEAYDVQQLEwlNZXRyb3RlY2gxHjAc
BgNVBAMTFUVsbGlvdHRlIFJ1c3R5IEhhcm9sZDCCAbgwggEsBgcqhkjOOAQBMIIBHwKBgQD9f1OB
HXUSKVLfSpwu7OTn9hG3UjzvRADDHj+AtlEmaUVdQCJR+1k9jVj6v8X1ujD2y5tVbNeBO4AdNG/y
ZmC3a5lQpaSfn+gEexAiwk+7qdf+t8Yb+DtX58aophUPBPuD9tPFHsMCNVQTWhaRMvZ1864rYdcq
7/IiAxmd0UgBxwIVAJdgUI8VIwvMspK5gqLrhAvwWBz1AoGBAPfhoIXWmz3ey7yrXDa4V7l5lK+7
+jrqgvlXTAs9B4JnUVlXjrrUWU/mcQcQgYC0SRZxI+hMKBYTt88JMozIpuE8FnqLVHyNKOCjrh4r
s6Z1kW6jfwv6ITVi8ftiegEkO8yk8b6oUZCJqIPf4VrlnwaSi2ZegHtVJWQBTDv+z0kqA4GFAAKB
gQDqMqk2eaSRZ4Cuyfk556DaeNzP6dd2TR/2Rkjz3Z12VHwuDVoyE94VNi6ircjqd4WVsGNbO6S0
1kqJdgF8qxJMHxTT11OImjaKvccm5jt5b+nz2iwox+LE9Cyn29AyDOmHpBqVjuPgwHvLlE4lixOv
X98XCaP/KGQfClunN53UsTALBgcqhkjOOAQDBQADLwAwLAIUODqxsFzS96BjrVA4LVo5FzuWBRMC
FC0xfXxbaJaCJuVqtcBv4bqwV0EX
      </X509Certificate>
    </X509Data>
  </KeyInfo>
</Signature>


Signing a Document: Step 1. Create the Signer Object

XMLSignatureFactory factory = XMLSignatureFactory.getInstance("DOM");

DigestMethod sha1 = factory.newDigestMethod(DigestMethod.SHA1, null);
CanonicalizationMethod inclusive = factory.newCanonicalizationMethod
 (CanonicalizationMethod.INCLUSIVE, (C14NMethodParameterSpec) null);
SignatureMethod rsasha1 
 = factory.newSignatureMethod(SignatureMethod.RSA_SHA1, null);

Transform enveloped = factory.newTransform
  (Transform.ENVELOPED, (TransformParameterSpec) null));
List transforms = Collections.singletonList(enveloped);

// empty string means sign the current, complete document
Reference ref = factory.newReference("", sha1, transforms);
List references = Collections.singletonList(ref);

SignedInfo signer = factory.newSignedInfo(inclusive, rsasha1, references);

Signing a Document: Step 2. Create the Key

char[] password = "secret".toCharArray();
KeyStore store = KeyStore.getInstance("JKS");
InputStream keys = new FileInputStream("keys.jks");
store.load(keys, password);
KeyStore.PrivateKeyEntry entry = (KeyStore.PrivateKeyEntry) store.getEntry
  ("theKey", new KeyStore.PasswordProtection(password));
X509Certificate cert = (X509Certificate) entry.getCertificate();

KeyInfoFactory keyFactory = factory.getKeyInfoFactory();
List certs = new ArrayList();
certs.add(cert.getSubjectX500Principal().getName());
certs.add(cert);
X509Data data = keyFactory.newX509Data(certs);
List dataList = Collections.singletonList(data);
KeyInfo key = keyFactory.newKeyInfo(dataList);

Signing a Document: Step 3. Sign the Document

Document doc = getDOMDocument( /* wherever you like */ );

DOMSignContext context 
  = new DOMSignContext(entry.getPrivateKey(), doc.getDocumentElement());

XMLSignature signature = factory.newXMLSignature(signer, key);
signature.sign(context);
// The Signature element has now been added to the Document.

Verifying a Signature

NodeList nodes 
  = doc.getElementsByTagNameNS(XMLSignature.XMLNS, "Signature");
DOMValidateContext dvc 
  = new DOMValidateContext(new X509KeySelector(), nodes.item(0));
XMLSignature signature = factory.unmarshalXMLSignature(dvc);
if (signature.validate(dvc)) {
  System.err.println(
    "Signature failed! Document may have been tampered with.");
} 

StAX


XML API Styles


Pull Parsing

pull parsing is the way to go in the future. The first 3 XML parsers (Lark, NXP, and expat) all were event-driven because... er well that was 1996, can't exactly remember, seemed like a good idea at the time.

--Tim Bray on the xml-dev mailing list, Wednesday, September 18, 2002


Pull Parsing is


StAX


Major Classes and Interfaces

XMLStreamReader:
an interface that represents the parser
XMLInputFactory:
the factory class that instantiates an implementation dependent implementation of XMLStreamReader
XMLStreamException:
the generic class for everything other than an IOException that might go wrong when parsing an XML document, particularly well-formedness errors

Simple Wellformedness Checker

import javax.xml.stream.*;
import java.net.*;
import java.io.*;

 
public class StAXChecker {

  public static void main(String[] args) {
        
    if (args.length == 0) {
      System.err.println("Usage: java StAXChecker url" );
      return;   
    }
        
    try {

      InputStream in;
      try {
        URL u = new URL(args[0]);
        in = u.openStream();
      }
      catch (MalformedURLException ex) {
          // Maybe it's a file name
          in = new FileInputStream(args[0]);
      }
      
      XMLInputFactory factory = XMLInputFactory.newInstance();
      XMLStreamReader parser = factory.createXMLStreamReader(in);
        
      while (true) {
           int event = parser.next();
           if (event == XMLStreamConstants.END_DOCUMENT) {
                parser.close();
                break;
           }
      }
      parser.close();
            
      // If we get here there are no exceptions
      System.out.println(args[0] + " is well-formed");      
    }
    catch (XMLStreamException ex) {
       System.out.println(args[0] + " is not well-formed"); 
       System.out.println(ex);  
    }
    catch (IOException ex) {
      System.out.println(args[0] + " could not be checked due to an " 
       + ex.getClass().getName());   
      ex.printStackTrace();      
    }
        
  }

}

Output from a Simple Wellformedness Checker

$ java -classpath stax.jar:.:bea.jar StAXChecker http://www.cafeconleche.org/
http://www.cafeconleche.org/ is well-formed
$ java -classpath stax.jar:.:bea.jar StAXChecker http://www.xml.com/
http://www.xml.com/ is not well-formed
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[44,7]
Message: could not resolve entity named 'nbsp'


Event Codes


Listening to Events

import javax.xml.stream.*;
import java.net.*;
import java.io.*;

 
public class EventLister {

  public static void main(String[] args) {
        
    if (args.length == 0) {
      System.err.println("Usage: java EventLister url" );
     return;    
    }
        
    try {

      InputStream in;
      try {
        URL u = new URL(args[0]);
        in = u.openStream();
      }
      catch (MalformedURLException ex) {
          // Maybe it's a file name
          in = new FileInputStream(args[0]);
      }

      XMLInputFactory factory = XMLInputFactory.newInstance();
      XMLStreamReader parser = factory.createXMLStreamReader(in);
        
      while (true) {
         int event = parser.next();
         if (event == XMLStreamConstants.START_ELEMENT) {
             System.out.println("Start tag");
         }
         else if (event == XMLStreamConstants.END_ELEMENT) {
             System.out.println("End tag");
         }
         else if (event == XMLStreamConstants.START_DOCUMENT) {
             System.out.println("Start document");
         }
         else if (event == XMLStreamConstants.CHARACTERS) {
             System.out.println("Text");
         }
         else if (event == XMLStreamConstants.CDATA) {
             System.out.println("CDATA Section");
         }
         else if (event == XMLStreamConstants.COMMENT) {
             System.out.println("Comment");
         }
         else if (event == XMLStreamConstants.DTD) {
             System.out.println("Document type declaration");
         }
         else if (event == XMLStreamConstants.ENTITY_REFERENCE) {
             System.out.println("Entity Reference");
         }
         else if (event == XMLStreamConstants.START_ENTITY) {
             System.out.println("Entity Reference");
         }
         else if (event == XMLStreamConstants.END_ENTITY) {
             System.out.println("Entity Reference");
         }
         else if (event == XMLStreamConstants.SPACE) {
             System.out.println("Ignorable white space");
         }
         else if (event == XMLStreamConstants.NOTATION_DECLARATION) {
             System.out.println("Notation Declaration");
         }
         else if (event == XMLStreamConstants.ENTITY_DECLARATION) {
             System.out.println("Entity Declaration");
         }
         else if (event == XMLStreamConstants.PROCESSING_INSTRUCTION) {
             System.out.println("Processing Instruction");
         }
         else if (event == XMLStreamConstants.END_DOCUMENT) {
             System.out.println("End Document");
             break;
         }
      }           
    }
    catch (XMLStreamException ex) {
       System.out.println(ex);  
    }
    catch (IOException ex) {
      System.out.println("IOException while parsing " + args[0]);   
    }
        
  }

}

Output from EventLister

% java -classpath stax.jar:.:bea.jar EventLister hotcop.xml
Ignorable white space
Start tag
Text
Start tag
Text
End tag
Text
Start tag
Text
End tag
Text
Start tag
Text
End tag
Text
Start tag
Text
End tag
Text
Start tag
Text
End tag
Text
Start tag
Text
End tag
Text
Start tag
Text
End tag
Text
Start tag
Text
End tag
Text
Start tag
Text
End tag
Text
End tag
Ignorable white space
End Document

XMLStreamReader getter methods depend on the current state

Invokable methods
Event Type Valid Methods
START_ELEMENT next(), getName(), getLocalName(), hasName(), getPrefix(), getAttributeCount(), getAttributeName(int index), getAttributeNamespace(int index), getAttributePrefix(int index), getAttributeQName(int index), getAttributeType(int index), getAttributeValue(int index), getAttributeValue(String namespaceURI, String localName), isAttributeSpecified(), getNamespaceContext(), getNamespaceCount(), getNamespacePrefix(int index), getNamespaceURI(), getNamespaceURI(int index), getNamespaceURI(String prefix), getElementText(), nextTag()
ATTRIBUTE next(), nextTag(), getAttributeCount(), getAttributeName(int index), getAttributeNamespace(int index), getAttributePrefix(int index), getAttributeQName(int index), getAttributeType(int index), getAttributeValue(int index), getAttributeValue(String namespaceURI, String localName), isAttributeSpecified()
NAMESPACE next(), nextTag(), getNamespaceContext(), getNamespaceCount(), getNamespacePrefix(int index), getNamespaceURI(), getNamespaceURI(int index), getNamespaceURI(String prefix)
END_ELEMENT next(), getName(), getLocalName(), hasName(), getPrefix(), getNamespaceContext(), getNamespaceCount(), getNamespacePrefix(int index), getNamespaceURI(), getNamespaceURI(int index), getNamespaceURI(String prefix), nextTag()
CHARACTERS next(), getText(), getTextCharacters(), getTextCharacters(int sourceStart, char[] target, int targetStart, int length), getTextLength(), nextTag()
CDATA next(), getText(), getTextCharacters(), getTextCharacters(int sourceStart, char[] target, int targetStart, int length), getTextLength(), nextTag()
COMMENT next(), getText(), getTextCharacters(), getTextCharacters(int sourceStart, char[] target, int targetStart, int length), getTextLength(), nextTag()
SPACE next(), getText(), getTextCharacters(), getTextCharacters(int sourceStart, char[] target, int targetStart, int length), getTextLength(), nextTag()
START_DOCUMENT next(), getEncoding(), next(), getPrefix(), getVersion(), isStandalone(), standaloneSet(), getCharacterEncodingScheme(), nextTag()
END_DOCUMENT close()
PROCESSING_INSTRUCTION next(), getPITarget(), getPIData(), nextTag()
ENTITY_REFERENCE next(), getLocalName(), getText(), nextTag()
DTD next(), getText(), nextTag()

getText()


getText() Example

import javax.xml.stream.*;
import java.net.*;
import java.io.*;

 
public class EventText {

  public static void main(String[] args) {
        
    if (args.length == 0) {
      System.err.println("Usage: java EventText url" );
      return;    
    }
        
    try {

      InputStream in;
      try {
        URL u = new URL(args[0]);
        in = u.openStream();
      }
      catch (MalformedURLException ex) {
          // Maybe it's a file name
          in = new FileInputStream(args[0]);
      }

      XMLInputFactory factory = XMLInputFactory.newInstance();
      XMLStreamReader parser = factory.createXMLStreamReader(in);
        
      for (int event = parser.next(); 
           event != XMLStreamConstants.END_DOCUMENT; 
           event = parser.next()) {
         if (event == XMLStreamConstants.CHARACTERS 
           || event == XMLStreamConstants.SPACE 
           || event == XMLStreamConstants.CDATA) {
             System.out.println(parser.getText());
         }
         else if (event == XMLStreamConstants.COMMENT) {
             System.out.println("<!-- " + parser.getText() + "-->");
         }
      }           
    }
    catch (XMLStreamException ex) {
       System.out.println(ex);  
    }
    catch (IOException ex) {
      System.out.println("IOException while parsing " + args[0]);   
    }
        
  }

}

Output

$ java -classpath stax.jar:.:bea.jar EventText hotcop.xml




Hot Cop


Jacques Morali


Henri Belolo


Victor Willis


Jacques Morali


PolyGram Records


6:20


1978


Village People


isFoo() and hasFoo()

Rather than testing for type, it's sometimes useful to ask if the current event can be queried for a certain characteristic:


hasText() Example

import javax.xml.stream.*;
import java.net.*;
import java.io.*;

 
public class SimplerEventText {

  public static void main(String[] args) {
        
    if (args.length == 0) {
      System.err.println("Usage: java SimplerEventText url" );
      return;    
    }
        
    try {

      InputStream in;
      try {
        URL u = new URL(args[0]);
        in = u.openStream();
      }
      catch (MalformedURLException ex) {
          // Maybe it's a file name
          in = new FileInputStream(args[0]);
      }

      XMLInputFactory factory = XMLInputFactory.newInstance();
      XMLStreamReader parser = factory.createXMLStreamReader(in);
        
      for (int event = parser.next(); 
           parser.hasNext(); 
           event = parser.next()) {
         if (parser.hasText()) {
             System.out.println(parser.getText());
         }
      }           
    }
    catch (XMLStreamException ex) {
       System.out.println(ex);  
    }
    catch (IOException ex) {
      System.out.println("IOException while parsing " + args[0]);   
    }
        
  }

}

More efficient way of getting text

import javax.xml.stream.*;
import java.net.*;
import java.io.*;

 
public class EfficientEventText {

  public static void main(String[] args) {
        
    if (args.length == 0) {
      System.err.println("Usage: java EfficientEventText url" );
      return;    
    }
        
    try {

      InputStream in;
      try {
        URL u = new URL(args[0]);
        in = u.openStream();
      }
      catch (MalformedURLException ex) {
          // Maybe it's a file name
          in = new FileInputStream(args[0]);
      }

      XMLInputFactory factory = XMLInputFactory.newInstance();
      XMLStreamReader parser = factory.createXMLStreamReader(in);
        
      Writer out = new OutputStreamWriter(System.out);
      for (int event = parser.next(); 
           event != XMLStreamConstants.END_DOCUMENT; 
           event = parser.next()) {
         if (event == XMLStreamConstants.CHARACTERS 
           || event == XMLStreamConstants.SPACE 
           || event == XMLStreamConstants.CDATA) {
             out.write(parser.getTextCharacters(), 
              parser.getTextStart(), parser.getTextLength());
         }
      }          
      out.flush();
      out.close();
    }
    catch (XMLStreamException ex) {
       System.out.println(ex);  
    }
    catch (IOException ex) {
      System.out.println("IOException while parsing " + args[0]);   
    }
        
  }

}

Reusable Text Arrays

public int getTextCharacters(int sourceStart, char[] target, int targetStart, int length)
  throws XMLStreamException, IndexOutOfBoundsException, 
         UnsupportedOperationException, IllegalStateException

Names

If the event is START_ELEMENT or END_ELEMENT, then the following methods in XMLStreamReader also work:

public String getLocalName()
public String getPrefix()
public QName getName()

QName Class

import javax.xml.namespace.*;

public class QName {

    public QName(String localPart);
    public QName(String namespaceURI, String localPart);
    public QName(String namespaceURI, String localPart, String prefix);
    
    public String getLocalPart();
    public String getPrefix();
    public String getNamespaceURI();
    
    public static QName valueOf(String qNameAsString);

    public int     hashCode();
    public boolean equals(Object object);
    public String  toString();

}

Names Example

import javax.xml.stream.*;
import java.net.*;
import java.io.*;

 
public class NamePrinter {

  public static void main(String[] args) {
        
    if (args.length == 0) {
      System.err.println("Usage: java NamePrinter url" );
      return;   
    }
        
    try {

      InputStream in;
      try {
        URL u = new URL(args[0]);
        in = u.openStream();
      }
      catch (MalformedURLException ex) {
          // Maybe it's a file name
          in = new FileInputStream(args[0]);
      }
      
      XMLInputFactory factory = XMLInputFactory.newInstance();
      XMLStreamReader parser = factory.createXMLStreamReader(in);
              
      while (true) {
         int event = parser.next();
         if (event == XMLStreamConstants.START_ELEMENT) {
             System.out.println("Start tag: ");
             printEvent(parser);
         }
         else if (event == XMLStreamConstants.END_ELEMENT) {
             System.out.println("End tag");
             printEvent(parser);
         }
         else if (event == XMLStreamConstants.START_DOCUMENT) {
             System.out.println("Start document");
         }
         else if (event == XMLStreamConstants.CHARACTERS) {
             System.out.println("Text");
             printEvent(parser);
         }
         else if (event == XMLStreamConstants.CDATA) {
             System.out.println("CDATA Section");
             printEvent(parser);
         }
         else if (event == XMLStreamConstants.COMMENT) {
             System.out.println("Comment");
             printEvent(parser);
         }
         else if (event == XMLStreamConstants.DTD) {
             System.out.println("Document type declaration");
             printEvent(parser);
         }
         else if (event == XMLStreamConstants.ENTITY_REFERENCE) {
             System.out.println("Entity Reference");
             printEvent(parser);
         }
         else if (event == XMLStreamConstants.SPACE) {
             System.out.println("Ignorable white space");
             printEvent(parser);
         }
         else if (event == XMLStreamConstants.PROCESSING_INSTRUCTION) {
             System.out.println("Processing Instruction");
             printEvent(parser);
         }
         else if (event == XMLStreamConstants.END_DOCUMENT) {
             System.out.println("End Document");
             break;
         } // end else if
      }  // end while
    }
    catch (XMLStreamException ex) {
       System.out.println(ex);  
    }
    catch (IOException ex) {
      System.out.println("IOException while parsing " + args[0]);   
      ex.printStackTrace();
    }
        
  }
  
  private static void printEvent(XMLStreamReader parser) {
      String localName = parser.getLocalName();
      String prefix = parser.getPrefix();
      String uri = parser.getNamespaceURI();
      
      if (localName != null) System.out.println("\tLocal name: " + localName);
      if (prefix != null) System.out.println("\tPrefix: " + prefix);
      if (uri != null) System.out.println("\tNamespace URI: " + uri);
      System.out.println();
  }

}

Names Example Output

[146:sd2004west/stax/examples] elharo% java -classpath .:bea.jar:stax.jar NamePrinter hotcop.xml
Ignorable white space

Start tag: 
        Local name: SONG
        Namespace URI: 

Text

Start tag: 
        Local name: TITLE
        Namespace URI: 

Text

End tag
        Local name: TITLE
        Namespace URI: 

Text

Start tag: 
        Local name: COMPOSER
        Namespace URI: 

Text

End tag
        Local name: COMPOSER
        Namespace URI: 

Text

Start tag: 
        Local name: COMPOSER
        Namespace URI: 

Text

End tag
        Local name: COMPOSER
        Namespace URI: 

Text

Start tag: 
        Local name: COMPOSER
        Namespace URI: 

Text

End tag
        Local name: COMPOSER
        Namespace URI: 

Text

Start tag: 
        Local name: PRODUCER
        Namespace URI: 

Text

End tag
        Local name: PRODUCER
        Namespace URI: 

Text

Start tag: 
        Local name: PUBLISHER
        Namespace URI: 

Text

End tag
        Local name: PUBLISHER
        Namespace URI: 

Text

Start tag: 
        Local name: LENGTH
        Namespace URI: 

Text

End tag
        Local name: LENGTH
        Namespace URI: 

Text

Start tag: 
        Local name: YEAR
        Namespace URI: 

Text

End tag
        Local name: YEAR
        Namespace URI: 

Text

Start tag: 
        Local name: ARTIST
        Namespace URI: 

Text

End tag
        Local name: ARTIST
        Namespace URI: 

Text

End tag
        Local name: SONG
        Namespace URI: 

Ignorable white space

End Document

RSSLister

import javax.xml.stream.*;
import java.net.*;
import java.io.*;

 
public class RSSLister {

  public static void main(String[] args) {
        
    if (args.length == 0) {
      System.err.println("Usage: java RSSLister url" );
      return;    
    }
        
    try {

      InputStream in;
      try {
        URL u = new URL(args[0]);
        in = u.openStream();
      }
      catch (MalformedURLException ex) {
          // Maybe it's a file name
          in = new FileInputStream(args[0]);
      }

      XMLInputFactory factory = XMLInputFactory.newInstance();
      XMLStreamReader parser = factory.createXMLStreamReader(in);
        
      boolean printing = false;
      for (int event = parser.next(); 
           parser.hasNext(); 
           event = parser.next()) {
         if (event == XMLStreamConstants.START_ELEMENT) {
             String name = parser.getLocalName();
             if (name.equals("title")) printing = true;
         }
         else if (event == XMLStreamConstants.END_ELEMENT) {
             String name = parser.getLocalName();
             if (name.equals("title")) printing = false;
         }
         else if (parser.hasText() && event != XMLStreamConstants.COMMENT) {
             if (printing) System.out.println(parser.getText());
         }
      }  
      parser.close();
         
    }
    catch (XMLStreamException ex) {
       System.out.println(ex);  
    }
    catch (IOException ex) {
      System.out.println("IOException while parsing " + args[0]);   
    }
        
  }

}

RSSLister Output

$ java -classpath stax.jar:.:bea.jar RSSLister ananova.rss
Ananova:
Archeology
Powered by News Is Free
Britain's earliest leprosy victim may have been found
20th anniversary of Mary Rose recovery
'Proof of Jesus' burial box damaged on way to Canada
Remains of four woolly rhinos give new insight into Ice Age
Experts solve crop lines mystery

Improved RSSLister

Print only item titles:

import javax.xml.stream.*;
import java.net.*;
import java.io.*;

 
public class BetterRSSLister {

  public static void main(String[] args) {
        
    if (args.length == 0) {
      System.err.println("Usage: java BetterRSSLister url" );
      return;    
    }
        
    try {

      InputStream in;
      try {
        URL u = new URL(args[0]);
        in = u.openStream();
      }
      catch (MalformedURLException ex) {
          // Maybe it's a file name
          in = new FileInputStream(args[0]);
      }

      XMLInputFactory factory = XMLInputFactory.newInstance();
      XMLStreamReader parser = factory.createXMLStreamReader(in);
        
      boolean inItem = false;
      boolean inTitle = false;
      // I am relying on no recursion here. To fix this
      // just keep an int count rather than a boolean
      for (int event = parser.nextTag(); 
           parser.hasNext(); 
           event = parser.next()) {
         if (event == XMLStreamConstants.START_ELEMENT) {
             String name = parser.getLocalName();
             if (name.equals("title")) inTitle = true;
             else if (name.equals("item")) inItem = true;
         }
         else if (event == XMLStreamConstants.END_ELEMENT) {
             String name = parser.getLocalName();
             if (name.equals("title")) inTitle = false;
             else if (name.equals("item")) inItem = false;
          }
         else if (parser.hasText() && event != XMLStreamConstants.COMMENT) {
             if (inItem && inTitle) System.out.println(parser.getText());
         }
      }  
      parser.close();
      
    }
    catch (XMLStreamException ex) {
       System.out.println(ex);  
    }
    catch (IOException ex) {
      System.out.println("IOException while parsing " + args[0]);   
    }
        
  }

}

RSSLister Output

$ java -classpath stax.jar:.:bea.jar RSSLister ananova.rss
Archeology
Powered by News Is Free
Britain's earliest leprosy victim may have been found
20th anniversary of Mary Rose recovery
'Proof of Jesus' burial box damaged on way to Canada
Remains of four woolly rhinos give new insight into Ice Age
Experts solve crop lines mystery

The nextTag() method


Attributes


Attributes Example: XLinkSpider

import javax.xml.stream.*;
import java.net.*;
import java.io.*;
import java.util.*;

public class PullSpider {

  // Need to keep track of where we've been 
  // so we don't get stuck in an infinite loop
  private List spideredURIs = new Vector();

  // This linked list keeps track of where we're going.
  // Although the LinkedList class does not guarantee queue like
  // access, I always access it in a first-in/first-out fashion.
  private LinkedList queue = new LinkedList();
  
  private URL currentURL;
  private XMLInputFactory factory;
  
  public PullSpider() {
      this.factory = XMLInputFactory.newInstance();
  }

  private void processStartTag(XMLStreamReader parser) {
    
    String type 
     = parser.getAttributeValue("http://www.w3.org/1999/xlink", "type");
    if (type != null) {
      String href 
       = parser.getAttributeValue("http://www.w3.org/1999/xlink", "href");
          if (href != null) {
            try {
              URL foundURL = new URL(currentURL, href);
              if (!spideredURIs.contains(foundURL)) {
                queue.addFirst(foundURL);
              }
            }
           catch (MalformedURLException ex) {
             // skip this URL  
           }
        }
    }
  }
  
  public void spider(URL url) {
      
    System.out.println("Spidering " + url);
    currentURL = url;
    try {
      XMLStreamReader parser = factory.createXMLStreamReader(currentURL.openStream());
      spideredURIs.add(currentURL);
      
      for (int event = parser.next(); 
           parser.hasNext(); 
           event = parser.next()) {
         if (event == XMLStreamConstants.START_ELEMENT) {
             processStartTag(parser);
         }
       }  // end for
       parser.close();
       
       while (!queue.isEmpty()) {
         URL nextURL = (URL) queue.removeLast();
         spider(nextURL);
       }
      
    }
    catch (Exception ex) {
       // skip this document
    }
    
  }

  public static void main(String[] args) throws Exception {
        
    if (args.length == 0) {
      System.err.println("Usage: java PullSpider url" );
       return;  
    }
        
    PullSpider spider = new PullSpider();
    spider.spider(new URL(args[0]));
        
  } // end main

} // end PullSpider


Output from the PullSpider

$ java -classpath stax.jar:.:bea.jar PullSpider http://www.rddl.org
Spidering http://www.rddl.org
Spidering http://www.rddl.org/natures
Spidering http://www.rddl.org/purposes
Spidering http://www.rddl.org/xrd.css
Spidering http://www.rddl.org/rddl-xhtml.dtd
Spidering http://www.rddl.org/rddl-qname-1.mod
Spidering http://www.rddl.org/rddl-resource-1.mod
Spidering http://www.rddl.org/xhtml-arch-1.mod
Spidering http://www.rddl.org/xhtml-attribs-1.mod
Spidering http://www.rddl.org/xhtml-base-1.mod
Spidering http://www.rddl.org/xhtml-basic-form-1.mod
Spidering http://www.rddl.org/xhtml-basic-table-1.mod
Spidering http://www.rddl.org/xhtml-blkphras-1.mod
Spidering http://www.rddl.org/xhtml-blkstruct-1.mod
Spidering http://www.rddl.org/xhtml-charent-1.mod
Spidering http://www.rddl.org/xhtml-datatypes-1.mod
Spidering http://www.rddl.org/xhtml-framework-1.mod
Spidering http://www.rddl.org/xhtml-hypertext-1.mod
Spidering http://www.rddl.org/xhtml-image-1.mod
Spidering http://www.rddl.org/xhtml-inlphras-1.mod
Spidering http://www.rddl.org/xhtml-inlstruct-1.mod
Spidering http://www.rddl.org/xhtml-lat1.ent
Spidering http://www.rddl.org/xhtml-link-1.mod
Spidering http://www.rddl.org/xhtml-meta-1.mod
Spidering http://www.rddl.org/xhtml-notations-1.mod
Spidering http://www.rddl.org/xhtml-object-1.mod
Spidering http://www.rddl.org/xhtml-param-1.mod
Spidering http://www.rddl.org/xhtml-qname-1.mod
Spidering http://www.rddl.org/xhtml-rddl-model-1.mod
Spidering http://www.rddl.org/xhtml-special.ent
Spidering http://www.rddl.org/xhtml-struct-1.mod
Spidering http://www.rddl.org/xhtml-symbol.ent
Spidering http://www.rddl.org/xhtml-text-1.mod
Spidering http://www.rddl.org/xlink-module-1.mod
Spidering http://www.rddl.org/rddl.rdfs
Spidering http://www.rddl.org/rddl-integration.rxg
Spidering http://www.rddl.org/modules/rddl-1.rxm

Namespaces


The NamespaceContext Class

package javax.xml.namespace;

public interface NamespaceContext {

  public String   getNamespaceURI(String prefix);
  public String   getPrefix(String namespaceURI);
  public Iterator getPrefixes(String namespaceURI);
  
}


XMLStreamWriter

package javax.xml.stream;

public interface XMLStreamWriter {
  
  public void writeStartElement(String localName) 
    throws XMLStreamException;
  public void writeStartElement(String namespaceURI, String localName) 
    throws XMLStreamException;
  public void writeStartElement(String prefix,
                                String localName,
                                String namespaceURI) 
    throws XMLStreamException;

  public void writeEmptyElement(String namespaceURI, String localName) 
    throws XMLStreamException;
  public void writeEmptyElement(String prefix, String localName, String namespaceURI) 
    throws XMLStreamException;
  public void writeEmptyElement(String localName) 
    throws XMLStreamException;
    
  public void writeEndElement() 
    throws XMLStreamException;
    
  public void writeEndDocument() 
    throws XMLStreamException;

   public void writeAttribute(String localName, String value) 
    throws XMLStreamException;
  public void writeAttribute(String prefix,
                             String namespaceURI,
                             String localName,
                             String value) 
    throws XMLStreamException;
  public void writeAttribute(String namespaceURI,
                             String localName,
                             String value) 
    throws XMLStreamException;

  public void writeNamespace(String prefix, String namespaceURI) 
    throws XMLStreamException;
  public void writeDefaultNamespace(String namespaceURI)
    throws XMLStreamException;

  public void writeComment(String data) 
    throws XMLStreamException;
  public void writeProcessingInstruction(String target) 
    throws XMLStreamException;
  public void writeProcessingInstruction(String target,
                                         String data) 
    throws XMLStreamException;
  public void writeCData(String data) 
    throws XMLStreamException;
  public void writeDTD(String dtd) 
    throws XMLStreamException;
  public void writeEntityRef(String name) 
    throws XMLStreamException;
  public void writeStartDocument() 
    throws XMLStreamException;
  public void writeStartDocument(String version) 
    throws XMLStreamException;
  public void writeStartDocument(String encoding,
                                 String version) 
    throws XMLStreamException;
  public void writeCharacters(String text) 
    throws XMLStreamException;
    
  public void writeCharacters(char[] text, int start, int len) 
    throws XMLStreamException;

  public String getPrefix(String uri) 
    throws XMLStreamException;
  public void setPrefix(String prefix, String uri) 
    throws XMLStreamException;
  public void setDefaultNamespace(String uri) 
    throws XMLStreamException;
  public void setNamespaceContext(NamespaceContext context)
    throws XMLStreamException;
  public NamespaceContext getNamespaceContext();

  public void close() throws XMLStreamException;
  public void flush() throws XMLStreamException;  
  
  public Object getProperty(java.lang.String name) throws IllegalArgumentException;

}




XMLStreamWriter Example: Convert RDDL to XHTML


Example: RDDLStripper

import javax.xml.stream.*;
import java.net.*;
import java.io.*;

 
public class RDDLStripper {
    
  public final static String RDDL_NS = "http://www.rddl.org/";

  public static void main(String[] args) {
        
    if (args.length == 0) {
      System.err.println("Usage: java RDDLStripper url" );
      return;    
    }
        
    try {      
      InputStream in;
      try {
        URL u = new URL(args[0]);
        in = u.openStream();
      }
      catch (MalformedURLException ex) {
        // Maybe it's a file name
        in = new FileInputStream(args[0]);
      }
      
      XMLStreamReader parser 
       = XMLInputFactory.newInstance().createXMLStreamReader(in);
      XMLStreamWriter serializer 
       = XMLOutputFactory.newInstance().createXMLStreamWriter(System.out);
        
      while (true) {
         int event = parser.next();
         if (parser.isStartElement()) {
             String namespaceURI = parser.getNamespaceURI();
             if (!namespaceURI.equals(RDDL_NS)) {
                 serializer.writeStartElement(namespaceURI, parser.getLocalName());
                 // add attributes
                 for (int i = 0; i < parser.getAttributeCount(); i++) {
                     serializer.writeAttribute(
                       parser.getAttributeNamespace(i),
                       parser.getAttributeName(i),
                       parser.getAttributeValue(i)
                     );
                 }
                 // add namespace declarations
                 for (int i = 0; i < parser.getNamespaceCount(); i++) {
                     String uri = parser.getNamespaceURI(i);
                     if (!RDDL_NS.equals(uri)) {
                       serializer.writeNamespace(parser.getNamespacePrefix(i), uri);
                     }
                 }
             }
         }
         else if (parser.isEndElement()) {
             String namespaceURI = parser.getNamespaceURI();
             if (!namespaceURI.equals(RDDL_NS)) {
                 serializer.writeEndElement();
             }
         }
         else if (event == XMLStreamConstants.CHARACTERS
           || event == XMLStreamConstants.SPACE) {
             serializer.writeCharacters(parser.getText());
         }
         else if (event == XMLStreamConstants.CDATA) {
             serializer.writeCData(parser.getText());
         }
         else if (event == XMLStreamConstants.COMMENT) {
             serializer.writeComment(parser.getText());
         }
         else if (event == XMLStreamConstants.DTD) {
             serializer.writeDTD(parser.getText());
         }
         else if (event == XMLStreamConstants.ENTITY_REFERENCE) {
             serializer.writeEntityRef(parser.getLocalName());
         }
         else if (event == XMLStreamConstants.PROCESSING_INSTRUCTION) {
             serializer.writeProcessingInstruction(parser.getPITarget(), parser.getPIData());
         }
         else if (event == XMLStreamConstants.END_DOCUMENT) {
            serializer.flush();
            break;
         }
      }  
      serializer.close();         
      parser.close();
      
    }
    catch (XMLStreamException ex) {
       System.out.println(ex);  
    }
    catch (IOException ex) {
      System.out.println("IOException while parsing " + args[0]);   
    }
        
  }

}

One of my favorite features


Future Work


To Learn More


Index | Cafe con Leche

Copyright 2007 Elliotte Rusty Harold
elharo@metalab.unc.edu
Last Modified March 12, 2007