Testing XML

Elliotte Rusty Harold

Software Development 2006 West

Friday, November 17, 2006

elharo@metalab.unc.edu

http://www.cafeaulait.org/


More and More Applications Are Generating XML


XML is not Just a Text File


Different ways of representing the same syntax


Content can change but still be OK


Key Question


An XML document

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" 
                       "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>CFBundleDevelopmentRegion</key>
    <string>English</string>
    <key>CFBundleExecutable</key>
    <string>thunderbird-bin</string>
    <key>CFBundleGetInfoString</key>
    <string>Thunderbird 1.0.2, © 2005 The Mozilla Organization</string>
    <key>CFBundleIconFile</key>
    <string>thunderbird</string>
    <key>CFBundleIdentifier</key>
    <string>org.mozilla.thunderbird</string>
    <key>CFBundleInfoDictionaryVersion</key>
    <string>6.0</string>
    <key>CFBundleName</key>
    <string>Thunderbird</string>
    <key>CFBundlePackageType</key>
    <string>APPL</string>
    <key>CFBundleShortVersionString</key>
    <string>1.0.2</string>
    <key>CFBundleSignature</key>
    <string>MOZM</string>
    <key>CFBundleVersion</key>
    <string>1.0.2</string>
    <key>NSAppleScriptEnabled</key>
    <true/>
</dict>
</plist>

An Alternate Representation of the Same document

<?xml version="1.0" encoding="MacRoman"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd" [
  <!ENTITY version "1.0.2">
]>
<plist version = '1.0'>
<dict>
    <key>CFBundleDevelopmentRegion</key>
    <string>English</string>
    <key>CFBundleExecutable</key>
    <string>thunderbird-bin</string>
    <key>CFBundleGetInfoString</key>
    <string>Thunderbird &version;, &#xA9; 2005 The Mozilla Organization</string>
    <key>CFBundleIconFile</key>
    <string>thunderbird</string>
    <key>CFBundleIdentifier</key>
    <string>org.mozilla.thunderbird</string>
    <key>CFBundleInfoDictionaryVersion</key>
    <string>6.0</string>
    <key>CFBundleName</key>
    <string>Thunderbird</string>
    <key>CFBundlePackageType</key>
    <string>APPL</string>
    <key>CFBundleShortVersionString</key>
    <string>&version;</string>
    <key>CFBundleSignature</key>
    <string>MOZM</string>
    <key>CFBundleVersion</key>
    <string>&version;</string>
    <key>NSAppleScriptEnabled</key>
    <true/>
</dict>
</plist>

A Representation of the Same Information in a Different Document

<?xml version="1.0" encoding="MacRoman"?>
<?xml-stylesheet href="plist.css" type="text/css"?>
<!-- Removing all the white space may not make this document as easy to
     read, but it could make it faster to parse since there are fewer nodes
     to handle. -->
<!DOCTYPE plist SYSTEM "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0"><dict>
<key>CFBundleVersion</key><string>1.0.2</string>
<key>NSAppleScriptEnabled</key><true/>
<key>CFBundleIdentifier</key><string>org.mozilla.thunderbird</string>
<key>CFBundleInfoDictionaryVersion</key><string>6.0</string>
<key>CFBundleName</key><string>Thunderbird</string>
<key>CFBundlePackageType</key><string>APPL</string>
<key>CFBundleShortVersionString</key><string>1.0.2</string>
<key>CFBundleSignature</key><string>MOZM</string>
<key>CFBundleDevelopmentRegion</key><string>English</string>
<key>CFBundleExecutable</key><string>thunderbird-bin</string>
<key>CFBundleGetInfoString</key><string>Thunderbird 1.0.2, © 2005 The Mozilla Organization</string>
<key>CFBundleIconFile</key><string>thunderbird</string>
</dict></plist>

The XML Infoset

The InfoSet defines 11 Kinds of Information Items

Not everyone agrees that this is a good thing! or that this is the right list!


Element Information Items

An Element Information Item includes:


The InfoSet Omits:


Infoset significant details that sometimes should ignored anyway

You may well wish to ignore more details when comparing:


Direct Testing with DOM


DOM Example

    private Document plist;
    
    protected void setUp() 
      throws IOException, ParserConfigurationException, SAXException {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true); // NEVER FORGET THIS!
        DocumentBuilder builder = factory.newDocumentBuilder();
        
        plist = builder.parse(new File("thunderbirdplist.xml"));
    }

    public void testNoTwoKeyElementsAreAdjacentDOM() {
       
        Element root = plist.getDocumentElement();
        Element dict = (Element) root.getElementsByTagName("dict").item(0);
        NodeList children = dict.getElementsByTagName("*");
        for (int i = 0; i < children.getLength(); i++) {
            Node element = children.item(i);
            if (element.getNodeName().equals("key")) {
                assertFalse(children.item(i+1).getNodeName().equals("key"));
                // effectively also tests that every key 
                // is followed by something
            }
        }
        
    }

Canonical XML


A Canonicalized Document

<plist version="1.0">
<dict>
    <key>CFBundleDevelopmentRegion</key>
    <string>English</string>
    <key>CFBundleExecutable</key>
    <string>thunderbird-bin</string>
    <key>CFBundleGetInfoString</key>
    <string>Thunderbird 1.0.2, © 2005 The Mozilla Organization</string>
    <key>CFBundleIconFile</key>
    <string>thunderbird</string>
    <key>CFBundleIdentifier</key>
    <string>org.mozilla.thunderbird</string>
    <key>CFBundleInfoDictionaryVersion</key>
    <string>6.0</string>
    <key>CFBundleName</key>
    <string>Thunderbird</string>
    <key>CFBundlePackageType</key>
    <string>APPL</string>
    <key>CFBundleShortVersionString</key>
    <string>1.0.2</string>
    <key>CFBundleSignature</key>
    <string>MOZM</string>
    <key>CFBundleVersion</key>
    <string>1.0.2</string>
    <key>NSAppleScriptEnabled</key>
    <true></true>
</dict>
</plist>

JUnit Canonicalization (using XOM)

import java.io.*;
import nu.xom.*;
import nu.xom.canonical.*;
import junit.framework.Assert;

public class CanonicalAssert extends Assert {

    public void assertCanonicalEquals(Document expected, Document actual) {
        
        ByteArrayOutputStream expectedBytes = new ByteArrayOutputStream();
        ByteArrayOutputStream actualBytes = new ByteArrayOutputStream();
        
        try {
            Canonicalizer expectedCanonicalizer 
              = new Canonicalizer(expectedBytes);
            expectedCanonicalizer.write(expected);
            byte[] expectedArray = expectedBytes.toByteArray();
        
            Canonicalizer actualCanonicalizer 
              = new Canonicalizer(actualBytes);
            actualCanonicalizer.write(actual);
            byte[] actualArray = actualBytes.toByteArray();
            
            assertEquals(expectedArray.length, actualArray.length);
            for (int i = 0; i < expectedArray.length; i++) {
                assertEquals(expectedArray[i], actualArray[i]);
            }
        }
        catch (IOException ex) {
            fail("IOException while canonicalizing");
        }        
        
    }    

}

Document Subset Canonicalization


JUnit Document Subset Canonicalization (using XOM 1.1)

public void assertCanonicalEquals(Document expected, Document actual, String xpath) {
        
  ByteArrayOutputStream expectedBytes = new ByteArrayOutputStream();
  ByteArrayOutputStream actualBytes = new ByteArrayOutputStream();
  
  try {
    Canonicalizer expectedCanonicalizer = new Canonicalizer(expectedBytes);
    Nodes expectedNodes = expected.query(xpath);
    expectedCanonicalizer.write(expectedNodes);
    byte[] expectedArray = expectedBytes.toByteArray();
  
    Canonicalizer actualCanonicalizer = new Canonicalizer(actualBytes);
    Nodes actualNodes = actual.query(xpath);
    actualCanonicalizer.write(actualNodes);
    byte[] actualArray = actualBytes.toByteArray();
  
    assertEquals(expectedArray.length, actualArray.length);
    for (int i = 0; i < expectedArray.length; i++) {
      assertEquals(expectedArray[i], actualArray[i]);
    }
  }
  catch (IOException ex) {
    fail("IOException while canonicalizing");
  }        
        
} 

Exclusive XML Canonicalization


JUnit Exclusive Document Subset Canonicalization

public void assertCanonicalEquals(Document expected, Document actual, String xpath) {
    
    ByteArrayOutputStream expectedBytes = new ByteArrayOutputStream();
    ByteArrayOutputStream actualBytes = new ByteArrayOutputStream();
    
    try {
        Canonicalizer expectedCanonicalizer = new Canonicalizer(
          expectedBytes, Canonicalizer.EXCLUSIVE_XML_CANONICALIZATION);
        Nodes expectedNodes = expected.query(xpath);
        expectedCanonicalizer.write(expectedNodes);
        byte[] expectedArray = expectedBytes.toByteArray();
    
        Canonicalizer actualCanonicalizer = new Canonicalizer(
          actualBytes, Canonicalizer.EXCLUSIVE_XML_CANONICALIZATION);
        Nodes actualNodes = actual.query(xpath);
        actualCanonicalizer.write(actualNodes);
        byte[] actualArray = actualBytes.toByteArray();
    
        assertEquals(expectedArray.length, actualArray.length);
        for (int i = 0; i < expectedArray.length; i++) {
            assertEquals(expectedArray[i], actualArray[i]);
        }
    }
    catch (IOException ex) {
        fail("IOException while canonicalizing");
    }        
    
} 

Canonicalization tools and libraries


Validity


DTDs

<!ENTITY % plistObject 
  "(array | data | date | dict | real | integer | string | true | false )" >
<!ELEMENT plist %plistObject;>
<!ATTLIST plist version CDATA "1.0" >

<!-- Collections -->
<!ELEMENT array (%plistObject;)*>
<!ELEMENT dict (key, %plistObject;)*>
<!ELEMENT key (#PCDATA)>

<!--- Primitive types -->
<!ELEMENT string (#PCDATA)>
<!ELEMENT data (#PCDATA)> <!-- Contents interpreted as Base-64 encoded -->
<!ELEMENT date (#PCDATA)> <!-- Contents should conform to a subset of ISO 8601 
                               (in particular, YYYY '-' MM '-' DD 'T' HH ':' MM ':' SS 'Z'.  
                               Smaller units may be omitted with a loss of precision) -->

<!-- Numerical primitives -->
<!ELEMENT true EMPTY>  <!-- Boolean constant true -->
<!ELEMENT false EMPTY> <!-- Boolean constant false -->
<!ELEMENT real (#PCDATA)> <!-- Contents should represent a 
                               floating point number matching 
                               ("+" | "-")? d+ ("."d*)? ("E" ("+" | "-") d+)? 
                               where d is a digit 0-9.  -->
<!ELEMENT integer (#PCDATA)> <!-- Contents should represent a (possibly signed) 
                                  integer number in base 10 -->

Validating via JUnit

    public void testValidOutput() throws SAXException, IOEXception {
    
      File f = new File("filename.xml");
      InputSource in = new InputSource(new FileInputStream(f));
      
      XMLReader parser = XMLReaderFactory.createXMLReader(); 
      parser.setFeature("http://xml.org/sax/features/validation", true);
      parser.setErrorHandler(new ErrorHandler() {

        public void warning(SAXParseException exception) {
          // skip
        }

        public void error(SAXParseException exception) throws SAXException {
          throw exception;      
        }

        public void fatalError(SAXParseException exception) throws SAXException {
          throw exception;       
        }
          
      });
      parser.parse(in);
    
    } 

Changing the DTD

import org.xml.sax.*;
import java.io.*;

public class LocalResolver implements EntityResolver {
 
   public InputSource resolveEntity (String publicID, String systemID)
   {
     if (publicID.equals("-//Apple Computer//DTD PLIST 1.0//EN")) {
       InputStream in = new FileInputStream("plist.dtd");
       return new InputSource(in);
     } 
     else {
       return null;
     }
   }
 }

Mock DOCTYPE


W3C Schemas

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

  <xsd:element name="plist">
    <xsd:complexType>
      <xsd:sequence minOccurs="1" maxOccurs="1">
        <xsd:element name="dict">
          <xsd:complexType>
            <xsd:sequence minOccurs="1" maxOccurs="123">
              <xsd:element name="key"    type="xsd:token"/>
              <xsd:choice>
                <xsd:element name="string" type="xsd:string"/>
                <xsd:element name="true"></xsd:element>
                <xsd:element name="false"></xsd:element>
              </xsd:choice>
            </xsd:sequence>
          </xsd:complexType>
        </xsd:element>
      </xsd:sequence>
      <xsd:attribute name="version" type="xsd:string" fixed="1.0"/>
    </xsd:complexType>
  </xsd:element>

</xsd:schema>

JUnit test for W3C Schemas

    public void testSchemaValidOutput() throws SAXException {
    
      File f = new File("filename.xml");
      InputSource in = new InputSource(new FileInputStream(f));

      XMLReader parser = XMLReaderFactory.createXMLReader(
        "org.apache.xerces.parsers.SAXParser"); 
      parser.setFeature("http://xml.org/sax/features/validation", true);
      parser.setFeature("http://apache.org/xml/features/validation/schema", true);
      parser.setProperty(
       "http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation", 
       "examples/plist.xsd");
      // also http://apache.org/xml/properties/schema/external-schemaLocation
      parser.setErrorHandler(new ErrorHandler() {

        public void warning(SAXParseException exception) throws SAXException {
          // skip
        }

        public void error(SAXParseException exception) throws SAXException {
          throw exception;      
        }

        public void fatalError(SAXParseException exception) throws SAXException {
          throw exception;       
        }
          
      });
      parser.parse(in);
      
    } 

RELAX NG schema

namespace a = "http://relaxng.org/ns/compatibility/annotations/1.0"

plistObject =
  array | data | date | dict | real | integer | \string | true | false
plist = element plist { attlist.plist, plistObject }
attlist.plist &= [ a:defaultValue = "1.0" ] attribute version { text }?

# Collections
array = element array { attlist.array, plistObject* }
attlist.array &= empty
dict = element dict { attlist.dict, (key, plistObject)* }
attlist.dict &= empty
key = element key { attlist.key, text }
attlist.key &= empty

# - Primitive types
\string = element string { attlist.string, text }
attlist.string &= empty
data = element data { attlist.data, text }
attlist.data &= empty

# Contents interpreted as Base-64 encoded
date = element date { attlist.date, text }
attlist.date &= empty
# Contents should conform to a subset of ISO 8601 (in particular, YYYY '-' MM '-' DD 'T' HH ':' MM ':' SS 'Z'.  Smaller units may be omitted with a loss of precision)

# Numerical primitives
true = element true { attlist.true, empty }
attlist.true &= empty

# Boolean constant true
false = element false { attlist.false, empty }
attlist.false &= empty

# Boolean constant false
real = element real { attlist.real, text }
attlist.real &= empty

# Contents should represent a floating point number matching ("+" | "-")? d+ ("."d*)? ("E" ("+" | "-") d+)? where d is a digit 0-9.
integer = element integer { attlist.integer, text }
attlist.integer &= empty
start = plist

JUnit test for RELAX NG Schemas

public void testRELAXNGValid() {

  // some of this might be moved into fixtures
  DocumentBuilder parser 
    = DocumentBuilderFactory.newInstance().newDocumentBuilder();
  Document document = parser.parse(new File("filename.xml"));

  SchemaFactory factory 
    = SchemaFactory.newInstance(XMLConstants.RELAXNG_NS_URI);
  Source source = new StreamSource(new File("plist.rnc"));
  Schema schema = factory.newSchema(source);
  Validator validator = schema.newValidator();
 
  validator.validate(new DOMSource(document));
  // throws exception if document is invalid

}

XPath


Some XPath Tests for the plist


XPath in a JUnit test

import org.xml.sax.InputSource;
import javax.xml.xpath.*;

import junit.framework.*;
import java.io.*;

public class PListXPathTest extends TestCase {

    private InputSource plist;
    private XPath query;
    
    protected void setUp() throws IOException {
        plist = new InputSource(new FileInputStream("thunderbirdplist.xml"));      
        query = XPathFactory.newInstance().newXPath();
    }
    
    public void testNoTwoKeyElementsAreAdjacent() 
      throws XPathExpressionException {
       // //key/following-sibling::*[1]/self::key is empty
       
       Boolean result = (Boolean) query.evaluate(
         "//key/following-sibling::*[1]/self::key", 
         plist, XPathConstants.BOOLEAN);
       assertFalse(result.booleanValue());
        
    }

}

Writing as XSLT

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:output method="text"/>

  <xsl:template match="/">

    <xsl:if test="not(//key[. = 'CFBundleExecutable'])">
      No CFBundleExecutable
    </xsl:if> 

    <xsl:if test="count(//key[. = 'CFBundleIconFile']) = 0">
      There is no CFBundleIconFile
    </xsl:if>

    <xsl:if test="count(//key[. = 'CFBundleIconFile']) &gt; 1">
      There is more than one CFBundleIconFile
    </xsl:if>
    
    <xsl:if test="not(contains(//key[. = 'CFBundleGetInfoString']/following-sibling::string, '© 2005 The Mozilla Organization'))">
      Missing copyright
    </xsl:if>

    <xsl:if test="string-length(//key[. = 'CFBundleSignature']/following-sibling::string) != 4">
      The CFBundleSignature is not four letters
    </xsl:if>

    <xsl:if test="count(//key/following-sibling::*[1]/self::key) != 0">
      Adjacent key elements
    </xsl:if>

  </xsl:template>

</xsl:stylesheet>

Combining XPath with Java


XPath+Java Test

public void testCFBundleVersionFormat() 
  throws XPathExpressionException {
       
    String regex = "\\d+\\.\\d+(\\.\\d+)?";
    String xpath = "//key[. = 'CFBundleVersion']/following-sibling::string[1]";
      
    String version = (String) query.evaluate(xpath, plist, XPathConstants.STRING);
      
    Pattern pattern = Pattern.compile(regex);
    Matcher matcher = pattern.matcher(version);
    assertTrue(matcher.matches());
      
}

Schematron


Schematron Syntax


Schematron schema for plists

Includes all the constraints previously listed as XPaths:

<?xml version="1.0"?>
<schema xmlns="http://www.ascc.net/xml/schematron">
  <title>A Schematron Schema for the Thunderbird PLists</title>

  <pattern>
    <rule context="plist">
      <assert test="//key[. = 'CFBundleExecutable']">
        There is a CFBundleExecutable.
      </assert>
      <assert test="count(//key[. = 'CFBundleIconFile']) = 1">
        There is exactly one CFBundleIconFile
      </assert>
      <assert test="contains(//key[. = 'CFBundleGetInfoString']/following-sibling::string, '© 2005 The Mozilla Organization')">
       The software is copyrighted by Mozilla.
      </assert>
      <assert test="string-length(//key[. = 'CFBundleSignature']/following-sibling::string) = 4">
        The CFBundleSignature is four letters
      </assert>
      <assert test="string-length(//key[. = 'CFBundleSignature']/following-sibling::string) = 4">
        The CFBundleSignature is four letters
      </assert>
    </rule>
  </pattern>

  <!-- some tests are simpler here -->
  <pattern>
    <rule context="key">
      <assert test="name(following-sibling::*[1]) != 'key'">
        No two key elements are adjacent.
      </assert>
    </rule>
  </pattern>

</schema>
  

Running Schematron

  1. Use skeleton-1.5.xsl to generate an XSLT stylesheet:

    $ xsltproc skeleton1-5.xsl plist.sct
    <?xml version="1.0" standalone="yes"?>
    <axsl:stylesheet xmlns:axsl="http://www.w3.org/1999/XSL/Transform" xmlns:sch="http://www.ascc.net/xml/schematron" version="1.0">
      <axsl:template match="*|@*" mode="schematron-get-full-path">
        <axsl:apply-templates select="parent::*" mode="schematron-get-full-path"/>
        <axsl:text>/</axsl:text>
        <axsl:if test="count(. | ../@*) = count(../@*)">@</axsl:if>
        <axsl:value-of select="name()"/>
        ...

  2. Apply the stylesheet to the input documents:

    $ xsltproc plist.xsl thunderbirdplist.xml
    $
  3. Or after deliberately invalidating the input:

    $ xsltproc plist.xsl thunderbirdplist.xml
    <?xml version="1.0"?>
    The CFBundleSignature is four letters
    $

  4. You can customize the skeleton to produce different output.


JUnit test for Schematron Schema


XMLUnit

From the XMLUnit web page:

For those of you who've got into it you'll know that test driven development is great. It gives you the confidence to change code safe in the knowledge that if something breaks you'll know about it. Except for those bits you don't know how to test. Until now XML has been one of them. Oh sure you can use "<stuff></stuff>".equals("<stuff></stuff>"); but is that really gonna work when some joker decides to output a <stuff/>? -- damned right it's not ;-)

XML can be used for just about anything so deciding if two documents are equal to each other isn't as easy as a character for character match. Sometimes

<stuff-doc>
<stuff>
Stuff Stuff Stuff
</stuff>
<more-stuff>
Some More Stuff
</more-stuff>
</stuff-doc>
equals
<stuff-doc>
<more-stuff>
Some More Stuff</more-stuff>
<stuff>Stuff Stuff Stuff</stuff>
</stuff-doc>

and sometimes it doesn't... With XMLUnit you get the control, and you get to decide.


XMLUnit


A simple test case

import java.io.*;
import javax.xml.parsers.*;
import org.custommonkey.xmlunit.*;
import org.xml.sax.*;

public class SimpleTest extends XMLTestCase {

    public void testHelloWorld() 
      throws SAXException, IOException, ParserConfigurationException {
     
        String expected = "<GREETING>Hello World!</GREETING>";
        String actual = "<GREETING>Hello World!</GREETING>";
        assertXMLEqual(expected, actual);
        
    }

}

But this is not just a String comparison!

public void testHelloWorld2() 
  throws SAXException, IOException, ParserConfigurationException {
     
      String expected = "<?xml version='1.0'?><GREETING >Hello World!</GREETING>";
      String actual = "<GREETING>Hello World!</GREETING>";
      assertXMLEqual(expected, actual);
        
}

Readers and Documents


Assertion messages

    public void testHelloWorld5() 
      throws SAXException, IOException, ParserConfigurationException {
        
        String expected = "<GREETING>Hello World!</GREETING>";
        String actual = "<GREETING>\nHello World!\n</GREETING>";
        assertXMLEqual("White space seems to count", expected, actual);
        
    }

Concepts of Equality


Some documents that are equal but not identical

These tests all pass:

    public void testSiblingOrder() 
      throws SAXException, IOException, ParserConfigurationException {
        
        String expected = "<a><x/><y/></a>";
        String actual = "<a><y/><x/></a>";
        assertXMLEqual("Sibling order seems to count", expected, actual);
        
    }

    public void testNamespacePrefix() 
      throws SAXException, IOException, ParserConfigurationException {
        
        String expected = "<a xmlns='http://www.example.org'><x/></a>";
        String actual = "<pre:a xmlns:pre='http://www.example.org'><pre:x/></pre:a>";
        assertXMLEqual(expected, actual);
        
    }
    
    public void testDOCTYPE() 
      throws SAXException, IOException, ParserConfigurationException {
        
        String expected = "<!DOCTYPE a [<!ATTLIST a b CDATA 'test'>]>\n" +
          "<a><x/></a>";
        String actual = "<a b='test'><x/></a>";
        assertXMLEqual(expected, actual);
        
    }
    
    public void testCommentInProlog() 
      throws SAXException, IOException, ParserConfigurationException {
        
        String expected = "<!-- test -->" +
          "<a><x/></a>";
        String actual = "<a><x/></a>";
        assertXMLEqual(expected, actual);
        
    }

    public void testProcessingInstructionInProlog() 
      throws SAXException, IOException, ParserConfigurationException {
        
        String expected = "<?xml-styleshet type='text/css' href='file.css'?>" +
          "<a><x/></a>";
        String actual = "<a><x/></a>";
        assertXMLEqual(expected, actual);
        
    }

CDATA Handling is Broken

This test fails

    public void testCDATA() 
      throws SAXException, IOException, ParserConfigurationException {
        
        String expected = "<a>Hello</a>";
        String actual = "<a><![CDATA[Hello]]></a>";
        assertXMLEqual(expected, actual);
        
    }

The Diff class


Testing for identity


XPath Based Tests


XPath Example

    private Document plist;
    
    protected void setUp() 
      throws IOException, ParserConfigurationException, SAXException {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true); // NEVER FORGET THIS!
        DocumentBuilder builder = factory.newDocumentBuilder();
        
        plist = builder.parse(new File("thunderbirdplist.xml"));
    }
    
    public void testNoTwoKeyElementsAreAdjacent() 
      throws TransformerException {
       
        assertXpathNotExists(
         "//key/following-sibling::*[1]/self::key", 
         plist);
        
    }

    public void testCreatorCodeIsMOZM() throws TransformerException {
       
        assertXpathEvaluatesTo("MOZM",
         "//key[. = 'CFBundleSignature']/following-sibling::string",
         plist);
      
    }    
    
    
    public void testThereIsAnIcon() throws TransformerException {
       
        assertXpathExists(
         "//key[. = 'CFBundleIconFile']", 
         plist);
        assertXpathExists(
         "//key[. = 'CFBundleIconFile']/following-sibling::string", 
         plist);
      
    }

DifferenceListener

package org.custommonkey.xmlunit;

public interface DifferenceListener {

    public final int RETURN_ACCEPT_DIFFERENCE = 0;
    public final int RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL = 1;
    public final int RETURN_IGNORE_DIFFERENCE_NODES_SIMILAR = 2;

    public int differenceFound(Difference difference);
    public void skippedComparison(Node control, Node test);

}

A DifferenceListener That Consider text nodes and CDATA sections to be equal

import org.custommonkey.xmlunit.*;
import org.w3c.dom.Node;

public class CDATAEqualsText implements DifferenceListener {

    public int differenceFound(Difference diff) {

        Node expected = diff.getControlNodeDetail().getNode();
        Node actual = diff.getTestNodeDetail().getNode();
        
        if ((expected.getNodeType() == Node.CDATA_SECTION_NODE 
             && actual.getNodeType() == Node.TEXT_NODE)
             ||
             (actual.getNodeType() == Node.CDATA_SECTION_NODE 
             && expected.getNodeType() == Node.TEXT_NODE)) {
         
            if (expected.getNodeValue().equals(actual.getNodeValue())) {
                return RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL;
            }
            
        }
        
        return RETURN_ACCEPT_DIFFERENCE;
        
        // We could really use something like DOM's NodeFilter
        // to indicate whether to process or skip the children
        
    }


    public void skippedComparison(Node expected, Node actual) {}

}

Comparing two documents with the custom DifferenceListener That Ignores Attributes

        String expected = "<root>Hello</root>";
        String actual = "<root><![CDATA[Hello]]></root>";
        DifferenceListener listener = new CDATAEqualsText();
        Diff myDiff = new Diff(expected, actual);
        myDiff.overrideDifferenceListener(listener);
        assertTrue(myDiff.identical());

ElementQualifier


XMLUnit


To Learn More


Index | Cafe con Leche

Copyright 2005, 2006 Elliotte Rusty Harold
elharo@metalab.unc.edu
Last Modified January 4, 2006