Developed by Jeff Martin and Tim Bacon
Open Source: BSD license
Addresses many of the issues we've been discussing today, but wraps it in a nice JUnit based framework
Based on JAXP and DOM
Can use XPath
Can parse badly formed HTML (or just use TagSoup)