John Cowan has posted the fifth release candidate of TagSoup, an open source, Java-language, SAX parser for nasty, ugly HTML. I use TagSoup to convert JavaDoc to well-formed XHTML. RC4 fixes bugs and adds a --nocolons command line option that "translates colons in elements and attribute names to underscores." TagSoup is dual licensed under the Academic Free License and the GPL.