2000 XML News

Sunday, December 31, 2000

I've updated Chapter 15 of the XML Bible, XSL Formatting Objects, to cover the November 27, 2000 Candidate Recommendation and FOP 0.16. The changes were quite minor overall. No elements were added or deleted. A few properties were added and another batch deleted. The largest change was the renaming of font-height-override-after to text-depth and font-height-overrride-before to text-altitude.


I've got an urgent need for a Motorola Timeport P7389 cell phone. This is a tri-band, WAP-enabled, GSM world phone. I need it to replace the Ericsson I888 VoiceStream mistakenly told me would support WAP. However, stores in New York don't seem to carry this model yet; and the only place I've found to buy one on the Web says I'll have to show the truck driver who delivers it my photo ID. Since this particular store is located in London, I doubt they'll ship across the Atlantic. If anybody does know where I can buy this particular model without a bundled service plan, would you please drop me a line? Thanks.

Saturday, December 30, 2000

The Apache XML Project has released version 0.16 of FOP, their XSL-FO to PDF converter. This is still far from a complete implementation of the XSL-FO Candidate Recommendation, but it's starting to become useful.


The Mozilla Project has posted a new roadmap outlining the path from the current 0.6 to 1.0. 1.0 is planned for sometime late in the second quarter of 2001, but you know what they say about the best-laid plans of mice and open source developers.

Friday, December 29, 2000

Zvon's published a hypertext XHTML 1.0 reference mostly derived from the HTML 4.01 standard and XHTML 1.0 DTD's.

Thursday, December 28, 2000

I've posted a preliminary version of DOMXIncluder, a Java class that will resolve documents that use XInclude to build a single document from multiple parts. Modulo bugs, it supports all of the October 26 Working Draft of XInclude except URLs that include XPointers or other fragment identifiers. A JDOM version is also available.

Wednesday, December 27, 2000

Antenna House has updated their XSL Formatter, an XSL Formatting Objects to PDF converter for Windows, to version 0.1.0.1227. This release fixes assorted bugs. It supports most of the November 21 XSL-FO Candidate Recommendation. Its most notable feature is the ability to handle Japanese and other Asian text, including mixed writing-modes such as "tb-rl". MSXML 3.0 is required.

Saturday, December 23, 2000

The W3C seems to be in their usual pre-Christmas rush to push new drafts out the door. They've published 15 separate notes, proposed recommendations, candidate recommendations, and working drafts since December 4, about half of them in the last week as well as a new version 4.2.1 of the open source Amaya web browser for Windows and Unix.

Most recently the XML Core Working Group has reverted the XML InfoSet spec back to working draft status.

Also, several editors of the XLink specification have also published a semi-official note on "XLink Markup Name Control". This proposes a possible XML Schema-based solution to address the question of how to integrate XLink syntax like xlink:href into existing non-XLink aware hypertext vocabularies like XHTML 1.0.

I'm leaving for Christmas vacation myself today. I'll have Internet access while I'm away. Nonetheless things are likely to be a little slow here for the next week, and you probably have better things to do than think about XML right now yourself. But I do promise to have some big announcements and a major shift in direction for this site come the New Year.

Friday, December 22, 2000

The XML Apache Project has just posted Xalan-Java 2.0.D05. This is a beta developer's release. It should fully implement the XSLT and XPath 1.0 specifications, as well as the new Java Transformation API for XML (TrAX). The API has been changed to mirror changes in Trax and the Java API for XML Processing (JAXP). This release also includes "greatly expanded support for custom stylesheet control over output properties, and numerous other upgrades and bug fixes."


The W3C has elevated XML Linking Language (XLink) Version 1.0 and XML Base to Proposed Recommendation status. The changes since the candidate recommendation seem mostly editorial. I didn't spot any changes to syntax or functionality. Comments are due by January 31, 2001.


Xsldoc 0.91 is a tool for generating documentation of your XSLT files from embedded comments in the XSLT files. The output and comment format are similar to Javadoc. Java 1.2 or later is required. Version 0.91 fixes a few bugs uncovered in version 0.90.

Thursday, December 21, 2000

Sun's posted the final draft proposal of the Java API for XML Processing 1.1 in PDF and PostScript format. The changes from the last draft seem pretty minor overall.


IAIK has posted the first alpha of the IAIK XML Signature Library (IXSIL) 1.0. IXSIL is a Java toolkit for creating and verifying XML digital signatures in the syntax defined by the IETF/W3C XML Signature Syntax and Processing Candidate Recommendation.


Xsldoc 0.9 is a tool for generating documentation of your XSLT files from embedded comments in the XSLT files. The output and comment format are similar to Javadoc. Java 1.2 or later is required.


Version 0.5.0. of the Sablotron XSLT processor for Windows and Linux has been released. This release adds support for xsl:sort as well as assorted bug fixes and optimizations.

Wednesday, December 20, 2000

The W3C has released XHTML Basic as an official Recommendation (i.e. a standard to anyone who isn't trying to stay clear of antitrust actions). XHTML Basic document type includes the minimal set of modules from modular XHTML (not yet a recommendation itself) required to be an XHTML host language document type plus the images, forms, basic tables, object support modules. It is designed for Web clients that do not support full HTML such as mobile phones, PDAs, pagers, and settop boxes.


The W3C XML Protocol Working Group has published the first public working draft of XML Protocol Requirements. This group is looking at technologies like XML-RPC and SOAP.


The W3C has also published a new working draft of XForms 1.0. XForms are a new kind of Web forms that "can be used with a wide variety of platforms of varying capabilities, for instance, desktop computers, television sets, personal digital assistants, cell phones, computer peripherals and even paper."


IBM's alphaWorks has released a new version of Data Descriptors by Example (DDbE) with "Enhanced support for XML Schemas that follows the XML Schema CR of 12/24/00". That's very interesting since that date is still a little bit in the future. Does IBM know something I don't know? Can we expect an early Christmas present from the W3C? Or did somebody's fingers slip on a keyboard? They probably meant the 10/24/00 draft from two months ago. As any reader of Cafe con Leche knows I make mistakes like that all the time. Still, it wouldn't be the first time IBM let a cat out of the bag a little earlier than they planned.

In either case, DDbE is a Java component library for inferring an XML DTD or Schema from a set of well-formed XML instances. DDbE offers parameters which permit the user to control the structure of the content models and the types used for attribute declarations. The goal of DDbE is to give users a good start at creating DTDs for their own applications.

AlphaWorks has also updated the XML Bridge for SAP. This product provides XML integration between SAP R/3 systems on the one side and arbitrary SAP R/3 or non-SAP systems on the other side. This runs inside WebSphere AE 3.5. This release adds support AIX and Linux as well as load-balanced SAP R/3 systems for inbound calls.

Tuesday, December 19, 2000

Idoox has released IdooXoap for Java version 1.0, an implementation of the Simple Object Access Protocol. The major improvements since pre-betas include:

  • WSDL
  • Arrays
  • SOAP Headers
  • Selective Java to WSDL compilation
  • EJB
  • Improved interoperability

Emmanuel Pietriga has released a new version of his MathML Content2Presentation Transformation (MathMLc2p), XSLT stylesheets that can translate MathML content markup into MathML presentation markup expressions automatically. This release supports the November 13, 2000 MathML 2.0 Candidate Recommendation.

Monday, December 18, 2000

Paul Brown's published a simple flat file (fixed-width fields) to XML converter under the GPL. It's written in Java and requires JAXP.


Antenna House has updated the English version of their XSL Formatter, an XSL Formatting Objects to PDF converter for Windows. This release supports most of the November 21 XSL-FO Candidate Recommendation. Its most notable feature is the ability to handle Japanese and other Asian text, including mixed writing-modes such as "tb-rl". MSXML 3.0 is required.

Sunday, December 17, 2000

The W3C Privacy Working Group has elevated the Platform for Privacy Preferences 1.0 (P3P1.0) to Candidate Recommendation. P3P is an XML-based syntax for documents browsers and other user agents can automatically retrieve from web sites to determine the site's privacy policies and act on that information, for instance by blocking cookies. The Candidate Recommendation phase ends March 15, 2001.


The W3C DOM Working Group has published a new working draft of Document Object Model (DOM) Requirements.

Saturday, December 16, 2000

Lots of Unicode news today. First off, the Unicode Consortium has published the first beta data files for Unicode 3.1. Version 3.1 adds many new characters and new blocks including:

  • Old Italic
  • Gothic
  • Deseret
  • Byzantine Musical Symbols
  • Musical Symbols
  • Mathematical Alphanumeric Symbols
  • CJK Unified Ideographs Extension B
  • CJK Compatibility Ideographs Supplement
  • Tags

Most importantly, even for developers who don't have a specific need for any of these characters, all of these have code points greater than 65535 (0xFFFF). Characters beyond the Basic Multilingual Plane are no longer just a theoretical possibility. And production 2 of the XML 1.0 specification does allow these characters to appear in XML documents, though it specifically forbids their representation as surrogate pairs. Surrogate pairs could probably still be used in the actual documents provided the parsers resolve them into the actual characters they represent before interpreting them as XML. That is, a document that used surrogate pairs would just be another encoding that the parser would convert to full Unicode.

Nonetheless, this is going to break a lot of poorly written software that assumed a single Unicode character could be represented in only two bytes. I can't speak for other programming languages, but Java is going to have big trouble with this.


The W3C and the Unicode Consortium have published a joint technical report/note on Unicode in XML and other Markup Languages which addresses many conflicts between XML markup and Unicode characters. However, it seems to be silent on the practical issue of how to handle supplemental characters that occupy more than two bytes with existing software and tools.


Finally, IBM has released version 1.7 of their International Components for Unicode (ICU 1.7), an open source, locale sensitive C and C++ class library for Unicode savvy text formatting including:

  • Calendar support
  • Character set conversions
  • Language-sensitive collation ()
  • Date and time formatting
  • Message catalogs
  • Message formatting
  • Normalization
  • Number and currency formatting
  • Time zones
  • Transliteration
  • Word, line and sentence breaking

Changes in version 1.7 include:

  • Improved collation performance
  • Conversion support: for ISO-2022-JP/CN/KR with extensions, GB 18030, and HZ
  • Partial UTF-32 support
  • Code/data library names contain version numbers
  • Debian packaging
  • Improved API documentation
Friday, December 15, 2000

The Hackerlab has released an Rx-XML, an open source C library for regular expression pattern matching using the regular expression language specified in "XML Schema Part 2: Datatypes".

Thursday, December 14, 2000

Hewlett Packard and Microsoft have submitted a note to the W3C on SOAP Messages with Attachments. This note defines a means of including a SOAP 1.1 message inside a MIME multipart/related message.


Bare Bones Software has released version 6.0.2 of BBEdit, the payware Macintosh text editor I use to produce most of this site. The major new feature is support for the Compact HTML 2.0 format used in iMode phones. Many bugs are fixed besides. The upgrade is free for 6.0 owners.

Wednesday, December 13, 2000

The W3C XML Signature Working Group has posted a new Candidate Recommendation of the Canonical XML specification. Canonical XML is a series of rules for converting XML documents into a sequence of bytes such that twoo documents with the same canonical form are in some sense the same. This version just cleans up a few editorial issues.


The W3C has published the first working draft of XSLT 1.1. Some of the most important changes and new features include:

  • The value of xsl:stylesheet's version attribvute is now 1.1, at least for stylesheets that take adavantage of the new elements and attributes in XSLT 1.1. The value 1.0 is still allowed.
  • There's a standard xsl:document element for producing multiple output documents from a single input document and style sheet. The syntax is almost identical to the existing xsl:output element with the addittion of an href attribute to specify where the output document will be stored. The functionality is roughly equivalent to the xt:document and saxon:document extension functions in XT and SAXON respectively, and much simpler than the Redirect extension function in Xalan.

  • xsl:apply-imports element can have xsl:with-param children to pass parameters to the applied imported templates, so it's now on a par with xsl:apply-templates parameters. In other words, xsl:apply-imports is now exactly like xsl:apply-templates except that it only applies templates in imported style sheets.

  • A new xsl:script element lets you define extension functions inside the stylesheet.

  • The result tree fragment data-type has been eliminated. Variable-binding elements with content now construct node-sets

  • XML Base is supported

  • Extension functions are allowed to return "external" objects, which do not have any of the builtin XPath types.

There are also a lot of clarifications, additional examples, and bug fixes. On the other hand, I note that my personal pet peeve in section 13 has not been fixed. This states, "The xsl:message is instantiated by instantiating the content to create an XML fragment. This XML fragment is the content of the message." The problem is the phrase "XML fragment" is nowhere defined, although an added example seems to suggest that what's meant is a string containing XML markup and text, though perhaps not a complete XML docuemnt. So far I don't see anything in here that is likely to break existing style sheets.

Tuesday, December 12, 2000

Takuki Kamiya has released SAX Navigator 0.1 a forward-only, pull-model Java API for reading XML.

Monday, December 11, 2000

CSIRO has relesed a new version of the open source CSIRO SVG Toolkit with support for the November 2, 2000 Candidate Recommendation of Scalable Vector Graphics. However, CSIRO is now directing most of its efforts in this area towards the Apache XML Project's Batik, so this is likely to be the last release for a while; and you might want to transition your own projects from CSIRO to Batik. Java 2 is required.

Saturday, December 9, 2000

Unicorn Enterprises has released new versions of the Unicorn XSLT processor, both standard and database editions. This release adds the ECMASCript interpreter to the standard edition, as well as adding regular expression to the ECMAScript extension mechanism.


Emmanuel Pietriga has released MathML Content2Presentation Transformation (MathMLc2p), XSLT stylesheets that can translate MathML content markup into MathML presentation markup expressions automatically. This release supports the November 13, 2000 MathML 2.0 Candidate Recommendation.

Friday, December 8, 2000

The Apache XML Project has posted the first alpha release of Xerces-J 2.0.0, an XML parser written in Java. The 2.0 branch is a major rearchitecting of the internal structure of the parser. However, the functionality is much the same as in the current 1.2 branch; that is, XML 1.0, DTDs namespaces, SAX, SAX2, DOM1 and DOM2. Schemas are not yet supported in this branch. According to the announcement, "Since it is an alpha release, defects are expected in the source code, documentation and packaging." I'd say that at this point you should stick to Xerces-J 1.2.x unless you're planning on hacking on the code.


The Apache XML Project has released the first beta of Batik, a Java program/class library for browsing Scalable Vector Graphics (SVG) documents and converting them to GIF or PNG. This is based on the Java 2D API, and supports most of the static features of the November 2nd, 2000 Candidate Recommendation of SVG. Animation and other dynamic features are planned.

The current beta 1 release is definitely a little rough, and I recommend it for extreme early adopters only at this point. Clearly a lot of thought has gone into the initial documentation, and there were copious step-by-step instructions for installing this and getting it to run, including such often omitted details as setting particular environment variables and setting up your CLASSPATH. Unfortunately that's no substitute for a more intuitive. less complicated process. I had a great deal of trouble getting it to display the sample documents, problems which I eventually tracked down to an old version of w3c.jar in my ext directory. Less motivated/experienced users would never have gotten past this. Clearly the problem isn't with Batik itself, which is a very nice piece of work. It's with Java's insanely complex, difficult and unintuitive installation system. Here are three changes I propose to make Java software much more competitive with compiled software:

  • Break the connection between package structure and directories. This single problem causes more difficulty for new users of Java than almost everything else combined. Both .class and .java files contain sufficient internal information for the compiler to tell what package the class is in. Why does this need to be duplicated in the file system?

  • Add enforceable version info to the manifest in a JAR archive. Allow other programs to simply and easily specify which version of a standard JAR archive like w3c.jar they must load. Allow multiple versions to coexist in the CLASSPATH.

  • Make JAR archives truly runnable. That is, let me download a JAR archive, double-click it, and have the program embedded in that archive run like any other standalone application. Microsoft's VM does let developers build standalone executable for Windows, and these are so much easier to install and use that I invariably pick them when given a choice between standalone versions and pure Java versions. This should be available on all platforms in all VMs. Indeed it should be the standard way to distribute a Java program to end users

In what is perhaps a related note, IBM's alphaWorks has updated their Install Toolkit for Java with some minor enhancements and bug fixes.


The latest milestone release of Mozilla has been posted. This one's numbered 0.6, an apparent pun on Netscape 6.0. I installed it on my Windows NT and it seemed to work fine in my initial tests. As it has for a while now, Mozilla supports XML and CSS.


IBM's alphaWorks has posted a new release of XML Master, a tool for designing and generating JavaBeans customized for working with particular XML documents. This version fixes a couple of bugs and add some documentation about localizing popups used in the bean suite.

Thursday, December 7, 2000

The Apache XML Project has released version 1.2.3 of their Xerces XML parser for Java. This is mostly a bug fix release. It should user substantially less memory in certain relatively common situations. It also adds a new load-external-dtd SAX feature so that applications that don't require validation can also ask the parser not to load the external DTD subset.


The W3C XML Query Working Group has released the first public working draft of the XML Query Algebra that will underlay a future XML Query Language (XQL). Think of it as SQL for XML documents instead of relational databases.


The W3C Voice Browser Working Group has revised the Introduction and Overview of W3C Speech Interface Framework. "This document describes a model architecture for speech processing in voice browsers. It also briefly describes markup languages for dialog, speech recognition grammar, speech synthesis, natural language semantics, and a collection of reusable dialog components. This document is being released as a working draft, but is not intended to become a proposed recommendation."


Opera Software is releasing a free-beer, ad-supported version of their namesake web browser, Opera. Information about what adds have been seen, user dmographics, and the like are sent to Opera's servers using an XML format. For $39 you can turn off the ads. Opera 5.0 is currently availabled for Windows only and supports XML, Java (separate download required), 128-bit encryption, TLS 1.0, SSL 2 and 3, CSS1 and CSS2, HTML 4.01, HTTP 1.1, ECMAScript, JavaScript 1.3, and provides experimental support for DOM and WAP/ WML.

Wednesday, December 6, 2000

I've posted the notes from Monday's talk to ACM D.C. on Processing XML with Java. Topics covered include the XML InfoSet, SAX2, DOM2, JDOM, and XInclude.


Sun's posted the second early access release of the Java API for XML Processing (JAXP) 1.1 on the Java Developer Connection (registration required). JAXP 1.1 is based on SAX2 and DOM2, the TRAX API for XSLT processing, and a few factory classes for locating a parser. It includes Sun's Crimson XML parser and a version of the Apache XML Project's Xalan XSLT processor.

Tuesday, December 5, 2000

I've updated my XInclude processor based on JDOM to support the lastest XInclude working draft. This version also substantially improves the JavaDoc. I'm still thinking about hwo to roll in XPointer support.

Sunday, December 3, 2000

IBM's alphaWorks has released XDRtoXSD, a Java program that reads an XML Schema written in the XDR schema language (used by Microsoft Internet Explorer and BizTalk) and translates it into the W3C XML Schema language.


IBM's alphaWorks has updated their XML Security Suite for Linux and Windows to conform to the XML-Signature Candidate Recommendation.


AlphaWorks has also updated Regex for Java, their regular expression Java class library. This release adds new substitute and split functions as well as support for W3C XML Schemas style regular expressions. The primary difference between XML schema regular expressions and most traditional regular expression languages is the addition of regular expressions for Unicode character classes. For instance, it's easy to match any character that's a currency sign or a digit, regardless of language or script.

Saturday, December 2, 2000

The second beta of the XML Spy 3.5 tree-based XML editor has been released. New features in this release inlcude:

  • Generation of complete XML Schema documentation in HTML and Microsoft Word format
  • printing from the XML Schema design view
  • Many bug-fixes
  • A graphical schema design view
  • Support for the 10/24/2000 W3C XML Schema Candidate Recommendation
  • The ability to automatically convert schemas from the April 7 working draft to Oct 24 candidate recommendation
  • the ability to access files from a password protected ftp or http URL
  • the ability to save files directly to an FTP server
  • a new COM-based API for integrating XML Spy into custom solutions

The download is 6.3 megabytes. XML Spy is $199 payware. The beta is free, but beta testers will likely not receive the customary free copy of the release version for doing unpaid work for the software vendor.


Zvon's upgraded XSLTracer to:

  • View of processing of multiple XML sources included via document()
  • The ouptput file can be displayed too
  • The result tree fragment corresponding to the XSLT instruction is dynamically highlighted in the output document
  • The XSLT code is much more pretty

The second beta of Enhydra 3.1, an open source application server written in Java, has been released. Version 3.1 focuses on XMLC, adding compile time includes, updated XML and HTML parsers (Xerces v1.2 & HTML Tidy), and improved performance through a lazy DOM.


The Apache XML project has released Xalan 1.2.2, an open source XSLT stylesheet processor written in Java. This is a minor bug fix release.

Friday, December 1, 2000

The Apache XML Project has released version 0.15.0 of FOP, their XSL-FO to PDF converter. Version 0.15 adds:

  • region-name support
  • a new package for image support and analysis
  • Vaslty improved support for embedded SVG pictures
  • outline and fill, switch, images (including external svg images)
  • underlining
  • Able to use arbitrary Type 1 fonts
  • Partial borders
  • Much better leader support

This is still far from a complete implementation of the XSL-FO Candidate Recommendation, but it's starting to become useful.


Fourthought, Inc has posted version 0.10.2 of 4Suite, a collection of open source tools for XML, DOM, XPath, XSLT, RDF XPointer, XLink and object-database development in Python. Changes in this release include assorted optimizations and bug-fixes.

Thursday, November 30, 2000

Sebastian Rahtz has updated PassiveTeX, his XSL-FO-to-TeX converter. This is a bug fix release.

Rahtz has also updated his XSLT stylesheets for Text Encoding Initiative (TEI) documents. These convert TEI docs to either to make HTML or XSL-FO.


The Zvon Glossary is a new, Web-based searchable list of XML-related terms and acronyms. The back-end software was written by Jakub Vrana.

Wednesday, November 29, 2000

Michael Kay's released SAXON 6.0.1, an open source XSLT processor written in Java. This release fixes a few conformance bugs in version 6.0. However, the main focus of this release is on substantially faster processing of documents. Kay still recommmends version 5.5.1 for production use at this time until 6.0.1 has been tested further and more bugs have been shaken out. The 6.x series also add two new extension attributes (Is that legal?) to xsl:output:

  • When the output method is html, saxon:omit-meta-declaration determines whether or not a META element is automatically added to the HEAD element.

  • The saxon:character-representation attribute specifies whether characters like & and <, as well as characters not available in the chosen encoding, that appear in parsed character data and attribute values should be output with decimal or hexadecimal character references. In HTML output, the predefined Latin-1 entity references can also be used.


The W3C has released version 4.1 of Amaya, their experimental web browser and editor for Windows and Unix. THis is mostly a bug fix release. Amaya is open source and source code is available. Version 4.1 still doesn't support direct display of XML documents in the browser, but it does support:

  • HTML 4.0
  • XHTML 1.0
  • HTTP 1.1
  • MathML 2.0
  • Parts of CSS 2
  • Parts of SVG
Tuesday, November 28, 2000

The Apache XML Project has released version 1.2.2 of the Xerces-J XML parser for Java. This is mostly a bug fix release. It does update the DOM support to be fully conformant with the new DOM Level 2 Recommendation from the W3C, though this didn't require major changes.

Monday, November 27, 2000

Antenna House has posted a beta of the Antenna House XSL Formatter, an XSL-FO formatter for Windows with support for the October 18th working draft of XSL Formatting Objects. It displays the XSL-FO documents in a Windows GUI. MSXML 3.0 is required.

Wednesday, November 22, 2000

The W3C has promoted XSL Formatting Objects to a candidate recommendation. I'll try to read and comment on this sometime next week, after I return from my annual Thanksgiving trip to the family farm in Kentucky. The Working Group has proposed the following criteria for moving to Proposed Recommendation:

  • Sufficient reports of implementation experience have been gathered to demonstrate that XSL processors based on the specification are implementable and have compatible behavior.
  • An implementation report shows that there is at least one implementation for each basic formatting object and property.
  • Providing Formal responses to all comments received.

As usual, I think these requirements are not strong enough. I'd like to propose my own requirements:

  • There are at least two independent, complete implementations of the entire specification.
  • At least one of the implementations generates printed pages.
  • At least one of the implementations generates an onscreen display.
Tuesday, November 21, 2000

The W3C Voice Browser Working Group has posted the first public working draft of the Natural Language Semantics Markup Language for the Speech Interface Framework. This is an XML application used to specify the meanings of "natural language utterances. It is expected to be automatically generated by semantic interpreters for use by components that act on the user's utterances, such as dialog managers."

Monday, November 20, 2000

The Unicorn XML Processor, version X-2.00.00, has been released. It is written in C++ for Windows. It bundles a stand-alone ECMAScript (JavaScript) interpreter. Features include:

  • non-validating XML parser
  • XML writer supporting XML, HTML and text output methods
  • SAX2 (Simple API for XML version 2.0) interfaces
  • DOM (Document Object Model) Level 1
  • XPath expressions and node-sets
  • XSLT processor and XSLT pattern matching facility
  • regular expressions
  • I/O operations for text files
  • An XSLT processor

LiveDTD is a free Perl script that turns the text of a DTD into a hypertext document. It parses the DTD files and generates a copy with HTML markup inserted. The markup enables you to quickly browse the DTD by following links. It works with both XML and SGML DTDs. An example, marked-up version of Eve Maler's DTD for the XML Specification, has been published.


IBM's alphaWorks has released version 3.3.1 of the XML4C parser written in C++. This is derieved from the Apache Xerces XML C++ Parser Version 1.3.0. New features include "experimental versions of DOM Level 2 - Range, Traversal and SAX2, improved support for Entities in the DOM, bug fixes and performance improvemenents."

Sunday, November 19, 2000

Fourthought, Inc has posted version 0.9.2 of 4Suite, a collection of open source tools for XML, DOM, XPath, XSLT, RDF XPointer, XLink and object-database development in Python. 4Suite now works with Python 2.0 as well as Python 1.5.2. Python 1.5.2 support will probably be dropped in the next release of 4Suite. Changes in this release include:

  • Introduced 4XLink: A processor to expand XLink attributes
  • Introduced DbDom: An alpha Dom implmentaiton on top of 4ODS
  • ODS: Improved the test suites to handle more cases and conform to protocol
  • cDomlette: added support for methods
  • 4RDF: Fixes and improvements to serialization, the back end and the API
  • Standardized reader interfaces across DOM implementations
  • Assorted optimizations and bug-fixes

IDOOX has posted a pre-release of : XDB, an XML document repository providing structured storage of XML data, "using an RDBMS mapping over PostgreSQL."


Paul Tchistopolskii has posted version 0.1 of Hiawatha, an open source "Webserver for those, who are fluent in XSL." Hiawatha is standalone webserver with built-in support for XML, XSL, XSLScript, HTTP, CGI (GET and POST). Hiawatha is written in Java. Features include:

  • URL rewriting and action binding based on regular expressions
  • Configurable transparent binding of XSL stylesheets to XML files without processing instructions
  • Prefix-based support for 1 XML -> N media rendering
  • Simple caching of XSL transformations a-la AxKit
Friday, November 17, 2000

The W3C DOM Working Group has released a Working Draft of the DOM Level 3 Views and Formatting Specification. This draft proposes two ways to give a DOM application access to a view's computed layout and presentation. This working group has also published a new Working Draft of the Document Object Model (DOM) Level 2 HTML Specification.


The Apache XML Project has released version 1.2.1 of their Xalan-J XSLT processor written in Java. This release works with Xercces-J 1.2.1. There are no major new features in this release. A few bugs are fixed, and performance should be speeded up a little.


ICANN's selected seven new top-level domains:

  • .biz
  • .info
  • .name
  • .pro
  • .museum
  • .aero
  • .coop

None of this will do much to change the fact that the whole process is rigged in favor of large multinational corporations.


Oracle's posted a beta of the Oracle XML SQL Utility 2.1.0 on the Oracle Technet web site (registration required). New features include:

  • SAX2 output from any SQL query for handling arbitrarily large XML query output
  • Any JDBC driver can be used, not just Oracle's
  • Preliminary XML Schema support
  • Can retrieve data as XML attributes instead of elements
Thursday, November 16, 2000

I've returned from San Jose and XML DevCon 2000. It was a fun conference. I wish I had had more time to spend there. I've posted the notes from the talk I gave Sunday on The Bleeding Edge of XML. I'll have more news from the show soon.

One down note from the show: after years of being a loyal Dell customer, I probably won't be buying from them again. As sometimes happens, in the rush to get out on Thursday I neglected to pack the power adapter for my Dell Latitude LS notebook. Apparently Dell uses numerous non-standard parts on this model, and consequently in the heart of Silicon Valley I was unable to locate a replacement power adapter, battery charger, or anything else that would let me run the laptop once its battery ran down on the flight over. Not to worry. I had showed up a few days early so I called Dell and, after waiting on hold for almost an hour, ordered a replacement power adapter to be delivered to my hotel on Saturday, the day before I was scheduled to present. Dell wanted $50 extra for overnight delivery, but since it was my fault for leaving the adapter home in the first place I didn't really object. Then Dell promptly lost my order. When I called them Saturday to find out where my adapter was they had no record of my order, even though I'd been given an order confirmation number, told the adapter was in stock, and promised it would definitely be there on Saturday. Nonetheless, although they claimed to have no record of my order, they managed to charge my credit card $98.90 just the same. (Needless to say, I'll be disputing this charge.)

I depend on Dell hardware to get my job done. I need to know that when that my computer goes down, whether through hardware failure or just because of my own forgetfulness, I can get replacement parts quickly and with no hassle. Dell failed this test miserably. Replacement parts were completely unavailable in the the most wired city on the planet, though local store shelves were full of batteries and adapters for IBM, Toshiba, and Compaq. When I tried to get Dell to fix the problem, they charged my credit card but did not do what they promised to do. Fortunately fellow-presenter Mark Volkmannn was kind enough to loan me his laptop Sunday, so I could deliver the scheduled talk. Nonetheless, he shouldn't have had to. In the future when I'm looking for new hardware, I won't be looking to Dell.


Opera Software has released the second beta of their port of their namesake Opera Web Browser tp Linux on the PowerPC platform.

Wednesday, November 15, 2000

The first public beta of XML Spy 3.5 is now available. XML Spy is a $199 payware XML editor for Windows. Beta testers should not expect to receive the cutomary free copy of the release version in exchange for their time, effort, and assistance. New features include:

  • A graphical schema design view for editing XML Schemas
  • Schema conversion from the April 7 working draft to the October 24, Candidate Recommendation
  • The ability to load files from a password-protected HTTP or FTP URL, Files can be saved to FTP URLs too.
  • A COM-based API for integrating XML Spy into custom solutions

The W3C has updated the roadmap for future XHTML specifications. The new schedule looks like this (excerpted from the W3C document):

Deliverable 1st draft 2nd draft Last Call ends CR PR REC
HTML 4.01 - - - - - 24 Aug 1999 24 Dec 1999
XHTML 1.0 5 Dec 1998 24 Feb 1999 5 May 1999 1 Jun 1999 - 10 Dec 1999 26 Jan 2000
Modularization 6 Apr 1999 10 Sep 1999 5 Jan 2000 1 Feb 2000 20 Oct 2000 Nov 2000 TBD
XHTML 1.1 10 Sep 1999 - 5 Jan 2000 1 Feb 2000 Dec 2000 TBD TBD
XHTML Basic 21 Dec 1999 - 10 Feb 2000 15 Mar 2000 - 3 Nov 2000 TBD
Profile reqs 6 Sep 1999 TBD TBD TBD TBD TBD TBD
Profile vocab Mar 2001 Jun 2001 TBD TBD TBD TBD TBD
Events 21 Dec 1999 28 Aug 2000 Dec 2000 TBD TBD TBD TBD
Schemas for XHTML 1.1 and Basic Nov 2000 - - - - - -
XHTML 2.0 Mar 2001 TBD TBD TBD TBD TBD TBD
Mime type Dec 2000            

Ronald Bourret's updated his XML database products list with about 20 new products, a new category for native XML databases, and rewritten product descriptions, especially in the areas of XML-Enabled Databases and XML Servers.

Tuesday, November 14, 2000

Netscape 6.0 has been released, for Linux, Windows 95 and later, and MacOS 8.5 and later, though the FTP servers seem to be more than a little overloaded right now. This is the first release version of Netscape based on the Mozilla source code. Among other features, it supports direct display of XML documents with attached CSS style sheets.


The W3C has released version 4.0 of Amaya, their experimental web browser and editor for Windows and Unix. Amaya is open source and source code is available. Version 4.0 still doesn't support direct display of XML documents in the browser, but it does support:

  • HTML 4.0
  • XHTML 1.0
  • HTTP 1.1
  • MathML 2.0
  • Parts of CSS 2
  • Parts of SVG

The W3C has promoted DOM Level 2 to official recommendation. This is divided into five specifications:


The W3C has promoted MathML 2.0 to Candidate Recommendation. The major change is a specification for a MathML DOM. There are a number of minor and editoial changes as well. The Candidate Recommendation review period ends on December 14. Once again I consider this to be far too short given the limited implementation experience with MathML that's already available.

Monday, November 13, 2000

Insider Software has released SpaceAgent 2.5, a $250 payware Web page optimizer. This version adds support for XML Javascript, and ASP.

Thursday, November 9, 2000

I've posted corrected version of the notes from all the talks I gave at SDExpo East last week including:

The Schemas and Advanced XML in particular contain information that's not easily available elsewhere, and hasn't been covered in any books yet. I'll be reprising Advanced XML this Sunday under the title The Bleeding Edge of XML at XML DevCon San Jose. Updates may be a little slow here until I return.

Wednesday, November 8, 2000

The dbXML Group has released version 0.3 of the source code for the dbXML Core Edition, an open source (LGPL) native XML database application server. It is a data management system designed specifically for collections of XML documents. The software can be embedded into existing applications.


Zvon has released XSLTracer, a tool for visualizing the processing of an XML file by an XSLT stylesheet by single stepping through the transformation, much as in a debugger.


Kevin Jones has posted a test version of his Napa XSLT processor for Windows.

Tuesday, November 7, 2000

David Flanagan's posted a call for Netscape to refrain from shipping Netscape 6.0 unitl they're actually standards compliant. Most of the bugs he complains about strike me as fairly minor and relatively obscure, and shouldn't affect any web site that uses solid design principles. However one in particular is pretty nasty. To be specific, "The DOM methods removeChild() and replaceChild() can crash the browser. There is a patch available for this bug, but it will not be applied for Navigator 6.0." The effect of this bug is that a web page can easily choose to crash Netscape 6 (and possibly the OS on memory-protected systems) by executing a simple piece of JavaScript when the page loads. This is not acceptable and should be considered a show-stopper. Applications should never crash for any reason. When a crashing bug is discovered, no matter how obscure, it must be fixed.

Monday, November 6, 2000

The W3C released several new drafts of various XML applications while I was away in D.C. last week. First, the Mobile Subgroup of the W3C HTML Working Group has elevated XHTML Basic to Proposed Recommendation status. From the abstract:

The XHTML Basic document type includes the minimal set of modules required to be an XHTML Host Language document type, and in addition it includes images, forms, basic tables, and object support. It is designed for Web clients that do not support the full set of XHTML features; for example, Web clients such as mobile phones, PDAs, pagers, and settop boxes. The document type is rich enough for content authoring.

XHTML Basic is designed as a common base that may be extended. For example, an event module that is more generic than the traditional HTML 4 event system could be added or it could be extended by additional modules from XHTML Modularization such as the Script Module. The point is that XHTML Basic always is the common language that user agents support.

The document type definition is implemented using XHTML modules as defined in "Modularization of XHTML".

Furthermore, the W3C SVG Working Group has promoted Scalable Vector Graphics to Candidate Recommendation. This is a minor update to the spec that mostly makes editorial fixes as well as clarifying a few points about the SVG DOM. The most visible change is that the suggested MIME media type is now "image/svg+xml" instead of rom "image/svg-xml".

Meanwhile, the IETF/W3C XML Signature Working Group has released a Candidate Recommendation of XML-Signature Syntax and Processing. The CR phase ends January 31, 2001.

Finally, the W3C has posted the first public working draft of Document Object Model (DOM) Level 3 Content Models and Load and Save Specification Version 1.0. This specification plans to define a means for programs to parse and serialize XML documents without relying on parser-dependent classes. This effectively chops the legs out from under Sun's JAXP (the Java API for XML Parsing), since this functionalioty is all that JAXP provides that DOM2 and SAX2 don't. This spec also defines a preliminary API for accessing the grammar (DTD or schema-based) for an XML document, something that's been more or less ignored until now.

Sunday, November 5, 2000

Dieter Kohler has released the Extended Document Object Model v.2.2.12 for Delphi. This library implements "DOM Core Level 2" and "Level 2 Document Traversal" as well as a self-developed DTD model.


Microsoft's released the final version of its MSXML 3 XML parser and XSLT processor for IE5.0 and later on Windows. This is supposed to be fully compliant with XML/XSLT/XPath 1.0 recommendations. However a couple of bugs have already surfaced. No word yet on when this might be bundled with a release version of Internet Explorer.


Howard Katz has posted version 0.89 of his XML Query Engine. This is a major update that fixes a number of outstanding bugs and adds several new features and optimizations including:

  • Logical subquery operators 'and' and 'or'
  • Set operators 'union' and 'intersect'
  • Namespace support
  • A setDoFullText() method for turning off element text indexing
  • A showDocTree() API for quick visualization of element hierarchy
  • Simple compound-word matching in element content
  • Attribute content moved into index for improved speed and precision
  • A number of other optimizations to improve performance and reduce memory footprint during full-text queries
Friday, November 3, 2000

I'm back from SDExpo. The broadcast of the various sessions fell through at the last minute. Sorry. However, I did do an interview with Dr. Dobbs TechNetCast that will be broadcast in the next couple of weeks. I'll announce it here when it's released, and I'll start posting updated notes from the show this weekend.

Cover of Die XML Bibel When I returned from D.C. yesterday one thing I found in my mail box was the German translation of The XML Bible. This is the most up-to-date version yet, with coverage of the finished versions of XSLT and XPath 1.0, and much more up-to-date coverage of XLink and XPointer than in the English edition. If you read German I recommend checking it out. Die XML Bibel costs 89 deutschmarks (49.50 Euros). The ISBN number is 3-8266-0627-2 and is published by mitp. It's available from amazon.de and other retailers of German books.

In related news, Amazon has opened a store in Japan. The Japanese translation of the XML Bible isn't finished yet, but when it is done, it should be the most current version to-date. They do have the Japanese translations of JavaBeans and what I suspect is of Java Network Programming, though having even less familiarity with Japanese than with German I'm not 100% sure of this.

Saturday, October 28, 2000

I leave today for the SDExpo East show in Washington D.C. I'll be talking more or less non-stop for the next week so updates may be a little slow during that time. Some of my sessions will be broadcast on Dr. Dobbs TechnetCast. The tentative schedule is

  • XML DTDs, Tuesday 10/31/2000, 10:30 AM - 12:00 PM
  • Interview with Elliotte Rusty Harold, Tuesday, 10/31/2000, 4:00-4:30 PM
  • XML Schemas, Wednesday 11/01/2000, 8:00 am - 9:30 AM
  • Java I/O, Wednesday 11/01/2000 3:30 PM - 5:00 PM
  • Java Network Programming, Part 1, Thursday 11/02/2000 8:30 AM - 10:00 AM
  • Java Network Programming, Part 2, Thursday 11/02/2000, 10:30 AM - 12:00 PM

All times are EST. (Don't forget to set your clocks back tonight.) I don't yet know whether these are going to be broadcast live or only after the fact. I'll post more details as I find out.

Friday, October 27, 2000

The W3C has released a candidate recommendation for Canonical XML Version 1.0. The review ends November 24.

Opera Software has posted the second beta of the Opera Web Browser for Linux X86. This browser supports XML with attached CSS style sheets.

Thursday, October 26, 2000

The W3C has posted a new working draft of XInclude. The big change in this draft is that it's no longer possible to make any element an XInclude element. Instead, only elements with the name include in the http://www.w3.org/1999/XML/xinclude namespace are replaced. Furthermore the attributes are no longer prefixed. Personally, I don't see what's gained by this. I preferred the attribute-only syntax. A typical XInclude element now looks like this:

  <chapter>
      <xinclude:include xmlns:xinclude="http://www.w3.org/1999/XML/xinclude"
         href="variables.xml" />
  </chapter>

The first beta of Enhydra 3.1 is now available. This release of the open source application server focuses on on XMLC, adding compile time includes, updated XML and HTML parsers (Xerces v1.20 & HTML Tidy), and a lazy DOM implementation.

Wednesday, October 25, 2000

Bare Bones Software has released version 6.0.1 of BBEdit, the primary text editor I use to produce this site. (Macintosh only). This is a minor update that improves link checker performance, lets you save FTP passwords without using the key chain, and fixes many bugs.


Netscape's released Communicator 4.76. This is a minor bug fix update. There's no new XML functionality here. Download from the ususal FTP sites.


The W3C XML Schema working group has released a Candidate Recommendation of XML Schema. As usual, there are three parts:

Most of the changes in these drafts appear to be editorial rather than substantive. The Candidate Recommendation phase is scheduled to end December 15, 2000. Given the complexity of the spec, this is not nearly enough time to gather any real implementation experience, thus leading once again to the conclusion that the W3C simply doesn't want to hear any outside opinions about this. The CR phase may be extended if necessary, but I think a month-and-a-half is so ridiculously little time for real experience that extension is almost inevitable. I just wish the W3C would admit that and post a real time frame. A month and a half is just going to discourage potential implementors.

In related news, Henry S. Thompson has released a stand-alone version of his XSV schema validator for Windows. A Web based version is also available. So far I haven't been able to get either of these to work at all. The web based validator crashes, and the local Windows version simply exits without producing any output.

Tuesday, October 24, 2000

DataPower has released XSLTMark 1.1.0, an XSLT benchmark and a small compliance testing suite. There are about 40 different testcases in this release. A variety of Java and C/C++ processors including:

  • XT
  • Saxon
  • Transformiix
  • Xalan-J
  • Xalan-C++
Sunday, October 22, 2000

I've updated Chapter 15 of the XML Bible, XSL Formatting Objects, to cover the October 18 Working Draft of the XSL specification. The changes were mostly fairly minor. In no particular order:

The fo:simple-link element was renamed fo:basic-link, probably to avoid confusion with XLink simple links. Along the way it picked up three new properties used to control the appearance and behavior of the targeted document:

target-presentation-context
A URI that generally indicates some subset of the external destination that should actually be presented to the user. For instance, an XPointer could be used here to say that although an entire book is loaded only the seventh chapter would be shown.
target-processing-context
A URI that serves as a base URI in the event that the external destination contains a relative URI. Otherwise, that would be considered relative to the current document. (I'm not a 100% confident that I've interpreted this one correctly, but this seems the most natural interpretation.)
target-stylesheet
A URI that points to a style sheet that should be used when the targeted document is rendered. This will override any style sheet that the targeted document itself specifies, whether through an xml-stylesheet processing instruction, a LINK element in HTML, or an HTTP header.

border-after-precedence, border-before-precedence, border-start-precedence, and border-end-precedence attributes were added to the table properties to specify what happens when, for example, one cell's bottom border conflicts with the next cell's top border.

Finally, a font-selection-strategy property was added to the font properties. This property lets you specify whether surrounding characters are considered when a font is chosen that best matches the actual characters. This is a fairly obscure property that only really matters if you're using multiple scripts (e.g. English and Cyrillic) in the same document.

Also, Sebastian Rahtz noticed that scale-to-fit is now a legal value for content-width and content-height so that you can make pictures fit exactly on the page.

In related news, RenderX has updated their DTD for XSL-FO to reflect the latest working draft.

Saturday, October 21, 2000

The W3C HTML Working Groups has released the Candidate Recommendation of Modularization of XHTML. Comments are due by November 17, 2000 which is ludicrously soon. Clearly the W3C isn't truly interested in real implementation experience with this. You can't even use the DTDs they've provided here without several other pieces that haven't been released in compatible form yet, including Ruby and XHTML Basic. The schema implementations are still just palce holders. There's simply way-too much missing for this to make even a good last call working draft, much less a candidate recommendation.

Friday, October 20, 2000

The XML Apache Project has released Xerces-J 1.2.1. This is a bug fix release. It still only supports the April 7, 2000 working draft of XML schemas. It does not support the more recent release.


The XML Apache Project has released the first development version of Xalan 2.0. For now this is primarily of interest to those using Xalan as a component of their own programs through its API. In particular it uses SAX2 and DOM2, and it implements a preliminary version of the TraX (Transformations for XML) interfaces. It doesn't offer a lot new if you're merely using it to feed in an XML file and get a transformed document using the command line interface or a servlet.

Thursday, October 19, 2000

The W3C Working Group has posted a new Last Call Working Draft of the Platform for Privacy Preferences (P3P) 1.0 Specification. P3P enables Web sites to express their privacy practices in a standard XML format that can be retrieved automatically and interpreted easily by browsers. Thus users can be informed of site practices (in both machine- and human-readable formats) so they don't have to read the privacy policies of every site they visit. Comments are due bu October 31.

Wednesday, October 18, 2000

The W3C XSL Working Group has released a new working draft of XSL Formatting Objects.


The W3C has published errata for the XPath 1.0 specification. There's nothing too earth-shattering here. Mostly the exact behavior of some mathematical operators and functions like *, div, and floor() is more closely specified. A couple of incorrect examples are fixed too. The one thing I think will require processor vendors to rev their software is that white space is now allowed within expressions. For instance, whereas previously you had to write /children::person/children::name/children::first/attribute::id you can now add write / children :: person / children :: name / children :: first / attribute :: id. This is mostly useful to we writers who have to try to find ways to fit long XPaths on a single lines. Not all XSLT processors accept the form with the extra white space now.

Tuesday, October 17, 2000

IBM's alphaWorks has released version 3.3.0 of their XML Parser for C++. New features include experimental versions of DOM Level 2 - Range, Traversal and SAX2, improved support for Entities in the DOM, bug fixes and performance improvemenents. This is based on Xerces-C 1.3.0, which is in turn based on IBM's earlier work on XML4C.

AlphaWorks has also released a new version of the XML Lightweight Extractor (XLE). XLE allows a user to associate the various components of a DTD with underlying data sources, then extract data from the data sources and assembles it into XML documents conforming to that DTD. This release adds methods for controlling out of memory problems.


PHP 4.0.3 has been released. New features include an extension to the Sablotron XSLT processor and important fixes for some security bugs.

Monday, October 16, 2000

Version 2.2.5 of libxml, the Gnome XML library written in ANSI standard C, has been posted. This release adds XPointer support. Other features inlcude:

  • Libxml exports Push and Pull type parser interfaces for both XML and HTML.
  • Libxml can do DTD validation at parse time, using a parsed document instance, or with an arbitrary DTD.
  • Libxml now includes a nearly complete XPath and XPointer implementations.
  • It is written in plain C, making as few assumptions as possible, and sticking closely to ANSI C/POSIX for easy embedding. Works on Linux/Unix/Windows
  • Basic support for HTTP and FTP client allowing to fetch remote resources
  • The design is modular, most of the extensions can be compiled out.
  • The internal document representation is as close as possible to the DOM interfaces.
  • Libxml also has a SAX like interface; the interface is designed to be compatible with Expat.
  • This library is released both under the W3C IPR and the GNU LGPL.

I've posted an updated version of XLinks, Chapter 16 of the XML Bible. This brings the chapter completely up-to-date with the July 3, 2000 Candidate Recommendation of XLink. The major changes in this draft were:

  • Requiring xlink:role and xlink:arcrole attributes to contain a URI
  • Using xlink:label attributes to draw arcs between resources in an extended link instead of xlink:role attributes

This is also more or less what's going to be in the second edition of the XML Bible next year, so I'd greatly appreciate it if you let me know of any errors or mistakes you might notice when reading it. Anything I get in the next month or so I should be able to correct before publication. Thanks!

Sunday, October 15, 2000

Sun's released the StarOffice source code under both the GPL and the Sun Industry Standards Source License (your choice). The file formats for this release of Star Office are all XML. This is a major plus for Linux, Star Office, XML, and open source. Kudos to Sun for doing the right thing with this.


Paul Tchistopolskii posted XSLScript v 0.7, a more terse notation for writing complex XSLT stylesheets.

Saturday, October 14, 2000

The XML Apache Project has released version 1.2 of Xalan-J, an XSLT processor written in Java. As well as fixing a few bugs, this release uses Xerces-J 1.2.


The W3C has released the first public working draft of CSS Mobile Profile 1.0. This document specifies a profile of the Cascading Style Sheets, level 2 (CSS2) specification appropriate for mobile devices such as wireless phones.

This comes at an opportune time. I'm just starting on the WML chapter for the second edition of the XML Bible. Yesterday I ordered an Ericsson cell phone from VoiceStream, and it should be here in a few days. I had earlier ordered from Sprint, but they decided I wasn't a good credit risk, apparently because my driver's license still has my old address.

My travails in getting a cell phone, as well as the absolutely ridiculous agreements all the cell phone companies tried to foist on me, convinced me that the cell phone industry is begging for regulation. Buying a cell phone should be as straight-forward as getting a land-line, in fact considerably easier because no service call is required. My travails over the last month in trying to sign up for cellular service proved that it is anything but.

One of the things that bothered me about almost all the cell phone agreements was a clause saying that the company, not me, owns my phone number. That's a little like an ISP or Network Solutions claiming they own your domain name. I see no reason why you shouldn't be allowed to keep your phone number if you switch companies, states, or phones. But then I realized that I can't keep my land line number if I move. And then I thought, why not? Area codes and exchanges and even phone numbers themselves are relics of outdated technology that we simply accept because we grew up with it. But if we were starting over today no one would design so openly user-hostile a system. I mean, how many phone numbers can you actually remember? I don't think I've even got ten phone numbers memorized. I know far more email addresses and URLs. Why can't you call me by dialing elharo@ibiblio.org? I suspect eventually this will happen when we stop thinking of phone lines as separate devices from our data lines, and IP telephony becomes real. But in the meantime, why can't a phone company run a server that maps email addresses, domain names, and URLs to phone numbers? The actual numbers could vanish into the background and become as irrelevant as IP addresses are today. Business plan anyone?

Friday, October 13, 2000

I love days when the Cafe au Lait and Cafe con Leche news are essentially the same. It makes my job a lot easier. First off, Sun's posted the first early access release of the Java API for XML Parsing (JAXP) 1.1 on the Java Developer Connection (Registration required). Mostly this updates JAXP to support SAX2 and DOM2.


Next Mozilla Milestone 18 has been released for Windows, MacOS, Linux, and VMS. (That's a new one!) The big new feature in this release is Java support for Linux. Many bugs are cleaned up as well and probabaly a few new ones introduced. XML support is on a par with the last several releases; that is, no major new support like XSLT or XLinks. However there's now an optional extra package you can install with XSLT, SVG and MathML support. Initial reports are that MathML works, SVG doesn't, and XSLT works but only on odd-numbered days. This stuff still seems to be pretty crash-prone and bleeding edge, which is probabaly why it wasn't included in the default install.

Thursday, October 12, 2000

I will once again be chairing the XML Track for the SD2001 West Conference in San Jose, next April 8-12. That's the week before Easter. We're looking for solid technical presentations on XML in both 90-minute and full-day formats (though if you haven't presented before, it's probably best to start with a couple of 90-minute seminars rather than proposing a full-day tutorial.) What we really want is in-depth, tutorial material; for example, how to use JDOM, how to use WML; how to use schemas, etc. This is a developer conference, and we find that our audience is fairly bored with managerial presentations like "How to manage your web site by integrating XML with E-Commerce" or political presentations like "Why RELAX is better than W3C schemas". We're looking for as much information about how to use the technology as you can cram into 90 minutes. For more details or to submit a proposal please go to the SDExpo speaker web site. The deadline is officially tomorrow, Friday the 13th but we'll probably let that slide a bit for interesting proposals.


Fourthought, Inc. has released 4Suite 0.9.1, an open source library for XML, DOM, XPath, XSLT, RDF and object-database development in Python, Parts were previously available as the separate 4DOM, 4XPath and 4XSLT. This release fixes a few bugs.

Wednesday, October 11, 2000

The Apache XML Project has released Xalan-C++ 1.0, a mostly conformant (modulo bugs) XSLT 1.0 processor written in reasonably portable C++.

Tuesday, October 10, 2000

I've posted the preliminary version of com.macfaq.xml.XIncluder, an XInclude processor I've been working on. This is written in Java and uses JDOM. This is still very rough but may be of interest to anyone working with or thinking about XInclude. And if anybody knows of an open source XPath or XPointer implementation in Java, please drop me a line so I can merge that in with this. I know there are such things in various XSLT programs like SAXON and Xalan. The question is just how easy is it to pull out the XPath parts without pulling in all the XSLT parts.


Version 2.30 of XML::Parser should now be available from your local CPAN mirror. This is a bug-fix release. Furthermore, it no longer bundles the expat source code, so to build this you first have to build and install the expat *library* 1.95.0 or later.


CenterPoint/XML 1.7.0, an open-source C++ class library for reading and writing XML streams has been released. CenterPoint/XML is based upon Expat 1.2 and features SAX (including limited SAX2 support) and DOM Level 1 interfaces. This release fixes some bugs, improves the CXMLWriter class, and adds support for serializing DOM trees back into XML streams.


Repat is a callback-based RDF parser built on James Clark's expat. It's written in more-or-less portable C that should be usable in most environments.


IBM's alphaWorks has released version 3.3.0 of XML4C, their XML parser for C++ parser. According to IBM, "XML4C 3.3.0 is based on the Apache Xerces XML C++ Parser Version 1.3.0. New features include experimental versions of DOM Level 2 - Range, Traversal and SAX2, improved support for Entities in the DOM, bug fixes and performance improvemenents."

Monday, October 9, 2000

I've now had a chance to read over the first public review draft specification for JAXP 1.1, the Java API for XML parsing. Mostly it just incorporates other specs like SAX2, DOM2, and XML 1.0. However, it incorporates obsolete versions of all of these. The one thing it adds to the existing specs is a pluggability layer for finding and configuring a parser. However, this functionality is already available through XMLReaderFactory in SAX, and will be added in DOM3 soon. I don't see any reason to add a separate way of doing this through JAXP. If you want to read the spec yourself, it's available as a PDF file only. Comments are due by November 6.


The W3C has published two notes about the use of RDF, the Resource Description Framework:

  • Harvesting RDF Statements from XLinks. According to the abstract, both XLink and RDF

    provide a way of asserting relations between resources. RDF is primarily for describing resources and their relations, while XLink is primarily for specifying and traversing hyperlinks. However, the overlap between the two is sufficient that a mapping from XLink links to statements in an RDF model can be defined. Such a mapping allows XLink elements to be harvested as a source of RDF statements. XLink links (hereafter, "links") thus provide an alternate syntax for RDF information that may be useful in some situations.

    This Note specifies such a mapping, so that links can be harvested and RDF statements generated. The purpose of this harvesting is to create RDF models that, in some sense, represent the intent of the XML document. The purpose is not to represent the XLink structure in enough detail that a set of links could be round-tripped through an RDF model.

  • Describing and retrieving photos using RDF and HTTP. From the abstract:

    This note describes a project for describing & retrieving (digitized) photos with (RDF) metadata. It describes the RDF schemas, a data-entry program for quickly entering metadata for large numbers of photos, a way to serve the photos and the metadata over HTTP, and some suggestions for search methods to retrieve photos based on their descriptions.

    The data-entry program has been implemented in Java, a specific Jigsaw frame has been done to retrieve the RDF from the image through HTTP. The RDF schema uses the Dublin Core schema as well as additional schemas for technical data.

    We already have a demo site, and, in a few weeks, we have sample source code available for download.

    The system can be useful for collections of holiday snapshots as well as for more ambitious photo collections.


Sebastian Rahtz has revised and expanded his XSLT stylesheets for transforming TEI XML documents to HTML and XSL-FO. He's also released StyleBear, a new web-based program that asks you lots of question about how you want your HTML to look, and then generates a TEI-to-HTML XSLT stylesheet.

Sunday, October 8, 2000

James Clark has release verison 1.2 of his popular expat parser, written in C. The major new feature is the ability to parse external DTDs and parameter entities. The license has been cahnges to the less restrictive MIT license rather than the MPL.

Clark has also handed over future development and maintenance of expat to a team led by Clark Cooper, hosted on SourceForge. Goals for expat 2.0 include better support for use as a shared library under Linux and other Unix variants. A beta of expat 2.0 has already been posted.


Beta 5 of JDOM has been released. There are many, many changes since beta 4 including a redesigned API, better though still imperfect namespace support, more customizable serialization, and a lot more. All users should upgrade. However, note that the API is still far from final and will change again in the future.


Netscape has released the third preview of Netscape 6. This browser supports direct display of XML with attached CSS style sheets.

Saturday, October 7, 2000

The W3C has published the second edition of the XML 1.0 specification. They've gone straight to recommendation status with this, completely skipping the Proposed and Candidate Recommendation stages. The changes from XML 1.0 are mostly editorial, with some merging of errata. The clarification do affect a few parsers that were checking things they should have been, but mostly this is backward compatible. All documents that were well-formed and valid before are still well-formed and valid now. However, a few documents that were not previously well-formed or valid, at least according to some parsers, are now well-formed and/or valid.


Opera Software has released the first beta (after several preview releases) of Opera 4.0 for Linux. This browser supports direct display of XML documents with attached CSS style sheets.

I still don't have X running on my Linux box, so I can't test this release myself. I did do a lot of CSS work with the Windows version of Opera recently, and my experience was that the XML+CSS support was on a par with the latest Mozillas, but no better. There were quite a number of bugs and holes and things that didn't work quite right when you started digging past the most basic styles and formats. I reported the first couple of bugs I found through Opera's online form, but since I didn't receive any feedback or acknowledgement of those bugs, I didn't feel like wasting my time reporting the next dozen or so I uncovered.

There's a moral in this for vendors: If you want users to help you debug your software on their time, you've got to make it easy for them to do so. Accept bug reports in email or on the Web. Let users choose what's more convenient for them, not what's more convenient for you. Have a genuine human being respond to all bug reports. Don't ask users to waste their time providing you with every last irrelevant detail of their system. If you need more inforamtion, ask them for it; but don't expect a user to spend 15 minutes filling out a form giving you details of every third party extension they've installed, just because one time in a hundred you might need to know that. In other words, don't try to save your own time at the expense of your users' time! You've sold the user a buggy product that does not do what you advertised. They're doing you a big favor by reporting the problem rather than deleting it, demanding a refund, and detailing to everybody on Usenet just how bad it is. You should be grateful for the opportunity to fix the problem.


Sun has posted the first public review draft specification for JAXP 1.1, the Java API for XML parsing. It's available as a PDF file only. Comments are due by November 6.


Unicorn Enterprises has released their Unicorn Formatting Objects 1.00.01, a Windows program that converts XSL-FO documents to TeX or HTML. (HTML is the big new feature of this release.) Windows 95 or later is required.


Sebastian Rahtz has released a new version of his PassiveTeX XSL Formatting Objects processor. This release adds support for fo:marker and fo:retrieve-marker as well as fixing various bugs. A working TeX system is a prerequisite.


The W3C has released the Candidate Recommendation of Modularization of XHTML. This "specifies an abstract modularization of XHTML and an implementation of the abstraction using XML Document Type Definitions (DTDs). This modularization provide a means for subsetting and extending XHTML, a feature needed for extending XHTML's reach onto emerging platforms." In other words, it lets you mix and match parts of XHTML like tables or forms with other XML applications like MathML and SVG to create new XML applications.

Friday, October 6, 2000

I've posted the notes for yesterday's Advanced XML tutorial at XMLOne. Topics covered include

  • The XML Infoset, Canonical XML, and Digital Signatures
  • JDOM
  • XML Base and XInclude
  • Schemas
  • XLinks
  • XPointers
  • Future predictions
Tuesday, October 3, 2000

Kevin Jones has posted an unscientific but interesting benchmark of some of the main XSLT processors. The results do not match my own even less scientific experiences.

Monday, October 2, 2000

The XML Apache Project has released version 1.3.0 of Xerces-C, their XML parser written in cross-platform C++. Source and binary versions are available for AIX, HP/UX 11, Linux, Solaris, and Windows. New features in this release include:

  • Experimental versions of DOM Level 2 Range, Traversal
  • Experimental version of SAX2
  • Improved support for Entities in the DOM
  • Bug fixes and speed-ups
Sunday, October 1, 2000

Microsoft has posted a new beta of their MSXML parser for Windows and Internet Explorer. For the first time, they claim this is a complete implementation of XSLT/XPath 1.0. They also support the optional SAX2-ext package.

It's important to note that the MSXML installer does not actually install MSXML. It just places the necessary DLLs on your hard drive. To actually use the new version, you'll need to download another program called xmlinst.exe and run it to replace the older parser with the newer one. Previous versions of MSXML 3.0 bundled xmlinst.exe with MSXML, but this release does not. You have to download it separately.

Saturday, September 30, 2000

I've posted a comprehensive update for Chapter 15 of the XML Bible, XSL Formatting Objects. This is completely up-to-date with the March 27, 2000 Last Call working draft of the XSL-FO specification and FOP 0.14.0. This is the entire chapter, approximately 70 pages with many full examples of XSL-FO.

To the best of my knowledge, this is the only comprehensive tutorial covering the current version of XSL-FO. Doubtless there are some errors since I was breaking new ground here and had to work from an incomplete and sometimes contradictory spec document, as well as using unfinished pre-alpha software. If you spot any mistakes, please do let me know so I can fix them. Otherwise, I hope you like it; and I hope you're now able to start using XSL-FO for real work!

Friday, September 29, 2000

IBM has released version 3.1.0 of their XML Parser for Java. This is based on Xerces-J, which is in turn based on IBM's earlier work on XML4J. According to IBM, "This release contains public and stable support of the DOM Level 1, and SAX Level 1 specifications. It also contains implementations of the DOM Level 2, SAX Level 2 implementations, and partial April 7 W3C Schema implementations but these are considered experimental, as the specifications themselves are still subject to change."

Thursday, September 28, 2000

The W3C has elevated the Document Object Model (DOM) Level 2 to Proposed Recommendation status. All that remains is Tim Berners-Lee's personal approval. DOM Level 2 is now divided into six modules, which are:

DOM is a platform and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of documents.

Wednesday, September 27, 2000

Sean Palmer has published an interesting example of XHTML Modularization called XSML . XSML differs from XHTML 1.1 in the following ways:

  • The base element (module) has been excluded because it's redundant with xml:base.
  • The tables module has been replaced by the basic tables module. This means that the col, colgroup, tbody, thead, and tfoot elements have been removed.
  • A util:comment attributehas been added to the common attributes.
  • An xsml:anchor element has been added for use with CSS2.

It's important to note that this has all been done by using the standard features of XHTML; that is, XHTML 1.1 has been designed speciifcally to make tasks like this possible. This is an example of the X in XHTML.


The W3C has posted several Web Content Accessibility Guidelines Notes including

These all describe techniques for authoring content that is more accessible to a broad range of users with varying abilities.

Tuesday, September 26, 2000

David Megginson's posted a new beta of the SAX2-ext package that includes the optional LexicalHandler and DeclHandler interfaces. LexicalHandler reports purely lexical information like comments, CDATA start/end, etc. DeclHandler reports non-mandatory DTD declarations such as <ELEMENT> and <!ASTTLIST> declarations. The changes in this release are all editorial. The API has not changed at all.


X4Dialog is a Java based framework to generate dynamic web pages using XML/XSLT. Details are on the web site in German.

Monday, September 25, 2000

The Unicorn XSLT Processor, Professional Edition, has been released. This free-beer program is written in C++ for Windows 95 and NT. An ECMAScript interpreter is integrated with the XSLT processor for convenient writing of extension functions. Built-in extensions include database access, report generation extensions, and text input .

Sunday, September 24, 2000

The Apache Cocoon Project has released version 1.8 of Apache Cocoon, their XML/XSL Web publishing framework. New features. include:

  • Database connection pooling
  • Numerous bugfixes, including better Servlet 2.0 compatibility
  • Increased performance for heavy load conditions
  • Increased speed for some XSP pages which include complex Nodes
  • New ESQL logicsheet allowing easy post-processing of resultsets in XSP
  • Improved documentation, with new expanded FAQ
  • More installation instructions for more different servlet engines
  • Experimental FP taglib for form-handling and other tasks
  • Jikes support for faster XSP page compilation

Fourthought, Inc. has released 4Suite 0.9.0, an open source library for XML, DOM, XPath, XSLT, RDF and object-database development in Python, Parts were previously available as the separate 4DOM, 4XPath and 4XSLT. This release adds 4RDF to work with Resource Description Framework documents and and 4ODS to work with object databases.

Saturday, September 23, 2000

The W3C XML Schema Working Group has made a number of changes to the syntax of the XML Schema language in response to comments received during Last Call, and have released new working drafts reflecting these changes:

The most obvious change is that the namespace URIs have changed. There are also a number of other syntactic changes as well as major rewriting of various sections. The wroking group has promised an XSLT style sheet that converts the old syntax into the new syntax shortly.


I've just returned from a brief trip to Las Vegas to do a bit of XML training. Las Vegas is a city that never fails to upset me, and it's never the same thing twice. This time I discovered that my hotel, the MGM Grand, thought that $9.90 a minute was an appropriate rate for phone calls. They don't actually tell you that of course. It just shows up on your bill when you leave. No wonder they were so insistent on getting a credit card before they'd turn on the room phone. No one in their right mind would agree to pay such extortionate fees. Downloading your email and chatting with your kids could cost you more than a week's stay. One wonders how they can expect any repeat business with such blatant attempts to rip off their customers. I've encountered some ridiculous hotel phone charge before, but this was by far the worst I've ever seen. Then again this is the city that was built by the Mob to take advantage of Americans' ignorance of basic math.

Now that I'm back I'll try and catch up on all the news this weekend. I think I've finally got the Quote of the Day working again, after the upgrade to BBEdit 6.0 broke it.

Tuesday, September 19, 2000

David Megginson's posted a new version of his SAX2/Java-based XML Writer. This is a SAX2 filter that serializes its events to an XML document. It also includes a special class for writing data-oriented XML with automatic newlines and indentation.


The dbXML Group has posted version 0.2 of 0.2 release of the dbXML Core Edition, an open source native XML database written in Java. The dbXML Core Edition is designed specifically for collections of XML documents. The dbXML Core Edition is published under the GNU Lesser General Public License (LGPL).

Monday, September 18, 2000

The W3C has posted a new workingn draft of the Platform for Privacy Preferences (P3P) specification. Most of the changes are fairly minor.


A new XSL-FO-to TeX Converter, Unicorn Formatting Objects 1.0 has been released. It supports a subset of the March 27, 2000 Working Draft of XSL-FO. It runs on Windows NT 4.0 and Windows 95. UFO can generate output in PostScript, PDF and all other formats supported by TeX DVI drivers. However, a modern TeX distribution for Windows is required. In particular, MikTeX is recommended.

Sunday, September 17, 2000

DTDexplorer is a Java 1.1 applet and application that loads a DTD and displays the parent/child relationships, allowing you to visualize the structure defined by the DTD

Friday, September 15, 2000

The Ginger Alliance has posted version 0.44 of their Sablotron XSLT processor written in C++. Sablotron is available under the MPL and GPL licenses. This release adds support for the namespace-alias() and message() functions, as well as fixing assorted bugs. This release also adds support for OpenBSD. It's also available on Linux, Windows, FreeBSD, and Solaris.


I've been getting fed up with $1.00 a minute rip-off rates for hotel phones, so I promised myself that before my next trip I'd finally buy a cell phone. I'm also interested in learning about and experimenting with WAP and WML. Even though I'm 90% convinced this is a fundamentally bad idea, I still think I should know more about it; and maybe even add a chapter on it to the next edition of the XML Bible.

Thus I'm looking for advice as to what phone I should buy and with which provider I should sign up. So far my research seems to say that the Motorola TimePort 7389 is the only commonly available phone that supports WAP and WML. Is that accurate? Are there any others? This phone isn't offered with the service plans of any of the major companies I've looked at. How do I know whose networks it will work on?

The final question is how open is WAP? I've been hearing stories that phone companies only let you access WAP enabled sites that their partners provide. I very much want an open service that lets me look at any WAP/WML site I want to look at, most especially including my own. If I'm going to write about this, I need to try it out of course, and I don't want to have to sign a business deal with a phone company to do that. I want a system that's just as open as the wirted Web; where I buy access from an ISP and then look at any site I want to look at without any hassles. In reverse, I want to buy server space from any ISP and then serve pages (or WML cards) to any client I want to serve to without making sure that my server has a contract with the client's phone company.

If you've got opinions or comments about any of these matters, please drop me a line. I'll let you know what I eventually do, and then report here on how well the phone and the service work.

Thursday, September 14, 2000

Yesterday, I modified the XSLT stylesheet for a book I'm writing to account for a slight change in markup. This meant that for the first time I had to use the preceding and following axes as well as level="any" on xsl:number. Something in these changes (I'm still not sure exactly what) hung my first-resort XSLT processor, XT midway through the first chapter, so I decided to switch processors and see if that helped.

The first one I tried was Xalan from the XML Apache Project. Xalan could handle the document and style sheet (and noticed a couple of problems with the style sheet that xt wasn't flagging as well). However, it used very different syntax and semantics for some extension functions I needed; and I didn't feel like rewriting them so I decided to give Michael Kay's SAXON a try.

Wow! The difference was like night and day. SAXON blew through in seconds some documents that xt and Xalan were taking several minutes to process. I haven't done exact timings, but it really wasn't even close. It was like racing a Ferrari against a frog. And most importantly SAXON was able to use all the extension functions I'd written for xt without even a recompile. It was overly aggressive about indenting the output, which caused some problems for preformatted content, but I was able to turn that off with an <xsl:output indent="no"/> element. Overall SAXON is now far and away my XSLT processor of choice. It's more conformant to the XSLT 1.0 spec than xt and a lot faster and easier to use than Xalan.

Wednesday, September 13, 2000

Mulberry Tech's posted Revision 0.2 Beta 1 of xslide, an Emacs major mode for editing XSL stylesheets.


Sebastian Rahtz has updated PassiveTeX, his XSL-FO-to-TeX translator. PassiveTeX is a partial implementation of the March 2000 XSL-FO draft specification. It also partially supports MathML.


Version 1.0 of EditML Pro has been released. This XML editor for Windows is based on Microsoft's MSXML parser and presumably shares all the bugs in that parser. EditMLis payware, but a price hasn't been set yet. A 30-day evaluation version is available.

Tuesday, September 12, 2000

I'm still working on fixing the problems with bouncing pages caused by the unexpected domain name change to ibiblio.org. All links to the old domain name should still work, though I have found some unexpected problems in obscure services. Most pages should work, but I haven't tested the entire site At this point, if you see a page that won't load because of the redirection, please drop me a line with the complete URL you were trying to reach so I can fix it manually.


Bare Bones Software has released version 6.0 of BBEdit, the Macintosh text editor I'm using to type this message right now. Version 6.0 adds support for UNicode and other multi-byte character sets. It has special tools for HTML 4.01, XHTML 1.0, and WML 1.1. There are various other improvements including better file filters for multi-file search and replace, customizable syntax highlighting, Keychain support and multiple clipboards. BBEdit is $119 payware. However cross-grades from the free BBEdit Lite are $79. Upgrades are free for anyone who bought BBEdit 5.x after June 1, 2000; $39 for everyone else. Now if they'd only port this to Windows, I'd be a happy camper.

Monday, September 11, 2000

You may get infinite loops on some of the pages inside this side and at Cafe au Lait. I'm working on fixing it, but it may take a little while; especially for pages deep inside the site. If you desperately need to get to something, try turning off JavaScript. You don't need to report any problems right now. I'm aware of them, but there's not a lot I can do about them. The webmasters changed the domain name out from under me last night and installed a new redirector without notice. I'm trying to get them to turn it off.


IBM's alphaWorks has updated their XML Diff and Merge Tool with various bug fixes and support for Java 1.2 and XML4J 2.0.15.

AlphaWorks has also updated their AS/400 XML Interface for RPG, C and COBOL with a new API naming convention, SAX APIs, improved translation support, new output stream functions, and better documentation and samples.

Sunday, September 10, 2000

The W3C XML Linking Working Group has released the candidate recommendation of XML Base. This proposes a mechanism, "similar to that of HTML BASE, for defining base URIs for parts of XML documents" via an xml:base attribute that may be placed on any element. Comments are due by December 8, 2000.

The W3C XML Signature Working Group has also published a new working draft of Canonical XML. This describes a form to which any XML document can be reduced. In this process, attribute values are normalized, character references are resolved, and so forth. When this process is complete all logically equivalent documents should be byte-for-byte equal. The Working Group plans to adavnce this to Candidate Recommendation in about a week.

Friday, September 8, 2000

Version 0.4 of XSLTC, a C++ stylesheet compiler based on TransforMiix, has been released. XSLTC takes converts XSLT stylesheets to C++ code that can be compiled into an executable transformation. This release is closer to a full implementation of the XSLT 1.0 spec.

Wednesday, September 6, 2000

Python 2.0 beta 1 has been released. It includes much better support for XML and internationalization. Paul Prescod contributed pyexpat, an interface to James Clark's Expat XML parser. Even more interesting is xml, a new package with XML support code organized into three subpackages: xml.dom, xml.sax, and xml.parsers. There's a special feature whereby a user-installed package named _xmlplus overrides the standard xmlpackage; this is intended to give the XML SIG a hook to distribute backwards-compatible updates to the standard xml package.

Monday, September 4, 2000

Bill la Forge has released version 3.2 of Quick. Quick converts XML files into a Java object structure as specified by a QDML schema. This release adds support for CDATA sections. Version 3.1 which I never got around to mentioning here made some other changes including adding support for inheritance to the Quick binding schema, QJML. Furthermore the license has been changed to the LGPL.


I have an unusual question today directed at any readers who are familiar with Armenian. The CSS Level 2 specification from the W3C makes reference to "Traditional Armenian numbering", but Unicode doesn't seem to include any Armenian numbers, at least as such. Is this another language like Nebrew where the letters of the alphabet double as digits? Or are there some uniquely Armenian digits that Unicode is missing? If you happen to know the answer to this question, please drop me a line. Thanks.

Saturday, September 2, 2000

The Apache XML project has released version 1.2.0 of the Xerces-J XML parser written in Java. This is mostly a bug fix release, plus a little additional schema support.


Unicode 3.0.1 is now available as a list of changes to Unicode 3.0. This is a pretty minor update. The chanfges won;t affect most users.

Friday, September 1, 2000

The W3C has published the first public working draft of the Document Object Model Level 3 (DOM3) specification. DOM is a language independent API defined in IDL for representing HTML, XML, SVG, MathML, and similar documents as tree-structured object hierarchies. The big new feature of DOM3 is content model support (e.g. DTDs and Schemas). There are also numerous minor additions to existing interfaces like Node and Text as well as a standard means for building and saving DOM Document objects. The DOM3 specification is divided into several documents, one module (or more if they are strongly related) per document. Published modules include

This is all still in the very early, very rough stages. The Load/Save and Views/Formatting modules are not yet available.


Simon St.Laurent has started a XHTML Tip of the Day mailing list for Web developers interested in XHTML and XML. He's starting with fairly basic issues in the changeover from HTML to XHTML and XML, and will be moving through more advanced material as time goes on. If you're already on the XHTML-L mailing list, you don't need to subscribe to this list, as tips will appear on XHTML-L. Otherwise you can subscribe at the web site. This list is a one-way announce list, not a discussion list, so traffic should be low.


I have a question today: Does anybody out there know a font that contains the standard GUI icons? e.g. a hand cursor, an arrow, a stopwatch, etc. I can use either Mac or Windows icons; either PostScript or TrueType fonts. I'd naturally prefer a free font I can download quickly, but if I have to pay for it I will. If you know where I can find such a thing, please drop me a line.

Thursday, August 31, 2000

As if there weren't already enough reasons not to use Microsoft Word or distribute Word documents, The Privacy Foundation has discovered that Web bugs can be embedded in Word documents so that the the author of the document can tell when and where the document is being read. This could be used to monitor leaks of confidential documents from a organization (like the Halloween documents) or to enforce content licensing agreements. Furthermore, Word documents can set and get cookies. It's even possible to "place Web bugs in individual paragraphs and detect when the text is copied from one Word document to another."


Cover of the French translation of the XML Bible

Amazon has launched a new French bookstore. This is good news for those of us nerds with an abnormal affection for les bandes desinées. (Everyone knows all the cool kids read Japanese comic books. :-) ) It also means that I can finally get a copy of the French translation of the XML Bible, as can any of my Francophone readers! It's called XML guide de l'utilisateur and it's actually more current than the English edition because the translator worked from the updated chapters posted here on Cafe con Leche. I looked for this when I was in Montreal a couple of weeks ago, but couldn't find it there. The discounts are lower than in Amazon's American store (only 5% on most products) but the strong dollar helps make up for the lower discount in any case.



The Object Management Group (OMG) has formed the xmlpc-corba mailing list to discuss the integration of Corba with SOAP. To subscribe send a message to xmlpc-corba-subscribe@egroups.com from the account from which you wish to subscribe.

Wednesday, August 30, 2000

Cover of the 2nd Edition of Java Network Programming

The most eagerly anticipated book of the summer is finally here. No, it's not Harry Potter and the Goblet of Fire, at least not at this web site. It's Java Network Programming, 2nd Edition. Yes, the real second edition of the bestselling Java Network Programming is now available from Amazon and fine book stores everywhere. It covers the state of the art in network programming with Java right up through the final version of Java 2 Standard Edition 1.3 (a.k.a Java 1.3). More details on today's Cafe au Lait.

Tuesday, August 29, 2000

The W3C has posted a new working draft of XHTML Events specification. This defines the XHTML Event Module, that provides XHTML host languages the ability to integrate behaviors with Document Object Model (DOM) Level 2 event interfaces. It also defines "the XHTML Basic Event Module, a module which subsets the XHTML Event Module for simpler applications and simpler client devices, and the XHTML Event Types Module, a module defining XHTML language event types."

Monday, August 28, 2000

The W3C has posted XSL Transformations Requirements Version 1.1. This is not a specification or even a draft of a specification. rather, it's a list of what they want to achieve with XSLT 1.1. From the spec:

The primary goal of the XSLT 1.1 specification is to improve stylesheet portability. This goal will be achieved by standardizing the mechanism for implementing extension functions, and by including into the core XSLT specification two of the built-in extensions that many existing vendors XSLT processors have added due to user demand:

  • Support for multiple output documents from a transformation
  • Support for converting a result tree fragment to a nodeset for further processing

By standardizing these extension-related aspects which multiple vendor implementations already provide, the ability to create stylesheets that work across multiple XSLT processors should improve dramatically.

A secondary goal of the XSLT 1.1 specification is to support the new XML Base (See [XMLBase]) specification.


Version 4.1.2 of DocBook XML has been released. This is a minor bug fix release.


Macromedia has released Flash 5 with a new interface, Freehand import, new Bezier tools, and more. Flash 5 is available for Windows and late-model Macs only. It's $399 payware. Upgrades are $149.


Bill LaForge has released version 3.0 of Quick. Quick converts XML files into a Java object structure as specified by a QDML schema. Version 3 now generates Java code directly and includes a new binding schema generator.


Mark Wutka's released version 1.7 of JOX to fix a bug. JOX is a Java library for reading XML into Java beans and writing XML from Java beans. Accoding to Wutka, "It is similar to, but not as powerful as, Bill La Forge's Quick library. The only thing JOX has going for it is that it doesn't require a separate schema. It uses the bean and optionally a DTD." JOX is licensed under the LGPL.

Wutka has also released version 1.2 of BeanMaker, a set of command-line utilities for creating Java Beans using XML. "BeanMaker reads an XML file containing a description of a bean - really just the bean name, package, properties and indexed properties - and generates a Java class implementing that bean. It mainly saves you some typing time when you need to implement a lot of properties. BeanMaker also includes a class called XMLIntrospector that looks at an XML file and generates a set of XML bean definitions that BeanMaker can then turn into beans. The introspector tries to parse data items as various Java data types and makes a list of the types that could be used for every occurrence of a tag. "

Saturday, August 26, 2000

Oracle's released XSQL Pages 1.0.1.0, a SQL+XML+XSLT publishing framework on the Oracle Technology Network (registartion required). XSQL Pages are server-side XML templates that use SQL, XML and XSLT to prototype and deploy dynamic, data-heavy web sites and services. The publishing framework can be extended via user-defined "action handlers". A Java servlet capable web server is required. Full functionality is available only against an Oracle database, though partsd work with other JDBC compatible databases.


Lutris is forming the Enhydra Documentation Working Group to write developer documentation for Enhydra 3.0 and Enhydra 4.0 (Enhydra Enterprise) application server. Participation is solicited.

Friday, August 25, 2000

Jonathan Eisenzopf's released XML::RSS 0.9, a Perl module for processing RSS 0.9, 0.91, and 1.0. Depending on the version, RSS stands for RDF Site Summary or Rich Site Summary. RSS is a light-weight syndication format currently used by many Web sites. The module is available at your local CPAN mirror.


Luca Padovani's released GtkMathView, a GTK widget for rendering MathML documents on LInux written in C++. It is intended as a standalone, light-weight component and not a full browser. GTK applications can use the widget as a window for displaying mathematical formulas and doing simple interactions. The current early alpha version (0.10) recognizes most MathML 2 elements and attributes. It's all under the GPL. Contributions are solicited.


Version 4.1.1 of DocBook XML has been released. This is a minor bug fix release.


The Apache XML Project has released Xalan-J 1.2.D02. This release fixes a few bugs, automates the build process using ANT, and supports Xerces 1.1.3.


The Mozilla Project has announced plans to dual-license their code under both the Mozilla Public License and the GNU General Public License. This is a good thing for GPL fans like myself. Still, it's going to be a big honking mess, because now they have to go get permission from everyone who submitted a patch or line of code under the MPL to also release it under the GPL. GPL'd Mozilla may not include quite everything MPL'd Mozilla does. Still and all, this is good news. I've been working with Mozilla a lot lately and its XML support in the latest Milestone 17 is a little better than Opera 4.0's and light-years ahead of Internet Explorer 5.5.

Thursday, August 24, 2000

One of the more interesting things I learned in Montreal last week was just what a grvoe is. Groves allow addressing and linking into data for which a property set has been defined. The property set describes how to represent a particular data format as an XML-like tree. Imagine using the DOM to process RTF to give yourself an idea of the power of this idea. XSet is an XML property set description of XML 1.0 and XML namespaces "to enable grove based processing of formats defined in EBNF" such as XML itself.


XLink2HTML is a set of XSLT stylesheets for creating HTML representations of Xlink elements. It is particularly useful for representing one-to-many arcs from local resources to locators. However, other relationships possible in an inline xlink element are implemented (e.g. many resources to many locators).


4xt has published a set of XSLT stylesheets that translate Rich Site Summary (RSS) 0.9 and 0.9.1 documents into the recently announced RSS 1.0 format, together with Java classes aimed at facilitating the transformation.

Wednesday, August 23, 2000

I'm still catching up on news that piled up while I was in Montreal. Today, I wanted to check in with the W3C. The big news of the last week was the draft of XML 1.0, second edition. This isn't a change to XML itself for the most part, just a clarification and update of the specification document that corrects various errata that have been discovered over the last two years. In a few cases, such as the requirements for language codes, parsers have implemented behavior based on erroneous interpretations of unclear sections of the spec so there may be some minor incompatibilities between parsers implmenting the first and second edition, but these should be rare, and I think the W3C would argue these were bugs in the parser in the first case.

The W3C has also posted new working drafts of three other specifications:

XForms are an XML Schema based replacement for and upgrade to standard HTML 4.0 forms. XML Query can be thought of as SQL for XML, that is a language for extracting data from XML documents and collections of XML documents that's more powerful and general than XPath.

As I mentioned on Monday, the W3C announced in Montreal that they have officially decided not to decide what to do about relative namespace URIs for the time being (though that decision doesn;t seem to have propagated to the main mailing lists yet). They will deprecate them, that is recommend that no one use them, but they will not forbid their use, at least not yet; and they will not venture to say what they do or do not mean or how they are or are not to be compared. This issue may be revisted in the future.

Meanwhile XFragment seems to be stuck in perpetual limbo, having gone nowhere in the last year, and the W3C has announced that they aren't really working on it anymore. If there's a popular outcry, it may get unstuck. Otherwise, it's probably dead. Rick Jelliffe has released a very early alpha of Fragserver to serve XML fragments in response to XPath expressions. He's looking for people who might have time to adopt and develop it.

Outside the W3C, the IESG is considering XML Media Types as a Proposed Standard. The IESG plans to make a decision in the next few weeks. Please send any comments to the iesg@ietf.org or ietf@ietf.org mailing lists by September 15, 2000.

Tuesday, August 22, 2000

Dave Beckett's posted a beta of Redland, an RDF Application Framework that allows plugging in of various modules to support different parsers, storage mechanisms or models. According to Beckett Redland provides:

  • A modular, object based library (or application framework) written in C
  • APIs for manipulating the RDF Model and parts - Statements, Resources and Literals
  • Parsers for importing the model from RDF/XML syntax (both parsers external)
  • Storage for models in memory and persistant, via Berkeley DB
  • Query APIs for the model by Statement or by Nodes and Arcs
  • Statement Streams can be used as well as static APIs for construction, parsing, (de-)serialisation of models
  • Internal utility classes such as Lists, Hashes, Digests, URI, and Iterators
  • The parsers, storage, hashes and digests are all modules that can be removed or added to at run time.

The Infozone Group has released Prowler 0.3, a Java-based XML-content management system.


Various parties have put together a proposal for RSS 1.0 RSS ("RDF Site Summary") is a lightweight multipurpose extensible metadata description and syndication format. RSS is an XML application, conforms to the W3C's RDF Specification and is extensible via XML-namespace and/or RDF based modularization. RSS is also known as the Rich Site Summary, and in version 0.91 it more or less abandoned its roots in RDF. (It isn't so careful about XML either. I've found a lot of malformed RSS out there in the wild, far more so than I've seen with any other XML application.) RSS 1.0 is an effort to attempt to drag RSS back to its RDF-metadata roots. Given the personalities and philosophies involved, I'm frankly skeptical that this effort can succeed. Interested parties are invited to join a working group on the RSS-DEV mailing list by sending email to rss-dev-subscribe@egroups.com.


The XML Apache Project has released version 0.14.0 of FOP, an XSL Formatting Objects to PDF converter. The major changes are support for the March 27, 2000 Working Draft of XSL-FO and the latest SVG spec. Other new features include:

  • fo:page-number-citation
  • fo:instream-foreign-object
  • support for 06/29/2000 SVG WD + W3C Java DOM bindings
  • Image support using Jimi
  • External and internal simple-links
  • The Symbol and ZapfDingbats fonts
  • MessageHandler
  • XalanCommandLine

FOP is open source and written in Java.


Chris Bayes has posted a beta of the "Change Stylesheet Power Toy", a JavaScript for IE5 that allows the user to select the stylesheet for the current document from a list of alternate stylesheets in the document given in xml-stylesheet processing instructions.


Rick Jelliffe's released a very early alpha of Fragserver. No, this isn't a Quake Bot. Rather it's a Java Web Application that sits on top of Tomcat, Xalan, and Xerces and serves XML fragments in response to XPath expressions. Jelliffe says he probably won't have time to maintain this so if anyone wants to adopt it and run with it, they're welcome to it.


IDOOX has released IdooXoap, a partial SOAP implementation for Java 1.3 that's compatible with the Microsoft and Apache SOAP implementations. IDDOX says that this will eventually be open sourced.

Monday, August 21, 2000

I spent the last week in Montreal, sighstseeing and attending the Exterme Markup Languages 2000 Conference where I delivered a talk about JDOM. I've written up my impressions of the city elsewhere. Now I wanted to say a few words about what took place at the conference.

Overall, this was a relatively academic conference. Papers were very high level but mostly practical. Topic maps were the main topic of the conference, probabaly occupying a third of the sessions. However, I found it difficult to muster a lot of interest in them. Groves were also popular, and seemed somewhat more interesting to me. Imagine using the DOM to access any kind of data at all, not just XML but also relational databases, Microsoft Word files, plain text, and more; alkl through a common tree-based API. The architectural forms partisans either passed on this conference or were amazingly quiet for a change.

There was one major announcement at the conference: The W3C has officially decided not to decide what to do about relative namespace URIs for the time being. They will deprecate them, that is recommend that no one use them, but they will not forbid their use, at least not yet; and they will not venture to say what they do or do not mean or how they are or are not to be compared. This issue may be revisted in the future.


Mark Wutka's released version 1.11 of his Java DTDParser. This release amkes a few minor improvements.


Netscape's released Communicator 4.75 for Macintosh, Windows, and Unix to fix the "brown orifice" security hole. I strongly recommend that all users upgrade to this release.

Sunday, August 20, 2000

I'm back from Montreal. The net access from my hotel was poor, but otherwise I absolutely adored the city. It struck me as the best parts of New York, Boston, New Orleans, and San Francisco all rolled into one city with all the bad parts removed. I'd be very tempted to pick up shop and move there, but my wife's a lot less enamored of French than I am. Then again, I did arrive during what one cabbie told me was the first decent week of weather this century, and I'm afraid he may have understood that 2000 is the last year of the 20th century rather than the first of the 21st. I'm going to spend today writing up some thoughts from the trip, most of which have nothing to do with computers. Tomorrow, I'll try to catch up with all the news items that have piled up over the week while I was away.

Friday, August 11, 2000

I'm leaving for Montreal this afternoon to attend the Extreme Markup Languages 2000 conference. I'm giving a brief talk on JDOM Wednesday afternoon, but mostly I'll just be listenting to everyone else talk and doing a bunch of sight-seeing and shopping. My wife Beth tells me that Montreal has lots of wonderful French bookstores.

Depending on what sort of Net access I have while I'm there, updates may be a little slow here for the next week. When I was in Atlanta last month, the Courtyard Marriott I stayed at had wonderful Net access, essentially a T-1 to my room for about $10.00 a day. The staff told me they're rolling this out all over the country, and are about half done. They also had a genuine laundromat on the premises and no ridiculously overpriced minibar. It was definitely a step-up from the San Jose/Santa Clara hotels I've stayed at lately, where they think that Net access means an extra phone jack for your modem, and $1.00 a minute phone calls. In the future when I have a choice I'm definitely looking for Courtyard Marriott again. However, in Montreal I'm staying at the conference hotel (The Hotel Wyndham Montreal) so I don't yet know what I'm likely to find when I arrive. My wife's more concerned aboiut whether or not the pool's heated. :-)


JDOM nightly builds are now available. There've been a lot of changes in the last month since 1.0b4 was produced. If you've been avoiding upgrading from 1.0b4 because of CVS, I strongly recommend you upgrade now. The latest nightly releases are quite stable and much improved over 1.0b4.


Kevin Jones has released xmlgrep 0.2, a small utility for progressively evaluating XPath expressions against an XML file. This release adds:

  • More complete XPath implementation
  • Multi-threaded expression evaluation
  • Support for the document() function on local files
  • Loading of expressions from files

Mark Wutka's released version 1.10 of his Java DTDParser. This release adds This version includes support for unparsed external entities and get/set methods for Java Bean style property access.


XML Spy 3.0.7, a $149 payware XML editor for Windows has been released. Upgrade are free for previous 3.x owners. Version 3.0.7 is a bug fix release, no major features.

Thursday, August 10, 2000

Joshua Allen's written an XSLT Test Tool for writing XSLT transforms and comparing the output of various processors side-by side. Current processors supported include MSXML, Saxon, Xalan, Xt, Instant Saxon, and Oracle XSL. It only runs on Windows.


CenterPoint/XML 1.6.5 is a mostly undocumented C++ class library for XML based on expat that supports DOM and SAX.

Wednesday, August 9, 2000

I've posted the notes from last night's talk about JDOM to the XML SIG of the New York Object Developers' Group. This is one of the most complete introductions to JDOM available anywhere and is completely up-to-date as of Friday's CVS build. Now I've got to figure out how to cut this down to just the bare essentials for a startlingly short half-hour presentation next week in Montreal at the Extreme conference.


IBM's alphaWorks has released a new version of the XML Lightweight Extractor with a new user interface that defines extra functions, a GUI editor for creating/modifying DTDS, and a Java package name xle. XLE allows a user to annotate a DTD to associate its various components with underlying data sources, and when requested, extracts data from the data sources and assembles the data into XML documents conforming to that DTD.


Mark Wutka's released version 1.9 of his Java DTDParser with some minor fixes. The primary new feature of this release is that line and column numbers are now given with parse errors.

Tuesday, August 8, 2000

Milestone 17 of Mozilla and Netscape 6.0 Prerelease 2 have both been posted on their respective web sites. Mozilla M17 is available for Windows, Linux, MacOS, and Solaris. Both support direct display of XML in the browser using CSS style sheets. Some support for simple XLinks is also included, but it's buggy.

Mozilla is now officially feature complete. Future milestones will focus on bug fixes and performance improvements.


Tonight, Tuesday August 8, I'm giving an introductory talk about JDOM to the XML SIG of the Object Developers Group in New York. JDOM is a tree-based, Java API for processing XML documents that's been designed with simplicity and convenience as its foremost purposes. JDOM programs can use any commonly available SAX or DOM parser. At a high level, it's similar to the DOM, but since JDOM was designed specifically for Java rather than for multiple languages, it feels much more natural and "right" to Java programmers. This talk teaches you how to use the JDOM to read and write XML documents from your Java programs while automatically maintaining well-formedness and checking validity.

The meeting takes place from 7-9 p.m., at Goldman Sachs, 125 Broad Street, 19th Floor, Room B in Manhattan in the Wall Street area. Admission is free. However, you must register for this session by sending an email to wperry@fiduciary.com. You will receive a confirmation by return email. You need to register in advance to get into the building. As usual, I'll post the notes from the talk here on Wednesday.


Oracle's released betas of XML Schema processors for Java, C, and C++. They've also posted version 2.0.4 of their XML/XSLT Parsers for C++, C and Java. Free registration is required.

Monday, August 7, 2000

Tomorrow evening, Tuesday August 8, I'll be giving an introductory talk about JDOM to the XML SIG of the Object Developers Group in New York. JDOM is a tree-based, Java API for processing XML documents that's been designed with simplicity and convenience as its foremost purposes. JDOM programs can use any commonly available SAX or DOM parser. At a high level, it's similar to the DOM, but since JDOM was designed specifically for Java rather than for multiple languages, it feels much more natural and "right" to Java programmers. This talk teaches you how to use the JDOM to read and write XML documents from your Java programs while automatically maintaining well-formedness and checking validity.

The meeting takes place from 7-9 p.m., at Goldman Sachs, 125 Broad Street, 19th Floor, Room B in Manhattan in the Wall Street area. To register for this session, please send a request by email to wperry@fiduciary.com. You will receive a confirmation by return email. You need to register in advance to get into the building. As usual, I'll post the notes from the talk here on Wednesday.


Sun's posted the first beta of the SVG Slide Toolkit, a program that transforms an XML file that using a specific DTD into an SVG (Scalable Vector Graphics) slide presentation.


Mark Wutka's released version 1.8 of his Java DTDParser.

Saturday, August 5, 2000

Mark Wutka's released version 1.7 of his Java DTDParser. This release now keeps track of the order of the items read in from the DTD.

Friday, August 4, 2000

Opera 4.0.2 for Windows is now available. Thsi is mostly a bug-fix release but also adds third party cookie scripting, e-mail client enhancements, new JavaScript methods, and better import of Eudora files. Opera supports direct display of XML documents with CSS stylesheets. In this area, Opera's a lot more reliable than Internet Explorer but not quite up to the standard set by Mozilla yet.

Thursday, August 3, 2000

Extensibility has released XML Instance 1.1, its $99.95 payware schema-driven data editor for data-oriented XML documents. New features in version 1.1 include:

  • XSLT stylesheet transformation facilities
  • Data importing from an ODBC data source
  • Support for all encoding types supported by JAVA
  • Improved namespace support
  • Additional customizable document display options

XYZFind.com is seeking beta sites for a new search engine that provides users with keyword search access to XML documents. XYZFind is a stand-alone server-side solution.

Wednesday, August 2, 2000

The W3C has promoted Scalable Vector Graphics (SVG) to a Candidate Recommendation. This draft makes primarily editorial changes. The changes to SVG itself are relatively minor. Most of those are in various parts of the DOM.

Tuesday, August 1, 2000

The W3C Synchronized Multimedia Working Group has posted a new working draft for SMIL Animation. This describes an animation framework and a "set of base XML animation elements suitable for integration with XML documents. It is based upon the SMIL 1.0 timing model, with some extensions." Previously SMIL animation had hit "Last Call", but this draft pulls back from that a little. There'll probably be at least one more draft before this goes to Candidate recommendation status.


The W3C has published a note about SAC, the Simple API for CSS. This is similar in broad pattern to SAX; that is it's a push-based API for parsing stylesheets. An implementation of a Parser interface reads a CSS stylesheet and based on what it finds there passes arguments to callback methods in an implementation of a DocumentHandler interface in real time. SAC is defined for Java and C. An implementation called Flute is has been written in Java.


Deadlines are slipping at the W3C. The HTML Working Group has published a new roadmap that sets the following dates for finishing (or starting) work on various HTML-related specs:

Deliverable 1st draft 2nd draft last call ends CR/PR Rec
HTML 4.01 26 Mar 99 - 28 Jun 99   10 Dec 99 26 Jan 00
XHTML 1.0 5 Dec 98 24 Feb 99 5 May 99   10 Dec 99 26 Jan 00
Modularisation 6 Apr 99 10 Sep 99 31 Jan 00   July 00 Sep 00
XHTML 1.1 6 Apr 99 10 Sep 99 31 Jan 00   July 00 Sep 00
XHTML Basic 21 Dec 99 - 28 Feb 00 15 Mar 00 July 00 Sep 00
Profile requirements 6 Sep 99 - -   - -
Profile vocabulary wait for cc/pp ?   ? ?
Events 21 Dec 99 Aug 00 31 Dec 00 TBD TBD  
Schemas for XHTML 1.1 and Basic Aug 00 - - - - -
XHTML 2.0 31 Dec 00 TBD TBD TBD TBD  

kXML is a lean Common XML API with WAP support that is intended to fit into the KVM on memory-challenged devices like the Palm Pilot.


Bill LaForge has released version 2.1 of Quick. Quick converts XML files into a Java object structure as specified by a QDML schema. Version 2.1 adds the encoding declaration and lets you use file names instead of URLs if you like. Also, the QIML internal binding schema of is now exposed (though not yet documented). Java 1.3 is required.


Mark Wutka's released version 1.6 of his DTDParser. Mostly this is a bug fix release.

Monday, July 31, 2000

XML Engine is a REALbasic class library for parsing, rendering, and working with XML documents. XML Engine is $125 payware.


The unofficial "Last Call" working draft of Common XML has been posted. Review ends September 1, 2000.


The XML Apache Project has posted the first development release of Xalan-J 1.2, an XSLT processor written in Java. This release adds a new nodeset extension function that casts a result tree fragment to a node-set as well as fixing various bugs. Xalan-J 1.2.D01 has only been tested with Xerces-J 1.1.2. It probably requires that specific release of Xerces, though it may work with close versions.


Microsoft's posted a new beta of their MSXML XML parser and XSLT processor for IE5 and later. This release adds support for:

  • xsl:key
  • xsl:message
  • xsl:fallback
  • xsl:namespace-alias
  • document()
  • key()
  • SAX2 in Visual Basic
Sunday, July 30, 2000

David Brownell, Curt Arnold Joe Polastre, and Richard Tobin have launched the xmlconf project at SourceForge to focus on conformance testing for XML and related technologies. This currently includes:

  • Test Software including the test harnesses used in Brownell's xml.com articles.
  • Results from the tests
  • Some preliminary XML Schema test cases.
  • A modular DTD validator
  • A pipeline framework
  • An enhanced version of the AElfred XML parser

Everything's under the GPL.

Saturday, July 29, 2000

UserActive has posted modxslt, a simple Apache module that transforms documents with XSL using Sablotron. It is free to whoever wants it.


Paul Tchistopolskii has release XSLScript, a simpler input lkanuage that can be compiled into XSLT. Accorsing to Mr. Tchistopolskii

I got tired typing 3 (or more) lines to invoke foo(bar). I doubt I'll now write too much plain XSLT code - I found it much easier to write and read XSLTScript. In the case I have to write XSLT code (for example - on client side), I can debug it in XSLScript and then I can generate the XSLT stylesheet.

The current veresion is 0.5. XSLScript is written in


Nick Semenov has ported XML-DBMS to PERL. XML-DBMS is middleware for transferring data between XML documents and relational databases and is now available in both Java and Perl forms.

Friday, July 28, 2000

The W3C XML Core Working Group has released a new draft of the XML Infoset Specification. This document defines what information a namespace-aware XML parser should reveal to applications, and precisely defines those inforamtion items. This spec is necessary because until now many other specs like XSLT and DOM have had subtle but potentially dangerous inconsistencies in how they define key terms like the value of a node. This is required reading for anybody working on XML APIs or software or on other XML specifications. However, for the rest of the community it's pretty technical.

Thursday, July 27, 2000

The Apache XML Project has released version 1.1.3 of the Xerces-J XML parser for Java. This release focuses on fixing a number of severe bugs in validation that reflected misunderstandings of the XML 1.0 spec. I recommend all users of Xerces 1.1.2 upgrade. A number of other bugs are fixed and schema support is improved as well with anyAttribute, xsi:type and equivClass.

Wednesday, July 26, 2000

I've posted the slides from this week's presentations in Atlanta. These include:


IBM's alphaWorks has updated their XML Security Suite. This Java program offers digital signatures for XML documents, element-wise encryption, and access control to Internet business-to-business transactions. This release adds support for Xerces-J 1.1 and the DOMHASH implementation of RFC 2803.


XSLTC is an open source XSLT stylesheet compiler written in C++ and based on TransforMiix. XSLTC reads an XSLT stylesheet, and generates C++ code. The executable that is built from this code should have the same behaviour as running the source stylesheet through an XSLT processor, only faster.


FourThought LLC has posted version 0.9.2 of 4XPath and 4XSLT, Python implementations of XSLT/XPath. Overall these releases add a variety of features and allegedly provide full feature support for XSLT 1.0 and XPath 1.0.


IBM's posted a preview release of Tech Explorer 3.0, a browser plug-in that supports MathML 1.0.1, TeX and LaTeX. Version 3.0 supports Macintosh browsers for the first time. It also works on AIX, IRIX, Linux, Solaris, Windows 95, 98 and NT. The Professional version of Techexplorer is $29.95 payware. The Introductory edition is free, but not yet available as a version 3.0 preview release. A free 30-day trial is available.


Digital Applications' SVG2PDF is a Windows and Mac program for converting Scvalable Vector Graphics documents into Acrobat PDF documents.


Netscape has released Netscape Communicator 4.74 for the ususal list of platforms. This is primarily a bug fix release plus a few minor new features. XML support is not changed. Navigator still can't display XML documents in any reasonable way.

Monday, July 24, 2000

The W3C Working Group has released four new working drafts covering Composite Capabilities/Preference Profiles, a way of specifying how client devices tell servers their capabilities and preferences. The server uses this information to deliver content appropriate for the client device. As well as computer-like devices, particular attention is being paid to mobile phones. Devices are described using RDF. The drafts are:


If you're in Atlanta tonight (Monday), come hear me talk about Java 1.4 and Beyond at the Atlanta Java User's Group. Now that Java 1.3 has been officially released, it's time to explore both where Java is going in the future and where it should go (which may or may not be the same place). What's ahead for the Java language and libraries? Will Java ever get templates, operator overloading, and multiple inheritance? Perhaps more to the point, should it get these constructs? Will Java I/O ever be fast enough for data intensive work? Will your network programs be allowed access to the raw IP layer so ping or traceroute can be written in Java? The meeting runs from from 7 P.M. to 9 P.M. and is held at the Ravinia Conference Center near Perimeter Mall. Directions are on the AJUG web site. As usual, I'll be posting the notes from my talk here on Cafe au Lait after the talk. However, the notes are quite skimpy compared to what I'll be saying, so you definitely want to attend if you can. Tomorrow I'll be at the Georgia Web Developers Conference where I'll be talking about XML Fundamentals and Web Client Programming in Java.

Saturday, July 22, 2000

The IESG of the IETF has approved the Internet-Draft Internet Open Trading Protocol (IOTP) HTTP Supplement as a Proposed Standard. Internet Open Trading Protocol (IOTP) messages will be carried as XML documents. IOTP, described in RFC 2801, is a payment system independent frmaework for e-commerce. It encapsulates existing payment systems such as SET, Secure Channel Credit/Debit, Mondex, CyberCoin, GeldKarte, etc.

Friday, July 21, 2000

Edd Dumbill's posted the seventh beta of his XML-RPC client for PHP. This release adds support for BASIC HTTP authentication as well as fixing various bugs.


Josh Nutzman has published mod_xslt 1.0, an XSL transformation module for Apache based on Sablotron.

Thursday, July 20, 2000

Doug Holton's released an XML parser plug-in for REALbasic based on James Clark's Expat XML parser.


The Ginger Alliance has released version 0.42 of Sablotron, a GPL'd XSL Transformations Processor written in C++ for Windows NT and Linux. This version is now thread safe and supports multiple output code pages. It also fixes a few bugs.

Wednesday, July 19, 2000

The GNOME project has released libxml 2.2.0, an XML parsing library written in C used heavily by Gnome apoplications like Gnumeric. libxml includes validating SAX and DOM parsers as well as an HTML parser. This release handles internationalization and unusual character sets a lot better.

Tuesday, July 18, 2000

The W3C XML Core Working Group has published a new working draft of XInclude. XInclude defines an <xinclude:include href="URI"/> element whose effect is to insert the referenced XML document into the document in place of the xinclude:include element after validation. You can specify whether you want the included content to be parsed as XML or inserted as raw text with all illegal characters like < escaped as necessary.

Monday, July 17, 2000

FourThought LLC has posted version 0.9.2 of 4XSLT, a Python implementation of XSLT/XPath. This release adds extension elements and fallback support as well as a lot more encodings. The API has been cleaned up somewhat too. Of course it also fixes assorted bugs. This is the first release to be fully feature complete with respect to XSLT 1.0.

FourThought has also released version 0.10.2 of 4DOM, a Python library for XML and HTML processing and manipulation using the W3C's Document Object Model interfaces. 4DOM implements DOM Core level 2, HTML level 2 and Level 2 Document Traversal. This release updates support to cover the May 10 candidate recommendation of DOM Level 2 and adds support for many more output encodings as well as fixing various bugs.


I've got jury duty and MacWorld Expo this week so updates may be a little slow until the end of the week.

Saturday, July 15, 2000

Version 4.0.1 of Opera for Windows has been released. This is primarily a bug fix release with some minor new features. Opera is a $39 payware web browser that supports direct display of XML documents with associated CSS stylesheets.

Friday, July 14, 2000

The XML Apache Project has posted version 0.40.0 of Xalan-C++, an XSLT processor written in C++. This release fixes many bugs, and brings XSLT support closer to full compliance with the specification. All axes except namespaces are now suported, and the document() function has been added. This release was built with Xerces-C 1.2.0a, and may not work with any other release of Xerces. The Xerces binaries from that release are included.


The Ginger Alliance has released version 0.41 of Sablotron, a GPL'd XSL Transformations Processor written in C++ for Windows NT and Linux. This version fixes a few bugs and rearchitects some of the internals to make them cleaner.


Raghavendra Gowkanapalli has founded an egroups mailing list for people preparing for IBM's XML certification exam. To subscribe send a blank email to xml-cert-subscribe@egroups.com from the account you wish to subscrine from.

Thursday, July 13, 2000

The W3C has released four new working drafts. First up is a last call draft for Canonical XML 1.0. Canonical XML describes how to tell whether two documents are "the same" once excess white space has been stripped, default attributes have been inserted, entities replaced and so forth. Last Call Ends 28 July 2000.

The IETF/W3C XML Signature Working Group has posted a new post-last-call draft of XML-Signature Syntax and Processing. This spec describes a method for calculating digital signatures for and attaching digital signatures to XML documents. Signatures are calculated with respect to the canonical form of an XML document.

The W3C Voice Browser Working Group has released the first public version of the Speech Recognition Grammar Specification for the W3C Speech Interface Framework. This spec defines BNF and XML syntaxes for grammars used in speech recognition so that developers can specify the words and patterns of words to be listened for by a speech recognizer. Among other issues, the working group is is seeking input on whether the final specification should include both XML and BNF forms or simply choose a single syntax.

The W3C Voice Browser working group has also posted the first public draft of Multimodal Requirements for Voice Markup Languages. According to the abstract:

Multimodal browsers allow users to interact via a combination of modalities, for instance, speech recognition and synthesis, displays, keypads and pointing devices. The Voice Browser working group is interested in adding multimodal capabilities to voice browsers. This document sets out a prioritized list of requirements for multimodal dialog interaction, which any proposed markup language (or extension thereof) should address.

I've opened a special report on Internet Explorer 5.5. Comments are welcome.

Wednesday, July 12, 2000

Microsoft's released Internet Explorer 5.5 for Windows. This release includes a lot of XML support. CSS support is expanded. A print preview function has been added. I don't yet know whether or not this supports real XSLT as opposed to the pseudo-XSLT IE5 used. More news as I try it out.

OK, I've installed it and ran some simple test files at it and I am sad to report that IE 5.5 failed almost all my tests quickly, obviously, and immediately. It does a worse job on XML pages with XSLT stylesheets than the latest beta of MSXML3 that I had running with IE5. Many other known bugs and annoyances have yet to be fixed. In particular:

  • The open file dialog still doesn't let you select XML files. To open an XML file you have to select All Files.
  • Some long-standing bugs in handling of entity declarations in DTDs remain. This means IE5.5 is still unable to parse the XML 1.0 specification and many of Jon Bosak's marked up Shakespeare plays, All's Well that Ends Well, for just one example.
  • While IE5.5 finally recognizes the correct namespace URI for XSLT, it does not recognize the correct MIME type (text/xml, not text/xsl). Furthermore, it has some major problems with XSLT 1.0 stylesheets.

I cut Microsoft a lot of slack for IE5, probably more than I should have. What they provided in IE5 was a reasonable implementation of what was then the current draft specification. However, the XSLT 1.0 spec preceded IE5.5 by more than half a year, and Microsoft still hasn't gotten it right! At this point, I can only attribute their failure to incompetence or malice. Take your pick.


The Apache XML Project has released Xalan-J 1.1, their XSLT processor written in Java. Most importantly Xalan-J now works with the current version of Xerces-J, 1.1.2 (.jar included) Also the ANT build script should finally work. The major API change is that a Xerces-specific URI class now replaces the java.net.URL class when resolving system IDs for external entities. Because of this, Xalan now requires you to use real file URLs like file:///c:/foo/bar.xml instead of c:\foo\bar.xml. However, the command line processor continues to work as before. Java 1.1.8 or later is required.

Tuesday, July 11, 2000

Oliver Becker has written three interesting XSLT stylesheets:


Roger Costello's posted a fairly comprehensive XML Schema tutorial.


FileMaker Inc. has released Web Companion 5.0v5 to fix assorted bugs including providing the correct MIME types for CSS and XSL style sheets.

Monday, July 10, 2000

Michael Smith has founded the XML-DOC mailing list to discuss applying XML to technical documentation. You can subscribe to the list by sending a blank message to xml-doc-subscribe@egroups.com. From the announcement:

XML-DOC is for technical writers, markup authors, XML developers, XML product vendors and systems integrators, and anyone else interested in discussing the application of XML tools and techniques to the creation of documentation, particularly documentation for computer software and hardware and for e-business infrastructure. Topics that are highly encouraged on the list:

  • XML-based content/document management systems
  • Developing and refining XML tools for the manipulation and delivery of technical documentation
  • "Single-sourcing" with XML (write once, reuse/republish anywhere)
  • DocBook, the SGML/XML DTD for documentation
  • Transitioning to XML authoring from a FrameMaker environment
  • Converting legacy documents to XML
  • Publishing (transforming and delivering XML content)
  • XML training and certification
  • Document analysis and data modeling
  • Profiling and customizing for multiple audiences and delivery formats
Saturday, July 8, 2000

The W3C has released Amaya 3.2, their experimental open source web browser and HTML editor for Windows and Unix. Italian and German versions are now available. Most of the new features are fairly minor. Amaya still doesn't support full display of XML in the web browser using either CSS or XSL stylesheets. However, this release does cover a little more of MathML, and does add support for simple XLinks.


The W3C has also released version 2.0.5 of the Jigsaw web server, their reference web server written in Java that supports HTTP 1.1. Jigsaw supports the Java Servlet API 2.2 (with the exception of .war because "it is too 'file' oriented") and Java Server Pages. This is a bug fix release of the "stable" version of Jigsaw. A more experimental, possibly less robust version 2.1.1 is also available. Jigsaw is completely free and open source software.


I already mentioned the release of Opera 4.0 for Windows with full suport for XML and CSS a couple of weeks ago, but I just found out about one new feature in this release that's so important I thought I'd bring it up again. Opera 4.0 has customizable cookie filtering that lets you disallow third party cookies, any cookie from a particular website, or write your own cookie filter that lets you eliminate cookies you don't want. This sort of privacy enhancing technology ought to be built into all browsers. The current choices between all cookies and no cookies are simply too limiting.

By way of contrast, Netscape is now being sued in New York over its use of cookies in the Smart Download feature. The suit alleges that Netscape is violating federal law by secretly transmitting file download information back to Netscape and AOL.

Friday, July 7, 2000

The W3C has posted a new "public review draft" of the Scalable Vector Graphics (SVG) specification. SVG is an XML application describing vector graphics such as might be produced by Illustrator or Corel Draw. The big change in this draft is the merging Stylable SVG and Exchange SVG into a single application. The object model for SVG was also cleaned up a lot, and m any elements and attributes were renamed. There are many editorial corrections as well. The complete list of changes in Appendix M has details.

Wednesday, July 5, 2000

John Wilson has released the first production version of MinML, a parser of an XML-like language intended for embedded Java systems. The big omission is mixed content. The API is SAX 1.0. While this might be useful in some applications, I really wish the word "XML" wasn't used anywhere in reference to it. This is not an XML parser, and it will fail to parse many well-formed XML documents. I think products like this will only confuse most devleopers who really do want and need a fully XML compliant parser. Not all languages have to be XML and not all parsers have to parse XML; but parsers that parse some other language shouldn't call themselves "a minimal XML parser" when they don't even try to meet the minimal requirements of the XML 1.0 spec.

Tuesday, July 4, 2000

XLink has now been promoted to candidate recommendation stage. The major addition to this draft is the xlink:arcrole attribute. This can be attached to simple XLinks and arcs. It serves the same purpose for arc resources as the xlink:role does for extended, simple, locator, and resource elements. (I haven't yet figured out why arcs deserve special treatment.) Furthermore, the specification now requires that the value of both xlink:role and xlink:arcrole attributes must be a URI that identifies a resource describing the intended property rather than the description itself.

The undefined value for the xlink:show and xlink:actuate attributes has been renamed other. Furthermore, xlink:show and xlink:actuate can no longer have arbitrary values but are now limited to the values defined in the spec: onLoad, onRequest, other, and none for xlink:actuate and new, replace, embed, other, and none for xlink:show.

Linkbases are more fully described as is the behavior of applications processing linkbases. Otherwise, I haven't found any major changes in this draft. I'll update Chapter 16 of the XML Bible again as soon as I get a free minute.

Monday, July 3, 2000

Bill LaForge has released version 1.4 of Quick. Quick converts XML files into a Java object structure as specified by a QDML schema. Version 1.4 adds ANY and MIXED support to QDML, includes Mark Wutka's DTD Parser mentioned here yesterday, the com.jxml.quick.qdml package that provides an object model for QDML, and DTD2QDML, a program for converting DTDs to QDML. Java 1.3 is required.

Sunday, July 2, 2000

Mark Wutka has posted a standalone DTDParser written in Java. This allows you to read and respond to the various ELEMENT, ATTLIST, and other decalrations in an XML Document Type Definition. Up till now this functionality has been a major omission in most XML parsers and APIs. DTDParser is licensed under the Lesser GPL (LGPL) license.

Saturday, July 1, 2000

The Ginger Alliance has released version 0.4 of their Sablotron XSLT processor written in portable C++. Version 0.4 changes the license to the MPL or GPL (user choice). It also makes some API changes.

Friday, June 30, 2000

Infinity-loop has released upCast 1.2.6, a Java application that converts RTF documents to XHTML or XML. Pictures in the RTF documents are written to external files, WMF-files are converted into JPEGs. upCast is payware (99 euros) but free for non-commercial use.

Thursday, June 29, 2000

Sun's posted the first alpha version of XSLTC, an XSLT-to-Java compiler. The tool converts an XSLT stylesheet into a Java .class file called a translet that can perform the same transformation. It is of course written in Java, and should run on any Java 1.2 compatible platform.


Opera Software has released version 4.0 of their namesake Opera web browser for Windows. Opera provides full support for direct display of XML styled with CSS.


The answer to yesterday's question about how to convert tag names to normal strings is the XPath name() function. We probably could have figured this out in about five minutes if we had had an Internet connection or a copy of my XML Bible at the show on Tuesday.

Wednesday, June 28, 2000

Ronald Bourret has released version 1.0.1 of XML-DBMS, an open source package "for transferring data between XML documents and relational databases. It views the XML document as a tree of document-specific objects (not the DOM) in which element types are generally viewed as classes and attributes and PCDATA as properties of those classes. It then uses an object-relational mapping to map these objects to the database....An XML-based mapping language is used to define the view and map it to the database. This language is very flexible, allowing users to do basic tasks such as renaming elements and attributes for storage in the database, as well as providing more advanced capabilities, such as handling namespaces, defining how null data is treated, and defining date formats." This release fixes a few bugs and adds some new and revised samples, as well as better documentation for the beginning user.


Yesterday, Alex Chaffee challenged me to find a way to list the tag names in an arbitrary XML document using only standard XSLT. For example,

  • p
  • li
  • html
  • ul

What I need to do is to convert the tag names to normal character data. I do not want to output the tags themselves coplete with angle brackets. So far the only thing I've thought of is an extension function. Can anyone help me out here? I do not know which tags are in the document in advance.

Tuesday, June 27, 2000

I've posted the complete notes for the introductory talk about JDOM I gave last night at the New York C++ & Java SIG of the New York PC Users' Group meeting' at XMLDevCon. The talk will be reprised in a slightly longer version at the New York City XML User's Group in August. (Date and location still to be determined. I'll post more details here when I know them.) I may also give a shorter talk at Extreme in Montreal in August if the conference board picks up my "breaking news" submission, but that's still uncertain.

Monday, June 26, 2000

I'll be at XMLDevCon today and most of this week so updates are likely to be a little slow. Tonight I'm giving an introductory talk about JDOM at the meeting of the New York C++ & Java SIG of the New York PC Users' Group. The meeting starts at 6:45 in the Gramercy B Room of the New York Hilton, 1335 Sixth Avenue, (where the XMLDevCon show is being held) and runs till roughly 8:00. I'll be showing off JDOM basics and comparing and contrasting it to SAX2 and DOM. Admission is free. You do not need to be registered at XMLDevCon to attend, though you may want to register for a free XMLDevCon special events pass if you want to see the keynotes or wander the show floor as well.

Sunday, June 25, 2000

Mark Wutka's released JOX, an open source Java library for mapping map XML documents to Java Beans and vice-versa using introspection. No separate mapping file is required. JOX can map nested objects and indexed properties, and can use a DTD to tell whether to write a property out as an attribute or an element. JOX is published under the LGPL (Lesser General Public License).

Saturday, June 24, 2000

The W3C and the Unicode Consortium have released a joint draft technical report on using Unicode 3.0 with XML. This report focuses on conflicts between specific kinds of Unicode characters (e.g. bidirectional controls) and XML.


The Apache XML Project has released version 1.1.2 of the Xerces-J XML parser for Java. This release focuses on expanding support for W3C XML schemas. In addition the optional SAX2 DeclHandler and LexicalHandler interface have been added. Namespace support is now on by default in this release.


The Apache XML Project has also posted a developer's release of the Xalan-J XSLT processor written in Java. This is version 1.1.D01 and adds some small new features and several bugfixes. It has not yet been extensively tested. Xalan-J 1.1.D01 requires Xerces-J 1.1.1 and may or may not work with other versions including the most recent 1.1.2.

Friday, June 23, 2000

Unidex has released XML Convert 2.0 , a ridiculously expensive $1,000 payware Java program for converting flat text files to XML and vice versa. Version 2.0 adds the ability to convert from one flat file format to another as well as better integration with SAX and James Clark's XT. All of this is accomplished by writing an XFlat schema that describes the transformation.

I took a look at the last beta of XML Convert for a project I was working on that required me to convert a few meagabytes of tab delimited data to XML. However, I quickly decided that it would be a lot easier to hack together a solution in Java, AppleScript, and FileMaker rather than learning yet another language (XFlat schema). It probably would have been even easier to do the job if my Perl wasn't so rusty. It might be more worthwhile if you had a more frequent need to do this sort of job, but I suspect that for most of us, our occasional needs to do this sort of thing are better met by using Perl to go from flat files to XML and XSLT to go from XML to flat files.

Thursday, June 22, 2000

Simon St. Laurent, Dan Kohn, and Murata Makoto have posted a revised Internet draft for XML media types. It proposes the text/xml-external-parsed-entity, application/xml-external-parsed-entity, and application/xml-dtd media types as well as using the '+xml' suffix for media types for specific XML applications like SVG and RDF. This is an "informal last call" working draft.

Wednesday, June 21, 2000

OASIS, has opened the XML.ORG Registry to the public. This is allegedly a directory of XML specifications and vocabularies. However it doesn't appear to be very useful, at least not yet. I tried four searches for "math", "legal", "chemistry", and "health"; all areas in which I know there's a lot of important XML work going on. Every one of them reported "No matches found." I suggest, only half facetiously, that OASIS take whatever budget they've allocated for this project and just give it to Robin Cover. His pages are still the best place to find details about particular XML applications despite half a dozen or so industry consortium efforts that seem dedicated mostly to producing press releases.


IBM's alphaWorks has released the first public version of Extensible Types, a Java component library for specifying constraints and determining whether objects satisfy those constraints. In particular it can verify many datatypes defined by the W3C XML Schema working draft as well as various ISO datatypes. The distribution includes a command line tool for deducing XML Schema text-only types from a set of XML documents or sets of example strings.


The W3C XML Schema Working Group has released a list of 204 open issues in the schema spec.


IBM's alphaWorks has updated VoiceXML for DirectTalk, a prototype implementation of a VoiceXML rendering application that runs on the IBM DirectTalk family of voice response products. This release adds support for "the VoiceXML browser Voicelet, ViaVoice speech recognition and synthesis, Lernaut & Hauspie voice synthesis, and FirstByte voice synthesis."

Tuesday, June 20, 2000

EXML is an open source, expat-based XML parser for Eiffel.


XYZFind have released a public beta of their payware namesake XML search engine for Linux and Solaris.

Monday, June 19, 2000

Next Monday, June 26, one week from today, I'll be giving an introductory talk about JDOM at the meeting of the New York C++ & Java SIG of the New York PC Users' Group at XMLDevCon. The meeting starts at 6:45 in the Gramercy B Room of the New York Hilton (where the XMLDevCon show is being held) and runs till roughly 8:00. I'll be showing off JDOM basics and comparing and contrasting it to SAX2 and DOM. Admission is free. You do not need to be registered at XMLDevCon to attend, though you may want to register for a free XMLDevCon special events pass if you want to see the keynotes or wander the show floor as well. Make your plans now.

Sunday, June 18, 2000

Opera Software has released the sixth beta of their namesake web browser for Windows. Opera 4.0 supports direct display of XML documents with attached CSS stylesheets in the browser. Opera 4.0 is $39 payware. The beta is free for 30 days.

Saturday, June 17, 2000

XIOP is an open source project to produce a Corba EISOP (Environment-Specific Inter-ORB Protocol) and GIOP compliant mapping using HTTP 1.1 as communication protocol and XML 1.0 as the content encoding. whose aim is to merge OMG Corba 2.3 / 3 The encoding framework is based on GREN, an Information Model with strong ISO Groves ties. Support for XML signatures will be added before first release of the specification.

Friday, June 16, 2000

I've updated Chapter 17 of the XML Bible, XPointers, to reflect the latest candidate recommendation.

Thursday, June 15, 2000

The first beta of Apache Jetspeed 1.2 has been released. JetSpeed is an open source Enterprise Information Portal; i.e. Web Based Groupware. Version 1.2 adds iCalendar support and the Portal Structure Markup Language: as well as fixing various bugs.

Wednesday, June 14, 2000

IBM's alphaWorks has posted an alpha of Xeena 1.2. Xeena is a tree-structured XML editor for valid documents. Version 1.2 adds partial XML Schema support and other new features, including DTD to XML-Schema import, search and print capabilities, and keyboard shortcuts.

AlphaWorks has also posted a new version, 1.2.19, of the Task Guide Viewer that works with JDK 1.3 and provides more examples. The Task Guide Viewer is "an XML-based tool for creating wizards."

Tuesday, June 13, 2000

The IETF has posted a new draft of XML Media Types. This proposes five new media types:

  • text/xml
  • application/xml
  • text/xml-external-parsed-entity
  • application/xml-external-parsed-entity
  • application/xml-dtd

It also suggests using the suffix "+xml" for media types for specific XML applications such RDF and XHTML.


The W3C has posted a new working draft of Canonical XML 1.0. Canonical XML describes how to tell whether two documents are "the same" once excess white space has been stripped, default attributes have been inserted, entities replaced and so forth. This is no longer a last call working draft. It now uses the XPath data model instead of the XML Infoset data model.


John Cowan's released a more XHTML compatible version of his Itsy Bitsy Teeny Weeny Simple Hypertext DTD (IBTWSH). This is useful for various XML applications that need a little bit of free-form text; for example, a description element in a catalog DTD.

Monday, June 12, 2000

Maas is developing a payware XML parser for Cobol85. An evaluation version is currently available in exchange for your contact info. They're also publishing a freeware DTD editor written in Java From the DTD it generates

  1. Java classes
  2. A SAX document handler for instantiating and filling data to the objects
  3. The XML document writer

Java 1.1.7 or later is required.

Sunday, June 11, 2000

IBM's alphaWorks has released the P3P Policy Editor, a GUI tool written in Java for creating and updating Web site privacy policies written in the the W3C Platform for Privacy Preferences XML based language.


FourThought LLC has posted version 0.9.1 of 4XPath and 4XSLT, Python implementations of XSLT/XPath. 4XPath implements almost all of XPath except for the lang() core function. 4XSLT is a little farther away from full conformance, but it's still pretty close. Overall these releases improve performance and fix a lot of bugs.

FourThought has also released version 0.10.1 of 4DOM, a Python library for XML and HTML processing and manipulation using the W3C's Document Object Model interfaces. 4DOM implements DOM Core level 2, HTML level 2 and Level 2 Document Traversal. This release fixes assorted bugs.

Saturday, June 10, 2000

IBM's alphaWorks has updated their XML Lightweight Extractor to add support for entities, comments, multiple simultaneous data sources, and hooks for using SQL directly in the DTD mapping file. XLE enables a user to annotate a DTD, then extract XML documents conforming to that DTD from data sources like relational databases.


dc-extract.xsl is an XSLT stylesheet (also available as an online form) that extract Dublin Core data encoded in XHTML pages using META tags, and converts it to RDF.


Michael Mealling's submitted a proposal to the IETF covering A URN Namespace of Object Identifiers. This suggests registering a Universal Resource Name (URN) scheme for which is the first step in registering a URN namespace for OID object identifiers. This would produce URIs that look like URN:OID:0.9.2342.19200300.100.4.

Friday, June 9, 2000

Howard Katz has posted an alpha of the XML Query Engine JavaBean that searches XML documents using a version of XQL. The engine uses a SAX parser to build an inverted index of the documents' full-text content. Java programming is required to connect the XML Query Engine to your own front-end. There's also a useful list of XML documents currently published as XML on the Web. This is payware, but the alpha release is free so far.


XMLPatterns.com is a new website cataloging XML Design Patterns. 25 patterns are currently listed. Some of the patterns seem a little trivial (Metadata First), for example, and the writing is on the dry side, but overall it looks like a good start.

Thursday, June 8, 2000

The W3C XML Linking Working Group has promoted XPointer to candidate recommendation. XPointer is an XPath based, non-XML syntax for identifying particular nodes and node sets in XML documents. It is not yet implemented outside of experimental software.

One major new development in this release is that "The W3C Director has been informed of a Patent (U.S. Patent 5659729, issued 19 Aug 1997 to Sun Microsystems) that may cover parts of XPointer. As of June 7 2000, W3C did not receive an official answer from the patent owner on this issue." The specific patent in question describes a "Method and system for implementing hypertext scroll attributes". The author is Jakob Nielsen. This is particularly galling since Sun employee Eve Maler is one of the primary editors of the specification. I hope this is simply an accident, and not another example of a company trying to get their patented technology adopted by a satndards body so they can extort royalties from anybody who implements the standard.

Other changes to the specification since the last draft seem to mostly reflect editorial decisions and clarifications. For instance, the rules for escaping XPointers in both URIs and XML atttributes are laid out in much greater detail. The actual XPointer syntax has not changed very much at all. There are two new functions. The range-to() function for creating a range that starts from the context node and extends to the argument given to the function. The unique() function returns true if and only if there's exactly one node in the context node list.Otherwise it returns false. That is it tells you whether or not your XPath points to exactly one thing or not.


The W3C XML Linking Working Group has also published a second last call working draft of XBase. This document proposes a facility, similar to that of HTML BASE, for defining base URIs for parts of XML documents. The new last call ends June 28.

Wednesday, June 7, 2000

The W3C has posted two notes from Sun for the JSpeech Grammar Format and the JSpeech Markup Language. The JSpeech Grammar Format defines a non-XML syntax for encoding the valid responses at given moments in a program's life. For instance, a very simple grammar may say that in response to a particular question, the user must answer "Yes" or "No". It makes speech recognition more accurate by limiting the words that may be spoken and commands that may be given at any one time. The is an XML application for annotating text that will be spoken by a computer to imporve its quality. For instance, JSML lets you specify the pronunciation of a word or say that a particular phrase should be spoken louder to emphasize it.

Aside from name changes, these are the same as Sun's previously published Java Speech API Grammar Format 1.0 and Java Speech Markup Language 0.6. The names have been changed to protect Sun's Java trademark while these to be developed under the W3C process. The W3C is obligated to post notes submitted by member organizations like Sun, but they aren't obligated to endorse or do anything with them. These two specs will probably be developed within the Voice Browser Working Group, however.

Tuesday, June 6, 2000

The Apache XML Project has released version 1.1.1 of Xerces-J, their open source XML parser written in Java. This release adds Sun's Java API for XML Parsing, schema include and import, and fixes many bugs.


The Ginger Alliance has released version 0.36 of Sablotron, a GPL'd XSL Transformations Processor written in C++ for Windows NT and Linux. This version is much faster than 0.35 and adds support for xsl:copy.

Saturday, June 3, 2000

The W3C XML Signature Working Group has posted a new working draft of XML-Signature Syntax and Processing and a new working draft of Canonical XML Version 1.0. These are two closely related standards. Canonical XML defines when two XML documents are effectively the same although they're not byte-for-byte identical. For instance, changing the document encoding, adding white space to attribute values, or reordering attributes would not create a canonically different XML document. XML-Signature Syntax and Processing describes a method for calculating digital signatures for and attaching digital signatures to XML documents. Signatures are calculated with respect to the canonical form of an XML document.

The big change in the Canonical XML specification is the use of the XPath data model instead of the XML InfoSet data model. The biggest changes to the definition of canonical XML are

  • Comments are no longer stripped. Two documents that differ only in their comments are no longer considered the same.
  • Namespace prefixes are no longer rewritten and the default namespace is allowed.
Friday, June 2, 2000

4xt has started an effort to update and maintain James Clark's open source xt XSLT processor. No deliverables are available yet, but let them kmow if you'd like to participate.

Thursday, June 1, 2000

A. Sundararajan posted a proposal for Dynamic XML without DOM and Scripts. In concept at least it seems very similar to Microsoft's HTML+TIME.

Tuesday, May 30, 2000

Sebastian Rahtz has updated PassiveTeX, his TeX-based XSL Formatting Object renderer that implements large parts of the March 2000 XSL-FO working draft. He's also updated his XSLT stylesheets for Text Encoding Initiative documents that transform into HTML and XSL-FO .


Tim McCune's released the first beta of Merlot 1.0, an open-source, Java-based visual XML editor. Merlot can be extended via custom editor plugins that can be added for individual XML applications. Java 2 is required.

Monday, May 29, 2000

Henry S. Thompson posted an alpha of his and Richard Tobin's online XML Schema Validator (XSV) with full Unicode support and XML-formatted output.

Saturday, May 27, 2000

IBM's alphaWorks Trust Establishment uses XML as its native file format. A Visual Policy Editor (VPE) edits Trust Policy Language (TPL) documents. Trust Establishment lets you use public key certificates to decide who you trust.

Friday, May 26, 2000

IBM's alphaWorks has posted a new version of their SOAP for Java to support services implemented in scripting languages, a new service manager client and 1-dimensional arrays.

Thursday, May 25, 2000

FourThought LLC has posted version 0.9.0 of 4XPath and 4XSLT, Python implementations of XSLT/XPath. 4XPath implements almost all of XPath except for the lang() core function. 4XSLT is a little farther away from full conformance, but it's getting pretty close. It does not yet handle extension elements and fallbacks. Thius release adds the xsl:attribute-set and xsl:decimal-format elements. Overall these releases improve performance and fix a lot of bugs.

FourThought has also released version 0.10.0 of 4DOM, a Python library for XML and HTML processing and manipulation using the W3C's Document Object Model interfaces. 4DOM implements DOM Core level 2, HTML level 2 and Level 2 Document Traversal. This release fixes assorted bugs and rearchitects some of the internals.


Minoru Development Corporation has released MDC-XSL 0.1, an open source XSLT processor written in C++. It's primarily intended Intended for inclusion in other applications, but a command line converter is included as well.

Wednesday, May 24, 2000

Eric van der Vlist has released CSVDocument, a SAX parser for comma-separated-value files. The SAX-client thinks its reading an XML document that uses a simple XHTML table/tr/td vocabulary. XSLT can transform this basic vocabulary into other vocabularies.

Tuesday, May 23, 2000

RenderX has posted the first public version of XEP, their XSL Formatting Object to PDF converter. XEP only supports the April 21, 1999 working draft of XSL-FO (coincidentally the draft covered in Chapter 15 of my XML Bible). This is an evaluation version that places a small gray label on every page produced, with a hyperlink to the RenderX site.


Version 2.29 of XML::Parser is available from your local CPAN mirror. This release fixes assorted bugs.


PHP 4.0 has been released. Version 4.0 adds the Zend engine for more speed and includes built-in XML support and better session tracking. And of course many bugs are fixed and probably a few new ones introduced. PHP4 is open source and available in source form as well as pre-compiled binaries for Windows. (If you're running Unix, you'll have to compile it yourself.)

Monday, May 22, 2000

Version 4 of DocBook has been released. Version 4 brings much-anticipated XML compatibility to DocBook. DocBook is an SGML (and now XML) application for technical documentation. It's been used for some O'Reilly books (indeed, version 1.0 was invented by O'Reilly's production department) as well as for large parts of the Linux Documentation Project.

DocBook, in either SGML or XML, is actually quite large. Norm Walsh has written a simplified version of the DocBook DTD that's suitable for small articles, essays, HowTos and the like that don't need tables of contents, cover pages, indexes, and the other effluvia of a full-fledged book. Documents written against this DTD should still validate against the full DTD.


The Apache XML Project has released the binary version of Xerces-J 1.1.0. This release focuses on updating schema support to the April 19th working draft, though much work remains to be done.

Sunday, May 21, 2000

Opera Software has released the fourth beta of Opera 4.0, their namesake Web browser for Windows. Beta 4 adds experimental support for WAP-WML (Wireless Application Protocol-Wireless Mark-up Language). Opera 4.0 can directly display XML files associated with CSS style sheets in the browser.


The Apache XML Project has released version 1.1.0 of the Xerces-J XML parser for Java in source form only. Binaries should follow shortly. This release focuses on updating schema support to the April 19th working draft, though much work remains to be done.

Saturday, May 20, 2000

The XML Apache Project has released version 1.7.4 of the Cocoon XML Publishing Framework with assorted minor new features, performance enhancements, and bug fixes.


Matt Sergeant's released version 0.65 of AxKit, the Apache XML Delivery Toolkit. Axkit is a suite of tools for the Apache httpd server running mod_perl, based on the Apache-Perl integration project, and Perl's XML processing tools, for delivering XML to clients. This release adds preliminary XSP support. Currently AxKit's XSP only supports Perl.

Friday, May 19, 2000

The Ginger Alliance has released version 0.35 of Sablotron, a GPL'd XSL Transformations Processor written in C++ for Windows NT and Linux.


Version 3.0 of XML Spy has been released. XML Spy is a $149 payware XML editor for Windows that supports validation and XSLT processing. Version 3.0 adds an incremental validating parser. A free 30-day evaluation version is available.

Thursday, May 18, 2000

Microsoft's posted an updated version of the their MSXML XML parser/XSLT processor for Windows and Internet Explorer 5. This release supports SAX2 and most of XSLT/XPath 1.0. Newly supported constructs include:

  • xsl:import
  • xsl:number
  • Named attribute sets
  • The preceding axis
  • The preceding-sibling axis
  • The following axis
  • The following-sibling axis
Wednesday, May 17, 2000

A new working draft of the Common XML specification has been posted. Changes include requiring namespace declarations to be absolute URIs, and a warning that white space may be normalized or discarded.

Tuesday, May 16, 2000

IBM's alphaWorks has posted a release of the XML Lightweight Extractor (XLE) to add support for multiple simultaneous data sources, allow using SQL directly in the DTD mapping file, more efficient management of JDBC resources, better handling of null values, and assorted bug fixes. XLE builds XML documents conforming to a specially annotated DTD from other data sources like relational databases.

Monday, May 15, 2000

The W3C has released version 3.1 of their Amaya testbed Web browser. This release expands support for MathML and XHTML but still doesn't handle arbitrary XML. Amaya is available for X86 Linux, Sparc Solaris, AIX, OSF1, and Windows.


Freelancer has released version 0.76 of xmlBlaster, an open source message-oriented middleware server written in Java for exchanging messages between publishers and subscribers. The message is described with XML-encoded meta information. Messages may contain GIF images, Java objects, Python scripts, XML data, Word documents, plain text, or any other form of computer data. Communication with the server is based on CORBA (using JacORB 1.1), and subscribers can use XPath expressions to filter the messages they wish to receive. Other protocols like email or HTTP may be plugged in. Version 0.76 adds customizable log files, SSL support, a rainbow fish, and many bug fixes and API improvements.

Sunday, May 14, 2000

Robert C. Lyons has posted a beta of XML Convert 2.0, a Java application that converts flat files to XML and vice versa based on user-supplied schemas. The beta is free, bu t the release version will be approximately $300 payware.

Saturday, May 13, 2000

The XML Apache Project has released version 1.7.3 of the Cocoon XML Publishing Framework with many usability improvements, new examples, and bug fixes.


Henry S. Thompson has posted the first (as far as I know) online W3C XML Schema validator. This will validate document instances using schemas.

Friday, May 12, 2000

There's lots of news from the W3C today. First, The Document Object Model Level 2 has been promoted to Candidate Recommendation status.

The W3C has posted the first public working draft of the XForms Data Model. XForms are an XML Schema based replacement for and upgrade to standard HTML 4.0 forms.


The W3C XML Query Working Group has published the first public working draft of the XML Query Data Model, the first half of the XML Query Language. The second half, the XML Query Algebra, remains to be published. An instance of this model represents one or more complete XML documents or document parts and tells you what information in the documents is accessible. However, it does not specify the programming-language or API used to reach that data. From the draft:

The XML Query Data Model defines formally the information contained in the input to an XML Query processor; in other words, an XML Query processor evaluates a query on an instance of the XML Query Data Model. Our model is based on the XML Information Set, but it requires the following new features to meet the XML Query Working Group's requirements:

A new working draft of the The Platform for Privacy Preferences 1.0 (P3P 1.0) Specification has been published as well. P3P is an XML application which Web sites can use to express their privacy practices in a standard format that browsers can download and interpret. P3P enabled web browsers will inform users of site policies (i and can make decisions based on those policies so that human users don't need to read the privacy policies at every site they visit.

The IETF/W3C XML Signature Working Group has published what they hope is the last public working draft of the XML-Signature Syntax and Processing Specification. This spec defines a way to calculate and represent digital signatures for XML and other documents, which are then represented in XML. This XMLized signature may be either internal or external to the documents it signs.

Thursday, May 11, 2000

Simon St.Laurent has updated the spec for XPDL, the XML Processing Description Language, to improve interoperability. XPDL describes the set of XML features used by different documents.


The proposed Internet Draft for XML MIME Types has been updated. Changes include:

  • Fragment identifiers for text/xml and application/xml are escaped XPointers
  • The base URI may be embedded in text/xml, application/xml, text/xml-external-parsed-entity, or application/xml-external-parsed-entity.
  • MIME types like application/rdf-xml are now application/rdf|xml. That is, the hyphen was changed to a vertical bar.

IBM's alphaWorks has released the XML Interface for RPG for the AS/400. This library provides access to DOM Level 1 in RPG programs so RPG applications can create or parse XML documents.


Matt Sergeant's released version 0.60 of AxKit, the Apache XML Delivery Toolkit. Axkit is a suite of tools for the Apache httpd server running mod_perl, based on the Apache-Perl integration project, and Perl's XML processing tools, for delivering XML to clients. This release improves the caching the caching code, updates the languages, the ability to invalidate the cache when external parsed entities change, adds a few other features, and fixes many bugs.


A new draft of the Common XML specification has been posted. This release adds the requirement that all namespaces be declared on the root element.


FileMaker Inc. has updated all versions of FileMaker 5 to 5.0v4 to fix the exposure of password protected fields through XML.

Wednesday, May 10, 2000

David Megginson's released the final version of SAX 2.0, the Simple API for XML. SAX 2.0 is a cross-language, cross-platform (though Java is best supported), event-based API for processing XML documents. To use SAX, you implement the ContentHandler interface and register your implementation with an XML parser like Xerces. As the parser reads the document, it calls back to methods in your ContentHandler such as startElement() and characters() to tell you what it sees in the document as it sees it.

SAX 2.0 is a major change from SAX 1.0. While SAX 1.0 classes and interfaces are still supported, almost all of them have been deprecated. The primary cause of this was namespace support which required changes to many method signatures. Changes from the last beta of SAX2 are relatively minor however.


The Apache XML Project has released version 1.0.4 of the Xerces-J XML parser for Java. The major new feature of this release is full support for the release version of SAX 2.0. (Most other parsers only support earlier betas at this point.) Schema support is improved in this release, but hardly complete yet. There are also lots of other bug fixes and performance enhancements.

Tuesday, May 9, 2000

Keith Moore of the University of Tennessee has written an important Internet Draft "On the use of HTTP as a Substrate for Other Protocols". Questions addressed include:

for a given application that is layered on top of HTTP:
  • Should the application use a different port than the HTTP default of 80?
  • Should the application use traditional HTTP methods (GET, POST, etc.) or should it define new methods?
  • Should the application use http: URLs or define its own prefix?
  • Should the application define its own MIME-types, or use something that already exists (like registering a new type of MIME-directory structure)?

This draft throws some cold water on current proposals for HTTP-based operations like SOAP and XML-RPC.

In a bit of synchronicity, the W3C has posted a Note on SOAP 1.1 submitted by Don Box, Microsoft, and IBM. This is essentially the same as the SOAP specification IBM published last month. However, it does represent a formal request for the W3C to take this under its wing (which it has not committed to do). SOAP does almost everything exactly wrong from the standpoint of Moore's recommendations.

Monday, May 8, 2000

I've posted the notes from the Intro to XML course I'm teaching today to the Washington D.C. Chapter of the ACM. This is the latest version of course I've given several times before. However, as usual I've updated, revised, and expanded the notes.

Sunday, May 7, 2000

Matt Sergeant's released version 0.55 of AxKit, the Apache XML Delivery Toolkit. Axkit is a suite of tools for the Apache httpd server running mod_perl, based on the Apache-Perl integration project, and Perl's XML processing tools, for delivering XML to clients. This release adds an XSLT processor based on the Ginger Alliance's Sablotron XSLT processor, and various other patches and updates.

Saturday, May 6, 2000

Netscape's released version 4.73 of Navigator/Communicator for the usual list of platforms to fix a couple of nasty security holes. This release doesn't add any significant XML functionality.

Friday, May 5, 2000

Sean McGrath's posted a new version of RAX (Record API for XML) with new writing capabilities so RAX can be used a "pull API" for read/write XML applications. Simply put, you can read records, modify them and write them out again as XML. RAX is implemented in Python.

Thursday, May 4, 2000

When looking at the TEI SGML DTD, I encountered this ELEMENT declaration: declaration:

<!ELEMENT %n.row;       - O  ((%n.cell; | %n.table;), 
                             (%m.Incl;)*)+                      >

I know this isn't legal XML, and I know it isn't supposed to be. What I was wondering was whether anybody could tell me in a paragraph or less what the "- O" between the element name and the content model is. I don't need to know every detail of the syntax, just what this feature is called and what it does. If you happen to know this, could you please drop a line to me at elharo@ibiblio.org? Thanks.


The Ginger Alliance has released version 0.33 of Sablotron, a GPL'd XSL Transformations Processor written in C++ for Windows NT and Linux.


Sebastian Rahtz has updated his PassiveTeX XSL Formatting-Object-to-TeX converter with a version that partially supports the March 2000 XSL FO working draft. The TEI XSL stylesheets at http://users.ox.ac.uk/~rahtz/tei/ have also been updated, in both their HTML and XSL FO incarnations.


The W3C has published a note on Describing and retrieving photos using RDF and HTTP. This describes a Java project as well as the Dublin Core based RDF schemas, data-entry program, means of serving the photos and the metadata over HTTP, and suggestions for search methods to retrieve photos based on their descriptions.

Wednesday, May 3, 2000

Sun's upgraded the Java API for XML Parsing (JAXP) reference implementation to version 1.0.1. This is a bug fix release to help with its use in applets and non-English locales.


Matt Sergeant's released AxKit, the Apache XML Delivery Toolkit. I This is a suite of tools for the Apache httpd server running mod_perl, based on the Apache-Perl integration project, and Perl's XML processing tools, for delivering XML to clients. Style sheet languages supported include XPathScript, NotXSLT, and XSLT as implemented by Perl's XML::XSLT.

Tuesday, May 2, 2000

A nasty security hole in the FileMaker Pro 5 Web Companion lets surfers bypass password protection for individual fields using XML. This bug affects all versions of FileMaker Pro: regular, unlimited, developer, and server. I recommend you disable the FileMaker Web Companion or revert to FileMaker 4 until this bug is fixed.

Monday, May 1, 2000

Beta 3 of the payware XML Spy Integrated Development Environment for XML/XSLT/DTD/Schemas is now available. Beta 3 adds support for for the April 7 W3C XML Schema working draft

Sunday, April 30, 2000

Sun has submittted JSR 67, Java APIs for XML Messaging 1.0, to the Java Community Process. According to the proposal,

"This specification will describe Java API's designed specifically for the exchange of XML business documents such as, invoices, purchase orders, and order confirmations....This JSR does not aim to define either XML messaging standards or XML schemas for particular tasks. These networking and formatting standards belong in networking standards bodies such as Oasis or IETF. Instead this JSR aims to define standard Java APIs to allow convenient access from Java to emerging XML messaging standards, such as the emerging ebXML Transport/Packaging & Routing standard."

Review closes on May 26, 2000.


The W3C Voice Browser working group has posted the first public working draft of Reusable Dialog Requirements for Voice Markup Language. This document lists the data types (Address, PhoneNumber, integer, date, etc.) that an eventual VoiceML API is going to need to provide.


The W3C has also posted a working draft of Techniques For Accessibility Evaluation And Repair Tools. According to the abstract,

This document describes techniques that Web accessibility validation tools may use to evaluate the conformance of HTML documents to the Web Content Accessibility Guidelines 1.0 (WCAG 1.0). This document also describes techniques that Web authoring tools may use to help authors modify HTML documents to conform to WCAG 1.0. We anticipate that tool developers may develop accessibility validation and/or repair modules to be incorporated into commercial authoring tools, validation tools, and perhaps user agents.
Saturday, April 29, 2000

Version 1.1 of the SOAP specification has been published. SOAP defines an XML-over-HTTP protocol for communication between programs running on different systems. SOAP is programming language and platform neutral. IBM's alphaWorks has released a Java reference implementation of the SOAP v1.1 specification under the IBM Public License with full source. As well as HTTP, this release supports SOAP over SMTP (email).

Friday, April 28, 2000

IBM's alphaWorks has released a new version of the SVGView Scalable Vector Graphics viewer program which supports the March 3rd SVG specification. New features include an interactive DOM viewer and JPEG export. file.


The Unicode Consortium and the W3C have released a draft technical report on on Unicode in XML and other Markup Languages Comments should be submitted by July 1, 2000 for consideration.


Murata Makoto and Nanba Ryosuke have released DTD2RELAX, a A DTD to Relax converter. Relax is yet another schema language for XML.


The Apache XML Delivery Toolkit (AXDTK) is a suite of tools for the Apache web server running mod_perl, that lets web servers apply multiple server side stylesheets in a variety of languages (not just XSLT) to a single XML document before sending it to a client browser.

Thursday, April 27, 2000

My fellow O'Reilly authors Brett McLaughlin and Jason Hunter are collaborating on JDOM, an alternative tree-based API for accessing XML documents from Java that is allegedly simpler and more natural to Java developers than SAX or DOM. From my initial peek at the API documentation, I suspect that may be true. However, I did note some problems with namespace support. Adapters are included for SAX and DOM parsers in general as well as Xerces in particular. Current version is 1.0b3. Java 2 (or at least Java 1.1 plus the Collections API) is required.


Meanwhile, on the Python side of the XML fence, Sean McGrath's has published RAX: An XML Database API for Python. This API is designed to make parsing data-oriented XML documents much simpler than they would be when parsed with SAX or DOM. However, this API is relatively limited in terms of the documents it can handle.

Wednesday, April 26, 2000

Various XSLT implementors are developing two vendor/XML parser/XSLT processor-independent APIs that attempt to do for XSLT what SAX and DOM do for XML (that is, let code written to the common API not run with any underlying processor). These are:

  • TRaX, TRansformations for Xml, a generic API for all forms of transformations including but not limited to XSLT
  • The Serialize API for serializing DOM trees and SAX events to a stream contributed by by Assaf Arkin from ExOffice

Both are written in Java and depend on SAX2, though this may change. Nothing is written in stone yet. The participation of the broader XML community in helping to define these APIs is actively solicited.


IBM's launched a Japanese AlphaWorks.


IBM's alphaWorks has posted a new version of the XMI Toolkit, a Windows tool for converting designs and code between Java, UML, and Rational Rose. This release requires Java 1.2.2 and includes a framework adapter sample that uses XMI to serialize and deserialize Java objects.

Tuesday, April 25, 2000

Michael Kay has released version 5.3.1 of Saxon, his XSLT 1.0 processor written in Java. This version is faster and uses less memory than version 5.2. This release also adds extension functions for calling a template whose name is determined at run-time and disabling output-escaping in attribute values (e.g. for "&" in HTML URLs).


The W3C has posted the last call working draft of The Platform for Privacy Preferences 1.0 (P3P1.0) Specification. P3P defines a standard XML format in which Web sites can express their privacy practices. Web browsers can retrieve and interpret these policy statements automatically so that human users don't have to read the privacy policies of every site they visit. Last call ends April 30, 2000 (a little quickly in my opinion). Feedback is particularly solicited about the extension mechanism.

Sunday, April 23, 2000

The W3C has posted the first public working draft of APPEL, A P3P Preference Exchange Language. APPEL specifies a language for describing collections of P3P (Platform for Privacy Preferences) preferences and policies between P3P agents. Using this language, users can express their preferences in a set of preference-rules which the user agent refers to before deciding whether to accept the privacy policies of P3P enabled Web sites.

Saturday, April 22, 2000

IBM's alphaWorks has released a new version of the XML Lightweight Extractor, a tool for annotating a DTD and then extratcing documents that adhere to the DTD directly from databases and other data sources. This release adds support for simultaneous multiple data sources and hooks for using SQL directly in the DTD mapping file, as well as fixing various bugs.

Friday, April 21, 2000

A new draft of Common XML has been posted. This draft cleans up the handling of some obscure namespace issues.

Thursday, April 20, 2000

Unicorn Enterprises has posted the first alpha of the Unicorn XSLT processor, an XSLT 1.0 processor written in C++ for Windows based on a non-validating parser that does not process external DTD sets and external parsed entities. Furthermore only UTF-8 and UTF-16 encodings are supported.


David Megginson posted XMLWriter 0.1 for Java, a SAX2 filter class that can both serialize its events to an XML document and pass them on for further processing. XMLWriter is public domain and requires the SAX2/Java prerelease.

Wednesday, April 19, 2000

David Megginson has posted the final, feature-frozen prerelease version of SAX2/Java. Bug reports are actively solicited. Barring discovery of major problems, the final version wil br released on Friday 5 May 2000.


Mozilla Milestone 15 for Mac, Linux, and Windows is on the FTP site.

Tuesday, April 18, 2000
Software Development 2000 Conference Logo

The call for proposals for SD2000 East, to be held in Washington D.C. from October 29th to November 2nd, 2000, is up. I chair the XML track for this conference and we're looking for proposals that cover the broad range of XML technologies.

The audience for this show is very developer heavy. Attendees tend to be programmers working in the trenches to implement solutions. The evaluation forms tell us they want meaty, technical talks that show by example how to use XML/Java/C++/etc. Thus in the XML track we're looking for talks that pick one aspect of XML (e.g. schemas, XLinks, DOM, etc.) and explain and demonstrate it in as much detail as is feasible in 90 minutes.

Our audience is not very patient with vague presentations that hype technologies without actually showing them in action. In particular, we really don't need more broad, overview introductions to XML and why it's cool. What we do need are presentations on specific topics like SOAP or Processing XML with Python that spend maybe 1/4 to 1/3 of the time explaining why that particular technology is useful, (not why XML in general is useful, assume the audience already knows that) and the rest of the time actually showing the audience how to use the technology you're presenting.

As well as XML, SD2000 East will offer tracks covering Linux Servers, C++ Programming, Component Based Development, COM, Enterprise Application Integration, Java, Methods & Modeling, Internet Application Development, and E-Commerce. There's also some interest in offering a few seminars about extreme programming and/or wireless applications. We haven't quite decided which track to put those in yet--it may well depend on exactly what the abstracts propose--but if you do submit abstracts in those areas, they'll be considered quite seriously. Aside from giving a few Java talks, I don't have direct involvement with these other tracks, but proposals are solicited for these as well. Deadline for abstracts is May 1, and they may be submitted online at http://www.sdexpo.com/speakers/abstracts.html.


IBM's alphaWorks has released a new version of VoiceXML for Windows 95/98/NT that supports most of the VoiceXML 1.0 specification, as well as persistent caching of network resources.


Hannes Wallnoefer has released version 1.0beta of his XML-RPC Java class library. This release fixes assorted bugs and supports the <nil/> element.

Monday, April 17, 2000

I've updated the XML Books List, though I'm sure they're still a lot that are missing.


Kyle Downey has posted a preview release of OpenLib/J 0.1 with the proposed SAX/T interfaces. SAX/T is a proposed standard interface for XSLT processors. This is quite rough at the moment, but it seems like an idea whose time has come. There's also a discussion forum for this package, though personally I prefer to discuss such matters on xsl-list.


John Wilson has posted the first alpha of MinML, a MinXML parser written in Java for the Dallas Semiconductor TINI, a $50 embedded Java system. The parser is about 7K.

Sunday, April 16, 2000

Sjoerd Visscher expanded his JavaScript MinXML parser and renamed it JaSMin. This release fixes a bug, adds a MinXML pretty printer, and a textnodeParser that can expand character entity references or collapse white space. It's now 50 lines long.

Saturday, April 15, 2000

The W3C has published a note from the Japanese Standards Association about using Japanese characters in XML documents. Mostly, this discusses ambiguities that arise when converting docuemnts encoded in legacy Japanese character sets to Unicode. out.


The W3C has published a CSS Level 3 roadmap. CSS3 will be divided into modules that can be worked on and perhaps implemented more or less independently. These include:

  • Syntax / grammar / etc
  • Selectors
  • Values, Units
  • Value assignment / cascade / inheritance
  • Box model / vertical layout
  • Positioning
  • Color / gamma / color profiles
  • Background
  • Line box model
  • Text, font / bidi / vertical alignment
  • Ruby
  • WebFonts
  • Generated content / markers
  • Replaced content
  • Paged media
  • User interface
  • Tables
  • Columns
  • SVG
  • Math
  • BECSS
  • ACSS
  • SMIL
  • Test Suite

David Megginson's split org.xml.sax.ext handlers package containing LexicalHandler and DeclHandler from the main SAX2 into SAX2-ext.


Late Night Software's released version 2.0 of XML Tools, an expat-based XML parser for AppleScript. Version 2.0 adds support for comments, processing instructions, namespaces, and external entities. MacOS 8.5 or later is required.

Friday, April 14, 2000

Bill la Forge has posted Quick 1.1.5 to fix a few bugs. Quick is an open source Java library for converting XML files into objects, using the classes of your choice.


Eve L. Maler has posted version 2.1 of the XMLspec DTD used for W3C XML-related specifications and associated documentation. This DTD is backwards compatible with version 2.0. Changes include:

  • A proto element for function prototypes like those in XPath.
  • Phrase-level elements for function, var, sub, sup, phrase, el, att, and attval.
  • Status and abstract can appear in the W3C-dictated order.
  • More current XLink usage.
  • A common diff attribute.
  • A div5 element.
  • Issue markup handles issues lists better.
  • An example element.

The XML Apache Group has released version 1.0.1 of the Xalan XSLT processor. Version 1.0.1 fixes about half a dozen bugs in version 1.0. This release fully supports XSLT 1.0, modulo any undiscovered bugs.


Shawn Silverman has written a single class MinXML parser in Java. The parser supports SAX 1.0, and is free for non-commercial use.


IBM's alphaWorks has released version 1.0.1 of the LotusXSL XSLT processor, based on Xalan 1.0.1 (or vice versa). This release supports XSLT 1.0 completely, modulo any undiscovered bugs.

Thursday, April 13, 2000

David Megginson has released version 0.4 of his XML::Writer Perl module. This release adds some extra functionality to make life easier for people creating data-oriented XML (i.e. no mixed content). In particular a new 'data mode' starts each element on a new line and adds indents it. There is also a new dataElement(name, data, atts...), method for producing elements that contain only character data content.


Common XML is a document describing what can be reliably counted on from non-validating parsers. In essence, it defines a subset of XML such that a document which conforms to this subset will present the same information set to all parsers.


Arbortext has released a public domain Java class library for resolving PUBLIC IDs in DTDs. These classes can be plugged into any SAX Parser and support the full semantics of OASIS Entity Management. The distribution includes examples of Catalog support with both Xerces and XT.

It's hard to believe that on the very day Norm Walsh announced this on the xml-dev mailing list, I was in London telling an audience that PUBLIC ID support was purely theoretical at this point in time and having a long discussion with them about what such a resolver would have to look like. At least I warned them that I had been sight-seeing for the last few days instead of checking my email, so I couldn't say about anything that had been released in the previous week. I thought I was joking. Such is life on the bleeding edge.


The final release of Enhydra 3.0 is now available. Version 3.0 supports the Java Servlet API 2.2 and Java Server Pages 1.1. It also adds load balancing, a WML DTD, dynamic recompilation of XMLC, a more robust DODS, cookie-less sessions, enhanced multiserver administration, better JBuilder support, and many bug fixes.


The W3C Document Object Model (DOM) Working Group has updated the "Document Object Model Requirements" Working Draft. to include the requirements for DOM Level 3.


From the "What goes around comes around department", we note that Amazon is now being sued by Intouch for infringing Intouch's patents on allowing consumers to preview music over the Internet. One stupid patent lawsuit deserves another. Read more on news.com.


Sjoerd Visscher's written a MinXML parser in 28 lines of JavaScript. This is actually interesting. I can see a use for very small parsers for use in Web pages, even if they aren't significantly faster than full-XML parsers like xt.


CSIRO's posted a new version of their open source SVG Toolkit. This contains a viewer, an implementation of the SVG DOM, and a utility for converting an SVG document into various image formats. Java 1.2 or later is required. This release supports the March 3 working draft of Scalable Vector Graphics.

Wednesday, April 12, 2000

Today, Wednesday April 12, I'm giving a brief introductory seminar on XML for the Department of Medical Informatics at the Columbia-Presbyterian Medical Center in upper Manhattan. from 4:00 to 5:00 P.M. (Refreshments at 3:45) The seminar is free and open to the public. Directions are on the web site. Everyone is invited.


The Apache XML Group is starting a new project called Xang. Xang is designed to help developers quickly build data-driven, cross-platform Web applications that integrate disparate data sources. The Xang architecture cleanly separates data, logic and presentation. It is based on open industry standards such as HTTP, XML, XSL, DOM and ECMAScript (JavaScript). The current version is 0.0.1.


The Apache XML Project has released version 1.72 of the Cocoon XML Publishing Framework. This release fixes assorted bugs.


Swift, Inc. has released version 1.00 of the Athens XML editor for Windows that allows you to edit XML either by typing raw XML source or by filling in forms. Athens can also be configured as an embeddable ActiveX control. It requires IE 5.0 or later. Athens is $80 payware. A 30-day free-trial is available.


Sun's contributed the Java Project X parser to the Apache XML Project where it's been renamed Crimson. It's not part of Xerces yet, and exactly how much of it will be merged into Xerces remains to be determined. However, it can be browsed via CVS or the Web in the xml-contrib/crimson module.


Kevin A. Burton has released version 0.03 of Alexandria, a CVS/JavaDoc/Source code/Documentation management system meant for use within open source projects. The backend is implemented in XML driven by Ant, Xerces, and Xalan. Definition of source code is done within XML and then transformed through XSLT into various HTML files and a master build.xml file which is used to drive Ant. Ant attempts to download various CVS repositories and then builds Javadoc. Projects currently available from the system include:

  • Alexandria
  • Ant
  • Apache 1.3.x
  • Apache 2.0
  • Avalon
  • Cocoon
  • ECS
  • FOP
  • Jakarta Tools
  • James
  • JDK 1.1.8
  • JDK 1.2.2
  • Jetspeed
  • Stylebook
  • Tomcat
  • Turbine
  • Xalan
  • Xerces

Apache Jetspeed 1.0 is an open source Enterprise Information Portal (i.e. Web based groupware). Jetspeed attempts to consume information from multiple resources on the Internet and combine them. Information can come from multiple content types from XML to SMTP to newer protocols such as iCalendar.

Tuesday, April 11, 2000

Sebastian Rahtz has relaunched PassiveTeX, his TeX-based system for rendering XSL formatting objects to FOP. This version rewrites the internals but doesn't add a lot of new functionality.


Vivid Creations has posted the first betas of their ActiveSAX 2.0.0 and ActiveDOM 2.0.0 COM components. Version 2.0 adds support for the DOM level 2 core interfaces as well as SAX2 beta. These are each $149 payware.


Freelancer has released version 0.75 of xmlBlaster. xmlBlaster is an open source message-oriented middleware server written in Java that exchanges messages between publishers and subscribers. The message is described with XML-encoded meta information. Messages may contain GIF images, Java objects, Python scripts, XML data, Word documents, plain text, or any other form of computer data. Communication with the server is based on CORBA (using JacORB), and subscribers can use XPath expressions to filter the messages they wish to receive. Other protocols like email or HTTP may be plugged in.


Rick Jelliffe's released, Schematron-message, an implementation of schematron which generates error messages in the format expected by emacs, XED, etc. so you can use it as part of an edit-validate-check cycle.


Swift,Inc. has released the RELAX Verifier for Java, an open source Java package that validates XML documents against RELAX grammars. RELAX is a yet another schema language for XML that is written in XML syntax.


IBM's alphaWorks has posted a minor update to Visual XML Tools to make the installer check for the IBM JDK 1.2.2 as well as the Sun JDK/JRE 1.2.2.

Monday, April 10, 2000

The Apache Jakarta Project Management Committee has formed a new sub-project called JAKARTA-TAGLIBS. The purpose of this sub-project is to be a repository of custom tag libraries, contributed to the project under the Apache Source License. These custom tag libraries can be utilized "as is" in your web applications, or you can contribute patches and other improvements to the libraries, or offer your own library as well.

People interested in developing custom tag libraries that are part of the sub-project, should subscribe to the TAGLIBS-DEV mailing list by sending mail to TAGLIBS-DEV-SUBSCRIBE@JAKARTA.APACHE.ORG. Anonymous CVS can be used to check out the source code of this sub-project by executing the command

cvs -d :pserver:anoncvs@jakarta.apache.org:/home/cvspublic checkout jakarta-taglibs

The source code can also be browsed online at http://jakarta.apache.org/cvsweb/index.cgi/jakarta-taglibs.


Bill la Forge has posted Quick 1.1.4 to fix some bugs involving IDREFs and repeating selections and sequences. Quick is an open source Java library for converting XML files into objects, using the classes of your choice.

Sunday, April 9, 2000

Don Park has released Min 1.0A2, a Java Minimal XML parser. This parser parses a restricted subset of XML known as Minimal XML or SML. It supports SAX 1.0 and JAXP (Java API for XML Parsers). You can download the distribution ZIP file at: It includes a command line program for converting XML files into MinXML files.

Min is allegedly twice as fast as at least one full-blown XML parser, Sun's JAXP parser, though the time savings is only about 50%. In my opinion, this is not worth the non-standardness. I'd want to see at least an order of magnitude improvement to consider this worthwhile. Minimal XML leaves out a LOT including:

  • Attributes
  • CDATA Sections
  • Comments
  • Document Type Declarations
  • Empty-Element Tags
  • Entity References
  • Mixed Content
  • Predefined Entities
  • Processing Instructions
  • Prolog
  • XML Declaration

Bascially all that's left are elements that contain other elements and elements that contain #PCDATA as well as the well-formedness rules. I don't think enough documents meet these constraints to make it worth a separate parser. I'd also like to see Min compared to other parsers, xp and Xerces in particular.


IBM's alphaWorks has released Voice XML for DirectTalk, a "prototype implementation of a VoiceXML rendering application that runs on the IBM DirectTalk family of voice response products" on AIX, Windows 95, 98, and NT.

Saturday, April 8, 2000

Lutris has posted the first alpha release of Enhydra Enterprise 4.0. This release combines

  • Enhydra MultiServer (EMS)
  • Enhydra Name Service (ENS)
  • Enhydra Web Container Service (EWS)
  • Enhydra JOnAS Service (EJS)

As an alpha, this release is not feature-complete and is "intended only for use by experimental service builders, for initial quality engineering work, and as a concrete example for architectural discussions."


The W3C has posted new working drafts of the XML Schema Specifications. Part 1, Structures, and Part 2, Datatypes, are now in Last Call, ending May 12, 2000. Part 0, the Primer, has also been updated, but isn't quite as far along.


Microsoft's posted a preview release of Internet Explorer 5.5 for Windows. New features include element behaviors, colored scrollbars, editable regions, HTML+TIME and zooming. CSS support is improved (though still incomplete) with border, padding, and margin propertiess on inline elements, the first-letter and first-line pseudo-selectors, and vertical text layout. Microsoft hasn't said anything about XSLT support, but I suspect it's at about the level of the MSXML 3 DLL released a few weeks ago; that is, incomplete support for the final XSLT 1.0 specification. As usual, Microsoft preview software should not be trusted. Run this only on machines dedicated to beta testing only where you won't mind if you have to wipe the hard drive and reinstall everything from scratch.

Friday, April 7, 2000

There were a number of new and revised recommendations and working drafts from the W3C while I was away. These include:


IBM's alphaWorks has updated the XML Lightweight Extractor by adding support for multiple simultaneous data sources and hooks for using SQL directly in the DTD. mapping file, in addition to bug fixes.

AlphaWorks has also posted a very minor update to Visual XML Tools.

Thursday, April 6, 2000

Sun's released version 1.0 of the Java API for XML Parsing (JAXP) 1.0, a standard extension for Java 1.1 and later. This spec is mostly comprised of three parts, two of which are obsolete. The first part is SAX1. The second part is DOM Level 1. The third part comprises classes for locating and instantiating parsers. SAX1 and DOM1 are both obsolete (for instance SAX1 doesn't have any explicit namespace support) so I recommend that most programmers using Java to process XML simply jump straight to SAX2 or DOM2 via Xerces-J, and ignore this release. In fact, Sun has already started a new Java Specification Request for the Java API for XML Parsing 1.1 that updates JAXP to SAX2 and DOM2. Why they couldn't wait and do it right the first time, I'll never know.

As usual the spec is only available in PostScript and PDF formats, not HTML. Scott McNealy did a good thing when he banned PowerPoint from Sun. Now if we can just get him to ban FrameMaker as well, maybe we can get some specs written in HTML that people can actually read.


Netscape's posted Netscape 6 Preview 1, a bleeding-edge pre-beta that supports direct display of XML+CSS in the browser. This is based on the Mozilla source code, and is available for Windows, Linux and the Mac. Lots of bugs have been reported up to and including crashing systems and deletion of old bookmark files, especially with the Macintosh version. Exercise extreme caution before installing. I don't recommend this for production systems, only for non-mission critical CPUs used for testing beta software.


I'm finally finished with three weeks of speaking at various venues around the Northern hemisphere, most recently in London. I'll be spending most of today catching up on the email and news that's piled up while I've been away.

Thursday, March 30, 2000

Version 2.28 of XML::Parser, the XML parsing module for Perl, is now available from your local CPAN mirror. This release fixes a lot of bugs and makes some modifications to the API.

Wednesday, March 29, 2000

Lutris has posted the second beta of Enhydra 3.0. Beta 2 adds new documentation including a "Getting Started Guide" as well as the usual bug fixes and speed-ups.


The W3C XSL Working Group has posted a new working draft of XSL formatting objects. XSL-FO is now in last call until April 30, 2000. Given the unstable and incomplete nature of the very few XSL-FO renderers that currently exist, I hope the W3C exhibits an unusual level of caution with this and does not set a definite date for the end of the Candidate Recommendation phase, but instead waits to see whether or not the full spec is actually implementable.


The W3C Math Working Group has published a new working draft of MathML 2.0 This draft uses the provisional UTC math allocations in Plane 1.

Monday, March 27, 2000

Microsoft's released Internet Explorer 5.0 for the Mac. This browser supports direct display of XML, and has much better support of CSS Level 1 than does IE5 for Windows. However, XSLT support is limited to the very old December, 1998 working draft that IE5 for Windows supports. IE5 Mac also supports XHTML without any compromises, the first browser to do so. MacOS 7.6.1 or later, a PowerMac, OpenTransport 1.2, twelve megabytes of disk space and QuickTime 3 or later are required.


FileMaker's launched the FileMaker XML Central web site discussing the XML support in the FileMaker 5 family of payware databases for Mac and Windows. It includes a white paper I wrote showing FileMaker developers why they might want to and how to use XML to serve web data. The full paper is unfortunately only available in PDF format.

Friday, March 24, 2000

The W3C XML Core Working Group has posted the first public Working Draft of XInclude. XInclude defines an <xinclude:include href="URI"/> element whose effect is to insert the referenced XML document into the document in place of the xinclude:include element after validation. (Whether to allow replacement to take place before validation is being discussed.) You can specify whether you want the included content to be parsed as XML or inserted as raw text with all illegal characters like < escaped as necessary.

Interestingly, this proposal also suggests that the correct interpretation of the xlink:show="embed" attribute of an XLink element is to graphically embed the content referenced in the XLink, but not to consider it part of the including document. The XLink specification itself is silent on exactly how xlink:show="embed" should be interpreted.

The intention seems to be that XML parsers like Xerces will implement XInclude, perhaps as a settable property. However, it should be straight-forward to write a SAX filter that adds XInclude behavior in the meantime, or alternately a simple static method that takes a DOM Document object as an argument and returns a new Document in which all the text has been included. For example,

public static Document mergeIncludes(Document doc) throws SAXException, IOException

I do something very much like this for my various presentation notes, though so far I've implemented it as an XSLT extension function rather than inside the XML processor.


IBM's alphaWorks has released LotusXSL 1.0.0. This XSLT processor is essentially a repackaged version of Xalan 1.0.0 (and vice versa).


Bill la Forge has posted Quick 1.1.1 to fix some bugs. Quick is an open source Java library for converting XML files into objects, using the classes of your choice.

Thursday, March 23, 2000

Opera 4.0 public beta 1 for Windows has been released. Most notable in this release is that Opera can now directly display XML pages formatted with CSS. The CSS support is incomplete however.


The Apache XML Project has released version 1.0.0 of the Xalan-J XSLT processor. This is possibly the first XSLT processor to implement the full XSLT 1.0 recommendation (modulo the inevitable bugs).


Morus Walter's released xmlparse 0.9.0, a BSD-licensed C++ validating xml parser library. It should compile under gcc on reasonably Unix-like platform.


Sean McGrath's released HTML2PYX, a Pyxie extension for creating PYX from HTML documents and PYX2XML, a Pyxie extension for creating PYX from XML documents.


Sebastian Rahtz has officially released passivetex, a TeX-based system for formatting XSL formatting objects.


Bill LaForge has released version 1.1 of Quick. Version 1.1 simplifies the API and adds the concept of a unique documents that can be cached and reused rather than reparsed. Hyper-links are now supported. New element content and attribute types include base64, char, BigInteger, url. FIXED attributes support has been added as well.

Tuesday, March 21, 2000
I'll be speaking at 
the O'Reilly Java Conference

I'm going to be travelling a lot over the next few weeks. Right now I'm in San Jose for SDExpo 2000 West. Next week, I'll drive up the road to Santa Clara for the O'Reilly Enterpise Java Conference. Then I'll fly over the Atlantic to the Xephon XML in the Large Organization show in London. Consequently updates are likely to be a little sporadic here over the next few weeks. If you're sending in news, I'd really appreciate it if you could write up quick, no-fluff one-paragraph summaries in HTML including all relevant points and links. I'll have limited time and net access to distill press releases and web sites down to the bare facts.

I'll be speaking at the SD2000 West conference

I have posted the notes from yesterday's Processing XML with Java tutorial at SD2000 West. This full-day talk covered SAX, DOM, and various other technologies related to writing Java programs that read and write XML documents. I hope to eventually turn this material into a book.

On a related note, I'm going to have a weekend's worth of down time in the Bay Area next weekend between the SD and O'Reilly conferences. If anybody knows of anything interesting going on in the Bay Area that weekend, I'd love to hear about it. I was thinking it might be nice to go hiking or bird watching somewhere. Suggestions for locations that aren't too far off the beaten path are appreciated. Drop a line to me at elharo@ibiblio.org if you have any ideas.

Sunday, March 19, 2000

Oracle's released version 0.9.9.1 of Oracle XSQL Pages and the XSQL Servlet for connecting SQL databases to XML/XSLT-based web servers. As usual you'll have to register and remember your password to get in.

Saturday, March 18, 2000

Oracle's posted version 0.9.0.0 of their XML Schema Processor on Java is now available on the Oracle Technology Network (registration required). This adds XML Schema support to Oracle's XML Parser for Java. A lot, but not all, of the current schema working draft is supported.


We just had a speaker drop out at the last minute for SD2000 West next week. If anyone here would like to talk on the subject of "Implementing E-Commerce into Enterprise Applications with XML" for 90 minutes on Tuesday in San Jose, please drop a line to Shabnam Malek at smalek@cmp.com or call her at 415-905-2711. Thanks!

Friday, March 17, 2000

FourThought LLC has posted version 0.8.3 of 4XPath and 4XSLT, Python implementations of XSLT/XPath. 4XPath implements almost all of XPath except for the lang() core function. 4XSLT is a little farther away from full conformance, but it's getting pretty close. xsl:number support is new in this release. Overall these releases improve performance and fix a lot of bugs.

FourThought has also released version 0.9.3 of 4DOM, a Python library for XML and HTML processing and manipulation using the W3C's Document Object Model interfaces. 4DOM implements DOM Core level 2, HTML level 2 and Level 2 Document Traversal. This release fixes assorted bugs.

Thursday, March 16, 2000

I've received email from a couple of Microsoft developers informing me that nested templates are in fact not allowed by the preview release of MSXML3.DLL as I erroneously claimed here yesterday, at least in the standard namespace. They may be allowed in the old namespace for compatibility with previous versions of IE5, but that's not such a big deal. The claim that they were suported was based on incorrect documentation on the Microsoft web site.

On the other hand, I still hold to my claim that Microsoft pays insufficient obeisance to standards, and that this is a cultural issue at Microsoft, even if it turns out not to be the problem in this particular case. For one thing, the Microsoft developers who communicated with me over this both seemed a little bemused by the strong customer demand for standard XSLT. They clearly felt this was an external requirement imposed on them from the outside rather than something that was important for its own sake and for Microsoft. For another, Bruce Schneier just pointed out to me an unrelated problem in Windows 2000 support for Kerberos that clearly demonstrates the same cultural imperatives at work.

Wednesday, March 15, 2000

Microsoft has posted a new preview release of MSXML3.DLL, their XML/XSLT processor used inside Internet Explorer 5 for Windows. This release gets closer to full XSLT 1.0 support, but isn't there yet. Missing features include:

  • The document() and key() functions
  • The decimal-format, strip-space, and preserve-space attributes
  • The preceding, following, preceding-sibling, and following-sibling axes in XPath
  • The xsl:param, xsl:with-param, xsl:number xsl:import and xsl:apply-imports elements.

Furthermore, Microsoft has added some decidedly non-standard features such as nested templates so that style sheets written for IE5 may well break when used with other, fully-standard compliant XSLT processors. This seems like more of Microsoft's usual arrogance. The programmers working on this software had what they thought was a good idea, but simply didn't consider whether it was more important to be compliant with the standard. In this case it was even a standard they participated in developing. Microsoft has a cultural aversion to standards, and this is just one more example.

Tuesday, March 14, 2000

IBM's alphaWorks has updated their XML Parser for Java to version 3.0.1 Release. This parser DOM1 and SAX1 as well as the latest drafts of SAX2 and DOM2. It's based on the Apache XML Project's Xerces 1.0.3.

Monday, March 13, 2000

David Megginson's released RDFFilter 1.0alpha, a Java- and SAX2(beta)-based package for processing RDF documents. RDFFilter uses a relatively simple callback-based interface, and doesn't build any in-memory trees, so it can be used with very large documents.

Sunday, March 12, 2000

The first beta version of Enhydra 3.0, an open source application server. This release adds:

  • The Java Servlet API 2.2
  • Java Server Pages 1.1
  • Load balancing
  • Dynamic recompilation of XMLC
  • More robust DODS
  • Cookie-less sessions
  • Better multiple server administration
  • Better JBuilder support
  • A DTD for WML
  • Many bug fixes

Simon Langford has released the first public version of Exemel, an XML browser written in Java.


Ken MacLeod's released SOAP-RPC 0.00, a Perl module implementing a minimal set of SOAP functionality. SOAP-RPC can marshal hashes (structures), arrays, and scalar values. SOAP-RPC follows SOAP convention in method calls: all methods accept and return named parameters (a single hash). Version 0.00 doesn't handle namespace processing correctly. SOAP-RPC also includes a command line program, soap-client.pl, that can send SOAP requests directly from the command line.


The Unicode Consortium has published Version 3.1 of Technical Report 6, A Standard Compression Scheme for Unicode (SCSU). This release adds a recommendation to use a signature sequence.

Saturday, March 11, 2000

The first alpha version of Apache 2.0 has been released. Version 2.0 will add multithreading, the Apache Portable Runtime (APR) library, an autoconf based build system, and support for multiple protocols.

Friday, March 10, 2000

Frank Boumphrey, the HTML Writers Guild, OASIS, and Project Gutenberg are beginning to mark up Project Gutenberg etexts in XML. About 100 books have already been marked up. Work is now commencing on a set of formal DTD's for Gutenberg and other e-texts. If you're interested in participating, you should joing the hwg-gutenberg-dtds@hwg.org mailing list.


<XML>fr is a new XML portal in French.


A temporary mailing list has been set up to discuss a possible Apache project to create a library of useful XSLT transforms. If you're interested, you can subscribe by sending email to majordomo@ejbdevelopers.com with the words "subscribe xsllib" in the body of your message.


IBM's alphaWorks has released a new version of VoiceXML with support for some (not all) of the just released VoiceXML 1.0 specification including subdialogs, scripting, and more.


Bill la Forge has posted Quick 1.1 Alpha. Quick is an open source Java library for converting XML files into objects, using the classes of your choice. Version 1.1 simplifies the API, recognizes unique documents, breaks lines more attractively, and adds base64, char, BigInteger, and url types.

Thursday, March 9, 2000

IBM's alphaWorks has added a new graphical user interface to their XML Generator, fixed some bugs, and sped up the program overall. The XML Generator produces instances of valid XML documents givena DTD to start from.

AlphaWorks has also released a new version of the Data Descriptors by Example with support for the February 2000 World Wide Web Consortium (W3C) XML Schema working draft and imporved documentation. Data Descriptors by Example (DDbE) goes in the opposite direction of XML Generator. DDbE starts with instances of XML documents and deduces an appropriate DTD or schema from them.


Matt Sergeant has posted version 0.17 of the XML::XPath Perl module. This release adds a command line program for making XPath queries against a file. It's available from your local CPAN mirror.


Paul A. Tchistopolskii has released PXSLServlet 0.3, a Java servlet that produces XML from SQL queries and other sources, then transforms them to HTML with XSLT before sending them to the browser.


The VoiceXML Forum (Motorola et al.) has published the VoiceXML 1.0 specification in PDF format only. VoiceXML is used to describe those annoying call trees you get when calling most major companies. (Press 1 if you want to wait on hold for 20 minutes and then be hung up on; press 2 if you want to wait indefinitely; press 3 if you'd rather we just hung up on you now.)


Version 1.0.3 of the Apache XML Project's Xerces-J XML parser in Java has been released in source and binary form. Among other bugs fixed, this release uses the correct defaults for namespace handling.


IBM's alphaWorks has added AS/400 support to their XML for C++ parser.


FatBrain is offering $16 off any purchase of $64 dollars or more. Use the discount code "sweet16". It's a good time to pick up the XML Bible, the Unicode 3.0 specification, or any other XML books you've been wanting.

Wednesday, March 8, 2000

The W3C has extended the candidate status of the Document Object Model (DOM) Level 2, through March 20, 2000 to solicit further feedback. Based on previous feedback, the internal subset string added in the previous draft has been dropped. In addition, a few minor changes were made to method names in the Range interface. A new draft that reflects these changes is available.


Oracle's posted version 2.0.2.7 of the Oracle XML Parser for Java which includes the Oracle XSLT Processor. This release fixes assorted bugs and adds an XSLT debug mode for error info on file/line/col and the oraxml command-line utility to check wellformedness/validity Personally I recommend using the open source Xerces-J and Xalan instead, if for no other reason than to avoid yet another annoying registration form and username/password combination.


IBM's alphaWorks has posted version 0.20 of the LotusXSL XSLT processor. This is essentially a rebranded version of the Xalan 0.20 release I mentioned yesterday.

Tuesday, March 7, 2000

The Apache XML Project has posted version 0.20 of the Xalan XSLT processor written in Java. This is the first release candidate for Xalan 1.0.0. Xerces-J 1.0.2 is required. It is not compatible with Xerces 1.0.1. Java 1.1.x or later is required.


Don Box has written an XSLT style sheet that converts XML-RPC to SOAP. It's also available as an online form and CGI.

Monday, March 6, 2000

Oracle's released some XML Transviewer Java Beans that can connect directly to any JDBC compatible database to retrieve and store XML and XSL files. Registration is required.

Sunday, March 5, 2000

M. Thomas Carrasco Benitez has posted a new IETF Internet-Draft describing Xdossier. The draft proposes organizing files into a data object called an Xdossier. This is a directory structure that contains multiple copies of more or less the same document but stored in different formats for different access schemes; e.g. one formatted for direct acces from the file system while another is prepared to be delivered from a web server.

Saturday, March 4, 2000

The W3C has posted a new working draft of the Scalable Vector Graphics 1.0 specification. This release splits SVG into two variants:

  • Stylable SVG with rendering properties assigned via a style sheet language
  • Exchange SVG with rendering properties assigned via XML attributes

Furthermore, the SVG Document Object Model (DOM) has been cleaned up a lot. There are lots of other minor changes in various areas.


IBM's alphaWorks has released version 3.10 of their XML for C++ parser. The big addition in this release is Unicode 3.0 compliance.

Friday, March 3, 2000

Jasc Software has posted a "development prototype" of Trajectory Pro, an SVG editor for Windows.

Thursday, March 2, 2000

Milestone 14 of Mozilla has been released for the usual list of platforms. Crypto isn't included yet, and Java is completely broken. An applet with an align tag will even crash the browser, as will simply installing the Java Plug-In with JRE 1.2. On the plus side, XML support including namespaces, CSS1, and simple XLinks is essentially done. Still this isn't a beta or ready to serve as your day-to-day browser yet.


The W3C working group has posted a new working draft of XSL Formatting Objects.


Ron Bourett's posted an XML Namespaces FAQ List.

Wednesday, March 1, 2000

The W3C has posted a new working draft of XML-Signature Syntax and Processing. Signatures use cryptographic algorithms to provide integrity, message authentication, and/or signer authentication services for data of any type, whether located within the XML that includes the signature or elsewhere. Last Call is March 27, 2000.

The W3C has also posted the first public working draft of Composite Capabilities/Preference Profiles: Requirements and Architecture, a proposal for to devleop a standard for specifying how client devices (e.g. Web browsers) express their capabilities and preferences to the servers. This would allow, for example, a web server to send differently formatted pages to clients with black and white screens than it sends to clients with color screens.


Confluent Technologies has released Fluition 1.1, a visual SMIL editor for PowerMacs. Pricing isn't on their web site yet, but they do have a free-beer Non-Commercial Edition.


Apple has released version 4.5 of WebObjects, a payware application server inherited from NeXT that costs up to $50,000. This release integrates XML support using the IBM alphaWorks XML parser. It also adds support for LDAP and automatic creation of Java clients for accessing databases over the Internet.


I've updated the online versions of Chapters 14, 16, and 17 of the XML Bible:

Overall the changes are quite minor. The biggest is that Chapters 16 and 17 now use the correct namespace for the February 21, 2000 working draft of XLink.

Tuesday, February 29, 2000

DecisionSoft Ltd. has posted a new beta of X-Tract 1.1, a free-beer Windows and Linux tool for transforming XML documents using the non-standard XMLScript language.


David Brownell's posted a new release of his open source SAX2 utilities. This release adds

  • Some non-javadoc documentation, with the docbook 3.1.7 XML source included.
  • SAX2beta2 support
  • A sample XML/XHTML validation/checker tool
  • Command line xit status reflects failure if you configured a validating parse and there were errors.

Previously enabled features of this package include:

  • Six SAX2beta2 parsers and wrappers including a faster, more conformant AElfred (+ validation)
  • A pipeline framework for transformations via SAX2 events
  • Some pipeline stages including a validator, link grabber, message exchanger, namespace fixer, and xml writer

IBM's alphaWorks has updated their XML for C++ class library for Windows 98/NT and assorted Unixes. This release adds version 1.3.1 of the IBM Classes for Unicode, and is now compatible with XML4C 3.0.1.

Monday, February 28, 2000

Sean Mcgrath's Pyxie is an open source XML processing library and tool suite for Python. Pyxie is based on PYX, a line-oriented notation for the information emitted by an XML parser. This is a form of ESIS (Element Structure Information Set) from SGML.

Pyxie includes two command line utilities for generating PYX notation:

  • xmln: A stand-alone PYX generating utility built on top of James Clark's non-validating XML parser, expat.
  • xmlv: A stand-alone PYX generating utility built on top of Richard Tobin's validating XML parser, rxp.
Saturday, February 26, 2000

The Apache XML Project has released version 1.10b2 of Xerces-C, the XML parser for C, in binary and source forms for Windows NT 4.0, Linux, and various Unixes.


The Apache XML Project is now publishing their bug list for Xerces, Cocoon, Xalan, and FOP via BugZilla.


The W3C has updated the XML schema working drafts Part 1: Structures and Part 2: Datatypes. There are very few syntax changes in these drafts. The biggest is probably that IDREF, IDREFS, ENTITY, ENTITIES, and NOTATION are now primitive types instead subtypes of string. However most of the changes focus on cleaning up the specification itself as opposed to the syntax and semantics the specification documents.

More importantly, for the first time there's a new "Part 0" introducing schema concepts in a more tutorial like fashion. This should help clear up a lot of initial confusion about schemas that has been promulgated by the more technical Parts 1 and 2.


Sun has posted version 1.0 of their Java API for XML Parsing developed in Sun's closed, proprietary community process. They have not yet released an actual implementation of this specification. The spec is only available in PostScript and PDF format.

I remain opposed to this API on principle, and I'm happy to note that there are realistic alternatives like SAX, DOM, and Xerces that have a lot more traction in actual use. Some of these form part of Sun's proprietary API. What Sun is really trying to do with this API is hijack control of APIs for XML parsing in Java from more open processes. Sun is simply incapable of accepting a process for Java standardization that they don't control.

This isn't just a theoretical problem either. For example, here's a direct quote from the spec:

This specification subsumes the XML Namespaces 1.0 Recommendation in its entirety. This specification reserves the XML namespace prefixes beginning with java and javax (case insensitive) for future usage by the Java Platform.

This is completely wrong! The whole point of namespaces is that nobody reserves prefixes. You only get to reserve URLs which are mapped to prefixes at run time. However, Sun gets apoplectic at the mere thought of anybody else potentially using the name "Java" so they have to try to grab every possible use of the word they can imagine, even if they don't have a clue what they're going to do with it.

Leaving aside for the moment the questions of whether or not Java APIs for XML should be developed in open, democratic forums or in Sun's closed, proprietary process, there's another question of whether or not now is the right time to standardize a Java API for XML. I think it's way too early on the adoption curve to write anything in stone that will only have to be deprecated next week. Right now I'm working hard on my presentation about Java and XML at SD2000 West in a few weeks. In the last week, there were three separate releases of software and changed specs that necessitated rewriting my notes! And I expect that to continue for the next few weeks and beyond. This is very bleeding edge stuff. Sun's Java API for XML relies on SAX1 which isn't really namespace aware. SAX2 changes lots of things and deprecates a lot that Sun's writing into their API. Their support of DOM1 instead of DOM2 has similar problems. If Sun wants to add SAX and DOM to the core Java APIs, then they should at least wait until those APIs are stable.

Friday, February 25, 2000

David Megginson has released the second beta of SAX2 for Java, the Simple Java API for XML. Changes include

  • Java 1.1 compatibility

  • An XMLReaderFactory class, similar to the old ParserFactory in SAX1 (but with subtle improvements, especially in the exceptions raised)

  • Default constructors for ParserAdapter and XMLReaderAdapter, that use system properties to locate the embedded Parser/XMLReader

  • ParserAdapter can be reused even if a previous parse threw an exception

  • Assorted documentation fixes and clarifications


The W3C has posted two revised working drafts for the SMIL, the XML application for multimedia:


The Apache XML Project has released version 0.19.5 of the Xalan XSLT processor. This release fixes assorted bugs that turned up in 0.19.4 after it was posted. Xalan is now compiled against Xerces-J 1.0.1 and probably doesn't work with Xerces 1.0.2 release because of the major changes to the SAX2 API in that release. Of course Xerces 1.0.2 isn't yet up to date with the latest SAX2 beta. This we're now stuck trying to work with three different version of SAX2. Yuck. The situation should clear up over time as SAX2 is finalized.


To further complicate matters, the Cocoon project has released version 1.7 of the Cocoon which uses Xalan 0.19.4 and FOP 0.12.1. This release fixes various bugs, improves memory management, installation and caching, but presumably it inherits some of Xalan 0.19.4's bugs. It may work with Xalan 0.19.5 since that release didn't involve any API changes, but I haven't tested it yet.

Thursday, February 24, 2000

Version 0.73 of the open source xmlBlaster has been released. xmlBlaster is Message Oriented Middleware (MOM) that exchanges messages in XML format between publishers and subscribers. Clients talk to the server using CORBA, HTTP, or email and can use XPath expressions to select the messages they want to receive.

Wednesday, February 23, 2000

Would you believe that after I spent four hours yesterday finding a GUI CVS client for my NT box, installing it, discovering that it was too confusing to use, deinstalling it, finding a command line client, installing it, discovering that it depended on CYGWIN, finding CYGWIN, downloading it, installing it, downloading the Xerces source tree, trying to build it, discovering that the Borland make tool wouldn't handle the Xerces Makefile, discovering that the version of CYGWIN I had installed didn't include make, deinstalling CYGWIN, downloading and installing the developer's version of CYGWIN that did include make, and finally making the entire Java source tree, all so I could work with SAX2 beta instead of alpha--that after all this, the Apache XML Project went ahead and released the Xerces 1.0.2 binary build with SAX2 and DOM Level 2 support about 4:00 P.M. last night just as my build was finishing? Sometimes life just isn't fair.

The Apache XML Project has also released version 0.19.4 of the Xalan XSLT processor. This release fixes assorted bugs and is now compiled against Xerces-J 1.0.1. This release probably doesn't work with Xerces 1.0.2 release because of the major changes to the SAX2 API in that release.


Enhydra logo of an otter drinking a cup of coffee Lutris has released the first alpha version of the Enhydra 3.0 open source application server. This release supports version 2.2 of the Java Servlet API Servlet 2.2 and Version 1.1 of Java Server Pages and adds load balancing, dynamic recompilation of XMLC, cookie-less sessions, better multiserver adminstration, better support for JBuilder , and many bug fixes.



Nils Klarlund, Anders Moller, and Michael I. Schwartzbach have been working on the Document Structure Description (DSD) language, yet another XML schema language. They claim it's simpler than the W3C XML schemas but also more expressive and "guarantees linear time processing in the size of the application document". An open source implementation is available for Windows and Unix if you care to evaluate these claims for yourself.


Donald Ball has released XMLForm, an HTTP servlet that outputs XML documents in response to HTTP post data, typically coming from HTML forms. It does so by interpreting the names of the HTTP post parameters according to an XPath-like syntax. The name is broken apart by / characters and each part is taken to the be the name of a node. The first part is the root node, every part after that is the child of the preceding part's node. All names must begin with the same root node, at least for now.

Tuesday, February 22, 2000

Michel Rodriguez has released XML::Twig 1.9, a Perl XML module for processing XML documents too large to load into memory at one time by breaking them up into sub-trees that can be processed sequentially. It's available from your local CPAN mirror.


Cover of the Unicode 3.0 specificationVersion 3.0 of the Unicode Specification is now available in printed book form. This is a major upgrade to Unicode that adds thousands of new characters in many languages. The most significant new character set supported is Ethiopic. You can order a copy from amazon.com or any other book store that carries computer books. Various information is available on the Unicode web site, but due to the poor support for Unicode in most existing software, this is still one case where a printed book is preferable to electronic documentation.


Monday, February 21, 2000

The W3C's XML Linking Working Group has released a new working draft of the XLink specification. This draft changes the namespace again, from http://www.w3.org/1999/xlink/namespace/ to http://www.w3.org/1999/xlink. Otherwise the syntax seems unchanged. XLink is now in last call with a last call date of March 20, 2000.

The same group has also released a new working draft of XBase, a proposal to add an xml:base attribute to XML with the same functionality as the BASE element in HTML. Last call on this is also March 20, 2000.


The PHP Group has released beta 4 of PHP4. PHP4 incorporates the Zend engine for more speed and includes built-in XML support and better session tracking. And of course many bugs are fixed and probably a few new ones introduced. PHP4 is open source and available in source form as well as a pre-compiled binary for Windows.

Sunday, February 20, 2000

Object Tools had designed a DTD for Eiffel source code.

Saturday, February 19, 2000

I've posted the notes from yesterday's XML Basics talk I gave at the Bell Atlantic E-Business symposium. This was mostly a highly condensed version of talks I've given elsewhere.

Thursday, February 17, 2000

The W3C has posted a new working draft of User Interface for CSS3. CSS1 and CSS2 provide several user interface related pseudo-classes, properties and values like :link and cursor. This working draft proposes to extend these to allow elements to be styled based on their user interface related states, and to let an arbitrary element look like a standard user interface widget. For instance, you could make an XML PO_BOX element look like an INPUT type=text form element (normally a text field) or a STATES element look like a SELECT menu (normally a popup menu). It also suggests a CSS mechanism for turning on full-screen kiosk mode.


Fabio Arciniegas has founded a mailing list for discussing XML development in C++. Topics include

  • SAX-C++ SAX2-C++
  • Questions about C++ parsers, toolkits, tools, etc.
  • Integration and portability issues
  • Weekly admin update on XML-C++ products and techniques

To subscribe, send email to subscribe-xml-cpp@lists.viaduct.com from the account which you wish to receive mail.

Wednesday, February 16, 2000

The W3C has posted an XHTML Road Map, outlining where XHTML is going in the medium-term future. Highlights include gradually breaking compatibility with existing browsers and adding support for user agent profiling so that servers can tell which modules of XHTML a given browser/renderer supports.

Tuesday, February 15, 2000

I'm working on my notes for the upcoming XML and Java tutorial at SDExpo 2000 West on March 20 in San Jose. This will be a full day tutorial covering writing XML from Java programs, reading XML documents with SAX and DOM, connecting XML documents to databases, and so forth. I'd love to hear from anyone who's got questions about this material. I'm trying to find out what people would want to see in apresentation of this nature. In particular, I'm looking for examples of the "How do I write a program to do X?" category. Even if you can't attend the conferewnce, I will post the notes here when I'm done for everyone to learn from one, and I may eventually turn this into a book. Please email me suggestions at elharo@ibiblio.org. Thanks!


MathML is now available in Mozilla for Windows in a special build of Milestone 13. Look for mozilla-win32-M13-MathML.exe.


Ken Macleod has released libxml-perl 0.06, a collection of Perl modules, scripts, and documents for processing XML. You can download it from the usual CPAN mirrors.

Monday, February 14, 2000

Michael Kay's released version 5.2 of his SAXON XSLT processor. This release uses the same extension function calling conventions as XT and Xalan. New extension functions in this release include:

  • saxon:distinct() removes nodes with the same values from a nodeset
  • saxon:evaluate() returns a string built from an XPath expression constructed at runtime
  • saxon:range() creates a nodeset containing a list of numbers, allowing xsl:for-each to be used as a traditional for-loop.
  • saxon:tokenize() converts a string into a nodeset containing one node for each token value in the string

This release also adds support for the xml-stylesheet processing instruction, allows programmers to read the input document or style sheet from a DOM tree, and fixes assorted bugs. Complete details are on the changes page.

Sunday, February 13, 2000

The W3C has posted several new working drafts.

First off, The W3C HTML Working Group has posted a last call working draft of XHTML Basic, a list of modules from XHTML 1.1 (modularized XHTML) that form a small subset of HTML intended for memory and display challenged devices like cell phones, PDAs, set top boxes, and so forth. Last call is March 15, 2000.

Next, the P3P Working Group has released a new last call draft of the Platform for Privacy Preferences 1.0 (P3P1.0) Specification. Most of the changes are syntactic rather than structural. This replaces the previous last call draft and extends the Last Call period to April 30, 2000.

Finally, there's a new working draft of Mathematical Markup Language (MathML) Version 2.0. Changes are mostly clean-ups to the specifcation rather than major changes to the functionality.

Saturday, February 12, 2000

Adobe's released some free beer SVG plug-ins and converters. The plug-ins allow Netscape or Internet Explorer 4.x and later to display SVG pictures. The converters allow Adobe Illustrator 8.0.1 to import or export SVG files. Both are available for Macintosh and Windows but not Unix.

Friday, February 11, 2000

Bill la Forge has released Quick 1.0. Quick builds on SAX, providing a simple API for transforming XML documents into Java objects. Quick aims to minimize the coupling between XML markup languages and an application's classes, relying mostly on an XML binding schema to effect the transformation at run time. Quick is distributed under a modified BSD license.

Thursday, February 10, 2000

Digital Applications has released a technology preview of SVG2PDF, a tool for converting Scalable Vector Graphics (SVG) documents into Acrobat PDF files. It runs on Mac and Windows.


I've updated the XML Conferences page.


Extensibility has released version 1.1e of its $99.95 payware XML Authority schema editor. This release supports the latest working draft of XML-Schema, adds an importer for COBOL copybooks, and introduces MetaSchema, a tool for further anotating schemas. Upgrades are free for registered users. A trial version is available.


Geert Josten has posted version 0.19 of XML::XSLT,a Perl module supporting XSL Transformations. It can be retrieved from the usual CPAN sites. .

Wednesday, February 9, 2000

The WS3C has posted a new working draft for XML-Signature Syntax and Processing. XML Signatures provide integrity, message authentication, and/or signer authentication services for data of any type, whether located within the XML that includes the signature or elsewhere.

The W3C has also posted a new DOM (Document Object Model) requirements working draft. This updates the document for DOM Level 2.

Tuesday, February 8, 2000

I've posted the slides from my XML Extensible Markup Language talk at the Summit 2000 conference that started yesterday in Mississippi. My talk was a basic introduction to XML for a mixed audience of technical and non-technical people.


IBM's alphaWorks has posted a revised version of their XML Parser for C++ (XML4C) to fix bugs with Unix threading, user-exception handling, and Japanese backslashes. XML for C++ has been updated form alphaworks XML4C)is a validating XML parser written in portable C++ that supports DOM Level 1 and SAX 1.0.

Monday, February 7, 2000

Samuel R. Blackburn has posted release 43 of his freeware Win32 Foundation Classes, a C++ class library for processing XML. This release removes dependencies on the Microsoft Foundation Classes. The XML parts should be able to be ported to any STL-compliant platform. It includes a stand-alone, non-validating parser.

Friday, February 4, 2000

IBM's alphaWorks has released version 1.1 of the Xeena XML Editor. Version 1.1 adds assorted new new features including URL support and access to document attributes.

Thursday, February 3, 2000

David Brownell's updated his open source XML Utilities to support the SAX2 beta APIs. These include an enhanced version and a validating version of the AElfred XML parser, a DOM Parser that converts DOM to SAX2 calls, and SAX2 adapters for the Swing HTML parser, Oracle's latest XML parser and Sun's XML parser.


LinuxWorld was fun. I'll probably go back tomorrow and bring my digital camera to take some pictures. There wasn't a lot of XHTML. Enhydra was there, as usual failing to explain exactly what it was their product did or why I would want one. (I asked. The guy at the booth literally could not tell me, and admitted as much.) At least it's open source, unlike all the other application server vendors who want to charge me $100,000 or more for a product they can't explain. Am I just dense or is there really "no there, there" in the application server market? BlueStone sends me at least three press releases a week, but still hasn't managed to explin what it is they actually do.

The one XML topic people were talking about was XHTML. The W3C has royally screwed the pooch in marketing this. Everybody had heard of it, but nobody knew what it actually was. Opinions were equally divided between XHTML being a replacement for XML and being a new version of HTML with new tags and features. Nobody understood what it really is, a simple reformulation of standard HTML 4.0 as well-formed XML.

Monday, January 31, 2000

The W3C XML Query Working Group has released the first working draft of XML Query Requirements. This document "goals, usage scenarios, and requirements for the W3C XML Query data model, algebra, and query language." The objective is to "produce a data model for XML documents, a set of query operators on that data model, and a query language based on these query operators. The data model will be based on the W3C XML Information Set, and will include support for Namespaces."


This Wednesday, February 2, I will be signing copies of the XML Bible and generally kibitzing with anyone who shows up in the IDG Books booth at LinuxWorld . We'll have plenty of copies of the XML Bible for sale, and I'll be happy to sign any copies of my books (or for that matter anyone else's :-) ) that you care to bring to the show.

I'm not scheduled to speak at LinuxWorld, but I do have a number of upcoming speaking engagements. Over the next several months you'll have the opportunity hear me talk in:

Check the conference web sites for exact registration details. I'll also be in Washington D.C. in mid-May to give some day-long tutorials under the auspices of the local chapter of the ACM. However, the exact date hasn't yet been fixed. I'll post it here when I know it. If you'd like to meet me one-on-one to demo a product or interview me for a local newsletter or just to talk about XML, drop me an email and we'll see what we can set up. If you run a conference, a user group, a training facility, and would like me to speak to your organization, then drop me an email as well. I look forward to meeting some of the Cafe con Leche readers over the next few months!

Sunday, January 30, 2000

Version 1.4 of X-Tract, a non-standard, non-XSLT based tool for transforming XML documents, has been released. X-Tract uses the non-standard XML Script 1.1 syntax to describe transformations. X-Tract is closed source but free for non-commercial use.


IBM's alphaWorks has released version 1.1 of Data Descriptors by Example (DDbE), a Java class library for guessing a DTD or schema from a set of well-formed XML instances. Version 1.1 adds a Java application for translating DTDs to XML Schemas and the ability to generate external DTDs.

Saturday, January 29, 2000

The W3C has posted a candidate recommendation for User Agent Accessibility Guidelines 1.0. The candidate recommendation phase is a new W3C step for testing implementations. However, since the Candidate Recommendation phase ends in less than a month on 18 February 2000 I doubt any significant testing can be done.

The W3C has also posted two new working drafts. Techniques for User Agent Accessibility Guidelines 1.0 provides techniques for implementing the checkpoints defined in "User Agent Accessibility Guidelines 1.0". SMIL Animation describes a possible animation vocabulary for XML documents that is based upon the SMIL 1.0 timing model. Last call for SMIL Animation February 27, 2000.

Friday, January 28, 2000

David Megginson has posted the first beta of SAX2/Java. SAX is the Simple API for XML, a very-widely implemented event-based interface for processing XML documents. SAX2 adds:

  • Namespace support.
  • Configurability and extensibility through features and properties.
  • A new interface and base class for SAX filters.
  • Adapters for using SAX1 parsers with SAX2 and vice-versa.
  • Lot of JavaDoc documentation.

SAX2 is explicitly placed in the public domain.


FourThought LLC has released version 0.9.2 of 4DOM, a Document Object Model (DOM) Library for Python. 4DOM implements W3C DOM Core level 2, HTML level 2 and Level 2 Document Traversal. This release fixes assorted bugs, particularly with regard to namespaces.


FourThought LLC has also released version 0.8.2 of 4XSLT and 4XPath, XSLT and XPath implementations in Python. These support most, though not quite all, of the latest W3C XPath and XSLT specifications. This release mostly fixes assorted bugs.

Thursday, January 27, 2000

The Mozilla project has posted their first official alpha release of Mozilla. This is Milestone 13, and is now available in binary and source form for MacOS 8.5, Windows, and assorted Unixes. Mozilla has pretty good XML+CSS support, as well as using XML internally for a lot of configuration information. The XML support is built on top of James Clark's expat XML parser.


Microsoft has released a beta of the next version of their MSXML parser. This is a pre-release parser that fixes a lot of bugs in the old XML parser and adds support for some of the XSLT 1.0 Recommendation. However it's still missing a lot, including xsl:transform, xsl:import, xsl:strip-space, xsl:preserve-space, xsl:apply-imports, xsl:call-template, xsl:namespace-alias and xsl:number. Furthermore, xsl:output is almost totally broken. xsl:copy and xsl:element do not support the use-attribute-sets attribute. xsl:sort does not support the lang, data-type or case-order attributes. And extension elements seem to be seriously non-conformant. Furthermore the XPath support (used inside select and match attributes) omits the following, following-sibling, preceding and preceding-sibling axes. In short this release handles the basic templates, but falls apart in a lot of the trickier, special case scenarios. While it's an improvement over the XSLT support built-in to IE5, it's still missing way too much to be considered for real work.

Microsoft has also released several supporting tools including Microsoft XSL ISAPI Extension 1.1 for server-side XSL in Internet Information Server, an XSLT Style Sheet for XML Schemas, and a converter for turning old IE5 XSL stylesheets into new W3C XSLT stylesheets.

Wednesday, January 26, 2000

Tetra Six's Majix Light 1.1 is a free (free beer, not free speech) Java program for converting Word generated RTF files to XML.


The W3C has posted the official, final XHTML 1.0 recommendation. XHTML is a reformulation of HTML as well-formed and possibly valid XML. This uses the single http://www.w3.org/1999/xhtml namespace.


Geert Josten and Egon Willighagen have released XML::XSLT 0.17, a Perl module for performing XSL Transformations. XSLT support is quite incomplete in this version, but perhaps useful for simpler tasks.

Tuesday, January 25, 2000

Bowerbird Consulting's XDBM 1.0 is an open source embedded XML database for Linux. It stores pre-parsed XML. It's not a standalone program. You link it with your own code. It's available under the GPL as well as a custom license.

Monday, January 24, 2000

The Qub Group has released version 0.2 of PXSLServlet, a Java servlet for using XSLT to transform SQL result sets.


IBM's alphaWorks has updated their XML Security Suite to include an ASN.1 to XML Translator.

Saturday, January 22, 2000

I've updated Chapter 16, XLinks, and Chapter 17, XPointers, of the XML Bible to reflect the latest XLink working draft. The main change in the XLink chapter was removing all references to the element based syntax that was eliminated in the last draft. I also added information about the new semantics of xlink:show and xlink:actuate. The XPointer chapter didn't change very much. Mostly I just cleaned up some detritus that had been removed in earlier working drafts but that I had missed in previous revisions, like the vertical bar | as an alternative to # in URIs.

Friday, January 21, 2000

Icon Information-Systems has released the second beta of XML Spy 3.0, yet another tree-based XML editor. (My opinion of such products should be well known enough that I won't repeat it here.) Version 3.0 adds editing based on the DTD, database views, improved file-type support, XSL transformations, and DTD and schema editing. Most beta testers will not be given free copies of the shipping software.

Thursday, January 20, 2000

The W3C has posted the last call working draft of Canonical XML Version 1.0. Unless major problems are found by February 22, this will become a new standard. Canonical XML describes how to tell whether two documents are "the same" once excess white space has been stripped, default attributes have been inserted, entities replaced and so forth.

The W3C has also released a new working draft specification for the XML Linking Language (XLink). The major change is that the xlink elements (xlink:simple, xlink:extended, xlink:arc, etc.) have been removed. All linking is now asserted by attributes like xlink:type and xlink:href. The xlink:typeattribute now has the added possible value none. The xlink:show and xlink:actuate attributes are now allowed to have the value undefined. In addition XLink processors are allowed to recognize other values for these attributes to indicate application specific behavior. Also new are title type elements (as opposed to title attributes) so that titles are no longer limited to one string of text without any markup. (Actually this was in the last working draft too, but I managed to miss it then.) Finally, external linksets and linkbases and the inline methods of linking to them are more clearly described.

Wednesday, January 19, 2000

Jun Fujisawa has ported the AElfred XML parser to the Palm Pilot using Sun's KVM Java virtual machine (required).

Tuesday, January 18, 2000

The Apache XML Project has released version 1.6 of the open source Cocoon XML Publishing Framework. New features include: support for Apache Xerces-j and Sun ProjectX for XML parsing, support for Apache Xalan and James Clark's XT for XSLT processing, support for Apache FOP for PDF generation, complete support for XSP compiled server pages, documentation generated from XML using Apache Stylebook, an LDAP processor for direct LDAP -> XML generation, and many more examples. Cocoon is written in pure Java and runs on top of a Java servlet engine such as Jakarta Tomcat.


Version 0.2 of the XML Media Types Internet Draft has been posted. Most significantly, this draft adds text/xml-external-parsed-entity and application/xml-external-parsed-entity media types.

Monday, January 17, 2000

Xalan 0.19.2, the open source XSLT processor written in Java, has been released with assorted bug fixes.

Sunday, January 16, 2000

The W3C has updated the errata to the XML 1.0 specification.

Saturday, January 15, 2000

IBM's alphaWorks has released XML Master which was probably named simply to justify the abbreviation "XMas". XML Master is a tool for designing and generating custom JavaBeans for working with a particular XML document.

Friday, January 14, 2000

IBM's alphaWorks has released version 2 of their XSL Editor. This release features syntax highlighting, better installation across different platforms, and the ability to run a transform without tracing.

Thursday, January 13, 2000

Dan Connolly's launched the xml-dist-app mailing list for discussing XML protocols for distributed applications such as XML-RPC and SOAP. To subscribe, send email to xml-dist-app-request@w3.org with the Subject "subscribe".

Wednesday, January 12, 2000

The W3C has published a new working draft of XSL Formatting Objects, though apparently it's now called simply "XSL". (That's going to be confusing and contentious.) This is the first draft in quite some time. It cleans up a lot and fills in a lot of holes. However, there's still a huge amount of work to be done. This document is already over a megabyte long, not including pictures.


Here's one tidbit I missed in the Christmas rush: Milestone 12 of Mozilla is now available for Windows, MacOS 8.5, i386 Linux, OpenVMS and FreeBSD. The next milestone is due in about a week. This release include James Clark's expat XML parser. It can render XML documents with CSS Level 1 style sheets and supports simple XLinks and HTML namespaces.

Tuesday, January 11, 2000

I have updated the online versions of Chapter 16, XLinks, and Chapter 17, XPointers of the XML Bible to reflect the December 1999 working drafts of these specifications. These are major rewrites.

The XPointer specification hasn't changed so much from its previous working draft but I did use this opportunity to clean up and substantially improve my presentation of the technology.

The XLink specification did change a lot from the previous working draft. xlink:document and xlink:group were removed. Local resources are now enclosed in an xlink:resource element. And out-of-line links finally make sense. Thus the changes to Chapter 16 were quite major. Anyone whose working with or thinking about working with XLinks should definitely read the new version. Overall I think Chapter 16 is a lot clearer than it used to be, partially reflecting some of my editing but mostly a result of a much improved XLink specification from the W3C that finally began to explain how extended and out-of-line links are meant to be used.

Monday, January 10, 2000

Lunatech Research has released beta 4 of their Morphon XML editor. Morphon is DTD aware with some support for XSL. Morphon is written in Java and should run on any Java 1.1/Swing compatible platform. This beta expires on March 31, 2000.

Morphon is perhaps a little less focused on the tree structure of the XML document than some of its competitors. However, it's still way too tree focused for a typical user. Personally, I think that almost all the XML editors out there are completely missing the boat. Writers don't think in trees. A good XML editor (of which I know exactly zero) would not show the tree to the author at all. Instead it would have a standard word-processorish user interface with named styles that the user could modify. I think programmers are really wasting a lot of time, money, and effort producing one tree structured editor after another. They're modeling their user interfaces to match the underlying data structure when in fact these should be completely decoupled. The user interface should reflect the tasks the user needs to accomplish, not the underlying file format. For instance, the user doesn't give a damn about adding a STRONG child element to a PARAGRAPH element. What the user wants to do is mark a run of text as strongly emphasized, and the user interface should reflect that. Actually, even this is a stretch since it relies on semantic styles rather than visual formatting, but I do think that's a jump we can make. This is not to say the interface shouldn't be reasonably WYSIWYG. It should be. When the user designates a block of text as strongly emphasized, it should be made bold on the screen. Until XML editors provide a real interface a writer can use, they offer nothing compared to a simple text editor like BBEdit or emacs. In fact, they're a lot harder and less convenient to use than a simple text editor.

Sunday, January 9, 2000

Paul Miller has released version 0.6 of XMLIO, an XML input/output library for C++ applications. XMLIO is a simple, nestable, streaming, XML parser for C++ application data. Version 0.6 adds expat-style push handlers for data, CDATA, and comments to XML::Input. This allows you to decide at each subelement whether to pull the data inside your element handler or to have it pushed to you at a different handler. XMLIO is free under the MIT X11 license.

Saturday, January 8, 2000

The W3C has updated several working drafts including


Oracle has released version 0.9.8.6 XSQL Pages and the XSQL Servlet. XSQL Pages are server-side XML templates for building web pages that load data from Oracle databases. Version 0.9.8.6 adds more documentation, demos, APIs, support for stored procedures, parameterized XSLT style sheets, and more.

Friday, January 7, 2000

ICEsoft has released the first demo of ICE Browser 5.01, a pure Java web browser component which supports HTML, XML, CSS, DOM, JavaScript, MathML, and so forth. Version 5.01 adds PDF rendering, XML namespace support, and fixes assorted bugs. ICE Browser 5 is $1500 per product you use it in, and is not license compatible with open source projects. The demo expires February 1, 2000.

Thursday, January 6, 2000

The XML Apache Project has posted version 1.0.1 of the Xerces-J open source XML Parser in Java. This release incorporates some "major new code contributions" by Assaf Arkin at ExOffice, as well as fixing assorted bugs.


Quark has posted a pre-release of avenue.quark, an XTension for saving Quark XPress documents to XML. No word yet on what this will cost once released.

Wednesday, January 5, 2000

Daniel Potter's posted a Java applet that parses regular expressions and matches strings as specified by the current XML Schema working draft.

Wednesday, January 5, 2000

IBM's alphaWorks has updated their XML Security Suite for Windows and Linux to include element-wise encryption Library and a new version of the XML-Signature implementation based on the latest W3C XML-Signature working draft.

Tuesday, January 4, 2000

FourThought LLC has released version 0.9.1 of 4DOM, a Document Object Model (DOM) Library for Python. 4DOM implements W3C DOM Core level 2, HTML level 2 and Level 2 Document Traversal. This release fixes assorted bugs.


FourThought LLC has also released version 0.8.1 of 4XSLT and 4XPath, XSLT and XPath implementations in Python. These support most, though not quite all, of the latest W3C XPath and XSLT specifications. This is also a bug fix release.

Monday, January 3, 2000

IBM's alphaWorks has released a new version of Data Descriptors by Example (DDbE). This release supports Windows style file names with spaces, as well as the ability to utilize XML example files which raise exceptions during parsing. DDbE is a Java class library for generating a DTD from well-formed XML documents.

Sunday, January 2, 2000

Paul Miller has released version 0.5 of XMLIO, an XML input/output library for C++ applications. XMLIO is a simple, nestable, streaming, XML parser for C++ application data. Version 0.5 is completely rewritten in C, with the C++ API layered over that (but using exceptions, references, and in the XML:: namespace). Unicode is not yet supported, however, so much work remains to be done. The underlying parser is James Clark's expat. An ANSI C++ compiler with namespaces, exceptions, and the standard library is required. XMLIO is free under the MIT X11 license.


News from 1998 | News from 1999
[ XML Books | XML Trade Shows | XML Mailing Lists | XML Quotes ]

Copyright 2000 Elliotte Rusty Harold
elharo@ibiblio.org
Last Modified December 6, 2000