The Free Software Foundation is calling for the W3C to make their patent policy GPL compatible, which it apparently still is not because it allows patent holders to limit their royalty-free licenses to spec implementations only. Comments to the W3C are due by midnight today, but in practice the W3C often listens to good comments past the official date, so don't hesitate to contribute just because you're not reading this unil the New Year.
Eddie Robertsson has written an open source implementation of Schematron 1.5 in Java based on JAXP's TrAX. The distribution also supports embedded Schematron in RELAX NG schemas, using James Clark's Jing package.
Morphon has released version 1.3.3 of the Morphon CSS-Editor, a $39 payware Cascading Stylesheet editor written in Java that supports CSS1, CSS2, SVG, and the CSS Mobile Profile.
Morphon has also posted the third beta of the Morphon XML Editor 3.0. This editor provides WYSIWYG, source, and tree views, supports Unicode 3.0, and can print preview and spell check XML documents. The Morphon XML Editor is $150 payware.
I'll be visiting family for the Christmas week. I should have Internet access, but updates will probably be a little slow here until next weekend. Merry Christmas, Joyeuese Noël, Happy Hanukkah, a Festive Kwanzaa, Season's Greeting's, and Happy New Year to you all.
As part of my continuing end-of-the-year InBox purging (Only 1400 messages to go!), yesterday I went through all the mail announcing conferences and updated the XML Conferences page. So far, I know that next year I'll speaking at OOP 2003 in Munich in January, XML & Web Services 2003 in London in March, Software Development 2003 West in Santa Clara in March, and Software Development 2003 East in Boston in September.
Benoît Marchal's HC (short for Handler Compiler) "takes some drudgery out of event-based XML parsing by automatically generating the SAX ContentHandler for a list of XPaths."
Lucid'i.t. has released the Lucid XML Toolkit 1.0. After peeling away several layers of press-release hype, this appears to just be another schema validating XML parser written in Java that supports the usual batch of API acronyms: SAX1, SAX2, DOM Level 1 and 2, JAXP, and so forth. So far only the free-beer personal edition is available.
Adaptinet has released XML-Serializer, an XML-To-Java data binding tool that supports DTDs and XML schemas (but apparently not simple, well-formed XML). XML-Serializer costs $49.95.
Roger L. Costello and David B. Jacobs have begun work on the collaborative development of a distributed, decentralized Web service registry. He seems to be thoroughly annoying the ebXML community that wants to have a centralized monopoly on web services registries. "The purpose of this effort is to develop a concrete, implementable architecture for a highly distributed registry. The notion is that each Web service defines their own registry - comprised of the collection of documents that describes the service."
As part of my continuing end-of-the-year InBox purging (Only 1500 messages to go!), yesterday I went through all the mail announcing conferences and updated the Java Conferences page. So far, I know that next year I'll speaking at OOP 2003 in Munich in January, XML & Web Services 2003 in London in March, Software Development 2003 West in Santa Clara in March, and Software Development 2003 East in Boston in September.
The W3C XML Linking Working Group has posted a new working draft of the XPointer xpointer() Scheme. This draft cleans up the specification a lot, but does not appear to make any significant changes to the language syntax or semantics.
The working group's charter expires at the end of the year, and there seems to be little will in the W3C for continuing with this. Since XPointer hasn't even gotten to last call working draft yet, it seems unlikely that this will ever be finished. This may be a good thing. The full syntax just seems too ugly and verbose to achieve widespread adoption. Even within the W3C, there's signifcant dissent. Outside the W3C, reactions to XPointer range from open hostility to apathy, but nobody seems to actually like it.
The W3C XHTML working group has published a new working draft of XHTML 2.0, just one week after the last working draft. According to the abstract, "It was released very soon after the second public Working Draft because of production errors, and does not reflect any major changes from that draft." XHTML 2.0 is the next, backwards incompatible version of HTML that incorporates XFrames, XForms, and lots of other crunchy XML goodness. However, XLink is not yet included and may never be. (The HTML Working Group are extreme XLink skeptics.)
The W3C Math Working Group has published the first working draft of the second edition of the MathML 2.0 specification. Like the second edition of XML 1.0, this focuses on incorporating errata and cleaning up the language of the spec, rather than on introducing new features or changing existing ones. Among other improvements, this draft now includes a schema for MathML 2.0 as well as a DTD. The MathML 2.0 described here should be completely identical to the one described by the first edition.
The W3C Quality Assurance (QA) Activity has posted the first public working draft of the QA Framework: Test Guidelines. "This document defines a set of common guidelines for conformance test materials for W3C specifications."
The W3C Web Services Internationalization Task Force has published the first public working draft of Web Services Internationalization Usage Scenarios. This describes various issues that arise when using SOAP services in multi-language environments. For example, is it possible to send error messages in both English and Japanese?
And speaking of Japanese, the Center for Global Communications,
International University of Japan (GLOCOM), Infoteria Corporation, and
Media Fusion Co., Ltd have submitted a note to the W3C describing
ongoing work within Japan on Embedding Glyph
Identifiers in XML Documents. This describes a means to specify
exactly which glyph should be used for a single Unicode character,
something which is apparently a lot more significant in ideographic
languages than alphabetic ones. Glyph selection would use a special
glyph:name attribute that references a glyph by its
position in the SO/IEC 10036:1996, Information Technology -- Font
information interchange -- Procedures for registration of font-related
identifiers standard. This note will not be put on the W3C
standards track. Instead, it's being moved through the Japanese
Standards Association (JSA).
The W3C XML Protocol Working Group has released the candidate recommendations of the three SOAP 1.2 specifications:
The namespace URIs are now http://www.w3.org/2002/12/soap-envelope and http://www.w3.org/2002/12/soap-encoding. Otherwise, it's not obvious to me what the changes are in this draft. Since, I think there are multiple fairly deep and fundamental flaws in SOAP, and they don't seem likely to be fixed, I haven't been following this work too closely. Comments are due by January 24.
Robert C. Lyons has released the Turing Machine Markup Language (TMML). This is an XML application for describing Turing machines. He also rote a TMML interpreter in XSLT that executes the Turing machine described in the TMML source document. This is another proof by construction that XSLT 1.0 is Turing-complete.
John Cowan's posted TagSoup 0.8, a "parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML."
The W3C XML Core Working Group has posted the candidate recommendation of Namespaces in XML 1.1. Their don't appear to be any functional changes since the last call working draft. Changes since 1.0 are:
xmlns is by definition bound to the
namespace name http://www.w3.org/2000/xmlns/. Mark Hale's released version 0.881 of JSci, a Java class library containing many useful mathematical and scientific functions such as complex arithmetic. The major new feature in this release is that all the periodic table classes have been replaced by XML documents (packaged in the jar file). "Element objects can be instantiated using the new factory method PeriodicTable.getElement(). This makes the data easier to update and more accessible for processing. As a consequence, a periodic table reference in HTML is now included with the documentation - generated by applying an xsl stylesheet to the xml files."
The Mozilla Project has posted the first alpha of Mozilla 1.3. This release focuses on mail. Message Views is a new feature "which will help users locate, organize and prioritize their mail messages. A View is used to filter and display only those messages matching a given set of criteria. Mozilla ships with a set of pre-defined Views, but users can also create their own."
Mail filter actions have also been improved, as has compatibility with some mail servers. The biggvest new feature is junk mail classification so you can train your client to distinguish between good mail and junk mail. I've recently installed SpamAssassin for this purpose and it seems to be catching about 60% of the spam I receive, including almost all of the most egregious examples. So far I haven't seen any false positives, and I'm going to try turning down the number of hits required to identify e-mail as spam.
Finally, Mac OS 9 support has been removed from this release. Third parties may release their own OS 9 builds.
IBM's alphaWorks has released version 3.3.1 of their Web Services Toolkit. The "basic software components needed to create a Web services environment are provided with Web Services Tool Kit. Included is an architectural blueprint (Web Services Architecture), sample programs, Utility services, and some tools that are helpful in developing and deploying Web services. Extensive documentation is included to assist developers with the basic concepts of Web services. The tool kit also includes a fully-functioning Web services client API that can be used to directly access a UDDI registry." New features in 3.3.1 adds WS-Policy and WSRP. Java 1.3 or later is required.
Opera Software has posted the second public beta of Opera 7.0 for Windows, a $39 payware web browser that supports XML and CSS. Beta 2 adds some user interface improvements including fast forward access to the pages you will want to visit next, a one-click log-in password manager; a links panel that displays all links in the current page, one-click skin install, and multiple user style sheets.
The World Wide Web Consortium (W3C) User Agent Accessibility Guidelines Working Group (UAWG) has released the final recommendation of User Agent Accessibility Guidelines 1.0. The abstract states, "This document provides guidelines for designing user agents that lower barriers to Web accessibility for people with disabilities (visual, hearing, physical, cognitive, and neurological). User agents include HTML browsers and other types of software that retrieve and render Web content. A user agent that conforms to these guidelines will promote accessibility through its own user interface and through other internal facilities, including its ability to communicate with other technologies (especially assistive technologies). Furthermore, all users, not just users with disabilities, should find conforming user agents to be more usable."
The UAWG has also puublished a complementary note covering Techniques for User Agent Accessibility Guidelines 1.0. "These techniques address key aspects of the accessibility of user interfaces, content rendering, application programming interfaces (APIs), and languages such as the Hypertext Markup Language (HTML), Cascading Style Sheets (CSS) and the Synchronized Multimedia Integration Language (SMIL)."
The W3C Device Independence Working Group has posted the first public Working Draft of Delivery Context Overview for Device Independence. "Delivery context information is typically used to provide an appropriate format, styling or other aspect of some web content that will make it suitable for the capabilities of a presentation device. The selection or adaptation required to achieve this may be performed by an origin server, by an intermediary in the delivery path, or by a user agent."
Andrzej Jan Taramina has posted a of GPSml, a "standard, comprehensive and functional markup language that can express and encode the full gamut of data generated by GPS (Global Position System) devices, including real time position information and collections of points (waypoints, routes, tracks, etc.)." The latest version of his GPL'd Java GPS Access Library generates GPSml output.
Kohsuke Kawaguchi has released JARV, a vendor-neutral, implementation-independent and schema language independent Java API interface for XML validation. With the right validation engines installed, it can support RELAX NG, TREX, the W3C XML Schema Language, and DTDs.. JARV is open source under an MIT license.
Alex Chaffee has updated his XPath Explorer. This is an open source, graphical tool for displaying XML document as a tree and evaluating XPath expressions with respect to those documents. to add an Expand/Collapse All Nodes menu item and an Open Location menu item. This release is also faster when opening large files and uses a progress bar when loading or expanding so it shouldn't appear hung just because an operation takes a while.
Syncro Soft has released version 1.2.4 of <oXygen/>, a $65 payware XML editor written in Java that can run as an applet. <oXygen/> 1.2 supports XSLT and XSL-FO, among other features. Version 1.2.4 adds about a dozen small new features including document templates, auto-completion of end-tags, and smart indenting.
RenderX has released a COM wrapper for the XEP XSL Formatting Objects to PDF converter. XEP is written in Java. The converter uses the Java Native Interface (JNI) to translate COM methods to Java virtual machine calls. Java 1.2.2 or later is required. This component only works with the $999.95 Developer and $4999.95 Server versions of XEP (and crippled variations thereof), not the cheaper client version.
Eric van der Vlist, the author of the O'Reilly XML Schema book, has begun writing a book about Relax NG that will be published both on paper and online under the GNU Free Documentation License.
W. Eliot Kimber has founded the EXSLFO project as "a community effort to define functional extensions to the XSL Formatting Objects specification in advance of development of new versions of the XSL FO specification by trhe W3C. It is intended to be an adjunct to the formal W3C specification development process. It is modeled on the existing EXSLT activity (http://exlst.org/)." Possible extensions might include PDF bookmarks and metadata, capturing of page-to-object mappings, non-rectangular pages, etc. All interested parties, vendors and users, are invited to participate.
Danny Vint's published two Quick Reference Cards for XML Schemas. The PDF files are setup for 11"x 17" paper, but can be shrunk to smaller page sizes.
eSVG 1.4 is an implementation of the subsets of SVG 1.1 and SVG Mobile specifications designed for integration into embedded systems. eSVG project additionally provides multithreaded eSVG scripting according to SVG DOM 2 interface specification. eSVG scripting is based on SpiderMonkey (JavaScript-C) Engine and ORMIDE. Version 1.4 supports the most of SVG Tiny profile features, SVG Basic profile features and SVG DOM interface entries. eSVG currently runs on Windows 98/NT/2000/ME/XP, Windows CE, and UniOP MMI. Pricing is deliberately hidden so it can't be good.
Roger L. Costello of The MITRE Corporation has published a W3C XSD Schema containing simpleType definitions that enumerates various units of temperature, length, weight, volume, area and so forth.
Vasil I. Yaroshevich has released xsltml 2.0, an XSLT MathML Library that provides a set of XSLT templates for MathML 2.0 to LaTeX translation implemented in pure XSLT. No extension functions are used.
The Simple XML Data Manipulation Language (SiXDML) is a SQL like (query/update/insert/delete) language for working with XML documents. There is currently an implementation in Java for the Xindice native XML database.
David Rosenborg has released three utilities for working with the RELAX NG schemas written in the Compact Syntax :
All three utilities are available under a BSD license
Fraunhofer IPSI has released IPSI-XQ 1.2.2, a prototype XQuery processor written in Java. Pricing has not yet been announced.
eXchaNGeR 0.9 is an open source XML-Browser/Editor. By default, it uses a tree view, but can be configured with special viewers and editors for different XML applications. eXchaNGeR is written in Java and published under the Mozilla Public License.
I'm doing my end-of-year mailbox cleansing over the next couple of weeks. As I try to reduce my inbox from almost 4000 messages down to something more manageable, those of you who've written to me in the last year or two may be getting some belated replies. So far I've reached July of 2001, and I'm down to 3200 messages. :-)
The W3C XML Schema Working Group is soliciting comments on version 1.1 of the W3C XML Schema language. 1.1 "is intended to be mostly compatible with XML Schema 1.0 and to have approximately the same scope, but also to fix bugs and make whatever improvements we can, consistent with the constraints on scope and compatibility." Comments should be directed to www-xml-schema-comments@w3.org.
Anders Møller at the BRICS research center of the University of Aarhus has released Document Structure Description 2.0 (DSD2), yet another schema language for XML. This one is based on boolean logic and regular exxpressions, and does not support typing. It reminds me a little of Schematron, though it's not based on XPath. It does allow the sort of context-sensitive constraints that Schematron allows.
I'm doing my end-of-year mailbox cleansing over the next couple of weeks. As I try to reduce my inbox from almost 4000 messages down to something more manageable, those of you who'vw written to me in thelast year or two may be getting some belated replies. :-)
Anthony B. Coates has released, mtxslt, an Ant task that allows multiple different XSLT engines to be used during the same build.
Henry S. Thompson has released a new version of his XSV W3C XML Schema Language Validator. This release features "improved conformance by computing and using values where required, e.g. for enumeration checks and 'fixed' element/attributes, and some support for date, time and dateTime." XSV can be run as a web form or a Windows executable.
Microsoft has released version 1.0 of the Microsoft XSD Inference utility, a tool for creating a W3C XML Schema Language (XSD) schema from an XML instance document.
The W3C XHTML working group has published a new working draft of XHTML 2.0, the next, backwards incompatible version of HTML that incorporates XFrames, XForms, and lots of other crunchy XML goodness. However, XLink is not yet included and may never be. (The HTML Working Group are extreme XLink skeptics.)
The W3C has released version 7.1 of Amaya, their open source, test bed web browser and editor for Windows and Linux that supports HTML 4.01, XHTML 1.0, XHTML Basic, XHTML 1.1, HTTP 1.1, and CSS, as well as providing partial support for MathML 2.0 and SVG. This is a bug fix release.
James Clark's Trang translates schemas written in RELAX NG into different formats. In particular, it can
Clark says, "Trang aims to produce human-understandable schemas; it tries for a translation that preserves all aspects of the input schema that may be significant to a human reader, including the definitions, the way the schema is divided into files, annotations and comments." Trang is written in Java and published under a very open license.
Christian Neumann's posted LibXMLight 0.1.1, "a non-validating, lightweight XML Parser Library written in C++. The API is similar to SAX." Version 0.1.1 adds API documentation. LibXMLight is published under the GPL.
Daniel Veillard's released version 2.4.30 of libxml2, the GNOME XML parser for Linux. Version 2.4.30 restores the Python support accidentally broken in 2.4.29.
Sun's posted the proposed final draft (version 0.9.0) of the Java Architecture for XML Binding 1.0 (JAXB) on the Java Developer Connection (registration required). This includes a spec, API docs, and a reference implementation. JAXB compiles an XML schema into one or more Java classes. (First mistake: JAXB assume there's a schema. Second mistake: It assumes the schema is written in the W3C XML Schema Language. Third mistake: It assumes documents actually adhere to the schema.) JAXB can unmarshal schema-valid XML into Java objects; read, update and validate the Java objects against the schema, and write the result back out as XML.
Netscape has released version 7.0.1 of their namesake web browser. Netscape 7.0.1 is based on Mozilla 1.0.2 and supports XML, HTML, XHTML, CSS, XSLT, RDF, DOM, and assorted other cool acronyms. The big new feature in this release is pop-up blocking (though Mozilla users have had this power for a while).
Devsphere has released XML Tag Library, an open source Java Server Page (JSP) tag library for processing XML. It complements the JSP Standard Tag Library by adding SAX parsing and DSOM serialization.
The World Wide Web Consortium (W3C) XML Encryption Working Group has released the final recommendations of XML Encryption Syntax and Processing and Decryption Transform for XML Signature as Recommendations. XML Encryption is a syntax for encrypting documents, elements, or other data and embedding or pointing to the encrypted text in XML documents using Base-64 encoding. A variety of algorithms are supported. The XML Signature decryption transform "enables XML Signature applications to distinguish between those XML Encryption structures that were encrypted before signing (and must not be decrypted) and those that were encrypted after signing (and must be decrypted) for the signature to validate."
The W3C HTML Working Group has released the Last Call Working Draft of Modularization of XHTML in XML Schema. This spec provides a complete set of W3C XML Schema Language modules for XHTML, and allows document authors to modify and extend XHTML to build new, non-strictly conmforming XHTML documents. Comments are due by January 31, 2003.
The W3C Amaya browser team has opened a contest to design the new Amaya Welcome page (the default page that the Amaya browser displays when launched). Entries must be valid, accessible, and showcase multiple Amaya capabilities. The winner gets bragging rights. Submissions are due by February 3, 2003.
I'm continuing to experiment with XHTML 1.1 for this page. In particular, I'm using the internal DTD subset to extend and replace some of the normal content models and attribute lists. You can check out your browser's support for that here. So far, here's my score card:
The big question mark seems to be the MIME type I serve the page with. If I serve it with text/html everything pretty much works, except that all the browsers display the "]>" at the end of the internal DTD subset. If I serve it as application/xhtml+xml, then Mozilla derived browsers and Opera 6 and later work perfectly, but everything else works not at all. Given this, I really can't see using XHTML for web sites for several years at least.
What I'm doing turns out not to be strictly conforming XHTML (redefining parameter entity references in the internal DTD subset and adding new elements in the XHTML namespace are big no-no's) but the browsers that support XHTML don't seem to have any trouble with this. It is possible to create valid documents based on XHTML without going through the hassle of creating a new profile using the modularization framework. I'm not sure there's any point to doing this, though.
Speaking of Opera, Opera Software has posted the third beta of Opera 6.0 for MacOS and Mac OS X. This release adds shared library support and enables Java in the classic MacOS, but offers no major new features in the XML space. It still supports direct display of XML with CSS stylesheets. XSL is still missing in action. Opera is normally $39 payware or free-beer adware, but right now, there's a sale so you can buy it for $29.
The Mozilla Project has released version 0.5 of Phoenix, a light-weight browser for Windows and Linux based on Mozilla's Gecko engine. It supports all the yummy XML features, but doesn't include the e-mail program, news reader, or nose hair trimmer. Phoenix differs from similar efforts like Galeon in that it's based on XUL and is designed for cross-platform release on Linux and Windows. (Mac OS X users should check out Chimera instead.) This is mostly a bug-fix and speed-up release.
Christian Neumann's posted LibXMLight 0.1.0, "a non-validating, lightweight XML Parser Library written in C++. The API is similar to SAX." LibXMLight is published under the GPL.
The W3C Evaluation and Repair Tools Working Group has posted the first public working draft of Evaluation and Report Language (EARL) 1.0. EARL is:
a language to express test results. Test results include bug reports, test suite evaluations, and conformance claims. The test subject might be a Web site, an authoring tool, a user agent or some other entity. Thus, EARL is flexible. It enables any person, entity, or organization to state test results for any thing tested against any set of criteria.
Stating test results in EARL creates a variety of opportunities. The data can be--
- exchanged between tools;
- used to create reports;
- combined to compare how different test subjects fared on the same test.
Earl is based on RDF.
The Apache XML Project has released version 2.0.4 of the Cocoon application server. "Apache Cocoon is an XML framework that raises the usage of XML and XSLT technologies for server applications to a new level. Designed for performance and scalability around pipelined SAX processing, Cocoon offers a flexible environment based on the separation of concerns between content, logic and style. A centralized configuration system and sophisticated caching top this all off and help you to create, deploy and maintain rock-solid XML server applications." Version 2.0.4 "is a maintainance release focusing on improved performance and robustness. In addition some bugs were fixed and new features were added." New features include:
Andy Clark has posted a new release of his CyberNeko Tools for the Xerces Native Interface (NekoXNI). This release fixes a few bugs in the HTML parser.
IBM's alphaWorks has released version 1.2 of the XML Wrapper Generator, a graphical tool that integrates XML data sources into a DB2 database. The tool loads XML schema files, "shreds" them to a relational schema, and generates appropriate NICKNAME and VIEW statements. This is a bug fix release.
I've fleshed out the XOM design principles document. It now explains the reasoning behind the various choices made in XOM.
Malcolm Wallace and Colin Runciman have released HaXml 1.08, and XML processing library for the Haskell language. According to the web page,
HaXml is a collection of utilities for using Haskell and XML together. Its basic facilities include:
- a parser for XML,
- a separate error-correcting parser for HTML,
- an XML validator,
- pretty-printers for XML and HTML.
For processing XML documents, the following components are provided:
- Combinators is a combinator library for generic XML document processing, including transformation, editing, and generation.
- Haskell2Xml is a replacement class for Haskell's Show/Read classes: it allows you to read and write ordinary Haskell data as XML documents. The DrIFT tool (available from http://repetae.net/~john/computer/haskell/DrIFT/) can automatically derive this class for you.
- DtdToHaskell is a tool for translating any valid XML DTD into equivalent Haskell types.
- In conjunction with the Xml2Haskell class framework, this allows you to generate, edit, and transform documents as normal typed values in programs, and to read and write them as human-readable XML documents.
- Finally, Xtract is a grep-like tool for XML documents, loosely based on the XPath and XQL query languages. It can be used either from the command-line, or within your own code as part of the library.
HaXml is distributed under the Artistic License.
Alex Chaffee has updated his XPath Explorer. to add "pop-down history lists, sample files menu, snazzier widget alignment, hierarchical tree document display," an expanded XPath field, and NetBeans and Eclipse plug-ins. This is an open source, graphical tool for displaying XML document as a tree and evaluating XPath expressions with respect to those documents.
This is a nice little toy for simple documents, but when I tried to use it in the Hands-On XSLT class at Software Development East last month it proved ungodly slow for real world documents like the periodic table example from the XML Bible and one DocBook chapter from the Processing XML with Java source.
The W3C Multimodal Interaction Working Group have published a note on the Multimodal Interaction Framework. This note is a very high-level description of how different inputs such as speech, handwriting, keyboards, and so forth can be connected up with different outputs such as audio, video, and screens within the same sytem. The goal is to allow content and processing to be decoupled from the specific input and output methods.
The W3C has released version 7.0 of Amaya, their open source, test bed web browser and editor for Windows and Linux that supports XML, XHTML, and CSS, as well as providing partial support for MathML and SVG. New features in this release include a history menu, the Raptor RDF parser, better support for XML, SVG, and CSS, an OpenGL version with support for SVG opacity and PNG transparency, and font anti-aliasing under Unix.
IBM's alphaWorks has released version 3.3 of their Web Services Toolkit. The "basic software components needed to create a Web services environment are provided with Web Services Tool Kit. Included is an architectural blueprint (Web Services Architecture), sample programs, Utility services, and some tools that are helpful in developing and deploying Web services. Extensive documentation is included to assist developers with the basic concepts of Web services. The tool kit also includes a fully-functioning Web services client API that can be used to directly access a UDDI registry." New features in 3.3 include the "Tivoli Management Web Services and Common Event Format, Federated Identity demo, Wide Spectrum Stress Tool, Reputation Protocol, WS-Inspection crawler utility, Pluggable Discovery Framework, Privacy Authorization Director, and Updated Utility Services." Java 1.3 or later is required.
The Mozilla Project has released Mozilla 1.2.1 in order to fix a "DHTML bug in Mozilla 1.2 which broke dynamically writing into a dynamically created element." There are no new features in this release. All 1.2 users should upgrade.
IBM's alphaWorks has released XincaML (eXtensible Inter-Nodes Constraint Mark-up Language). XincaML is a schema language that "can describe the presence or value dependencies amongst nodes located on different branches of an XML tree. It can specify constraints that can't be expressed by the XML schema language and therefore supplement the existing XML schema language to insure the integrity of data. XincaML is a declarative language and its syntax is based on XML format." I have looked at the language yet, but the general description of what it can do reminds me of Schematron. The XincaML validator is written in Java.
The W3C Voice Browser Working Group has posted the last call working draft of the Speech Synthesis Markup Language Specification. According to the abstract:
The Voice Browser Working Group has sought to develop standards to enable access to the Web using spoken interaction. The Speech Synthesis Markup Language Specification is part of this set of new markup specifications for voice browsers, and is designed to provide a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications. The essential role of the markup language is to provide authors of synthesizable content a standard way to control aspects of speech such as pronunciation, volume, pitch, rate, etc. across different synthesis-capable platforms.
Michael Fitzgerald's written a nice introductory article about XOM for XML.com.
Daniel Veillard's released version 1.0.23 of libxslt, the GNOME XSLT library and version 2.4.28 of libxml2, the GNOME XML parser for Linux. The new version of libxslt fixes bugs and cleans up the code and docs. The new version of libxml fixes a few bugs and adds support for alternate encodings when processing XIncludes with parse="text".
Andy Clark has posted a new release
of
his CyberNeko
Tools for the Xerces Native Interface
(NekoXNI). This is a collection of XML tools
written specifically to take advantage of the XNI API
in Xerces2 including the
NekoHTML parser and the NekoDTD parser.
This release makes a number of improvements and experiments in the
HTML parser and corrects the naming conventions in the GeneralEntityEvent
class.
Version 1.3.2 of the OpenJade DSSSL processor has been released. DSSSL is a style language for SGML and XML documents. OpenJade contains backends for various formats (RTF, HTML, TeX, MIF, SGML2SGML, and FOT). This is a maintenance release that "supports and is intended to be used with the latest version of OpenSP, currently 1.5. This means that openjade takes advantage of the features available in OpenSP 1.5. It also means that distributors can provide separate and independent 'packages' for OpenSP and OpenJade.".
Weather permitting (and it may not) I'll be visiting family over the Thanksgiving break. Updates will probably be a little slow until next week.
The Mozilla Project has released Mozilla 1.2, the open source web browser for Windows, Mac, and Linux that natively supports XML, CSS, XSLT, XUL and lots of other cool acronyms. The big new XML feature in this release is a pretty printed source view for raw, unstyled XML, much like that used in internet Explorer. Other new features in 1.2 include:
Type Ahead Find, a new feature that allows quick navigation when you type a succession of characters in the browser, matching the text in one or more links on the page. To give it a spin just go to a web page, start typing, watch the typed characters highlight as they find a match in a link and hit enter to load the selected link. You can also use it to search for any text on the page by typing / before your search text.
You can now show toolbars as text/icons/both in the default Classic theme. There are also a few other usability improvements such as image selection visualization (image highlights with system selection color when selected) and the removal of the confusing toolbar grippies.
Improvements to native look and feel in both the browser interface and the browser content area. Mozilla now supports most native GTK themes in Mozilla and the native look and feel for web content like form controls under Windows XP.
You can launch the browser with a group of bookmarks as your start page. This loads several pages into tabs at startup.
Keyboard access is improved with additional accesskeys for menus, other UI elements and page elements.
Document prefetching based on hints included in the page's link elements
Java compatibility with Mac OS 10.2 (Jaguar) has been repaired.
The W3C Quality Assurance (QA) Activity has updated three working draft specifications on quality assurance:
These describe "a common framework for enhancing the quality practices of the W3C Working Groups in the areas of specification editing, production of test materials, and coordination efforts with internal and external groups."
The W3C Device Independence Working Group has published a second last call working draft of Composite Capability/Preference Profiles (CC/PP): Structure and Vocabularies. This is an RDF vocabulary for describing user agent (browser) and proxy capabilities and preferences. Topics include:
The CC/PP vocabulary uses URIs to refer to specific capabilities and preferences. It covers:
The W3C Cascading Style Sheets working group has published working drafts of two new modules for CSS Level 3:
New properties for borders since CSS2 include
border-image,
border-fit,
border-image-transform,
border-break, and
box-shadow.
Changes in lists since CSS2 include:
display:marker has been replaced by
the ::marker pseudo-elementmarker-offset property is obsolete. The W3C Web Services Architecture Working Group has updated the working draft of Web Services Architecture Requirements and published two new working drafts on Web Services Architecture and Web Services Glossary.
The Web Services Architecture "document describes the Web Service Architecture. The Web services reference architecture identifies the functional components, defines the relationships among those components, and establishes a set of constraints upon each to effect the desired properties of the overall architecture."
The Web Services Glossary "is a glossary of Web services terms intended to be used to describe the Web services architecture [WS Arch], and across the Web Services Activity."
IBM's alphaWorks has released
XML Processing Plus Plus
a new "typed and stream-based XML processing language" that
extends Java with
new XML stream APIs: XmlIn and XmlOut. XmlIn
retrieves data from XML input streams, and XmlOut is
inserts data into XML output streams.
"XML Processing Plus Plus includes the xpppc compiler, which converts
programs written in XML Processing Plus Plus syntax into standard Java
byte code. The compiler supports type checking based on DTDs (document
type definitions). The type checker reports semantic errors of XML
manipulation against DTDs".
IBM has also updated their XML Parser for Java to version 4.1.2. This release is based on Xerces-J 2.2.0 and supports the W3C XML Schema Recommendation 1.0, SAX 1.0 and 2.0, DOM Level 1, DOM Level 2, and some experimental features of DOM Level 3 Core and Load/Save Working Drafts, JAXP 1.2, and XNI.
I've extracted out all the examples from Processing XML with Java into individual files. You can download them as a zip archive if you like. I actually wrote an XSLT stylesheet to pull all the examples out of the chapters, dump them into individual files, and then generate the index files from the titles used in the book. I used ant to automatically apply the stylesheet and zip all the examples, so this is just a single part of the book's build process. (Question: it seems that Ant is missing one key feature that make has, the ability to detect whether a file on the disk has changed since the last build, and thus whether dependent files needs to be regenerated or not. Ant seems to rebuild the entire project from scratch every time. Is there any way to avoid this?)
Several of the examples communicate with web services running on http://www.elharo.com. Unfortunately, between the time the book went to press and now, an upgrade to that server necessitated by security concerns broke a number of URLs published in the book. The services are still running, just not at quite the same URLs. I think in all cases you can access them by chnaging /fibonacci to /fibonacci/servlet. For instance, Example 3-10 and most of the examples in Chapter 5 attempt to communicate with a servlet running at http://www.elharo.com/fibonacci/XML-RPC. Instead you can connect to http://www.elharo.com/fibonacci/servlet/XML-RPC. In Example 3-11, you would change http://www.elharo.com/fibonacci/SOAP to http://www.elharo.com/fibonacci/servlet/SOAP and so forth.
I do not know why the new version of the Java Development Kit for the Cobalt Qube will not let me map the servlets to the shorter URLs. I just know that it won't. If anyone has a supposition as to how I might fix this so that the shorter URLs work again, please let me know. I've been tearing my hair out trying to fix this. This is using a special version of Tomcat 3.2.1 for Sun's Cobalt Qube. As near as I can tell it just doesn't pay any attention to the servlet mappings defined in the web.xml file like the old version did.
The W3C Web Ontology Working Group has published three updated working drafts about the Web Ontology Language (OWL):
According to the Guide abstract,
The World Wide Web as it is currently constituted resembles a poorly mapped geography. Our insight into the documents and capabilities available are based on keyword searches, abetted by clever use of document connectivity and usage patterns. The sheer mass of this data is unmanageable without powerful tool support. In order to map this terrain more precisely, computational agents require machine-readable descriptions of the content and capabilities of web accessible resources. These descriptions must be in addition to the human-readable versions of that information.
The Web Ontology Language (OWL) is intended to provide a language that can be used to describe the classes and relations between them that are inherent in Web documents and applications.
The reference abstract further elucidates:
OWL is a semantic markup language for publishing and sharing ontologies on the World Wide Web. OWL is derived from the DAML+OIL Web Ontology Language [DAML+OIL] and builds upon the Resource Description Framework [RDF/XML Syntax].
I've posted updated notes from my final two talks at Software Development 2002 East this past week, XML Pull Parsing and and DOM. The XML Pull Parsing Talk was particularly fun. It includes a lot of new material on Andy Clark's NekoPull that hasn't gotten a lot of notice yet. Unlike XMLPULL, NekoPull is fully conformant to the XML specification in both interface and implementation. The API is also much more sensible than XMLPULL. It doesn't make all the compromises XMLPULL makes for J2ME environments in the name of size and speed.
However, the NekoPull API is far from perfect. There are definitely some weird spots. The two most obvious are short type constants and the use of public fields instead of getter and setter methods. There are also several dependencies on the Xerces Native Interface (XNI) that prevent it from being a truly generic API for other parsers. In the long term, I have high hopes for StAX, the Streaming API for XML, being developed in the Java Community Process, mostly becomes James Clark is on the expert group. I'll be revisting and updating all of this in just about four months at Software Development West 2003 in Santa Clara. Pull parsing is an exciting space to be watching right now.
Jonathan Borden has posted a draft of RDDL 2.0.
RDDL is an XHTML-based vocabulary for human readable
and machine processable documents placed at the end of namespace URLs.
The major change in version 2.0 is the option to use RDF instead of XLinks.
For example, in the old XLink style, a rddl:resource element might look like this:
<rddl:resource xlink:href="baseball.dtd"
xlink:role="http://www.isi.edu/in-notes/iana/assignments/media-types/application/xml-dtd"
xlink:arcrole="http://www.rddl.org/purposes#validation">
<div id="DTD" class="resource">
<h3>DTD</h3>
<p>A <a href="baseball.dtd">DTD</a> for baseball statistics</p>
</div>
</rddl:resource>
In the new RDF style, it might look like this:
<rdf:Description rdf:ID="DTD" rddl:title="DTD">
<purpose:validation>
<rddl:resource rdf:about="baseball.dtd" >
<rddl:nature
rdf:resource="http://www.isi.edu/in-notes/iana/assignments/media-types/application/xml-dtd"
/>
</rddl:resource>
<div id="DTD" class="resource">
<h3>DTD</h3>
<p>A <a href="baseball.dtd">DTD</a> for baseball statistics</p>
</div>
</purpose:validation>
</rdf:Description>
A second, slightly less ugly RDF option looks like this:
<rddl:resource ID="DTD">
<rddl:title>DTD</rddl:title>
<rddl:nature
resource="http://www.isi.edu/in-notes/iana/assignments/media-types/application/xml-dtd"/>
<rddl:purpose
resource="http://www.rddl.org/purposes#validation"/>
<rddl:related resource="baseball.dtd"/>
<rddl:prose>
<div id="DTD" class="resource">
<h3>DTD</h3>
<p>A <a href="baseball.dtd">DTD</a> for baseball statistics</p>
</div>
</rddl:prose>
</rddl:resource>
The W3C RDF Core Working Group has updated six working drafts covering various aspects of the Resource Description Framework. According to the primer draft, "The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource. However, by generalizing the concept of a 'Web resource', RDF can also be used to represent information about things that can be identified on the Web, even when they can't be directly retrieved on the Web. RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning." The updated specs, in roughly the order you might want to read them, are:
RDF Primer "is designed to provide the reader with the basic knowledge required to effectively use RDF. It introduces the basic concepts of RDF and describes its XML syntax. It describes how to define RDF vocabularies using the RDF Vocabulary Description Language, and gives an overview of some deployed RDF applications. It also describes the content and purpose of other RDF specification documents."
RDF Semantics "s a specification of a precise semantics for RDF and RDFS, with some entailment results. It is intended to be readable by a general technical audience."
Resource Description Framework (RDF): Concepts and Abstract Syntax "defines an XML syntax for the Resource Description Framework (RDF) as amended and clarified by the RDF Core Working Group from that originally described in RDF Model & Syntax. The syntax is updated to be specified in terms of XML, XML Namespaces, the XML Information Set with new support for XML Base. The parts of the RDF/XML syntax are explained along with examples of how they work. The formal grammar is annotated with actions for generating the arcs that form the RDF graph as defined in the RDF Concepts and Abstract Syntax Working Draft. This is done using the N-Triples RDF Graph serializing format which enables more precise recording of the mapping in a machine processable form. These tests are gathered and published in the RDF Test Cases Working Draft."
RDF Vocabulary Description Language 1.0: RDF Schema "describes how to use RDF to describe RDF vocabularies. This specification also defines a basic vocabulary for this purpose, as well as conventions that can be used by Semantic Web applications to support more sophisticated RDF vocabulary description."
RDF Test Cases describes a set of machine-processable test cases for RDF which are available from a separate web page.
RDF/XML Syntax Specification (Revised) "defines an XML syntax for the Resource Description Framework (RDF) as amended and clarified by the RDF Core Working Group from that originally described in RDF Model & Syntax. The syntax is updated to be specified in terms of XML, XML Namespaces, the XML Information Set with new support for XML Base. The parts of the RDF/XML syntax are explained along with examples of how they work. The formal grammar is annotated with actions for generating the arcs that form the RDF graph as defined in the RDF Concepts and Abstract Syntax Working Draft. This is done using the N-Triples RDF Graph serializing format which enables more precise recording of the mapping in a machine processable form."
The W3C XForms Working Group has posted the candidate recommendation of XForms 1.0. According to the abstract,
XForms is an XML application that represents the next generation of forms for the Web. By splitting traditional XHTML forms into three parts-XForms model, instance data, and user interface-it separates presentation from content, allows reuse, gives strong typing-reducing the number of round-trips to the server, as well as offering device independence and a reduced need for scripting.
One of the editors, Micah Dubinko, will be talking about this at SD Expo here in Boston tomorrow.
I've posted updated notes for yesterday's XLinks and Schemas seminars at Software Development 2002 East.
Michael Kay has released Saxon 7.3, a partial and experimental implementation of XSLT 2.0 written in Java. Changes include:
This is for experimenters only. Most users should continue to use Saxon 6.5.2.
I've posted updated notes for yesterday's half-day Hands-On XSLT class at Software Development 2002 East. This covers basic XSLT 1.0 and XPath. This was the first time I've taught this as a hands-on session. It was fun, but a full day might have been more helpful. This class will probably next be offered at Software Development 2003 West in March.
The W3C has released version 6.4 of Amaya, their open source, test bed web browser and editor for Windows and Linux that supports XML, XHTML, and CSS, as well as providing partial support for MathML and SVG. Improvements in this release include improved Finnish and German localizations, HTTP Location header support, and many bug fixes.
Amaya's actually becoming a pretty nice browser. However, it's still ugly as sin. I know Amaya isn't really supposed to compete with Mozilla and IE, but it would be nice if some experienced screen designer felt like donating some time to cleaning up the icons and general GUI appearance.
I've posted updated notes for yesterday's half-day XML Fundamentals class at Software Development 2002 East. This covers basic XML, well-formedness, DTDs, validity, a little CSS, and namespaces. This class will probably next be offered at XML Web Services One London 2003 and then at Software Development 2003 West, both in March.
The W3C Technical Architecture Group (TAG) has published an updated working draft of Architecture of the World Wide Web. This is still incomplete, but makes for interesting reading. According to the introduction:
The World Wide Web (or, Web) is a networked information system consisting of agents (programs acting on behalf of another person, entity, or process) that exchange information.
This document organizes Web architecture into:
- Identification. Agents identify objects in the system (called "resources") with Uniform Resource Identifiers (URIs), defined in [RFC2396].
- Representation. Agents represent resources using a nonexclusive set of data formats, separately or in combination (e.g., XHTML, CSS, PNG, XLink, RDF/XML, SMIL animation). This section also discusses technologies for building new data formats (XML, XML Namespaces).
- Interaction. Agents exchange representations via protocols, including HTTP [RFC2616], FTP, and SMTP1. Several of these protocols share a reliance on the Multipurpose Internet Mail Extensions (MIME) standards for the format of message bodies [RFC2045] and for Internet Media Types [RFC2046].
The W3C XSLT and XQuery Working Groups have updated seven working drafts:
For XQuery, you should start by reading XML Query Use Cases. For XPath and XSLT 2.0, you should start with the XQuery 1.0 and XPath 2.0 Data Model assuming you're already familiar with XSLT 1.0. Otherwise, you should begin by learning XSLT 1.0.
As Murphy's Law requires, they did this just a couple of days before I have to leave for Boston to talk about exactly this at SD Expo. So much for the notes I had prepared. I guess I know what I'm reading on the plane now. From a first glance, here are the major changes in these releases:
This document has been rewritten very heavily. You should probably just read it. It's much cleaner and more consistent. The key paragraph is the following:
Every value handled by the data model is a sequence of zero or more items. An item is either a node or an atomic value. A node is defined in 4 Nodes and is one of seven node kinds. An atomic value encapsulates an XML Schema atomic type and a corresponding value of that type. They are defined in 5 Atomic Values. A sequence is an ordered collection of nodes, atomic values, or any mixture of nodes and atomic values. A sequence cannot be a member of a sequence. A single item appearing on its own is modeled as a sequence containing one item. Sequences are defined in 6 Sequences.
However, data models are tricky things, and I really need to read this more carefully. At first glance, it does appear to require some basic notion of well-formedness, unlike the infoset. For instance, it does require that each attribute of an element have a unique name, which the Infoset does not.
In addition to the changes listed above for XPath 2.0,
The primary change here is that all the examples now use the new XQuery syntax with order by clauses instead of sortby(). For example,
<bib>
{
for $b in document("http://www.bn.com/bib.xml")//book
where $b/publisher = "Addison-Wesley" and $b/@year > 1991
order by $b/title
return
<book>
{ $b/@year }
{ $b/title }
</book>
}
</bib>
The changes here are fairly minor and technical. The
big one is that there is no longer a principal result tree. All
"result trees now have the same status, though there is still an
initial result tree created implicitly if the stylesheet does
not create one using xsl:result-document." In
addition, xsl:output supports Unicode
normalization, and the type attribute of several
elements has been renamed as.
The W3C Patent Policy Working Group has posted the last call working draft of the W3C Royalty-Free Patent Policy. Bottom line: "In order to promote the widest adoption of Web standards, W3C seeks to issue Recommendations that can be implemented on a Royalty-Free (RF) basis. Under this policy, W3C will not approve a Recommendation if it is aware that Essential Claims exist which are not available on Royalty-Free terms." The word "Royalty-Free" is actually being used in a slightly unusual way here. To better understand the implications, replace "Royalty-Free" with "Zero-Cost".
Amazon has corrected their mispricing of Processing XML with Java. They now show the correct list price of $54.99, and they're selling it for 30% off at $38.49.
The W3C SVG Working Group has posted proposed recommendations of the Scalable Vector Graphics (SVG) 1.1 Specification. The abstract says, "SVG 1.1 serves two purposes: to provide a modularization of SVG based on SVG 1.0 and to include the errata found so far in SVG 1.0." A test suite is now provided for SVG.
The SVG Working Group has simultaneously posted Mobile SVG Profiles: SVG Tiny and SVG Basic. SVG Tiny is a stripped down version of SVG for cell phones. SVG Basic is a slightly larger version of SVG for PDAs. Comments on both specs are due by December 20th.
The same working group has also posted the first public working draft of Scalable Vector Graphics (SVG) 1.2. This is very preliminary, but possible new features for this version include:
flowText element for wrapping text inside shapessolidColor element is a paint server that
provides a single color with opacity. It can be referenced like
the other paint servers (gradients and patterns).SVGWindow interfaceThe W3C DOM Working Group has released the proposed recommendation of Document Object Model (DOM) Level 2 HTML Specification. The abstract states, "This specification defines the Document Object Model Level 2 HTML, a platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content and structure of [HTML 4.01] and [XHTML 1.0] documents. The Document Object Model Level 2 HTML builds on the Document Object Model Level 2 Core [DOM Level 2 Core] and is not backward compatible with DOM Level 1 HTML [DOM Level 1]." Comments are due by October 16.
The W3C/IETF joint XML Signature Working Group has released the XML-Signature XPath Filter 2.0 Recommendation. According to, the abstract, "XML Signature [XML-DSig] recommends a standard means for specifying information content to be digitally signed and for representing the resulting digital signatures in XML. Some applications require the ability to specify a subset of a given XML document as the information content to be signed. The XML Signature specification meets this requirement with the XPath transform. However, this transform can be difficult to implement efficiently with existing technologies. This specification defines a new XML Signature transform to facilitate the development of efficient document subsetting implementations that interoperate under similar performance profiles."
IBM's alphaWorks has released the Web Services Tool Kit for Mobile Devices. It enables the development of Java applications that talk to SOAP services on the PocketPC, Palm, and BlackBerry and C applications on the Palm.
The W3C XML Linking Working Group has posted proposed recommendations of three XPointer specifications:
The biggest change in these drafts is that non-W3C-defined scheme names must be namespace qualified, and those namespaces must be defined by xmlns parts. Simon St. Laurent has been pointing out on the xml-dev mailing list that this has stretched namespaces and qualified names pretty much well past the breaking point. The syntax proposed is so horribly unusable that it effectively dooms any hope of extending XPointer from outside the W3C. It's becoming obvious that instead of going to recommendation, XPointer should be tabled, and work on an XML addressing syntax for URI fragment identifiers should begin again from scratch. Unfortunately, the working group's charter expires at the end of the year, and they seem to prefer putting out a deeply flawed spec to putting out nothing at all.
Opera Software has posted the first public beta of Opera 7.0 for Windows, a $39 payware web browser that supports XML and CSS. This version is allegedly faster and more standards compliant. There's also a new e-mail client. There don't seem to be a lot of other impressive new features.
The Mozilla Project has posted version 0.6 of Chimera, a small footprint, native Cocoa Mac OS X web browser based on Mozilla's Gecko layout engine that includes lots of XML support. Unlike Mozilla, this is only a browser: no e-mail client, news reader, chat program, or dog walker. Changes in 0.6 include "improvements to plugins, keychain support, better cookie management, and Talkback support."
IBM's alphaWorks has updated their XSL Formatting Objects Composer, "a typesetting and display engine that implements a substantial portion of XSL Formatting Objects", to improve memory, speed, and performance as well as handling larger input documents. However, it doesn't handle quite as much of the XSL spec as earlier versions.

I am pleased to announce the official publication of Processing XML with Java. This is the most comprehensive and up-to-date book about integrating XML with Java (and vice versa) you can buy. It contains over 1000 pages of detailed information on SAX, DOM, JDOM, JAXP, TrAX, XPath, XSLT, SOAP, and lots of other juicy acronyms. This book is written for Java programmers who want to learn how to read and write XML documents from their code.
Normally, this is the point where I'd spend a few paragraphs describing
just what's in the book and how important it is to your education, your
career, and your love life; but this time I've done something a little
different. The entire book is available online.
You can read every chapter and every page so you can see for yourself
how well this book answers your questions such as, "Why
does SAX truncate the text in my documents after a few thousand
characters?", "How do I
serialize a DOM Document object in an
implementation-independent way?", or, "Why doesn't my
significant other understand the importance of a building a life size
Millennium Falcon in our backyard?". Consequently, I'll forego the
usual hype. Check the book out for yourself; and if you like it, please
buy a copy. I promise it's cheaper than printing all 1100+ pages on your
laser printer.
I received my copy yesterday, and Amazon is reporting that it will be in stock tomorrow. They do have the wrong list price. It's $54.99, not $69.50, but their actual price is a quite reasonable $48.65. Barnes & Noble shows the list price as $49.99 (also wrong) and is selling it at $39.99. You may want to pre-order your copy today, because their initial shipments of my books tend to sell out very quickly once I announce them here. If you missed the first batch, don't worry. Addison Wesley will ship more very quickly. It does not normally take the advertised "2-3 weeks". Brick and mortar stores should have their copies very soon as well.
I'm heading off to Europe later today for a much needed vacation. (Well, a working vacation anyway. I'll be stopping by the Javapolis conference in Antwerp next Wednesday, November 13, to talk about Refactoring Java and the Top 10 Myths About Java I/O. It's only €150; and looks to be a really fun show. If you're not too far from Antwerp, check it out.) Consequently, updates will be slow to non-existent here for the next week. If you think I planned this just so my book hype would be front page news here for a week, well, OK. You caught me. :-) In the meantime, why don't you check out Processing XML with Java. If you like what you see, you can buy it at Amazon or any other purveyor of fine computer books.
In an unrelated note, elharo.com is down for the time being due to either Covad, Verizon, or Speakeasy problems. (Whose fault it is exactly has not been determined yet.) and may not be fixed until I return late next week. Cafe au Lait and Cafe con Leche are both hosted by the friendly and much more skilled folks at IBiblio and should be fine.
SpaceMapper - DataStore is an open source "document repository server for storing, querying and fetching XML based documents. It is built on practical needs allowing the storage of semi-structured (well formatted, maybe validated, XML, XHTML and HTML) documents and un-structured documents (TXT)." The documents are stored in conventional relational databases such as PostgreSQL, MySQL, or DB2. Space Mapper is wwritten in Java on top of the Avalon Phoenix framework. The documents are managed through BEEP and/or XML-RPC interface using a subset of the Simple Exchange Profile (SEP) protocol.
SpaceMapper includes MN8, "an experimental object oriented scripting
language, tightly integrated with the net, which emulates the concepts
at the core of XML in order to simplify and make as transparent as
possible information extraction and manipulation from the WWW and XML
documents." To give you a bit of the flavor of the language, here's the
first example from the
tutorial that defines a "concept" (an MN8 programming structure that
would probably be called a class in most languages) for a
Person:
# -------- Person.mn8 ---------
define Person label "PersonalPersonDefinition" [
@firstName
@lastName typeof String label "lastName"
email label "Email.Address"
address [
street
city
state
country [
@code
value typeof String
]
]
: getMyPerson [
@firstName = "Remus"
@lastName = "Pereni"
/email = "remus@nolimits.ro"
/address/street = "Some street"
/address/city = "Satu Mare"
/address/state = "no state J"
/address/country@code = "RO"
/address/country/value = "Romania"
]
static : main ( $args typeof Series ) [
$me typeof Person
$me.getMyPerson
print $me.toXML
]
]
MN8 is Written in Java and includes concepts for HTML, HTML-Forms, Cookies, RSS, OPML, HTTP, FTP, POP3, SMTP, Jabber, BEEP, XML-RPC, SOAP, MBox.
Jens Låås has released xmlclitools 1.42, four Linux command-line tools for searching, modifying, and formating XML data. The tools are designed to work in conjunction with standard utilities such as grep, sort, and shell scripts. Version 1.42 adds sorting to xmlmod and new special property names to xmlfmt, and an RPM spec file. All four tools are published under the LGPL.
Opera Software has posted the second beta of Opera 6.0 for MacOS and Mac OS X. This release adds shared library support and enables Java in the classic MacOS, but offers no major new features in the XML space. It still supports direct display of XML with CSS stylesheets. XSL is still missing in action. Opera is normally $39 payware or free-beer adware, but right now, there's a sale so you can buy it for $29.
I've posted version 1.0d8 of XOM, my open source,
tree-based Java API for processing XML that strives for strict
compliance to the XML specs. There are no breaking changes in
this release. The big new feature is that XSLT works (modulo
some obscure bugs in handling the undeclaration of the default
namespace. I need to get some clarification on the proper
behavior of SAX processors to fix this.) As part of supporting
XSLT, I discovered a need to undeclare the default namespace on
a prefixed element. That is <pre:name
xmlns:pre="http://www.example.com" xmlns="">. You can
now do this by passing an empty string for both the prefix and
URI to declareNamespace().
I'm travelling quite a bit in November so this is probably the last release until next month. The API, however, is starting to feel quite stable to me. Most of the things on my to do list involve implementation details, testing, documentation, benchmarking, optimization, and the like. One of the places I'm travelling this month is the SD Expo show in Boston. I'll be hosting a Birds of a Feather session there on "What's Wrong with XML APIs and How to Fix Them." I expect this to be more interactive than the XML SIG session in September.
The Apache XML Project has released version 2.4.1 of Xalan-J, an open source XSLT processor written in Java that supports XSLT 1.0 and TrAX. New features in this release include:
Gal Binyamini has released JXV, an open source library that allows Java objects to be given "XML Views", and for those views to be read back into objects. (This strikes me as a little more plausible than the other direction in which you start with an XML document and build a custom Java object around it.) Essentially, this is another variation of object serialization using XML. JXV supports SAX input and output and DOM output. According to Binyamini,
JXV uses a pluggable architecture which allows XML view factories to be configured and loaded at runtime. The JXV configuration mechanisms also leverage XML namespaces to allow the configurations for those different view factories to be inlined within the JXV configuration file. In this release, JXV comes pre-configured with view factories for JavaBeans, collections, array, and "flat objects" such as Strings, primitives, etc. These factories support a wide variety of configuration options, and are sufficient for most object models. Future versions of JXV will include pre-configured support for additional factories. JXV may also release special-purpose factories (such as ones providing views for RowSets, ResultSets and other JDBC structures) as extension packages.
Andrew Watt reports that, "After thinking about the discussions we have had in the last few days about the difficulties of newcomers to XML getting a grasp on the important concepts of XML I have decided to set up a mailing list for newbies called, very imaginatively, 'XMLNewbies'. So if you know people who are new to XML you might want to point them in that direction." You can subscribe by sending an email to XMLNewbies-subscribe@ yahoogroups.com.
IBM's alphaWorks has released the Multimodal Browser Extension, a plug-in that allows Internet Explorer to render multimodal applications written according to the W3C XHTML+Voice (X+V) note. "This technology, which includes IBM's automatic speech recognition and text-to-speech engines, allows testing of voice-enabled Web applications written in the X+V language." Windows 2000 or XP is required.
Sun's posted a beta of the Java Architecture for XML Binding 1.0 (JAXB) on the Java Developer Connection (registration required). JAXB compiles an XML schema into one or more Java classes. (First mistake: JAXB assume there's a schema. Second mistake: It assumes the schema is written in the W3C XML Schema Language. Third mistake: It assumes documents actually adhere to the schema.) JAXB can unmarshal schema-valid XML into Java objects; read, update and validate the Java objects against the schema, and write the result back out as XML.
The Mozilla Project has released version 0.4 (Oceano) of Phoenix, a light-weight browser for Windows and Linux based on Mozilla's Gecko engine. It supports all the yummy XML features, but doesn't include the e-mail program, news reader, or nose hair trimmer. Phoenix differs from similar efforts like Galeon in that it's based on XUL and is designed for cross-platform release on Linux and Windows. (Mac OS X users should check out Chimera instead.) Improvements in 0.4 include themes support, type ahead find, better pop-up blocking, toolbar customization, and tabbed browsing, as well as assorted bug fixes.
The W3C Web Services Description Working Group has posted the Last Call Working Draft of Web Service Description Requirements. According to the W3C web page, this "document describes definitions and requirements for specifying application to application communication." Comments are due by the end of the year.
Lucid'i.t. has released version 1.1 of their Lucid XML Editor, a web based document editor for Windows and Internet Explorer. I think this is free-beer.
Al Byers has released AG101 0.3.2.2, an open source, visual XSL editor/debugger written in Java. It is based on the Pollo XML editor. AG101 allows the user to visually set breakpoints in the XSL code and step through the source code. At breakpoints the values of variables, selects, etc. can be inspected.
Xerlin 1.2.1 is an open source XML Editor written in Java. Users can extend the application via custom editor interfaces for specific DTDs. Java 1.2 or later is required.
Simon St. Laurent has published an The XPointer xpath1() Scheme, an
IETF Internet-Draft that defines an xpath1() scheme
for use inside the W3C's XPointer Framework. In essence this is
the same as the existing xpointer() scheme after
subtracting points and ranges. For example, today's news on Cafe
con Leche could be identified as
http://www.cafeconleche.org/#xpath1(//today). The
news for Monday, October 28, could be
http://www.cafeconleche.org/#xpath1(//*%5B@id='
news2002October28'%5D). This is nothing that can't be
done now with the xpointer() scheme, but it is a
lot simpler to implement without the points or ranges.
Open Wave has released version 6.1 of the OpenWave Software Development Kit, a Windows cell phone emulator for WAP 2.0 phones based on XHTML and CSS. For older phones, it also supports Wireless Markup Language (WML) 1.1, WML 1.3 with GUI Extensions for M-Services, WMLScript, WAP Push, cHTML, and HDML 3.0.
Johannes Dobler's released version 1.2.9 of jd.xslt, an open
source XSLT processor written in Java that supports most of the
now defunct XSLT 1.1 working draft. This release fixes tail
recursion and adds support for fragment identifiers in URIs for
the document() function.
Michael Fuchs has posted version 0.3.1 of his DocBook Doclet that creates DocBook SGML and XML documents from JavaDoc. This release uses graphviz 1.8.9 to generate UML class diagrams.
Max Kellermann's LeanEdit 1.8.6 an open source XML editor written in Java. Judging by the screen shots it appera to be form-based, and needs to be customized for different DTDs and schemas. LeanEdit is published under the GPL.
Pekka Enberg's posted version 0.1.9 of XML Indent, an open source (GPL) "XML stream reformatter written in ANSI C" that "is analogous to GNU indent." This release fixes some bugs.
The W3C Web Ontology Working Group has published first public Working Draft of Web Ontology Language (OWL) Test Cases. "The draft illustrates correct OWL usage, the formal meaning of OWL constructs, and resolution of issues considered by the Web Ontology Working Group. OWL is used to publish and share sets of terms called ontologies, providing accurate Web search, intelligent software agents, and knowledge management."
The W3C Device Independence Working Group has posted the first public Working Draft of Authoring Challenges for Device Independence. "The document provides a discussion of several challenges that web site authors commonly face when making content and applications available to users with devices of various capabilities The document examines the effects on authors and the implications for authoring techniques that assist in the preparation of sites that can support a wide variety of devices."
The W3C CSS working group published three new working drafts of CSS3 modules:
This document proposes a set of CSS properties for Ruby text
used in Japanese to annotate other text, often for purposes of
pronunciation. These properties include
ruby-position, ruby-align,
ruby-overhang, and ruby-span. This
draft is in last call. Comments are due by November 27.
This document describes the basic text formatting properties for
CSS3 including writing-mode,
direction, glyph-orientation-vertical,
glyph-orientation-horizontal,
unicode-bidi, text-script,
text-align, text-justify,
text-align-last, min-font-size,
max-font-size, text-justify-trim,
text-kashida-space, text-indent,
line-break, word-break-CJK,
word-break-inside, word-break,
wrap-option, linefeed-treatment,
white-space-treatment,
all-space-treatment, white-space,
text-overflow-mode,
text-overflow-ellipsis, text-overflow,
letter-spacing, word-spacing,
punctuation-trim, text-autospace,
kerning-mode, kerning-pair-threshold,
text-underline-style,
text-line-through-style,
text-overline-style,
text-underline-color,
text-line-through-color,
text-overline-color,
text-underline-mode,
text-line-through-mode,
text-overline-mode,
text-underline-position, text-blink,
text-underline, text-line-through,
text-overline-mode, text-decoration,
text-shadow, line-grid-mode,
line-grid-progression, line-grid,
text-transform, hanging-punctuation, and
text-combine. Many of these should be familiar from
CSS2. The new ones mostly address the needs of East Asian and
bidirectional text. This draft is also in last call.
Comments are due by November 27.
According to the abstract:
CSS (Cascading Style Sheets) describe the rendering of documents on various media. When textual documents (e.g., HTML, WML) are laid out on visual media (e.g., screen, paper), CSS represents the elements of the document by rectangular boxes that are laid out one after the other or nested inside each other in an ordering that is called a flow. This module describes the characteristics of the flow and of the various kinds of boxes.
The flow includes "floating" boxes, but tables [CSS3TBL] and "absolute" and "fixed" positioning [CSS3POS] are described in other modules. Also, the rules for partitioning a flow into pages (for paged media) is described elsewhere [CSS3PAGE], as are the special boxes for ruby annotations [CSS3RUBY] and the multicolumn layouts [CSS3COL].
The box model builds on the inline text modules ([CSS3TEXT] and [CSS3LINE]), that describe how text is laid out on a line, including treatment of superscripts, bidirectional ("bidi") and vertical text.
The flow can be horizontal (typical for most languages), but in level 3 of CSS, flows can also be vertical (typical for the Uighur script and often used for ideographic scripts).
The W3C Voice Browser Working Group has published the second working draft of Voice Browser Call Control: CCXML Version 1.0. According to the spec abstract, "CCXML is designed to provide telephony call control support for VoiceXML or other dialog systems. CCXML has been designed to complement and integrate with a VoiceXML system. Because of this you will find many references to VoiceXML's capabilities and limitations. You will also find details on how VoiceXML and CCXML can be integrated. However it should be noted that the two languages are separate and are not required in an implementation of either language. For example CCXML could be integrated with a more traditional IVR system and VoiceXML or other dialog systems could be integrated with some other call control systems."
TM4J 0.7.1 has been released. This is an open source topic map processing toolkit for Java as well as a set of topic map processing tools. Topic maps are an ISO standard for the interchange of information structures which can be used to represent ontologies, business data and processes, individual knowledge and opinions, and more. This engine processes files conforming to the XML Topic Maps (XTM) specification and stores them either in memory or in a persistent store, providing access via a Java API. This is a bug fix release.
The W3C DOM Working Group has a new working draft of Document Object Model (DOM) Level 3 Core Specification. The biggest new feature in this release seems to be support for providing type information for attributes and elements. DTD types are provided for attributes. Schema types are provided for both elements and attributes. Unlike the recently killed abstract schemas effort, the approach taken (just provide a type name and URI for each node) seems much more extensible and much less tied to particular schema languages. I think this is a clear case of doing something better by doing less.
Jens Låås has released xmlclitools 1.41, four Linux command-line tools for searching, modifying, and formating XML data. The tools are designed to work in conjunction with standard utilities such as grep, sort, and shell scripts. Version 1.41 adds wildcard matching was added for xmlgrep, whitespace stripping from the ends of output strings, Makefile improvements, and some basic manpages. All four tools are published under the LGPL.
Version 1.1.2 of the XmlPull API has been released. I'll be talking about this (and other pull APIs for XML parsing)
at SD2002 East in Boston in November.
The major improvement in this release is that the XML declaration is no longer treated as a processing instruction.
Version 1.1.2 also improves Java 2 Micro Edition (J2ME)
compatibility and enhances XmlSerializer.
The BulTreeBank Project has released an XPath Implementation Engine for Java. This is free-beer for non-commercial use.
The sixth beta of Luxor, a GPL'd XML User Interface Language (XUL) toolkit for Java, has been posted. Luxor includes a web server, a portal engine that supports RSS, the Velocity template engine, a Python interpreter, and more. Beta 6 adds:
The W3C User Agent Accessibility Guidelines Working Group has published the proposed recommendation of User Agent Accessibility Guidelines 1.0. According to the abstract, "This document provides guidelines for designing user agents that lower barriers to Web accessibility for people with disabilities (visual, hearing, physical, cognitive, and neurological). User agents include HTML browsers and other types of software that retrieve and render Web content. A user agent that conforms to these guidelines will promote accessibility through its own user interface and through other internal facilities, including its ability to communicate with other technologies (especially assistive technologies). Furthermore, all users, not just users with disabilities, are expected to find conforming user agents to be more usable."
IBM's alphaWorks has released the XML Wrapper Generator, a graphical tool that integrates XML data sources into a DB2 database. The tool loads XML schema files, "shreds" them to a relational schema, and generates appropriate NICKNAME and VIEW statements.
alphaWorks has also released WSDL Explorer, a Windows application that displays Web Services Description Language (WSDL) documents, "generates views of operations, allows invocation of operations, and allows viewing of sample message flow."
Daniel Veillard's released version 1.0.22 of libxslt, the GNOME XSLT library and version 2.4.26 of libxml2, the GNOME XML parser for Linux. The new version of libxslt updates the Windows makefiles, adds a security module, supports a few new options to xsltproc, adds a per transformation error handler, and fixes a few bugs. The new version of libxml works better with Windows CE and fixes some bugs with validation for both DTDs and schemas.
The Mozilla Project has posted the first beta of Mozilla 1.2, an open source web browser that supports XML, simple XLinks, MathML, CSS, XSLT, XHTML, XUL, SVG, and many other cool acronyms. Most importantly it lets you turn off pop-up ads and block web bugs and cookies in a sensible way. (IE claims to let you manage your cookies, but it only works about 80%. Mozilla's cookie management is much smoother.) New features in this beta include:
Most importantly for me personally, this release finally fixes a long standing AppleScript bug that prevented me from switching over to Mozilla on the Mac. I can finally remove IE from my work chain completely. Update: I spoke too soon. That bug is indeed fixed, but as often happens in software development, fixing one bug reveals another. This one isn't as bad as the last one—it only affects the quote of the day, not the recommended reading—but it still means Mozilla can't do quite everything I need it to do.
Other features added since 1.1 include "Type Ahead Find" and a pretty printed raw XML view, like that found in Internet Explorer. XML pretty printing is only available in the .zip distribution and is turned off by default because it affects the DOM for unstyled XML-pages. To turn it on, add user_pref("layout.xml.prettyprint", true); to your user.js file.
However, Mozilla is still a fairly large and monolithic web browser/e-mail program/news reader/chat client/application platform/child minder/dog washer/nose hair trimmer and probably always will be. If you'd like to try a leaner, meaner browser-only application, you should check out the recently released Phoenix 0.3 instead. This browser is based on Mozilla's Gecko engine so it supports all the yummy XML features, but doesn't include the e-mail program, news reader, or nose hair trimmer. Phoenix differs from similar efforts like Galeon in that it's based on XUL and is designed for cross-platform release on Linux and Windows. (Mac OS X users should check out Chimera instead.)
The W3C Web Services Architecture Working Group has published the third public working draft of Web Services Architecture Requirements. According to the abstract,
The use of Web services on the World Wide Web is expanding rapidly as the need for application-to-application communication and interoperability grows. These services provide a standard means of communication among different software applications involved in presenting dynamic context-driven information to the user. In order to promote interoperability and extensibility among these applications, as well as to allow them to be combined in order to perform more complex operations, a standard reference architecture is needed. The Web Services Architecture Working Group at W3C is tasked with producing this reference architecture.
This document describes a set of requirements for a standard reference architecture for Web services developed by the Web Services Architecture Working Group. These requirements are intended to guide the development of the reference architecture and provide a set of measurable constraints on Web services implementations by which conformance can be determined.
My favorite part of this document is that it actually defines what the heck a web service is:
Definition: A Web service is a software application identified by a URI, whose interfaces and bindings are capable of being defined, described, and discovered as XML artifacts. A Web service supports direct interactions with other software agents using XML based messages exchanged via internet-based protocols.
In the past, I've noticed that how a web service is defined often depends on what a vendor is trying to sell me. Notably absent from this definition is any requirement to use HTTP, SOAP, WSDL, UDDI, or similar FLAs (four-letter acronyms).
On a related note for all the developers who keep asking me to announce their web services products (and you are legion): if you want your product announced here you need to be able to explain in one paragraph what the product is and what it does. (Also required: what platforms it runs on and what it costs, though those requirements don't seem to be causing people as much trouble as explaining what their products actually do.) Do not use adjectives or adverbs, especially comparatives or superlatives (better, faster, more robust, efficiently, etc.). Do not define your product by comparison to some other product since I probably don't know what that product does either. Do not use buzzwords like "web services" or "application server". (I once walked across an Internet World show floor asking each and every booth selling an "application server" what an application server was. Most frequent answer: "If you come back later, the right person to answer your question will be here.") Explain in plain language what your product does and why a developer might need such a thing.
An example of what not to send me, adapted from a recent e-mail (names changed to protect the guilty, and because they're hardly the only group in this space that can't seem to explain what they're doing):
Cherokee Allies 1.0, the Open Source product of the XML Cherokee Group is now released. Allies, which Cherokee's John Doe likes to call "Cherokee SOAP 3.0", is quite a bit more powerful than Cherokee SOAP 2.0. Like SOAP 2.0, Allies supports the latest SOAP 1.1 spec. However Allies also supports WSDL 1.1. Allies includes implementations of both the JAX-RPC and SOAP API with Attachments for Java (SAAJ) specifications. Importantly, Allies is making some significant contributions with highly useful features that promise to improve interoperability and capabilities of future Web services as a whole.
An example of what to do:
Cherokee Allies 1.0 is a generic server written in Java that communicates with remote clients by sending and receiving XML documents over HTTP. These documents adhere to the SOAP 1.1, JAX-RPC, and SOAP API with Attachments for Java (SAAJ) specifications. Out of the box, Allies doesn't do much of anything. Sites customize Allies by writing small Java programs called "foolets" that respond to particular kinds of SOAP messages. The foolets are written in Java, and can do essentially anything a Java program running on that server can do; for example, talk to a database with JDBC, read a file, send data to a printer connected to the parallel port, invert a matrix, etc. The results of the foolet's work are then transmitted back to the requesting client as another XML document. The messages the server understands and responds to are described by a WSDL document that client programmers can retrieve and inspect. Allies handles all the generic services involved in sending and receiving HTTP, marshalling and unmarshalling arguments to XML documents, and, optionally, supporting transactions. The foolet programmer can focus on the unique local logic of their system. Client programs can be written in any language capable of generating and receiving XML documents over HTTP. Allies is published under the Cherokee license.
From the beast that wouldn't die department, the W3C XML Core Working Group has brought forth the candidate recommendation of XML 1.1; and, surprise, surprise, it's even worst than the last draft. This release has a few big new features:
C0 control characters such as form feed, vertical tab, BEL, and DC1 through DC4 (whatever those are) are now allowed in XML text. However, they must be escaped as character references. They cannot be included literally in data. Nulls, thankfully, are still forbidden.
The C1 control characters such as BPH, IND, NBH, and PU1 are no longer allowed as literals in XML text. They too must now be escaped as character references. For the first time this means that some well-formed XML 1.0 documents are not well-formed XML 1.1 documents. The exception, of course, is IBM's holy grail of NEL, which will be allowed in literal XML text, just to make life difficult for every text editor on the planet except those from IBM mainframes.
Unicode character normalization should be performed on XML documents, unless you don't feel like it, in which case you can ignore it. This almost makes sense. Basically it says that parsers may report an e followed by a combining accent acute instead of the single character é as an error of unspecified type if they want to or the client asks for it. The details are quite complicated, but at least it's optional. However, I still worry that this is a source of interoperability problems, especially when it comes to names of elements and attributes. For instance, a normalizing validator might accept documents a non-normalizing validator would reject.
And of course all the other problems previous drafts have had are still present. I've already calumnied these sufficiently in the past. Let me just reprint my criticisms now. What follows was originally posted on June 21, 2001. Regrettably, it's just as relevant today:
This is a proposal for a new backwards incompatible version of XML. The specific goal is to address some shortcomings of the XML 1.0 character model relative to Unicode 3.1, as well as throwing a sop to IBM.
The concern with respect to IBM is that one of the world's largest corporations, with thousands of patents, legions of programmers, billions of dollars in revenue, and resources pouring out of every orifice is somehow unable to handle documents where lines end with carriage returns and line feeds, as documents do on every non-IBM system on the planet. The only reason there's a problem here at all is because IBM tried to go it alone as a monopoly and set standards by fiat for years rather than working with the rest of the industry. Consequently their mainframe character sets don't really interoperate well with everybody else's character sets. In XML this arises as a problem with line endings when someone edits an XML document with an IBM mainframe text editor. IBM mostly grew out of their anti-competitive monopolistic tendencies over the last thirty years (with a large dose of assistance from the U.S. government). However, there are still some legacy issues relating to their attempt to dictate standards to the rest of the industry, and this is one of them. Now rather than fixing their own broken mainframe text editing software, they want everyone else on the planet to change their software so IBM doesn't have to. (If this reminds anybody of the current mess with Oracle and UTF-8, you're not alone.) This proposal was laughed out of the W3C a few months ago when IBM made it, or at least it seemed to be. However, it's now risen from the dead as part of XML Blueberry; but it doesn't make any more sense now than it did then; and it still deserves to be laughed off the table with whooping cries of derision.
The second proposal for breaking backwards compatibility with existing parsers is much more serious, and requires a more thoughtful response. Starting in Unicode 3.0 a number of new characters have been added both for new scripts that were previously unencoded such as Amharic and Cherokee as well as for old scripts that were incomplete such as Chinese. The concern is that since XML 1.0 is based on Unicode 2.0, "fully native-language XML markup is not possible in at least the following languages: Amharic, Burmese, Canadian aboriginal languages, Cantonese (Bopomofo script), Cherokee, Dhivehi, Khmer, Mongolian (traditional script), Oromo, Syriac, Tigre, Yi. In addition, Chinese, Japanese, Korean (Hangul script), and Vietnamese can make use of only a limited subset of their complete character repertoires."
If this were true, it would be a very serious criticism of XML 1.0 Fortunately, however, the claim is not nearly as dire as the proposal makes out. Indeed the proposal substantially overstates the need for any changes. The XML 1.0 BNF productions do not allow these newly defined characters to be used in element, attribute, and entity names. However, they can be used in the text of element content and attribute values. This means that XML is fully adequate for literature and data in Amharic, Burmese, Canadian aboriginal languages, Cantonese, Cherokee, Dhivehi, Khmer, Mongolian, Oromo, Syriac, Tigre, Yi, Mandarin, Japanese, Korean, and Vietnamese. Only the markup, that is, the tags, would have to be written in another script. Given that there aren't even localized operating systems in m