2009 XML News

Wednesday, December 30, 2009 (Permalink)

The W3C XQuery working group has published updated working drafts for XQuery 1.1 including:

New requirements in 1.1 include:

  • Value-based grouping
  • Error processing and recovery
  • Numeric formatting
  • Date formatting
  • Default values for external variables
  • Non-deterministic external functions
  • Positional grouping
  • Windowing
  • Output numbering
  • Dynamic creation of namespace bindings
  • Declaration of serialization parameters
  • Node references
  • Outer for
  • Type declaration for context item
  • Support for new data types
  • Higher order functions
  • Conditional constructors
  • Validation
  • Type expressions
  • Module composition restrictions
  • Invocation of external functionality
  • Static typing extensions

Data model changes include adding the W3C Schema 1.1 dateTimeStamp datatype and a function item datatype to support Higher Order Functions.

Thursday, December 24, 2009 (Permalink)

Todd Ditchendorf has released Cocoatron 1.0, "a collection of Automator actions for graphically creating XML Processing Pipelines on Mac OS X." Cocoatron consists of six Automator Actions (composeable, configurable pipeline stages):

  1. Load XML Documents
  2. Process XIncludes of XML Documents
  3. Validate XML Documents (DTD, XSD, RNG)
  4. Transform XML Documents (XSLT)
  5. Query XML Documents (XQuery)
  6. Serialize XML Documents

Cocoatron includes is based on libxml2 and supports XInclude, XSLT 1.0, EXSLT, XQuery, DTDs, W3C XML Schemas, and RELAX NG.

Wednesday, December 23, 2009 (Permalink)

Bare Bones Software has released version 9.3.1 of BBEdit, my preferred text editor on the Mac, my favorite XML editor on any platforms, what I'm using to type these very words. This release fixes bugs including some issues in handling UTF-8 with byte-order marks.

Tuesday, December 22, 2009 (Permalink)

The Eclipse Project has released PsychoPath 1.1M4, an XPath 2.0 Schema Aware engine written in Java that is conformant to 99.8% of the W3C XPath 2.0 test suite and improving.

Monday, December 21, 2009 (Permalink)

Automattic has released Wordpress 2.9.0 an open source (GPL) blog engine based on PHP and MySQL. New features in 2.9 include

  • Global undo/trash feature and no more "are you sure" messages when deleting. Apple's known since 1984 that Undo beats confirm every time. Nice to see the rest of the world finally getting with the program.
  • Built-in image editor allows you to crop, edit, rotate, flip, and scale your images
  • Batch plugin update and compatibility checking, which means you can update 10 plugins at once, versus having to do multiple clicks for each one
  • Easier video embeds

I suspect I'll wait for the inevtiable 2.9.1 release before upgrading.

Friday, December 18, 2009 (Permalink)

SyncroSoft has released <Oxygen/> 11.1, $349 payware XML editor written in Java. Oxygen supports XML, XSL, DTDs, XQuery, SVG, Relax NG, Schematron, and the W3C XML Schema Language. According to the announcement:

The visual XML authoring now uses schema information to provide intelligent editing actions that help keep the document valid and provide a better editing experience. The new compact representation of tags and the quick up/down navigation features improve the ergonomics and the usability. <oXygen/> XML Editor can use any XQJ compliant XQuery processor for XQuery transformations, different error levels and external references can be specified for Schematron messages and the XProc support was improved with better editing and execution. The XML format and indent operation can use DTD/schema information to provide better formatting and the find and replace is now XML-aware and can accept XPath filtering to delimit the search scope.

Some of the changes inlcuding schema-aware line wrapping and multiple tags per line sound like they may actually make this a better text editor, though I'll have to try it out to be sure.

Thursday, December 17, 2009 (Permalink)

The Apache XML Project has released XML Commons External Components 1.4.01.

xml-commons provides an Apache-hosted set of DOM, SAX, and JAXP interfaces for use in other xml-based projects. Our hope is that we can standardize on both a common version and packaging scheme for these critical XML standards interfaces to make the lives of both our developers and users easier.

The External Components portion of xml-commons contains interfaces that are defined by external standards organizations. For DOM, that's the W3C; for SAX it's David Megginson and sax.sourceforge.net; for JAXP it's Sun. While we could send users to each of the primary sources for these deliverables, keeping our own versions of these in the xml-commons repository gives us a number of advantages:

  • Simplicity of downloads: users get the whole product from one place.
  • Better version control: we can only take fixes we want, and add Apache-specific changes.
  • Better overview documentation of how these interfaces fit into the XML processing world.
  • More chance for cross-project community building within Apache projects.

New features in this release include:

  • Classes and method signatures introduced by JAXP 1.4.
  • The StAX 1.0 API (JSR 173).
  • The Java Language Binding for the W3C DOM Element Traversal API (org.w3c.dom.ElementTraversal).
  • Default implementations to SchemaFactory for the XML Schema 1.0 and 1.1 language identifiers.

Various bugs are fixed as well. This will all be bundled with the next version of Xerces in a couple of weeks.

They've also posted bug fix releases of XML Commons External 1.3.05 and XML Commons External 1.2.06 for older versions of Xerces and the JDK without all the new APIs.

Wednesday, December 16, 2009 (Permalink)

Just a few days left to submit papers for XML Prague 2010. Topics include:

  • XML lifecycle (diffing, merging, change tracking, etc.)
  • Efficiency and performance in XML (verbosity, processing, overuse)
  • Hypermedia in XML (SMIL, SVG animations)
  • Spatial data and XML (WGS84, microformats)
  • XML all the time (XRX, XQuery web applications)

Initial abstracts/papers are due by December 21. I won't make Prague this year, though just maybe I'll get to Montreal. Since I'm working a day job right now, it's harder to just skip town for a fun conference than when I was writing full time; and I'm trying to save most of my vacation for birding expeditions. (I want to visit Puerto Rico, Southeastern Arizona, and the BugGuide Gathering next year.) I wish I could go though. From all reports, it's a fun time for markup geeks.

Tuesday, December 15, 2009 (Permalink)

The W3C XML Core working groups has released Namespaces in XML 1.0 (Third Edition). "There are several editorial changes, including a number of terminology changes and additions intended to produce greater consistency. The non-normative appendix 'The Internal Structure of XML Namespaces' has been removed. The BNF has been adjusted to interconnect properly with all editions of XML 1.0, including the fifth edition."

Monday, December 14, 2009 (Permalink)

I've released XOM 1.2.4, my Java library for processing XML. The major change in this release is a hopefully user-transparent one-button release process. There's one small optimization in XPath evaluation, and the maven POM that was broken in 1.2.3 should now be fixed.

Thursday, December 10, 2009 (Permalink)

The W3C XML Core Working Group has posted a proposed edited recommendation of Associating Style Sheets with XML Documents 1.0, i.e. the <?xml-stylesheet?> spec. "This second edition incorporates all known errata as of the publication date, clarifies several areas left unspecified in the earlier edition, and has been restructured to allow other specifications to reuse the rules for parsing pseudo-attributes from a string. This edition, once it becomes a Recommendation, will supersede the previous edition of 29 June 1999." After a quick read, this looks like an improvement over the existing spec though a lot of details are still left unsaid.

Tuesday, December 8, 2009 (Permalink)

I've posted a beta of XOM 1.2.4, my Java library for processing XML. The major change in this release is a hopefully user-transparent one-button release process. I'd appreciate it if someone could check it out and make sure nothing's broken. There's one small optimization in XPath evaluation, and the maven POM that was broken in 1.2.3 should now be fixed, though again I need external verification of that before release. If anyone using XOM could plug in the new jars into their system and let me know if they work, I'd greatly appreciate it. Thanks.

Monday, December 7, 2009 (Permalink)

XMLMind has released the XMLmind FO Converter 4.3.2, a $200 payware XSL-Formatting Objects to RTF/OOXML/OpenDoc converter written in Java. This release can now "convert DITA 1.1 documents to production-quality XHTML 1.0, XHTML 1.1, HTML 4.1, Java[tm] Help, HTML Help, Eclipse Help, PDF, PostScript®, RTF (can be opened in Word 2000+), WordprocessingML (can be opened in Word 2003+), Office Open XML (.docx, can be opened in Word 2007+), OpenOffice (.odt, can be opened in OpenOffice.org 2+)."

Thursday, December 3, 2009 (Permalink)

The W3C XML Schema Working Group has sent XML Schema 1.1 Part 1: Structures and XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes back to last call. According to the data types draft,

The major changes since version 1.0 include:

  • Support for XML 1.1 has been added. It is now implementation defined whether datatypes dependent on definitions in [XML] and [Namespaces in XML] use the definitions as found in version 1.1 or version 1.0 of those specifications.
  • A new primitive decimal type has been defined, which retains information about the precision of the value. This type is aligned with the floating-point decimal types which are included in [IEEE 754-2008].
  • In order to align this specification with those being prepared by the XSL and XML Query Working Groups, a new datatype named anyAtomicType which serves as the base type definition for all primitive atomic datatypes has been introduced.
  • The conceptual model of the date- and time-related types has been defined more formally.
  • A more formal treatment of the fundamental facets of the primitive datatypes has been adopted.
  • More formal definitions of the lexical space of most types have been provided, with detailed descriptions of the mappings from lexical representation to value and from value to ·canonical representation·.
  • The validation rule Datatype Valid (§4.1.4) has been recast in more declarative form. A paraphrase of the constraint in procedural terms, which corrects some errors in the previous versions of this document, has been added as a note.
  • The rules governing partial implementations of infinite datatypes have been clarified.
  • Various changes have been made in order to align the relevant parts of this specification more closely with other relevant specifications, including especially the corresponding sections of [XSD 1.1 Part 1: Structures].
Friday, November 27, 2009 (Permalink)

The Mozilla Project has posted the fourth beta of Firefox 3.6 for Mac OS X, Linux, and Windows. The most interesting feature in this release is a new plug-in that allows you to reenable other plugins that claim to be incompatible with this version. More often than not, they're just marked conservatively and work just fine with the new 3.6.

Monday, November 16, 2009 (Permalink)

Wolfgang Meier et al have released eXist-db 1.4, an open source XQuery database:

eXist-db supports many (web) technology standards making it an excellent platform for developing web based applications:

The 1.4 version adds a new full text index based on Apache Lucene, a lightweight URL rewriting and MVC framework as well as support for XProc. Most important, the XQuery engine has seen a major redesign, resulting in improved performance.

eXist-db is highly compliant with the XQuery standard (current XQTS score is 99.4%). The query engine is extensible and features a large collection of XQuery Function Modules.

eXist is probably the most popular pure XML open source (LGPL) native XML database. I've been wanting to spend some serious time with this product for a while now, but haven't had the opportunity. From what I hear, these days it scales to a few hundred megabytes of content. Perhaps this release can extend the range of scaleability.

Friday, November 13, 2009 (Permalink)

The OpenOffice Project has released OpenOffice 3.1.1, an open source office suite for Linux, Solaris, and Windows that saves all its files as zipped XML and uses XForms. This release fixes assorted bugs, including several security issues. All users should upgrade.

Thursday, November 12, 2009 (Permalink)

Bare Bones Software has released version 9.3 of BBEdit, my preferred text editor on the Mac, my favorite XML editor on any platforms, what I'm using to type these very words. This release fixes bugs and adds a new bbfind command line tool. On the downside, it makes an annoying and incorrect though not crippling change in handling UTF-8 encoded documents. UTF-8 with a byte-order mark is not a different encoding than UTF-8 without. I'll have to report the bug. The BBEdit developers have never really quite understood character sets and encodings, I'm afraid. They're 99% of the way there, but the 1% keeps itching from release to release.

Despite that minor quibble, I really, really wish BBEdit were available on Linux, and a Windows port wouldn't be unwelcome either. One of my co-workers turned me onto kate recently, and it's a little better than gEdit, even if it does insist on dirtying my X-term with pointless log messages; but it's still no BBEdit or TextMate. New copies cost $125. Upgrades from 9.x are free. Mac OS X 10.4 or later is required.

Wednesday, November 11, 2009 (Permalink)

The Eclipse Project has released PsychoPath 1.1M3, an open source XPath 2.0 Schema Aware engine written in Java that is conformant to 99.6% of the W3C XPath 2.0 test suite and improving. "It is fully schema aware including the ability to handle user defined data types. There are about 24 tests left to go to bring it into full compliance with the test suite."

Wednesday, November 4, 2009 (Permalink)

The W3C XML Security Working Group has posted the first public working draft of XML Signature Syntax and Processing Version 2.0. "This version of the XML Signature specification introduces a new, simpler transform model. While this model is less generic than the one in the 1.x versions of this specification, we anticipate gains in terms of simplicity, lower attack surface, and streamability."

Tuesday, November 3, 2009 (Permalink)

The W3C XML Security Working Group has posted the first public working draft of Canonical XML Version 2.0. "Canonical XML Version 2.0 is a major rewrite of Canonical XML Version 1.1 to address issues around performance, streaming, hardware implementation, robustness, minimizing attack surface, determining what is signed and more. It also incorporates an update to Exclusive Canonicalization, effectively a 2.0 version, as well." Version 2.0 uses "a very limited form of the generic XPath Nodeset that was the input model for Canonical XML 1.x. It is designed to be simple and allow a high performance algorithm, while still allowing the essential use cases." Furtermore, "Instead of separate algorithms for each variant of canonicalization, this specification goes with the approach of a single algorithm, which does slightly different things depending on the parameters" such as trimTextNodes and ignoreComments.

Monday, November 2, 2009 (Permalink)

The Mozilla Project has posted the first beta of Firefox 3.6 for Mac OS X, Linux, and Windows. I finally managed to make this version show me my scrollbars again. Apparently there was an incompatibility with the theme I was using. Overall 3.6 looks like the strongest release since the last of the 2.0 series, though its AppleScript support is still incomplete. On Windows, though, I find myself gravitating to Chrome more and more.

On the "What are they smoking?" front, I notice that this release adds a choose() method to XPath which "should be used instead of if(), which has been deprecated." Where the hell did they get that idea from? I'm not up to date with every last detail of XPath 2, but I don't think it has a choose() function, and I didn't think Firefox was implementing XPath 2.0 either. XPath 1.0 doesn't have an if() function. Did I miss something? OK. Found it. This is an XForms 1.1 extension function. I hope they properly namespace this outside of XForms, but it doesn't look like they will. Time to file a bug. Maybe it's just a documentation problem.

Friday, October 30, 2009 (Permalink)

XML Prague 2010 has posted a call for papers on several topics:

  • XML lifecycle (diffing, merging, change tracking, etc.)
  • Efficiency and performance in XML (verbosity, processing, overuse)
  • Hypermedia in XML (SMIL, SVG animations)
  • Spatial data and XML (WGS84, microformats)
  • XML all the time (XRX, XQuery web applications)

Initial abstracts/papers are due by December 21.

Thursday, October 29, 2009 (Permalink)

Ian Hickson has posted the "last call" working draft of HTML 5:

For a brief period today, there were no outstanding e-mails or bugs on the specs, and so I took that opportunity to transition us here at the WHATWG to the next stage of HTML5's development: Last Call! This affects three specs at the WHATWG:

There's also a version of the spec called Web Applications 1.0 (for nostalgic reasons) that has all of the above as well as a number of other specs, namely Web Storage, Web Database, Server-sent Events, and the Web Sockets API and protocol, all together in one document. With the exception of the Web Database spec, they're all now in last call at the WHATWG:

http://www.whatwg.org/specs/web-apps/current-work/complete.html (5MB)

So if you've been waiting to see if someone else would report the problem that you had seen, well, if it's not fixed, they didn't! So you should now send that feedback in yourself.

Wednesday, October 21, 2009 (Permalink)

Automattic has released Wordpress 2.8.5 an open source (GPL) blog engine based on PHP and MySQL. This release fixes yet more security holes, and attempts to harden WordPress somewhat against as yet undiscovered vulnerabilites. All users should upgrade.

Monday, October 19, 2009 (Permalink)

The OpenOffice Project has posted a beta of OpenOffice 3.2, an open source office suite for Linux, Solaris, and Windows that saves all its files as zipped XML. This version makes a large number of small UI and feature improvements (though sadly they haven't yet addresed my two blockers: outline mode and normal view.) Final release is expected in December.

Wednesday, October 14, 2009 (Permalink)

The W3C has redesigend their web site. It's somewhat prettier, but feels less simple, functional, and useful overall. They seem to have reformatted a lot of the specs including XML in violation of the advertised in place modification policy. At least they don't seem to have broken most of the links.

Tuesday, October 13, 2009 (Permalink)

SyncroSoft has released <Oxygen/> 11.0, $349 payware XML editor written in Java. Oxygen supports XML, XSL, DTDs, XQuery, SVG, Relax NG, Schematron, and the W3C XML Schema Language. According to the announcement:

One of the major additions in oXygen 11 is the integrated XSLT documentation support. The main points of this support are:

  • documentation for any stylesheet based on its structure
  • support for XML comments and a number of languages for annotations
  • actions in the XSLT editor to easily add documentation stubs to stylesheet components

oXygen XML editor 11 ships with Saxon 9.2 Enterprise Edition from Saxonica and supports also Saxon 9.2 Professional and Home Editions providing editing, transformation, debugging and profiling support.

Other new additions in version 11 include XProc support, a new XQuery debugger (for Oracle Berkeley XML DB), extended large documents editing support, updates on the visual editing mode, etc.

If you must have a integrated development environment for XML, then Oxygen is the one to buy, though personally I still prefer using plain vanilla text editors and the command line myself. At the end of the day, XML is just text; and an excellent text editor does a better job of it than a a text editor that's an afterthought in a product designed to shield users from raw XML. At most, I want some extra features on the side that don't get in my way when I'm just typing; for instance, a menu item to check the document for well-formedness or a spell checker that's smart enough to ignore tags. I don't want anything that gets in the way of my typing text like auto-tag closing or tree views.

Monday, October 12, 2009 (Permalink)

I've released XOM 1.2.3, my free-as-in-speech (LGPL) dual streaming/tree-based API for processing XML with Java. 1.2.3 fixes a few minor bugs:

  • Closed a loophole by which a buggy parser could create a malformed internal DTD subset.
  • Unit tests no longer load the XHTML DTD from the W3C web site (A series of bad decisions in several specs caused the W3C to initiate a DDOS attack on their own servers, and they finally collapsed under the load.)
  • Fixed bug that allowed nulls to be inserted into a document.
  • The JAR file and build process should now be more OSGI friendly.

Please upgrade when you get a chance. I'll post maven packages soon.

Friday, October 9, 2009 (Permalink)

Sonic Software has released Stylus Studio 2010, a $99-$595 payware XML editor for Windows. Features include:

  • XML differencing
  • XSLT debugging
  • XSLT mapping
  • XSLT profiling
  • XSL:FO
  • XQuery editing, mapping, and debugging.
  • XML Schema Editor
  • Document Type Definition (DTD) Editor
  • XPath Evaluator
  • XPath Expression Generator
  • Web Service Call Composer
  • UDDI Registry Browser
  • Tools for mapping to and from XML documents, Web service data, relational data, and flat files
  • Import/export utilities for RDBMS, XML, CSV, ADO, and flat files
  • JSP Editor
  • RenderX XEP Personal Edition XSL-FO processor bundled
  • An XPath Query Editor,
  • Java APIs for accessing EDI, X12, EDIFACT and other legacy data formats.
  • XML Pipeline support
  • An XML report designer
  • Data Conversion APIs

New features in this release include

  • Visual Mapping for Updating Relational Data Sources
  • XQuery Code Refactoring
  • New Data Sources: ZIP, JAR, MS Open Office, and OpenDocument
  • IATA Cargo-IMP to XML support
  • SOAP 1.2
Wednesday, September 30, 2009 (Permalink)

Mulberry technologies has posted the complete proceedings of Balisage 2009 and the International Symposium on Processing XML Efficiently. It's all browsable HTML, no registration required. Kudos to Balisage for making this so straight-forwward. Would that more conferences followed its lead.

Friday, September 25, 2009 (Permalink)

XMLmind has released version 4.5 of their XML Editor. This $300 payware product features word processor and spreadsheet like views of XML documents. This release enhances DITA support. A free-beer hobbled version is also available.

Wednesday, September 23, 2009 (Permalink)

The W3C Internationalization Core Working Group has publishe a note on Requirements for String Identity Matching and String Indexing. "This document is being published as a Working Group note in order to capture and preserve historical information. It contains requirements elaborated in 1998 for aspects of the character model for W3C specifications. It was developed and extensively reviewed by the Internationalization Working Group, and is being published by its successor, the Internationalization Core Working Group, part of the W3C Internationalization Activity. The wording of the 1998 version remains unchanged (except for correction of a small number of typographic errors), but the links to references have been updated prior to this publication." At least a couple of sections are out of date, and I wouldn't recommend it for a modern implementation.

Tuesday, September 22, 2009 (Permalink)

The W3C has published a note on Publishing Open Government Data:

Step 1: The quickest and easiest way to make data available on the Internet is to publish the data in its raw form (e.g., an XML file of polling data from past elections). However, the data should be well-structured. Structure allows others to successfully make automated use of the data. Well-known formats or structures include XML, RDF and CSV. Formats that only allow the data to be seen, rather than extracted (for example, pictures of the data), are not useful and should be avoided.

Step 2: Create an online catalog of the raw data (complete with documentation) so people can discover what has been posted.

These raw datasets should be reliably structured and documented, otherwise their usefulness is negligible.  Most governments already have mechanisms in place to create and store data (e.g., Excel, Word, and other software-specific file formats).

Posting raw data, with an online catalog, is a great starting point, and reflects the next-step evolution of the Internet - "website as fileserver".

Step 3: Make the data both human- and machine-readable:

  • enrich your existing (X)HTML resources with semantics, metadata, and identifiers;
  • encode the data using open and industry standards - especially XML - or create your own standards based on your vocabulary;
  • make your data human-readable by either converting to (X)HTML, or by using real-time transformations through CSS or XSLT.  Remember to follow accessibility requirements;
  • use permanent patterned and/or discoverable "Cool URIs";
  • allow for electronic citations in the form of standardized (anchor/id links or XLINKs/XPointers) hyperlinks.

These steps will help the public to easily find, use, cite and understand the data. The data catalog should explain any rules or regulations that must be followed in the use of the dataset. Also, the data catalog itself is considered "data" and should be published as structured data, so that third parties can extract data about the datasets. Thoroughly document the parts of the web page, using valid XHTML, and choose easily patterned and discoverable URLs for the pages. Also syndicate the data for the catalog (using formats such as RSS) to quickly and easily advertise new datasets upon publication.

Actually, that sounds like good advice for more than just government data.

Tuesday, September 15, 2009 (Permalink)

The Mozilla Project has posted the first alpha of Firefox 3.6 for Mac OS X, Linux, and Windows. Most notably, this release finally has a functioning AppleScript dictionary for the first time in years. Sadly, the AppleScript is incompatible with older scripts, and still doesn't have a way to get the text of the selection, but at least someone is finally looking at this.

Monday, September 14, 2009 (Permalink)

Snow Leopard seems to have broken most of the AppleScripts I use to manage this site. That's going to make getting back on a regular update schedule even tougher until I can figure out how to fix them.

Thursday, September 10, 2009 (Permalink)

XML in a Nutshell is due for a reprint in a week or so. If you've noticed any errors that need to eb corrected, please send them in now.

Tuesday, September 8, 2009 (Permalink)

The OpenOffice Project has released OpenOffice 2.4.3, an open source office suite for Linux, Solaris, and Windows that saves all its files as zipped XML. This release fixes bugs and plugs security holes. All 2.x users should upgrade.

Monday, September 7, 2009 (Permalink)

Real life is interfering with blogging lately. The last couple of weeks have been occupied with moving to New York, and starting work with a new team. At home, my life is still in boxes. I just got my monitor and keyboard set up today. My stove still isn't working. It will probably be late September before I resume a more regular update schedule here. Despite the temporary hiatus, getting back to Brooklyn should result in a long term increase in my productivity if for no other reason than that my Brooklyn office is so much better than the loft I had in Irvine. (Never again will I rent an apartment with thick, wall-to-wall carpet in every room.)

Thursday, August 27, 2009 (Permalink)

The Eclipse Project has released PsychoPath 1.1M1, an XPath 2.0 Schema Aware engine written in Java that is conformant to 96% of the W3C XPath 2.0 test suite and improving.

Wednesday, August 26, 2009 (Permalink)

Bare Bones Software has released TextWrangler 3.0, a free-as-in-beer text editor for Mac OS X. New features in 3.0 seem to focus on assorted small user interface improvements. One of my favorites is that "the text views in browsing windows (disk browsers, search results, and similar) are now editable; rather than having to open a file into a new window from such a browser, you can just edit it right in the window."

Thursday, August 20, 2009 (Permalink)

The W3C XForms working group has posted the proposed recommendation of XForms 1.1. Changes since 1.0 include:

  • A new namespace URI, http://www.w3.org/2004/xforms/
  • power, luhn, current, choose, id and property XPath extension functions
  • An email address datatype
  • An ID card number datatype
  • A prompt action element
  • An xforms-close event
  • An xforms-submit-serialize event
  • Inline rendering of non-text media types

Comments are due by April 5.

Monday, August 17, 2009 (Permalink)

The DBIS Group at University of Konstanz has released BaseX 5.7, an open source native XML database with a GUI frontend that supports XQuery 1.0 and XQuery Full-Text. It seems to be written in Java so one presumes its platform independent.Changes in this release include a revised GUI with enhanced XQuery editing facilities and extended database statistics.

Saturday, August 15, 2009 (Permalink)

The Mozilla Project has released Camino 1.6.8, an open source Mac OS X web browser based on the Gecko 1.8 rendering engine and the Quartz GUI toolkit. It supports pretty much all the technologies that Mozilla does: HTML, XHTML, CSS, XML, XSLT, etc. Version 1.6.8 fixes security bugs, and all users should upgrade. Mac OS X 10.3 or later is required.

Friday, August 14, 2009 (Permalink)

The Modis Team has released Sedna 3.2, an open source native XML database for Windows and Linux written in C++ and Scheme and published under the Apache License 2.0. Sedna supports XQuery and its own declarative update language. Chnages in this rleease include:

  • The persistent heap was discarded. All metadata is stored in the same way as data.
  • The XQuery parser was rewritten in Flex/Bison for improved XQuery conformance.
  • Rewritten Deadlock detection algorithm.

I haven't had the opportunity to use Sedna myself, but I've heard good things about it from some folks who have. Likely worth a look.

Thursday, August 13, 2009 (Permalink)

Wolfgang Meier et al have released eXist DB 1.2.6, an open source XQuery database:

It stores XML data according to the XML data model and features efficient, index-based XQuery processing.

eXist-db supports many (web) technology standards making it an excellent application platform:

eXist-db is highly compliant with the XQuery standard (current XQTS score is 99.4%). The query engine is highly extensible and features a large collection of XQuery Function Modules.

eXist-db provides a powerful environment for the development of web applications based on XQuery and related standards. Entire web applications can be written in XQuery, using XSLT, XHTML, CSS and Javascript (for AJAX functionality). XQuery server pages can be executed from the filesystem or stored in the database.

eXist is probably the major pure XML open source (LGPL) native XML database. I've been wanting to spend some serious time with this product for a while now, but haven't had the opportunity. From what I hear, these days it's adequate for single server systems. I'd be interested to hear what folks have built on top of it.

Wednesday, August 12, 2009 (Permalink)

Planamesa Software has released NeoOffice/J 3.0 patch 7, a Mac port of OpenOffice 3.0.1 using a Java-based GUI. NeoOffice is considerably more Mac-like than the official Mac version of OpenOffice. However it's still somewhat limited by bad design choices and missing features made in the source code base.

Tuesday, August 11, 2009 (Permalink)

I'm hearing some pushback on CERT-FI's XML parser vulnerability story. According to Xerces committer Michael Glassevich, "The specific problem reported to Apache only applied to Apache Xerces C++. Xerces-J does not have the bug that was fixed in the C++ impl." This directly contradicts the original CERT-FI report. Possibly the claimed Xerces-J bug is a separate one that was fixed in CVS couple of months ago, but not yet released. This is exactly why we should insist on full and immediate disclosure of vulnerability information. Otherwise, there's no way to tell whether the problem is real, and just how bad it is.

Thursday, August 6, 2009 (Permalink)

The W3C Cascading Style Sheets Working Group has posted a new working draft of CSSOM View Module. "The APIs introduced by this specification provide authors with a way to inspect and manipulate the visual view of a document. This includes getting the position of element layout boxes, obtaining the width of the viewport through script, and also scrolling an element. The CSSOM View Module also supercedes DOM Level 2 Views and therefore defines the AbstractView and DocumentView interfaces and extensions for them."

Tuesday, August 4, 2009 (Permalink)

The W3C User Agent Accessibility Guidelines Working Group (UAWG) has posted a working draft of User Agent Accessibility Guidelines 2.0. "This document provides guidelines for designing user agents that lower barriers to Web accessibility for people with disabilities. User agents include browsers and other types of software that retrieve and render Web content. A user agent that conforms to these guidelines will promote accessibility through its own user interface and through other internal facilities, including its ability to communicate with other technologies (especially assistive technologies). Furthermore, all users, not just users with disabilities, should find conforming user agents to be more usable. In addition to helping developers of browsers and media players, this document will also benefit developers of assistive technologies because it explains what types of information and control an assistive technology may expect from a conforming user agent. Technologies not addressed directly by this document (e.g., technologies for braille rendering) will be essential to ensuring Web access for some users with disabilities."

Monday, August 3, 2009 (Permalink)

Automattic has released Wordpress 2.8.3 an open source (GPL) blog engine based on PHP and MySQL. This release fixes some bugs that were supposed to be fixed in 2.8.1 but weren't. :-(

Friday, July 31, 2009 (Permalink)

The W3C XML Security Working Group published six working drafts about XML signatures and encryption:

  • XML Signature Best Practices "describes best practices related to improving security and mitigating attacks, yet others are for best practices in the practical use of XML Signature, such as signing XML that doesn't use namespaces, for example."
  • XML Signature Syntax and Processing Version 1.1. "Conformance-affecting changes against this previous recommendation mainly affect the set of mandatory to implement cryptographic algorithms, including Elliptic Curve DSA (and mark-up for corresponding key material), and additional hash algorithms. There is currently no consensus about the inclusion of the ECDSA algorithm as mandatory to implement, and the Working Group seeks early community input into what algorithms should be supported. Arguments for and against specific approaches are called out in an editorial note in section 6.1 Algorithm Identifiers and Implementation Requirements."
  • XML Signature Transform Simplification: Requirements and Design "outlines a proposed simplification of the XML Signature Transform mechanism, intended to enhance security, performance, streamability and to ease adoption."
  • XML Encryption Syntax and Processing Version 1.1. "Conformance-affecting changes against this previous recommendation mainly affect the set of mandatory to implement cryptographic algorithms, by adding Elliptic Curve Diffie-Hellman Key Agreement. There is currently no consensus about the inclusion of this algorithm as mandatory to implement, and the Working Group seeks early community input into what algorithms should be supported. Arguments for and against specific approaches are called out in an editorial note in section 5.1 Algorithm Identifiers and Implementation Requirements."
  • XML Security Generic Hybrid Ciphers "augments XML Encryption Version 1.1 by defining algorithms, XML types and elements necessary to enable use of generic hybrid ciphers in XML Security applications."
  • XML Security Algorithm Cross-Reference "collects the various known URIs for encryption algorithms (at the time of its publication) and indicates which specifications define them."

I've thought about adding XML encryption and/or digital signatures to XOM one of these days, but frankly there's just never been any demand for it. It's not clear that anyone's actually using this stuff.

Tuesday, July 28, 2009 (Permalink)

The W3C CSS working group has posted a working draft of the Flexible Box Layout Module. "The draft describes a CSS box model optimized for interface design. It provides an additional layout system alongside the ones already in CSS. [CSS21] In this new box model, the children of a box are laid out either horizontally or vertically, and unused space can be assigned to a particular child or distributed among the children by assignment of “flex” to the children that should expand. Nesting of these boxes (horizontal inside vertical, or vertical inside horizontal) can be used to build layouts in two dimensions. This model is based on the box model in the XUL user-interface language used for the user interface of many Mozilla-based applications (such as Firefox)." New properties introduced in this draft include:

  • box-align
  • box-direction
  • box-flex
  • box-flex-group
  • box-lines
  • box-ordinal-group
  • box-orient
  • box-pack
Monday, July 27, 2009 (Permalink)

Jason Hunter has released JDOM 1.1.1, a library for processing XML with Java using a tree metaphor. 1.1.1 is a bug fix release that fixes a nasty thread safety problem with namespaces, adds a new SAXBuilder flag for faster parsing, supports Unicode surrogate pairs, and is compatible with the Android Dalvik VM. It also upgrades Jaxen to the much improved version 1.1.1 so XPath evalaution should now be much more spec compliant. All users should upgrade.

Friday, July 24, 2009 (Permalink)

The W3C CSS Working group has posted a working draft of CSS Image Values Module Level 3. "In CSS Levels 1 and 2, image values, such as those used in the ‘background-image’ property, could only be given by a single URI value. This module introduces additional notations that allow a 2D image to be given as a list of URIs denoting fallbacks, as a slice of a larger image (sprite), and as a gradient."

Wednesday, July 22, 2009 (Permalink)

Automattic has released Wordpress 2.8.2 an open source (GPL) blog engine based on PHP and MySQL. "Comment author URLs were not fully sanitized when displayed in the admin. This could be exploited to redirect you away from the admin to another site." Anyone know if this issue is present in 2.7.1?

Thursday, July 16, 2009 (Permalink)

Apple has released Safari 4.0.2 for Mac and Windows, and I have made the switch. Bye Bye Firefox. It's been nice knowing you, but you denied and ignored my critical bugs for too long. Safari supports XML 1.0, XSLT 1.0, CSS, XHTML, SVG 1.1, and RSS. The final missing pieces that enabled me to switch were DeliciousSafari, Safari AdBlock, View Selection Source, and this bit of AppleScript:

tell application "Safari"
	set selecTxt to (do JavaScript "(getSelection())" in document 1)	
	set theURL to URL of tab 1 of front window
	set title to name of tab 1 of front window
end tell

Calling JavaScript from AppleScript to get the selection is definitely a hack; but it works. I do wish web browsers would provide easier programmatic access to the DOM from AppleScript and also the standard text suite though. I'd also love more metadata about the page--author, date, and so forth--but this at least moves Safari to parity with Firefox 2, and way beyond Firefox 3.

I still need replacements for CookieSafe and CustomizeGoogle, but I can live without those for now. Oh, and one more thing, could someone please make Safari restore its state after quit/crash and relaunch? "History/Reopen All Windows from Last Session" is close, but not quite right.

Wednesday, July 15, 2009 (Permalink)

Bare Bones Software has released version 9.2.1 of BBEdit, my preferred text editor on the Mac, my favorite XML editor on any platforms, what I'm using to type these very words. This is a bug fix release. New copies cost $125. Upgrades from 9.x are free. Mac OS X 10.4 or later is required.

Monday, July 13, 2009 (Permalink)

I've posted the fourth and hopefully final beta of XOM 1.2.2, my free-as-in-speech (LGPL) library for processing XML with Java. 1.2.2 improves compatibility with Maven and jaxen, and (tentatively) supports OSGI. However, the OSGI supoprt is at risk since it doesn't appear to be testable in an automated way. :-( If no one notices any major problems in this release, I'll probably declare it 1.2.2 final in a couple of weeks.

Tuesday, July 7, 2009 (Permalink)

The W3C XQuery working group has posted a new candidate recommendation of XQuery and XPath Full Text 1.0 as well as a new working draft of XQuery and XPath Full Text 1.0 Use Cases.

1.1 Full-Text Search and XML

As XML becomes mainstream, users expect to be able to search their XML documents. This requires a standard way to do full-text search, as well as structured searches, against XML documents. A similar requirement for full-text search led ISO to define the SQL/MM-FT [SQL/MM] standard. SQL/MM-FT defines extensions to SQL to express full-text searches providing functionality similar to that defined in this full-text language extension to XQuery 1.0 and XPath 2.0.

XML documents may contain highly structured data (fixed schemas, known types such as numbers, dates), semi-structured data (flexible schemas and types), markup data (text with embedded tags), and unstructured data (untagged free-flowing text). Where a document contains unstructured or semi-structured data, it is important to be able to search using Information Retrieval techniques such as scoring and weighting.

Full-text search is different from substring search in many ways:

  1. A full-text search searches for tokens and phrases rather than substrings. A substring search for news items that contain the string "lease" will return a news item that contains "Foobar Corporation releases the 20.9 version ...". A full-text search for the token "lease" will not.

  2. There is an expectation that a full-text search will support language-based searches which substring search cannot. An example of a language-based search is "find me all the news items that contain a token with the same linguistic stem as 'mouse'" (finds "mouse" and "mice"). Another example based on token proximity is "find me all the news items that contain the tokens 'XML' and 'Query' allowing up to 3 intervening tokens".

  3. Full-text search must address the vagaries and nuances of language. Search results are often of varying usefulness. When you search a web site for cameras that cost less than $100, this is an exact search. There is a set of cameras that matches this search, and a set that does not. Similarly, when you do a string search across news items for "mouse", there is only 1 expected result set. When you do a full-text search for all the news items that contain the token "mouse", you probably expect to find news items containing the token "mice", and possibly "rodents", or possibly "computers". Not all results are equal. Some results are more "mousey" than others. Because full-text search may be inexact, we have the notion of score or relevance. We generally expect to see the most relevant results at the top of the results list.

Note:

As XQuery and XPath evolve, they may apply the notion of score to querying structured data. For example, when making travel plans or shopping for cameras, it is sometimes useful to get an ordered list of near matches in addition to exact matches. If XQuery and XPath define a generalized inexact match, we expect XQuery and XPath to utilize the scoring framework provided by XQuery and XPath Full Text.

[Definition: Full-text queries are performed on tokens and phrases. Tokens and phrases are produced via tokenization.] Informally, tokenization breaks a character string into a sequence of tokens, units of punctuation, and spaces.

Tokenization, in general terms, is the process of converting a text string into smaller units that are used in query processing. Those units, called tokens, are the most basic text units that a full-text search can refer to. Full-text operators typically work on sequences of tokens found in the target text of a search. These tokens are characterized by integers that capture the relative position(s) of the token inside the string, the relative position(s) of the sentence containing the token, and the relative position(s) of the paragraph containing the token. The positions typically comprise a start and an end position.

Tokenization, including the definition of the term "tokens", SHOULD be implementation-defined. Implementations SHOULD expose the rules and sample results of tokenization as much as possible to enable users to predict and interpret the results of tokenization. Tokenization operates on the string value of an item; for element nodes this does not include the content of attribute nodes, but for attribute nodes it does. Tokenization is defined more formally in 4.1 Tokenization.

[Definition: A token is a non-empty sequence of characters returned by a tokenizer as a basic unit to be searched. Beyond that, tokens are implementation-defined.] [Definition: A phrase is an ordered sequence of any number of tokens. Beyond that, phrases are implementation-defined.]

Note:

Consecutive tokens need not be separated by either punctuation or space, and tokens may overlap.

Note:

In some natural languages, tokens and words can be used interchangeably.

[Definition: A sentence is an ordered sequence of any number of tokens. Beyond that, sentences are implementation-defined. A tokenizer is not required to support sentences.]

[Definition: A paragraph is an ordered sequence of any number of tokens. Beyond that, paragraphs are implementation-defined. A tokenizer is not required to support paragraphs.]

Some XML elements represent semantic markup, e.g., <title>. Others represent formatting markup, e.g., <b> to indicate bold. Semantic markup serves well as token boundaries. Some formatting markup serves well as token boundaries, for example, paragraphs are most commonly delimited by formatting markup. Other formatting markup may not serve well as token boundaries. Implementations are free to provide implementation-defined ways to differentiate between the markup's effect on token boundaries during tokenization. In the absence of an implementation-defined way to differentiate, element markup (start tags, end tags, and empty-element tags) creates token boundaries.

A sample tokenization is used for the examples in this document. The results might be different for other tokenizations.

Tokenization enables functions and operators that operate on a part or the root of the token (e.g., wildcards, stemming).

Tokenization enables functions and operators which work with the relative positions of tokens (e.g., proximity operators).

This specification focuses on functionality that serves all languages. It also selectively includes functionalities useful within specific families of languages. For example, searching within sentences and paragraphs is useful to many western languages and to some non-western languages, so that functionality is incorporated into this specification.

Monday, July 6, 2009 (Permalink)

The W3C has published a candidate recommendation of SKOS Simple Knowledge Organization System Reference and a new working draft of SKOS Simple Knowledge Organization System Primer. According to the primer:

SKOS — Simple Knowledge Organisation System — provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabulary. As an application of the Resource Description Framework (RDF), SKOS allows concepts to be composed and published on the World Wide Web, linked with data on the Web and integrated into other concept schemes.

This document is a user guide for those who would like to represent their concept scheme using SKOS.

In basic SKOS, conceptual resources (concepts) are identified with URIs, labeled with strings in one or more natural languages, documented with various types of note, semantically related to each other in informal hierarchies and association networks, and aggregated into concept schemes.

In advanced SKOS, conceptual resources can be mapped across concept schemes and grouped into labeled or ordered collections. Relationships can be specified between concept labels. Finally, the SKOS vocabulary itself can be extended to suit the needs of particular communities of practice or combined with other modeling vocabularies.

This document is a companion to the SKOS Reference, which gives the normative reference on SKOS.

Friday, July 3, 2009 (Permalink)

YesLogic has posted the first beta of Prince 7.0, a $495-$3900 payware batch formatter for Linux, Windows, and Mac OS X that produces PDF and PostScript from XML documents with CSS stylesheets that passes the Acid2 test. Version 7.0 adds support for Arabic, Hebrew, and Hindi; and kerning and ligatures.

Thursday, July 2, 2009 (Permalink)

SyncroSoft has released <Oxygen/> 10.3, $349 payware XML editor written in Java. Oxygen supports XML, XSL, DTDs, XQuery, SVG, Relax NG, Schematron, and the W3C XML Schema Language. According to the announcement:

Version 10.3 of Oxygen XML Editor improves both the XML Authoring and the XML Development capabilities. As a result of user feedback the Oxygen XML Author API was reorganized and extended with additional functionality. There are various improvements to the existing frameworks (DITA, DocBook, TEI, etc.) like automatic ID generation or DITA aware search and replace. An important new XML development feature is the Component Dependencies View that presents a tree of component dependencies starting with a specified component for XSLT, XML Schema, Relax NG and NVDL. The new version also integrates the Saxon SA XQuery Update functionality and updates a number of components to their latest versions.

If you must have a specialized XML development environment, then Oxygen is the one to buy, though personally I still prefer using plain vanilla text editors and the command line myself. At the end of the day, XML is just text; and an excellent text editor does a better job of it than a a text editor that's an afterthought in a product designed to shield users from raw XML. At most, I want some extra features on the side that don't get in my way when I'm just typing; for instance, a menu item to check the document for well-formedness or a spell checker that's smart enough to ignore tags. I don't want anything that gets in the way of my typing like auto-tag closing or tree views.

Tuesday, June 30, 2009 (Permalink)

Mark Logic has released version 4.1 of their namesake XML database for Linux, Solaris, and Windows. New features in 4.1 include:

  • SSL and HTTPS Support
  • URL Rewriting
  • xdmp:to-json and xdmp:from-json extensions functions that serialize XQuery datatypes as a JSON string and vice-versa
  • Japanese Language Support
  • Read-Only Forests
  • Built-in scheduler (like cron)
  • Search API With Support for Facets and Snippets
  • W3C XML Schema Validation
  • Function Values
  • xdmp:pretty-print extension function
  • xdmp:elapsed-time extension function

Pricing's hidden, but seems to be in the ballpark of $60,000 as best I can tell.

Friday, June 26, 2009 (Permalink)

Norm Walsh has posted version 0.9.12 of Calabash, an open source XProc implementation written in Java. This release fixes bugs and adds a non-standard “general values extension”. Java 5 or later is required. Calabash is published under the GNU General Public License Version 2.0.

Thursday, June 25, 2009 (Permalink)

Oracle has released the final version of Java Specification Request (JSR) 225, XQuery API for Java™ (XQJ). There's also a reference implementation and technical compatibility kit. As JDBC is to SQL, XQJ is to XQuery.

The following sample Java code is meant to convey a first look and feel of the style and usage of the XQJ API. It is by no means exhaustive or complete; e.g., no error handling is shown and it is assumed that xqds is an XQDataSource object representing a given data source. It illustrates the basic steps that an application would perform to execute an XQuery expression at a given XQuery implementation.

// establish a connection to the XQuery engine 
XQConnection conn = xqds.getConnection(); 
 
// create an expression object that is later used 
// to execute an XQuery expression 
XQExpression expr = conn.createExpression(); 
 
// the XQuery expression to be executed 
String es = "for $n in fn:doc('catalog.xml')//item " + 
  "return fn:data($n/name)"; 
 
// execute the XQuery expression 
XQResultSequence result = expr.executeQuery(es); 
 
// process the result (sequence) iteratively  
while (result.next()) {  
  // retrieve the current item of the sequence as a String 
  String str  = result.getAtomicValue(); 
  System.out.println("Product name: " + str); 
} 
 
// free all resources allocated for the result 
result.close(); 
 
// free all resources allocated for the expression 
expr.close(); 
 
// free all resources allocated for the connection 
conn.close(); 

On a side note, kudos to the spec authors for putting this simple example in the spec right up front. Something like this would help a lot of other JSRs.

Wednesday, June 24, 2009 (Permalink)

The W3C Voice Browser Working Group has published the second working draft of the VoiceXML 3.0 specification. VoiceXML is used to describe those annoying call trees you hear when calling most major companies. "Press 1 if you want to wait on hold for 20 minutes and then be hung up on; press 2 if you want to wait indefinitely; press 3 if you'd rather we just hung up on you now."

How does one build a successor to VoiceXML 2.0/2.1? Requests for improvements to VoiceXML fell into two main categories: extensibility and new functionality.

To accommodate both, the Voice Browser Working Group

  1. Developed the detailed semantic descriptions of VoiceXML functionality that versions 2.0 and 2.1 lacked. The semantic descriptions clarify the meaning of the VoiceXML 2.0 and 2.1 functionalities and how they relate to each other. The semantic descriptions are represented in this document as English text, UML state chart visual diagrams [ref] and/or textual SCXML representations [ref]. Figure 1 illusrates the VoiceXML 3.0 framework which contains some abstract UML state chart visual diagrams representing some existing VoiceXML functionality.
  2. Described the detailed semantics for new functionality. New functions include, for example, speaker identification and verification, video capture and replay, and a more powerful prompt queue. These semantic descriptions for these new functions are also represented in this document as English text, UML state chart visual diagrams [ref] and/or textual SCXML representations [ref]. Figure 2 contains some abstract UML state chart visual diagrams representing new functionality.
  3. Organized the functionality into modules, with each module implementing different functions. One reason for the introduction of a more rigorous semantic definition is that it allows us to assign semantics to individual modules. This makes it easier to understand what happens when modules are combined or new ones are defined. In contrast, VoiceXML 2.0 and 2.1 had a single global semantic definition (the FIA), which made it difficult to understand what would happen if certain elements were removed from the language or if new ones were added. Figure 3 contains some modules, each containing VoiceXML 3.0 functionality Vendors may extend VoiceXML functionality by creating additional modules with additional functionality not described in this document. For example, a vendor might create a new GPS input module. Application developers should be cautious about using vendor-specific modules because the resulting application may not be portable.
  4. Restructured and revisedDefined the syntax of each module to incorporate any new functionality. Application developers use the syntax of each module as an API to invoke the module’s functions. Figure 4 illustrates some simplified syntax associated with modules.
  5. Introduced the concept of a profile (language) which incorporates the syntax of several modules. Figure 5 illustrates two profiles. For example, a VoiceXML 2.1 profile incorporates the syntax of most of the modules corresponding to the VoiceXML 2.1 functionality which will support most existing VoiceXML 2.1 applications. Thus most VoiceXML 2.1 applications can be easily ported to VoiceXML 3.0 using the VoiceXML 2.1 profile. Another profile omits the VoiceXML 2.1 Form Interpretation Algorithm (FIA). This profile may be used by developers who want to define their one own flow control rather than using the FIA. Profiles enable platform developers to select just the functionality that application developers need for a platform or class of application. Multiple profiles enables developers to use just the profile (language) needed for a platform or class of applications. For example, a lean profile for portable devices, or a full-function profile for servers-based applications using all of the new functionality of VoiceXML 3.0.

One of the benefits of detailed semantic descriptions is improving portability within VoiceXML. Two vendors may implement the same functionality differently; however, the functionality must be consistent with the semantic meanings described in this document so that application authors are isolated from the different implementations. This increases portable among platforms that support the same syntax. Note that there are many other factors that effect to the portability that is outside the scope of this document (e.g. speech recognition capabilities, telephony).

Tuesday, June 23, 2009 (Permalink)

Wolfgang Meier has released eXist 1.2.6:

an open source database management system entirely built on XML technology. It stores XML data according to the XML data model and features efficient, index-based XQuery processing.

eXist-db supports many (web) technology standards making it an excellent application platform:

eXist-db is highly compliant with the XQuery standard (current XQTS score is 99.4%). The query engine is highly extensible and features a large collection of XQuery Function Modules.

1.2.6 fixes several scary database corruption issues.

Monday, June 22, 2009 (Permalink)

The W3C XQuery working group has posted a new candidate recommendation of XQuery Update Facility. XQuery as it currently exists is basically just SELECT in SQL terms. XQuery Update adds INSERT, UPDATE, and DELETE. More specifically it is:

  • upd:mergeUpdates
  • upd:revalidate
  • upd:applyUpdates
  • upd:insertBefore
  • upd:insertAfter
  • upd:insertInto
  • upd:insertIntoAsFirst
  • upd:insertIntoAsLast
  • upd:insertAttributes
  • upd:delete
  • upd:replaceNode
  • upd:replaceValue
  • upd:replaceElementContent
  • upd:rename
  • upd:removeType
  • upd:setToUntyped

The following features are considered to be at risk:

They may be removed if implementations of them do not exist at the end of the Candidate Recommendation period.

Comments are due by August 31.

Friday, June 19, 2009 (Permalink)

Could it really be 7 years? Yes, it could. Back from the dead after 7 years as a last call working draft, the W3C CSS Working Group has posted a new working draft of CSS Fonts Module Level 3. Described properties include:

  • font-family
  • font-weight
  • font-stretch
  • font-style
  • font-variant
  • font-size
  • font-size-adjust
  • font
  • @font-face

"This draft consolidates material previously divided between the CSS3 Fonts and CSS3 Web Fonts modules."

Thursday, June 18, 2009 (Permalink)

The first release candidate of Firefox 3.5 is out; though you'll need to get it by auto-updating 3.5 beta 4. It's ugly as sin, breaks the back button, breaks the scrollbars, and still hasn't fixed this AppleScript bug. To add insult to injury the feedback page uses a pointless, illegible CAPTCHA:

No one can read this

I think I'm giving up on Firefox. I just need to get del.icio.us integrated into Safari and I'll be done.

They've also released Firefox 3.0.11 to fix a security vulnerability.

Wednesday, June 17, 2009 (Permalink)

Michael Kay has released version 9.1.0.7 of Saxon, his XSLT 2.0 and XQuery processor for Java and .NET. This is a bug fix release.

Saxon is published in two versions for both of which Java 1.4 or later (or .NET) is required. Saxon 9.1B is an open source product published under the Mozilla Public License 1.0 that "implements the 'basic' conformance level for XSLT 2.0 and XQuery." Saxon 9.1 SA is £300.00 payware. According to Kay,

The most obvious difference between Saxon-SA and Saxon-B is that Saxon-SA is schema-aware: it allows stylesheets and queries to import an XML Schema, to validate input and output trees against a schema, and to select elements and attributes based on their schema-defined type. Saxon-SA also incorporates a free-standing XML Schema validator.>

In addition Saxon-SA incorporates some advanced extensions and optimizations not available in the Saxon-B product:

  • Saxon-SA is able to compile XQuery code directly into Java classes.

  • Saxon-SA has an advanced optimizer which recognizes joins in XPath expressions, XQuery FLOWR expressions, and in XSLT templates (nested xsl:for-each instructions). Whereas Saxon-B always implements these as nested loops, Saxon-SA uses a variety of strategies including indexes and hash joins. This can give dramatic improvements in execution time for large documents: some of the queries in the XMark benchmark improve by a factor of 300 (from 16 seconds to 45 milliseconds) to process a 10Mbyte source file.

  • Saxon-SA has a facility to process large documents in streaming mode. This enables documents to be handled that are too large to hold in memory (it has been tested up to 20Gb).

  • Additional extensions available in Saxon-SA include a try/catch capability for catching dynamic errors, improved error diagnostics, support for higher-order functions, and additional facilities in XQuery including support for grouping, advanced regular expression analysis, and formatting of dates and numbers.

Tuesday, June 16, 2009 (Permalink)

M/Gateway Developments has released M/DB:X, an XPath-based XML Database with a non-RESTful HTTP interface based on DOM. There doesn't seem to be any support for queries across documents--you have to know the ID of the document you're querying--so it doesn't really replace a good XQuery database. There might be some use cases for this I'm not seeing right now, but I suspect most projects would be better served by a traditional XQuery database such as eXist instead. M/DB:X is licensed under the GNU Affero General Public License Version 3.

Monday, June 15, 2009 (Permalink)

XMLMind has released Qizx/db 3.0, a $600 closed source, embeddable native XML database engine and/or database server written in Java that supports XQuery 1.0. Version 3.0 adds XQuery full text support. The query interpreter part is available under an open source license.

Friday, June 12, 2009 (Permalink)

Andy Clark's updated his NekoHTML open source HTML parser to version 1.9.12. "NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces. The parser can scan HTML files and "fix up" many common mistakes that human (and computer) authors make in writing HTML documents. NekoHTML adds missing parent elements; automatically closes elements with optional end tags; and can handle mismatched inline element tags. NekoHTML is written using the Xerces Native Interface (XNI) that is the foundation of the Xerces2 implementation. This enables you to use the NekoHTML parser with existing XNI tools without modification or rewriting code." This release fixes assorted bugs.

Thursday, June 11, 2009 (Permalink)

Automattic has released Wordpress 2.8.0 an open source (GPL) blog engine based on PHP and MySQL. "2.8 represents a nice fit and finish release for WordPress with improvements to themes, widgets, taxonomies, and overall speed. We also fixed over 790 bugs." And here I just finished upgradinbg to 2.7.1. None of the new features seem especially compelling to me, so I'll probably just stay there for a while.

Wednesday, June 10, 2009 (Permalink)

The W3C POWDER Working Group has posted proposed recommendations of three specs:

According to the formal semantics,

The Protocol for Web Description Resources, POWDER, offers a simple method of associating RDF data with groups of resources. Its primary 'unit of information' is the Description Resource (DR). This comprises three elements:

  • attribution (who is providing the description)
  • scope (defined as a set of IRIs over which the description applies to the resources de-referenced from those IRIs)
  • the description itself (the 'descriptor set').

To some extent, this approach is in tension with the core semantics of RDF and OWL. To resolve that tension, it is necessary to extend RDF semantics as described below. In order to minimize the required extension, while at the same time preserving the relatively simple encoding of POWDER in XML which is generally readable by humans, we define a multi-layered approach. The operational semantics, i.e. the encoding of POWDER in XML, is first transformed into a more restricted XML encoding that is less easily understood by humans and depends on matching IRIs against regular expressions to determine whether or not they are within the scope of the DR. This latter encoding is, in its own turn, transformed into the extended-RDF encoding.

Still sounds like turtles to me. :-)

Tuesday, June 9, 2009 (Permalink)

Apple has released Safari 4.0 for Windows and the Mac. Safari supports XML 1.0, XSLT 1.0, CSS, XHTML, SVG 1.1, and RSS. 4.0 adds cover flow for web pages and faster JavaScript execution. I've been thinking about switching to Safari, since Firefox seems to get slower with each release, and has multiple Mac-specific bugs that have remained unfixed for years. If only Safari had a decent plug-in ecosystem. :-( Mac OS X 10.4, 10.5.7, or Windows XP or later is required.

Monday, June 8, 2009 (Permalink)

The W3C Math Working Group has posted an updated working draft of Mathematical Markup Language (MathML) Version 3.0.:

The present draft is an incremental one making public some of the results of Math Working Group work in recent months. The biggest difference this time is in Chapter 4, although there have been smaller ameliorations throughout the specification. A more detailed description of changes from the previous Recommendation follows.

  • With the second Working Draft, much of the non-normative explication that formerly was found in Chapters 1 and 2, and many examples from elsewhere in the previous MathML specifications, were removed from the MathML3 specification and planned to be incorporated into a MathML Primer being prepared as a separate document. It is expected this will help the use of this formal MathML3 specification as a reference document in implementations, and offer the new user better help in understanding MathML's deployment. The remaining content of Chapters 1 and 2 is being edited to reflect the changes elsewhere in the document, and in the rapidly evolving Web environment. Some of their text used to go back to early days of the Web and XML, and its explanations are now commonplace.

  • Chapter 3, on presentation-oriented markup, in this draft adds new material on linebreaking and on markup for elementary math notations. Material introduced in the last draft revising the mpadded and maction elements has been further revised as a result of active discussion. In addition, the layout of schemata such as that for long division have been carefully revised with an eye to the demands mathematics as an international language. This has resulted in the introduction of new mstack, mlongdiv and other associated elements. Earlier work, as recorded in the W3C Note Arabic mathematical notation, has allowed clarification of the relationship with bidirectional text and examples with RTL text have been added.

  • Chapter 4, on content-oriented markup, contains major changes and additions in this Working Draft. The meaning of the actual content remains as before in principle, but a lot of work has been done on expressing it better. A few new elements have been added.

  • Chapter 5 is being refined as its purpose has been further clarified to deal with the mixing of markup languages. This chapter deals, in particular, with interrelations of parts of the MathML specification, especially with presentation and content markup.

  • Chapter 6 is a new addition which deals with the issues of interaction of MathML with a host environment. This chapter deals with interrelations of the MathML specification with XML and HTML, but in the context of deployment on the Web. In particular there is a discussion of the interaction of CSS with MathML.

  • Chapter 7 replaces the previous Chapter 6, and has been rewritten and reorganized to reflect the new situation in regard to Unicode, and the changed W3C context with regard to named character entities. The new W3C specification of Entity Definitions for Characters in XML, which incorporates those used for mathematics is becoming a public working draft [Entities]. It is expected that some new ancillary tables will be provided that reflect requests the Math WG has received.

  • The Appendices, of which there are eight shown, have been reworked. Appendix A now contains the new RelaxNG schema for MathML3 as well as discussion of MathML3 DTD issues. Appendix B addresses media types associated with MathML and implicitly constitutes a request for the registration of three new ones, as is now standard for work from the W3C. Appendix C contains a new simplified and reconsidered Operator Dictionary. Appendices D, E, F, G and H contain similar non-normative material to that in the previous specification, now appropriatley updated.

Friday, June 5, 2009 (Permalink)

Google has posted betas of Chrome for Mac OS X and Linux. According to Google,

whatever you do, please DON'T DOWNLOAD THEM! Unless of course you are a developer or take great pleasure in incomplete, unpredictable, and potentially crashing software.

How incomplete? So incomplete that, among other things, you won't yet be able to view YouTube videos, change your privacy settings, set your default search provider, or even print.

Sounds like fun to me. :-) I tried out the Mac build, and overall it feels quite nice, much nicer than the Windows version I've never really been able to accept. The biggest difference is that the Mac version, like all good Mac programss--in fact, like all good GUI programs period--has a real menu bar. OK, I know that neither Windows or Linux has yet been able to get menu bars right 25 years after they first had a Mac to copy from, but is that any excuse for eliminating them completely? I think not.

Thursday, June 4, 2009 (Permalink)

The call for late-breaking news for Balisage this August in has been posted. "Balisage is the leading international conference on markup theory and practice. If you have interesting markup applications, difficult markup problems, or intriguing solutions to problems related to the design and use of markup, markup languages, or markup tools; if you want to know what the leading theorists of markup are thinking; if you are the house markup expert and want to spend time with your kind, then you should plan to be in Montréal, in August 2009." Sounds like fun, though I probably won't be able to attend this year myself. Submissions are due by June 19.

Sunday, May 31, 2009 (Permalink)

The W3C XML Processing Model Working Group has published the candidate recommendation of XProc: An XML Pipeline Language. "This specification describes the syntax and semantics of XProc: An XML Pipeline Language, a language for describing operations to be performed on XML documents. An XML Pipeline specifies a sequence of operations to be performed on zero or more XML documents. Pipelines generally accept zero or more XML documents as input and produce zero or more XML documents as output. Pipelines are made up of simple steps which perform atomic operations on XML documents and constructs similar to conditionals, iteration, and exception handlers which control which steps are executed."

Friday, May 29, 2009 (Permalink)

The detailed progam has been posted for Balisage 2009 in Montreal in August. Looks like a good program. Maybe if I'm lucky I can sneak up there for a couple of days.

Tuesday, May 26, 2009 (Permalink)

Google has released Chrome 2.0, its Windows-only web browser. This release adds a full-screen mode and form autofill (a must-have feature before I'll even consider a new browser these days). JavaScript is also supposed to be faster though I haven't tested that myself yet. (I do know Firefox is sucking wind badly these days, so it shouldn't be hard to beat.) Annoyingly the site won't let you download Chrome from a Mac, just sign up to be notified when a Mac version is available. Surely someone over there has heard of Bootcamp, Parallels, VVMWare Fusion, and perhaps even network drives? One more time: sites that tailor content depending on user agent strings are broken. URLs identify the resource, not the HTTP headers.

Friday, May 22, 2009 (Permalink)

Monkfish XML Software has released version 7.0 of XMLBlueprint XML Editor for the Windows platform, priced at US $85 per license. Version 7 supports Code Folding, Schematron, and MSXML6 for better error messages. (And special kudos to Monkfish for being pretty much the first vendor to send me a news item I could post verbatim. Admirable conciseness, clarity, and completeness. Would that all news items were this easy. Thanks guys!)

Thursday, May 21, 2009 (Permalink)

The W3C Media Fragments Working Group has posted the first public working draft of Use cases and requirements for Media Fragments:

Audio and video resources on the World Wide Web are currently treated as "foreign" objects, which can only be embedded using a plugin that is capable of decoding and interacting with the media resource. Specific media servers are generally required to provide for server-side features such as direct access to time offsets into a video without the need to retrieve the entire resource. Support for such media fragment access varies between different media formats and inhibits standard means of dealing with such content on the Web.

This specification provides for a media-format independent, standard means of addressing media fragments on the Web using Uniform Resource Identifiers (URI). In the context of this document, media fragments are regarded along three different dimensions: temporal, spatial, and tracks. Further, a fragment can be marked with a name and then addressed through a URI using that name. The specified addressing schemes apply mainly to audio and video resources - the spatial fragment addressing may also be used on images.

The aim of this specification is to enhance the Web infrastructure for supporting the addressing and retrieval of subparts of time-based Web resources, as well as the automated processing of such subparts for reuse. Example uses are the sharing of such fragment URIs with friends via email, the automated creation of such fragment URIs in a search engine interface, or the annotation of media fragments with RDF. This specification will help make video a first-class citizen of the World Wide Web.

The media fragment URIs specified in this document have been implemented and demonstrated to work with media resources over the HTTP and RTP/RTSP protocols. Existing media formats in their current representations and implementations provide varying degrees of support for this specification. It is expected that over the time, media formats, media players, Web Browsers, media and Web servers, as well as Web proxies will be extended to adhere to the full requirements given in this specification.

Monday, May 18, 2009 (Permalink)

The W3C XML Schema Working Group has posted candidate recommendations of XML Schema 1.1 Part 1: Structures and XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes. According to the structures draft,

The major changes since version 1.0 include:

  • Support for XML 1.1 has been added. It is now implementation defined whether datatypes dependent on definitions in [XML] and [Namespaces in XML] use the definitions as found in version 1.1 or version 1.0 of those specifications.
  • A new primitive decimal type has been defined, which retains information about the precision of the value. This type is aligned with the floating-point decimal types which are included in [IEEE 754-2008].
  • In order to align this specification with those being prepared by the XSL and XML Query Working Groups, a new datatype named anyAtomicType which serves as the base type definition for all primitive atomic datatypes has been introduced.
  • The conceptual model of the date- and time-related types has been defined more formally.
  • A more formal treatment of the fundamental facets of the primitive datatypes has been adopted.
  • More formal definitions of the lexical space of most types have been provided, with detailed descriptions of the mappings from lexical representation to value and from value to ·canonical representation·.
  • The validation rule Datatype Valid (§4.1.4) has been recast in more declarative form. A paraphrase of the constraint in procedural terms, which corrects some errors in the previous versions of this document, has been added as a note.
  • The rules governing partial implementations of infinite datatypes have been clarified.
  • Various changes have been made in order to align the relevant parts of this specification more closely with other relevant specifications, including especially the corresponding sections of [XSD 1.1 Part 1: Structures].

Changes since the previous public Working Draft include the following:

  • To reduce confusion and avert a widespread misunderstanding, the normative references to various W3C specifications now state explicitly that while the reference describes the particular edition of a specification current at the time this specification is published, conforming implementations of this specification are not required to ignore later editions of the other specification but instead may support later editions, thus allowing users of this specification to benefit from corrections to other specifications on which this one depends.
  • Schema Component Constraint enumeration facet value required for NOTATION (§3.3.20), which restricts the use of NOTATION to validate ·literals· without first enumerating a set of values, has been clarified.
  • The use of the namespace whose URI is http://www.w3.org/2001/XMLSchema-datatypes continues to be defined. An earlier draft of this specification had introduced text deprecating that use; that text has been deleted. This change resolves issue 6522 Please un-deprecate the the namespace http://www.w3.org/2001/XMLSchema-datatypes, raised by John Cowan.
  • The discussion of whitespace handling in whiteSpace (§4.3.6) makes clearer that when the value is collapse, ·literals· consisting solely of whitespace characters are reduced to the empty string; the earlier formulation has been misunderstood by some implementors.
  • The value space of anyURI is now explicitly identified; this resolves issue 3264 xs:anyURI definition, raised by the W3C XML Query and XSL working groups.
  • References to IEEE 754-1985 have been updated to refer to 754-2000 (resolves issue 6664).
  • The description of the value space of precisionDecimal has been revised for better clarity; this resolves issue 3248.
  • In the discussions of the built-in list datatypes, the display of facets which have a value for the datatype in question has been corrected; this resolves issue 6734 NMTOKENS IDREFS and ENTITIES should all have a "whiteSpace" facet. The wording used to introduce facets with values has also been revised to try to reduce confusion.
  • The historical list of leap seconds given in earlier versions of this document has been removed (see issue 6554).
  • The publication form of this document now includes a detailed prose description of the type hierarchy diagram in section Built-in Datatypes and Their Definitions (§3). We thank the W3C Web Accessibility Initiative's Protocols and Formats Working Group for their comments and assistance in this connection.
  • Several other editorial corrections and improvements have been made.

In the datatypes spec,

The major revisions since the previous public working draft include the following:

Comments are due by February 20.

Wednesday, May 13, 2009 (Permalink)

The W3C XHTML working group has published proposed edited recommendations of 4 XHTML 1.1 specs:

  • XHTML™ Basic 1.1 - Second Edition: "In this revision, an XML Schema implementation and the lang attribute have been added." (A schema is fine, but a new attribute is really beyond what a merely edited recommendation should offer.The problem is the W3C process is too heavyweight and hidebound, and mixes experimentation with standardization, so even the W3C doesn't really follow it except in name any more.)
  • XHTML-Print - Second Edition: "The only substantive changes in this version are the addition of an implementation of the markup language using XML Schema."
  • XHTML™ 1.0: The Extensible HyperText Markup Language (Third Edition): "The only change of note in this version is that the contents of the informative Appendix A have been removed as they are being kept up to date in a separate document"
  • XHTML™ 1.1 - Module-based XHTML - Second Edition: "It also includes an XML Schema implementation of the language, and integrates the lang attribute to increase compatibility with User Agents and Assistive Technologies."
Friday, May 8, 2009 (Permalink)

Bare Bones Software has released version 9.2 of BBEdit, my preferred text editor on the Mac, my favorite XML editor on any platforms, what I'm using to type these very words. New features include a BBEdit Sleep command. "When BBEdit gets quit because of a system shutdown, restart, or user logout, it now does a 'Sleep BBEdit' in order to restore as much of the application state as possible when starting back up. (This means that you may not be prompted to save new or unsaved documents, since they will be autosaved when the application sleeps.)" New copies cost $125. Upgrades from 9.x are free. Mac OS X 10.4 or later is required.

Thursday, May 7, 2009 (Permalink)

The OpenOffice Project has released OpenOffice 3.1, an open source office suite for Linux, Solaris, and Windows that saves all its files as zipped XML and uses XForms. New features include:

  • Improvements to comments: reply feature now supports 'conversations'
  • Outline levels within paragraphs
  • Hot hints for formulae, with new and improved formulae available
  • Improved sorting
  • Flexible positioning of axes
  • Flexible handling of "missing" data points
  • SQL syntax highlighting
  • Easier deployment of macro applications
  • Improved support for bidirectional scripts
  • New locale support

I wanted to try this out, but the download site appears a bit swamped (or broken) at the moment. The English Mac version is out of commission, so I'm downloading the French version instead.

Tuesday, May 5, 2009 (Permalink)

The W3C Scalable Vector Graphics Working Group published three new public drafts for SVG Compositing Specification, SVG Referenced Parameter Variables 1.0, Part 1: Primer, and SVG Referenced Parameter Variables 1.0, Part 2: Language. SVG Compositing "adds support for raster and vector objects to be combined to produce eye catching effects via advanced alpha compositing, masks, and clipping paths." "The Referenced Parameter Variables specification provides a declarative way to incorporate parameter values into SVG content. Often, users may wish to create a single resource, and reuse it several times with specified variations, and this specification provides a means to do so without the use of script."

Monday, May 4, 2009 (Permalink)

The W3C XML Security Working Group has posted the first public draft of XML Signature Properties:

The SignatureProperties element defined by XML Signature [XMLDSIG2nd] offers a means to associate property values with an XML Signature. This document defines specific properties that may be used by various applications of XML Signature, without requiring those applications to define such properties on a per case basis. This document defines how these properties are to be specified and processed when used but does not require their use - specifications that reference this document may or may not require their use.

The changes proposed in this document would not be a breaking change to XML Signature, but warrant a new namespace for the properties themselves so that they can be used in various versions of XML Signature."

Friday, May 1, 2009 (Permalink)

The W3C Cascading Style Sheets working group has posted a second candidate recommendation of Media Queries:

HTML4 and CSS2 currently support media-dependent style sheets tailored for different media types. For example, a document may use sans-serif fonts when displayed on a screen and serif fonts when printed. ‘screen’ and ‘print’ are two media types that have been defined. Media queries extend the functionality of media types by allowing more precise labeling of style sheets.

A media query consists of a media type and zero or more expressions to limit the scope of style sheets. Among the media features that can be used in media queries are ‘width’, ‘height’, and ‘color’. By using media queries, presentations can be tailored to a specific range of output devices without changing the content itself.

Wednesday, April 29, 2009 (Permalink)

The fourth beta of Firefox 3.5 is out and, at least on the Mac, it's sucking wind. It's ugly as sin. Scroll bars are broken. You're bombarded with password pop-ups as soon as you restart (one for each web site you have open instead of just one for the entire browser). Finally, they still haven't fixed this AppleScript bug. (Well, seems they fixed part of it; but there are at least three more related AppleScript bugs they haven't fixed so scripts that worked in Firefox 1 and 2 still don't work.) Is it time to give up on Firefox? If only Safari had a decent plug-in ecosystem.

They've also released Firefox 3.0.10 to fix a security vulnerability.

Tuesday, April 28, 2009 (Permalink)

The CSS Working Group has published the candidate recommendation of Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification:

CSS 2.1 builds on CSS2 [CSS2] which builds on CSS1 [CSS1]. It supports media-specific style sheets so that authors may tailor the presentation of their documents to visual browsers, aural devices, printers, braille devices, handheld devices, etc. It also supports content positioning, table layout, features for internationalization and some properties related to user interface.

CSS 2.1 corrects a few errors in CSS2 (the most important being a new definition of the height/width of absolutely positioned elements, more influence for HTML's "style" attribute and a new calculation of the 'clip' property), and adds a few highly requested features which have already been widely implemented. But most of all CSS 2.1 represents a "snapshot" of CSS usage: it consists of all CSS features that are implemented interoperably at the date of publication of the Recommendation.

CSS 2.1 is derived from and is intended to replace CSS2. Some parts of CSS2 are unchanged in CSS 2.1, some parts have been altered, and some parts removed. The removed portions may be used in a future CSS3 specification. Future specs should refer to CSS 2.1 (unless they need features from CSS2 which have been dropped in CSS 2.1, and then they should only reference CSS2 for those features, or preferably reference such feature(s) in the respective CSS3 Module that includes those feature(s)).

Monday, April 27, 2009 (Permalink)

The W3C XQuery Working Group has posted proposed edited recommendations of the various XQuery/XPath 2 specs:

A "second edition is not a new version of this specification; its purpose is to clarify a number of issues that have become apparent since the first edition was published. All of these clarifications (excepting trivial editorial fixes) have been published in a separate errata document."

Friday, April 24, 2009 (Permalink)

The W3C Web Applications Working Group has published first public working drafts of specifications for APIs that enhance the open Web platform as a runtime environment for full-featured applications:

  • Web Storage provides APIs for persistent client-side data storage by Web applications.
  • Web Workers defines an API for enabling thread-like operations (using message-passing) in Web applications, so that certain application tasks can run in parallel.
  • Web Sockets API provides an API for full-duplex communication between a Web application and a remote host.
  • Server-Sent Events defines an API for opening an HTTP connection for receiving push notifications from a server in the form of DOM events.

The Web Storage, Web Sockets API, and Server-Sent Events specifications were previously published as parts of the HTML 5 specification, but will now each become Recommendation-track deliverables within the Web Applications Working Group.

If we ultimately succeed in recreating the full desktop experience on the Web, is it still the Web? Or have we just finally succeeded in abstracting out the operating system and the platform?

I think the answer is yes is, and only if, URL addressability is maintained. If each individual screen of an application is an independently URL addressable resource, and if the back button still works, then it's the Web. (I need to flesh this out more with some examples. That's not quite correct.) Otherwise it's just standard application development with different APIs, a different language, and a somewhat improved delivery mechanism. Not that this is a bad thing in all cases, but it's not the Web. More than HTML, more even than HTTP, the Web is about URLs. For example a Web 2.0 slideshow app should have separate URLs for each and every slide in each and every slide show. Just going to http://slideshow.example.org/, logging in to choose a user, and running an RIA in a browser window is not the Web. Too many Web 2.0 apps today are just weak desktop apps that run in the browser. Web 2.0 needs to be a superset of Web 1.0, not something completely different.

Wednesday, April 22, 2009 (Permalink)

The Mozilla Project has released Firefox 3.0.9 and SeaMonkey 1.1.16 to fix various security and stability bugs including one affecting XSLT (actually fixed in 3.0.8). All users should upgrade.

Tuesday, April 21, 2009 (Permalink)

The W3C Technical Architecture Group has published a note on Usage Patterns For Client-Side URI parameters:

Designers of URIs have traditionally used ? to encode server-side parameters. At its inception, the Web also introduced fragment identifiers (preceded by # ) as a means of addressing specific locations in a document. As highly interactive applications get built using Web parts (HTML, CSS and JavaScript component resources that are themselves Web addressible — see [tvr-cacm2009], there is an increasing need for encoding interaction state as part of the URI. The Web is beginning to discover and codify design patterns based on fragment identifiers for many of these use cases.

This draft finding is being prepared in response to TAG issue #60 . This document explores the issues that arise in this context, and attempts to define best practices that help:

  • Create URIs for intermediate pages in a Web application so that the back button does the right thing

  • Enable clients to address into specific points in a stream of content, e.g., video.

The goal of this finding is to initially collect the various usage scenarios that are leading to innovative uses of client-side URI parameters, along with the solutions that have been developed by the Web community. When this exercise is complete, this finding will conclude by ensuring that these design patterns are mutually compatible. If some of these usage patterns are identified as being in conflict, we will recommend best practices that help side-step such conflicts. We encourage the wider Web community to point us at emerging usage scenarios and design patterns so that we maximize our chances of arriving at a final finding that helps move forward the architecture of the Web in a self-consistent manner.

Monday, April 20, 2009 (Permalink)

SyncroSoft has released <Oxygen/> 10.2, $366 payware XML editor written in Java. Oxygen supports XML, XSL, DTDs, XQuery, SVG, Relax NG, Schematron, and the W3C XML Schema Language. According to the announcement:

The visual editing in <oXygen/> 10.2 includes content change tracking support, image scaling, table width and column width support including visual resizing, a more compact representation for included content and tag markers as well as a number of additions to the Author SDK to allow building powerful applications on top of <oXygen/>.

The XML developers will benefit in <oXygen/> XML Editor 10.2 of a redesigned and more ergonomic XML Schema diagram with a number of additional editing capabilities, will be able to see and understand the include/import relations between XML Schema documents, Relax NG schemas and XSLT stylesheets with the help of the new Hierarchy/Dependencies views. The addition of the Author page for all XML schema editors creates a completely new perspective on schemas and schema documentation editing.

Tuesday, April 14, 2009 (Permalink)

The W3C CSS Working group has posted a working draft of CSS 2D Transforms Module Level 3:

The CSS visual formatting model describes a coordinate system within which each element is positioned. Positions and sizes in this coordinate space can be thought of as being expressed in pixels, starting in the upper left corner of the parent with positive values proceeding to the right and down.

This coordinate space can be modified with the transform property. Using transform, elements can be translated, rotated and scaled in two dimensional space. The coordinate space behaves as described in the coordinate system transformations section of the SVG 1.1 specification. This is a coordinate system with two axes: the X axis increases horizontally to the right; the Y axis increases vertically downwards.

Specifying a value other than ‘none’ for the transform property establishes a new local coordinate system at the element that it is applied to. Transformations are cumulative. That is, elements establish their local coordinate system within the coordinate system of their parent. In this way, a transform property effectively accumulates all the transform properties of its ancestors. The accumulation of these transforms defines a current transformation matrix (CTM) for the element.

The transform property does not affect the flow of the content surrounding the transformed element. However, the value of the overflow area takes into account transformed elements. This behavior is similar to what happens when elements are translated via relative positioning. Therefore, if the value of the overflow property is scroll or auto, scrollbars will appear as needed to see content that is transformed outside the visible area.

They've also published a working draft of CSS 3D Transforms Module Level 3:

This is an enhancement to the CSS Transforms specification which provides transforms in three dimensions. It should be read in conjunction with that specification.

This coordinate space can be modified with the transform property. Using transform, elements can be translated, rotated and scaled in two or three dimensional space. A perspective transform can also be applied to give a sense of depth to the way elements are displayed. In three dimensions, a Z axis is added, with positive z values conceptually rising perpendicularly out of the window toward the user and negative z values falling into the window away from the user.

Any value other than ‘none’ for the transform results in the creation of both a stacking context and a containing block. The object acts as though position: relative has been specified, but also acts as a containing block for fixed positioned descendants. The position on the Z axis of a transformed element does not affect the order within a stacking context. With elements at the same z-index, objects are drawn in order of increasing z position.

Note that while transform uses a three-dimensional coordinate system, the elements themselves are not three-dimensional objects. Instead, they exist on a two-dimensional plane (a flat surface) and have no depth.

Friday, April 10, 2009 (Permalink)

The W3C POWDER Working Group has posted second last call working drafts of three specs:

According to the formal semantics,

The Protocol for Web Description Resources, POWDER, offers a simple method of associating RDF data with groups of resources. Its primary 'unit of information' is the Description Resource (DR). This comprises three elements:

  • attribution (who is providing the description)
  • scope (defined as a set of IRIs over which the description applies to the resources de-referenced from those IRIs)
  • the description itself (the 'descriptor set').

To some extent, this approach is in tension with the core semantics of RDF and OWL. To resolve that tension, it is necessary to extend RDF semantics as described below. In order to minimize the required extension, while at the same time preserving the relatively simple encoding of POWDER in XML which is generally readable by humans, we define a multi-layered approach. The operational semantics, i.e. the encoding of POWDER in XML, is first transformed into a more restricted XML encoding that is less easily understood by humans and depends on matching IRIs against regular expressions to determine whether or not they are within the scope of the DR. This latter encoding is, in its own turn, transformed into the extended-RDF encoding.

Thursday, April 9, 2009 (Permalink)

The W3C Cascading Stylesheets Working Group has published a new working draft of CSS Template Layout Module. According to the latter,

The styling of a Web page, a form or a graphical user interface can roughly be divided in two parts: (1) defining the overall “grid” of the page or window and (2) specifying the fonts, indents, colors, etc., of the text and other objects. The two are not completely separate, of course, because indenting or coloring a text influences the perceived grid as well. Nevertheless, when one separates the parts of a style that should change when the window gets bigger from the parts that stay the same, one often finds that the grid changes (room for a sidebar, extra navigation bar, big margins, larger images…), while fonts, colors, indents, numbering styles, and many other things don't have to change, until the size of the window becomes extreme.

The properties in this specification work by associating a layout policy with an element. Rather than letting an element lay out its descendants in their normal order as inline text or as blocks of text (the policies available in CSS level 1), the policy defined in this module, called template-based positioning, gives an element an invisible grid for aligning descendant elements.

Because layouts on the Web have to adapt to different window and paper sizes, the rows and columns of the grid can be made fixed or flexible in size.

The typical use cases for these properties include:

  • Complex Web pages, with multiple navigation bars in fixed positions, areas for advertisements, etc.
  • Complex forms, where the alignment of labels and form fields may be easier with the properties of this module than with the properties for tables and margins.
  • GUIs, where buttons, toolbars, labels, icons, etc., are aligned in complex ways and have to stay aligned (and not wrap, for example) when the window is resized.
  • Paged displays (e.g., printed media) where each page is divided in fixed areas for different kinds of content.

Template-based positioning is an alternative to absolute positioning, which, like absolute positioning, is especially useful for aligning elements that don't have simple relationships in the source (parent-child, ancestor-descendant, immediate sibling). But in contrast to absolute positioning, the elements are not positioned with the help of horizontal and vertical coordinates, but by mapping them into slots in a table-like template. The relative size and alignment of elements is thus governed implicitly by the rows and columns of the template. A template doesn't allow elements to overlap, but it provides layouts that adapt better to different widths.

The mapping is done with the ‘position’ property, which specifies in this case into which slot of the template the element goes. The template itself is specified on the ‘display’ property of some ancestor of the elements to remap.

Wednesday, April 8, 2009 (Permalink)

The W3C EXI working group has posted a new working draft of Efficient XML Interchange Evaluation. They report some impressive numbers. However it's not immediately apparent to what extent the most impressive ones cheat by hiding some of the information in the schema, rather than producing self-contained documents as one finds in actual XML. They also appear to be throwing away information in some cases. This doesn't seem to always be a lossless compression.

Tuesday, April 7, 2009 (Permalink)

Just noticed that Microsoft has released Internet Explorer 8 for Windows. It's supposed to be more standards compliant, but it still can't handle application/xhtml+xml. I haven't tested it yet, but I'd be astonished if application/xslt+xml isn't similarly broken. When Microsoft can't even get the easy things right, what hope is there for supporting genuinely hard standards?

Friday, April 3, 2009 (Permalink)

The W3C OWL Working Group has published a working draft of OWL 2 Web Ontology Language Document Overview:

The OWL 2 Web Ontology Language, informally OWL 2, is an ontology language for the Semantic Web with formally defined meaning. OWL 2 ontologies provide classes, properties, individuals, and data values and are stored as Semantic Web documents. OWL 2 ontologies can be used along with information written in RDF, and OWL 2 ontologies themselves are primarily exchanged as RDF documents.

This document, part 1 of 13 in the OWL 2 document set, serves as an introduction to OWL 2 and the various other OWL 2 documents. It describes the various syntaxes for OWL 2, the different kinds of semantics, the defined profiles (sub-languages), and the differences between OWL 1 and OWL 2.

Thursday, April 2, 2009 (Permalink)

Quark has released Quark XML Author 3. Version 3 integrates with MathType for creating equations and improves table formatting. As near as I can figure out, it costs as much as Quark's salespeople think you can afford to pay.

Tuesday, March 31, 2009 (Permalink)

Norm Walsh has posted version 0.9.7 of Calabash, an open source XProc implementation written in Java. This release fixes bugs and adds <cx:uri-info> and <cx:eval> extension steps. Java 5 or later is required. Calabash is published under the GNU General Public License Version 2.0.

Wednesday, March 25, 2009 (Permalink)

Apologies that things have been a little slow here lately. Personal life intervenes. This seems likely to continue for another month or two at least. If you need to reach me please start using ibiblio.org instead of metalab.unc.edu in the domain. It may be a while before I have all the links updated. The old e-mail address should work for the foreseeable future (sunsite.unc.edu still does after almost 15 years) but the best laid schemes o' Mice an' Men, gang aft agley and all that. Thanks.


The DBIS Group at University of Konstanz has released BaseX 5.2, an open source native XML database with a GUI frontend that supports most of XQuery 1.0 and some of XQuery Full-Text. It seems to be written in Java so one presumes its platform independent.Changes in this release include:

  • XPath 1.0 support has been removed.
  • XQuery has been optimized
  • Improved, sequential and index-based XQuery Full-Text 1.0 support.
  • Enhanced TreeMap visualization with new layout algorithms and text thumbnails
Friday, March 20, 2009 (Permalink)

The W3C Voice Browser, Web APIs, and Web Application Formats (WAF) Working Groups have posted a new working draft of Cross-Origin Resource Sharing (formerly Access Control for Cross-site Requests). According to the draft, "This document defines a mechanism to enable client-side cross-origin requests. Specifications that want to enable cross-origin requests in an API they define can use the algorithms defined by this specification. If such an API is used on http://example.org resources, a resource on http://hello-world.example can opt in using the mechanism described by this specification (e.g., specifying Access-Control-Allow-Origin: http://example.org as response header), which would allow that resource to be fetched cross-origin from http://example.org." In other words, Bob's browser can download a JavaScript program from Alice's server. The JavaScript served from Alice's computer but running in Bob's browser can now download more data from Eve's server if Eve allows it.

I finally see how this works, but there are some major flaws here. There are several reasons cross-server requests have been banned in Java and JavaScript for the last 15 years: distributed denial of service attacks, behind-the-firewall scanning, and information distribution to 3rd parties. This approach ameliorates these problems, but does not eliminate them. Each of those possible attacks is less powerful and effective when this regime is in place, but is still possible. It also makes it dramatically easier for 3rd parties to write and install software to track user behavior across web sites. User privacy concerns do not appear to play any part in the design of this spec.

Friday, March 13, 2009 (Permalink)

The W3C Evaluation and Repair Tools Working Group has posted the first public working draft of Pointer Methods in RDF. "This specification contains a framework for representing pointers - entities that permit identifying a portion or segment of a piece of content - making use of the Resource Description Framework (RDF). It will also describe a number of specific types of pointers that permit portions of a document to be referred to in different ways. When referring to a specific part of, say, a piece of Web content, it is useful to be able to have a consistent manner by which to refer to a particular segment of a Web document, to have a variety of ways by which to refer to that same segment, and to make the reference robust in the face of changes to that document." Briefly, this defines a meta-pointer representation that can encompass XPointers, CSS selectors, Byte offset pointers, line-char pointers, and potentially other forms of pointers into documents. Here's one of the simpler examples from the spec:

<ptr:ByteOffsetCompoundPointer rdf:about="#byteOffsetCompoundPointer"> 
  <ptr:startPointer rdf:resource="#byteOffsetPointer"/>   
  <ptr:byteOffset>255</ptr:byteOffset>   
</ptr:ByteOffsetCompoundPointer>

All that just to point at the position 255 bytes in. Somehow I think we'll be using #foo for sometime to come.

Wednesday, March 11, 2009 (Permalink)

The W3C CSS Working group has posted the last call working draft of Selectors Level 3. "Selectors are patterns that match against elements in a tree, and as such form one of several technologies that can be used to select nodes in an XML document. Selectors have been optimized for use with HTML and XML, and are designed to be usable in performance-critical code. CSS (Cascading Style Sheets) is a language for describing the rendering of HTML and XML documents on screen, on paper, in speech, etc. CSS uses Selectors for binding style properties to elements in the document. This document describes the selectors that already exist in CSS1 and CSS2, and further introduces new selectors for CSS3 and other languages that may need them." New pseudo-classes include :root, :nth-child(), :nth-last-child(), :nth-of-type(), :nth-last-of-type(), :first-child, :last-child, :first-of-type, :last-of-type, :only-child, :only-of-type and :empty. There's also a negation pseudo-class and a general sibling combinator.

Monday, March 9, 2009 (Permalink)

I've released XOM 1.2.1, my free-as-in-speech (LGPL) dual streaming/tree-based API for processing XML with Java. 1.2.1 fixes one very minor glitch in 1.2.0. (The version number was still printing 1.0 on System.out.) If you already upgraded to 1.2.0 and this glitch doesn't bother you, there's no real need to upgrade. If you're using 1.1 or earlier, please upgrade as some significant bugs have been fixed since then.

Friday, March 6, 2009 (Permalink)

Norm Walsh has posted version 0.9.5 of Calabash, an open source XProc implementation written in Java. Calabash currently passes all the tests in the XProc test suite. Java 5 or later is required. Calabash is published under the GNU General Public License Version 2.0.

Thursday, March 5, 2009 (Permalink)

The Mozilla Project has released Firefox 3.0.7 and SeaMonkey 2.0 alpha 3 to fix various security bugs. All users should upgrade. In addition, Estonian, Kannada, and Telugu localized versions are now available.

Wednesday, March 4, 2009 (Permalink)

Opera Software has released version 9.64 of their namesake free-beer web browser for Windows, Mac, and and Linux. "Opera 9.64 is a recommended security and stability upgrade, incorporating the Opera Presto 2.1.1 user agent engine." This fixes at least three major security issues that could lead to arbitrary code execution, cross domain scripting, and something they;re too scared to even talk about. All users should upgrade.

Tuesday, March 3, 2009 (Permalink)

I am pleased to announce the release of XOM 1.2, my free-as-in-speech (LGPL) dual streaming/tree-based API for processing XML with Java. Version 1.2 adds several minor new features including:

  • Latest Unicode normalization tables.
  • A lookup method in XPathContext that finds the namespace URI for a prefix
  • xml:id attributes no longer checked for NCNames
  • Upgraded to Xerces 2.8.0, DTD-only version
  • DOMConverter can accept a NodeFactory to be used in creating the XOM document

In addition, various bugs were fixed, especially in canonicalization and XPath. However there's at least one bug fix in the core so I recommend all users upgrade. XOM 1.2 should be fully backwards compatible with code written to 1.0 and 1.1 APIs. 1.2 should also be somewhat easier to compile and edit due to various changes with UnicodeUtil and Jaxen.

My apologies that this one took so long. There should have been a release in 2007, but late 2007-2008 was a very challenging time. (Moved across the country; new jobs, etc.) On the plus side, several additional bugs did get fixed over the last year.

At this point XOM is essentially feature-complete. I have no plans for major additions to the API though I will continue to fix any bugs that are discovered and improve performance as opportunities suggest themselves.

Friday, February 27, 2009 (Permalink)

The Omni Group has released OmniWeb 5.9 as free beer. Omniweb is a Mac OS X web browser that supports the core parts of XML on the Web including XSLT and CSS. As a product, I suspect it was doomed: nice enough but fundamentally a niche browser on a niche platform. Good niche products can succeed. Niche squared rarely does.

Thursday, February 26, 2009 (Permalink)

The W3C Authoring Tool Accessibility Guidelines Working Group has posted a new working draft of Authoring Tool Accessibility Guidelines 2.0. "An authoring tool that conforms to these guidelines will promote accessibility by providing an accessible user interface to authors with disabilities as well as enabling, supporting, and promoting the production of accessible Web content by all authors."

Wednesday, February 25, 2009 (Permalink)

SyncroSoft has released <Oxygen/> 10.1, $366 payware XML editor written in Java. Oxygen supports XML, XSL, DTDs, XQuery, SVG, Relax NG, Schematron, and the W3C XML Schema Language. According to the announcement:

Version 10.1 adds a new, redesigned XML Schema diagram that allows visual XML Schema editing, a new XML Schema documentation engine supporting multiple output formats, new features in the Author visual XML editing mode, improvements in the Outliner and content completion and updates the database support.

The new version adds a new simplified and improved visual editing page for XML Schema, in addition to the hybrid text/diagram page. The structure of the diagram was redesigned in order to be more intuitive and easy to use. The XML Schema diagram helps both the content authors who want to understand a schema and schema designers who develop complex schemas.

The XML Schema Documentation tool was redesigned to deliver the best available performance for generating XML schema documentation. Even for the largest XML schemas (thousands of components), <oXygen/> generates the complete documentation within a couple of minutes. The documentation output is carefully designed with a balanced selection of colors and a neat layout.

Oxygen has become pretty much the editor of choice among XML experts who don't use a raw text editor. If you're going to buy an XML editor, this is the one to buy. (I'm still a BBEdit/jEdit/vi user myself though.)

I also notice that at some point while I wasn't looking they added a $179 author edition that rips out the XQuery and XSLT development support to provide a cleaner, simpler interface for just writing. That sounds like a good idea. Developers and writers don't really need the same tools or interfaces. I'll have to check that out when I get a minute. In fact, I'm working on an article in raw XML right now that would make a good test case. Maybe I'll load it up and see how it goes.

Tuesday, February 24, 2009 (Permalink)

Apple has posted a beta of Safari 4.0 for Windows and the Mac. Safari supports XML, XSLT, CSS, XHTML, and RSS. 4.0 adds cover flow for web pages and faster JavaScript execution. However this beta seems to be causing a lot of crashes and problems. I recommend waiting for the release version. Mac OS X 10.4 or Windows XP or later is required.

Monday, February 23, 2009 (Permalink)

Bare Bones Software has released version 9.1.1 of BBEdit, my preferred text editor on the Mac, my favorite XML editor on any platforms, what I'm using to type these very words. This is a bug fix release. New copies cost $125. Upgrades from 9.0 are free. Mac OS X 10.4 or later is required.

Sunday, February 22, 2009 (Permalink)

The W3C Multimodal Interaction working group has posted the finished recommendation of EMMA: Extensible MultiModal Annotation markup language. According to the abstract, this spec "provides details of an XML markup language for containing and annotating the interpretation of user input. Examples of interpretation of user input are a transcription into words of a raw signal, for instance derived from speech, pen or keystroke input, a set of attribute/value pairs describing their meaning, or a set of attribute/value pairs describing a gesture. The interpretation of the user's input is expected to be generated by signal interpretation processes, such as speech and ink recognition, semantic interpreters, and other types of processors for use by components that act on the user's inputs such as interaction managers."

Friday, February 20, 2009 (Permalink)

agenceXML has posted a beta of XSLTForms, an XSLT-based XForms engine. "XSLTForms can convert XForms to XHTML+Javascript (AJAX) on any browser or web-server (PHP, ASP, J2EE,...) with an XSLT 1.0 engine." Internet Explorer, Firefox and Opera are supported.

Thursday, February 19, 2009 (Permalink)

The Apache XML Project has released Xerces-C++ 3.0.1, an open source schema validating XML parser written in reasonably cross-platform C++. Bug fixes and enhancements include:

  • More robust handling of leading/trailing whitespace
  • --disable-rpath configure option.
  • Fix a bug in the socket and winsock net accessors when a redirection response does not contain a "Location" header.
  • Make XMLMutex use the correct MemoryManager when it closes its mutex.
  • Documentation fixes
  • Force Borland projects to use CRLF EOL.
  • Memory leaks plugged
Wednesday, February 18, 2009 (Permalink)

The W3C has posted a new working draft of HTML 5. "This specification defines the 5th major revision of the core language of the World Wide Web: the Hypertext Markup Language (HTML). In this version, new features are introduced to help Web application authors, new elements are introduced based on research into prevailing authoring practices, and special attention has been given to defining clear conformance criteria for user agents in an effort to improve interoperability." There's also a new draft of HTML 5 differences from HTML 4.The latter contains a convenient list of changes since the January 22 draft:

  • The data member of ImageData objects has been changed from an array to a CanvasPixelArray object.
  • Shadows are now required from implementations of the canvas element and its API.
  • Security model for canvas is clarified.
  • Various changes to the processing model of canvas have been made in response to implementation and author feedback. E.g. clarifying what happens when NaN and Infinity are passed and fixing the definitions of arc() and arcTo().
  • innerHTML in XML was slightly changed to improve round-tripping.
  • The toDataURL() method on the canvas element now supports setting a quality level when the media type argument is image/jpeg.
  • The poster attribute of the video element now affects its intrinsic dimensions.
  • The behavior of the type attribute of the link element has been clarified.
  • Sniffing is now allowed for link when the expected type is an image.
  • A section on URLs is introduced dealing with how URL values are to be interpreted and what exactly authors are required to do. Every feature of the specification that uses URLs has been reworded to take the new URL section into account.
  • It is now explicit that the href attribute of the base element does not depend on xml:base.
  • It is now defined what the behavior should be when the base URL changes.
  • URL decomposition DOM attributes are now more aligned with Internet Explorer.
  • The xmlns attribute with the value http://www.w3.org/1999/xhtml is now allowed on all HTML elements.
  • data-* attributes and custom attributes on the embed element now have to match the XML Name production and cannot contain a colon.
  • Web Socket API is introduced for bidirectional communication with a server. It is currently limited to text messages.
  • The default value of volume on media elements is now 1.0 rather than 0.5.
  • event-source was renamed to eventsource because no other HTML element uses a hyphen.
  • A message channel API has been introduced augmenting postMessage().
  • A new element named bb has been added. It represents a user agent command that the user can invoke.
  • The addCueRange() method on media elements has been modified to take an identifier which is exposed in the callbacks.
  • It is now defined how to mutate a DOM into an infoset.
  • The parent attribute of the Window object is now defined.
  • The embed element is defined to do extension sniffing for compatibilty with servers that deliver Flash as text/plain. (This is marked as an issue in the specification to figure out if there is a better way to make this work.)
  • The embed can now be used without its src attribute.
  • getElementsByClassName() is defined to be ASCII case-insensitive in quirks mode for consistency with CSS.
  • In HTML documents localName no longer returns the node name in uppercase.
  • data-* attributes are defined to be always lowercase.
  • The opener attribute of the Window object is not to be present when the page was opened from a link with target="_blank" and rel="noreferrer".
  • The top attribute of the Window object is now defined.
  • The a element now allows nested flow content, but not nested interactive content.
  • It is now defined what the header element means to document summaries and table of contents.
  • What it means to fetch a resource is now defined.
  • Patterns are now required for the canvas element.
  • The autosubmit attribute has been removed from the menu element.
  • Support for outerHTML and insertAdjacentHTML() has been added.
  • xml:lang is now allowed in HTML when lang is also specified and they have the same value. In XML lang is allowed if xml:lang is also specified and they have the same value.
  • The frameElement attribute of the Window object is now defined.
  • An event loop and task queue is now defined detailing script execution and events. All features have been updated to be defined in terms of this mechanism.
  • If the alt attribute is omitted a title attribute, an enclosing figure element with a legend element descendant, or an enclosing section with an associated heading must be present.
  • The irrelevant attribute has been renamed to hidden.
  • The definitionURL attribute of MathML is now properly supported. Previously it would have ended up being all lowercase during parsing.
  • User agents must treat US-ASCII as Windows-1252 for compatibility reasons.
  • An alternative syntax for the DOCTYPE is allowed for compatibility with some XML tools.
  • Data templates have been removed (consisted of the datatemplate, rule and nest elements).
  • The media elements now support just a single loop attribute.
  • The load() method on media elements has been redefined as asynchronous. It also tries out files in turn now rather than just looking at the type attribute of the source element.
  • A new member called canPlayType() has been added to the media elements.
  • The totalBytes and bufferedBytes attributes have been removed from the media elements.
  • The Location object gained a resolveURL() method.
  • The q element has changed again. Punctation is to be provided by the user agent again.
  • Various changes were made to the HTML parser algorithm to be more in line with the behavior Web sites require.
  • The unload and beforeunload events are now defined.
  • The IDL blocks in the specification have been revamped to be in line with the upcoming Web IDL specification.
  • Table headers can now have headers. User agents are required to support a headers attribute pointing to a td or th element, but authors are required to only let them point to th elements.
  • Interested parties can now register new http-equiv values.
  • When the meta element has a charset attribute it must occur within the first 512 bytes.
  • The StorageEvent object now has a storageArea attribute.
  • It is now defined how HTML is to be used within the SVG foreignObject element.
  • The notification API has been dropped.
  • How [[Get]] works for the HTMLDocument and Window objects is now defined.
  • The Window object gained the locationbar, menubar, personalbar, scrollbars, statusbar and toolbar attributes giving information about the user interface.
  • The application cache section has been significantly revised and updated.
  • document.domain now relies on the Public Suffix List. [PSL]
  • A non-normative rendering section has been added that describes user agent rendering rules for both obsolete and conforming elements.
  • A normative section has been added that defines when certain selectors as defined in the Selectors and the CSS3 Basic User Interface Module match HTML elements. [SELECTORS] [CSS-UI]

Web Forms 2.0, previously a standalone specification, has been fully integrated into HTML 5 since last publication. The following changes were made to the forms chapter:

  • Support for XML submission has been removed.
  • Support for form filling has been removed.
  • Support for filling of the select and datalist elements through the data attribute has been removed.
  • Support for associating a field with multiple forms has been removed. A field can still be associated with a form it is not nested in through the form attribute.
  • The dispatchFormInput() and dispatchFormChange() methods have been removed.
  • Repetition templates have been removed.
  • The inputmode attribute has been removed.
  • The input element in the File Upload state no longer supports the min and max attributes.
  • The allow attribute on input elements in the File Upload state is no longer authorative.
  • The pattern and accept attributes for textarea have been removed.
  • RFC 3106 is no longer explicitly supported.
  • The submit() method now just submits, it no longer ensures the form controls are valid.
  • The input element in the Range state now defaults to the middle, rather than the minimum value.
  • The size attribute on the input element is now conforming (rather than deprecated).
  • object elements now partake in form submission.
  • The type attribute of the input element gained the values color and search.
  • The input element gained a multiple attribute which allows for either multiple e-mails or multiple files to be uploaded depending on the value of the type attribute.
  • The input, button and form elements now have a novalidate attribute to indicate that the form fields should not be required to have valid values upon submission.
  • When the label element contains an input it may still have a for attribute as long as it points to the input element it contains.
  • The input element now has an indeterminate DOM attribute.
  • The input element gained a placeholder attribute.
Friday, February 13, 2009 (Permalink)

xmlHelpline.com has released Xml Schema Lightener 2.0, a $29 tool for subsetting and flattening W3C XML Schemas. Xml Schema Lightener 2.0 is based on XSLT 2.0.

Wednesday, February 11, 2009 (Permalink)

The W3C Web Applications Working Group has posted the first public working draft of Widgets 1.0: APIs and Events. "This specification defines a set of APIs and events for widgets that enable baseline functionality for widgets. Widgets are full-fledged client-side applications that are authored using Web standards. They are typically downloaded and installed on a client machine or device where they typically run as stand-alone applications outside of a Web browser. Examples range from simple clocks, stock tickers, news casters, games and weather forecasters, to complex applications that pull data from multiple sources to be "mashed-up" and presented to a user in some interesting and useful way (see [Widgets-Landscape] for more information). The runtime environment in which a widget runs is referred to as a widget user agent."

Tuesday, February 10, 2009 (Permalink)

The W3C WebCGM Working Group has posted a working draft of WebCGM 2.1, an updated version of the ISO Computer Graphics Metafile standard (ISO/IEC 8632:1999).

WebCGM 2.1 builds upon WebCGM 2.0 and adds these new features:

  1. Geometric transform definition and inquiry for objects is defined via DOM facilities and XCF facilities.
  2. getStyleProperty() inquiry method added to DOM WebCGMPicture interface and WebCGMAppStructure interface.
  3. a number of new Style Properties are added to the list of those settable via the setStyleProperty() method.
  4. Application configurable items are added, to improve predictability of WebCGM viewing: font mapping; defaults setting.
  5. g-zip compression of whole metafiles.
  6. new setRedraw() method on WebCGMMetafile interface allows postponement / control of when redraws happen.
  7. new setView() method and createWebCGMRect() convenience method on WebCGMPicture interface enable the picture view to be defined via the DOM.
  8. new getObjectExtent() method on the WebCGMAppStructure interface facilitates getting the bounding extent of one or more objects, for example to use with new setView()

In addition, the WebCGM 2.1 makes these clarifications and other changes to WebCGM 2.0:

  1. Transparency clarifications, detailing how the various WebCGM 2.0 transparency facilities interact.
  2. A number of 2.0-deprecated items were made obsolete for 2.1.
Saturday, February 7, 2009 (Permalink)

Freelancer has released version 1.64 of xmlBlaster, an open source message-oriented middleware server written in Java for exchanging messages between publishers and subscribers.

XmlBlaster is a publish/subscribe and point to point 100% Java based MOM server (message-oriented middleware) which exchanges messages between publishers and subscribers. The message is described with XML-encoded meta information. Messages may contain everything, GIF images, Java objects, Python scripts, XML data, a word document, plain text - just anything.

Communication with the server is based on socket, CORBA (using JacORB), RMI, XmlRpc, HTTP or email, clients are free to choose their preferred protocol. Other protocols like SOAP may be plugged in.

Subscribers can use XPath expressions to filter the messages they wish to receive.

This is the publish/subscribe middleware server you have needed many times before, now available for free to glue together your distributed client/server application.

The xmlBlaster server is pure Java and under LGPL. PHP, Perl, Python, C, C++, C#, Visual Basic.net, Flash, J2ME, Java (applications, servlets, applets) client samples are delivered in the xmlBlaster distribution.

Thursday, February 5, 2009 (Permalink)

Code Synthesis has released XSD/e 3.0.0, a free-as-in-speech (GPL) C++ schema validating XML parser for embedded environments. According to Boris Kolpackove, "This release adds the new Embedded C++/Hybrid XML Schema to C++ mapping. C++/Hybrid provides a light-weight, tree-like object model with support for fully in-memory as well as hybrid, partially event-driven, partially in-memory XML processing modes." In addition, 3.0 adds precise reproduction of the XML vocabulary structure and element order and filtering during parsing and serialization.

Wednesday, February 4, 2009 (Permalink)

The W3C XML Schema Working Group has posted possibly the third last call working drafts of XML Schema 1.1 Part 1: Structures and XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes. According to the structures draft,

The major revisions since the previous public working draft include the following:

In the datatypes spec,

The major changes since version 1.0 include:

  • Support for XML 1.1 has been added. It is now implementation defined whether datatypes dependent on definitions in [XML] and [Namespaces in XML] use the definitions as found in version 1.1 or version 1.0 of those specifications.
  • A new primitive decimal type has been defined, which retains information about the precision of the value. This type is aligned with the floating-point decimal types which will be part of the next edition of IEEE 754.
  • In order to align this specification with those being prepared by the XSL and XML Query Working Groups, a new datatype named anyAtomicType which serves as the base type definition for all primitive atomic datatypes has been introduced.
  • The conceptual model of the date- and time-related types has been defined more formally.
  • A more formal treatment of the fundamental facets of the primitive datatypes has been adopted.
  • More formal definitions of the lexical space of most types have been provided, with detailed descriptions of the mappings from lexical representation to value and from value to ·canonical representation·.
  • The validation rule Datatype Valid (§4.1.4) has been recast in more declarative form. A paraphrase of the constraint in procedural terms, which corrects some errors in the previous versions of this document, has been added as a note.
  • The rules governing partial implementations of infinite datatypes have been clarified.
  • Various changes have been made in order to align the relevant parts of this specification more closely with other relevant specifications, including especially the corresponding sections of [XSD 1.1 Part 1: Structures].

Changes since the previous public Working Draft include the following:

Comments are due by February 20.

Monday, February 2, 2009 (Permalink)

Google has released Chrome 1.0.154.46. This release fixes two security bugs and incompatibilities with Yahoo! Mail and Windows Live Hotmail (the latter at least caused and fixed by bad browser sniffing.) One more time: sites that tailor content depending on user agent strings can and will be broken.

Sunday, February 1, 2009 (Permalink)

The W3C XHTML2 Working Group has published the second edition of XHTML Media Types. Briefly:

  1. If the Accept header explicitly contains application/xhtml+xml (with either no "q" parameter or a positive "q" value) deliver the document using that media type.
  2. If the Accept header explicitly contains text/html (with either no "q" parameter or a positive "q" value) deliver the document using that media type.
  3. If the accept header contains "*/*" (a convention some user agents use to indicate that they will accept anything), deliver the document using text/html.

In other words, requestors that advertise they support XHTML family documents will receive the document in the XHTML media type, and all other requestors that (at least claim to) support HTML or "everything" will receive the document using the HTML media type. Dealing with user agents that satisfy none of these criteria is outside the scope of this document.

Friday, January 30, 2009 (Permalink)

The W3C has published XML Base (Second Edition). Changes since the first edition include:

  1. The published errata (see http://www.w3.org/2001/06/xmlbase-errata) have been incorporated;

  2. The definition of URI reference has been switched from RFC2396 to 3986;

  3. The xml:base attribute has been redescribed as a Legacy Extended IRI, but this does not change its syntax (the December 2006 PER used the term "XML Resource Identifier" which was to be defined in an XLink revision, but that plan has been superseded by the definition of LEIRI in the W3C Note);

  4. Implementations are now encouraged to return base “URIs” without escaping non-URI characters;

  5. The meanings of xml:base="" and xml:base="#frag" have been clarified;

  6. The expected reference to XML Base in the forthcoming XML Media Types RFC (“son of 3023”) has been noted;

  7. It has been clarified that normal validity rules apply to the xml:base attribute;

  8. The out-of-date appendix describing effects on other standards has been removed;

  9. Various minor editorial changes have been made.

Wednesday, January 28, 2009 (Permalink)

The OpenOffice Project has released OpenOffice 3.0.1, an open source office suite for Linux, Solaris, and Windows that saves all its files as zipped XML and uses XForms. Besides fixing bugs, 3.0.1 enhances support for grammar checkers and increases the number of words held in personal word lists to 30,000.

Monday, January 26, 2009 (Permalink)

Microsoft has posted the first release candidate of Internet Explorer 8 for Windows. It's supposed to be more standards compliant, though I haven't yet tested the XML and XSLT support. Eighth time's the charm! New features in IE8 include InPrivate Browsing. "Now you can shop for that special gift with confidence knowing your family won't accidentally find out or use a shared computer without leaving a trace."

Sunday, January 25, 2009 (Permalink)

XMLMind has released the XMLmind FO Converter 4.3, an XSL-Formatting Objects to RTF/OOXML/OpenDoc converter written in Java. Version 2.0 adds the ability to convert XSL-FO documents to Microsoft's WordprocessingML format. Version 4.3 now includes a $200 Desktop Edition and a more liberal Developer License.

Monday, January 19, 2009 (Permalink)

I've posted the third beta of XOM 1.2, my free-as-in-speech (LGPL) library for processing XML with Java. Compared to the 1.0-->1.1 transition, this is a very minor upgrade. There are just a couple of additional methods, a few bug fixes, and maybe a small optimization or two. All code written to the 1.1 or 1.0 APIs should run unchanged with 1.2. This beta upgrades jaxen to 1.1.2 and optimizes UnicodeUtil, which should be somewhat easier to compile now. If no one notices any major problems in this release, I'll probably declare it 1.2 final in a couple of weeks.

Sunday, January 18, 2009 (Permalink)

The W3C XHTML 2 Working Group has posted the candidate recommendation of CURIE Syntax 1.0: A syntax for expressing Compact URIs. This is modeled after namespace URIs and qualified names. In brief, it defines a prefix for a known base IRI (a URI that can contain non-ASCII characters like é), then appends a colon and a local part. For example, the CURIE cafe:tradeshows.xml could be shorthand for http://www.cafeaulait.org/tradeshows.xml if the prefix cafe were mapped to the URL http://www.cafeaulait.org/. Exactly how prefixes are mapped to base IRIs is left to the specification of the documents in which the CURIEs appear. However if the CURIEs are in an XML document, then the namespaces in scope define the prefix mappings. The default namespace can be used for prefix-less CURIEs.

Frankly I'm surprised to see this. Namespaces and the namespace syntax are one of the notable failures of the XML ecosystem. Why someone would choose to imitate this now that we know better is beyond me. Based on experience with namespaces, I predict that the problems of moving CURIEs from one context to another are going to be especially problematic. Well, we've learned to live with (if not exactly like) namespaces. I guess we can get used to this.

Thursday, January 15, 2009 (Permalink)

The W3C CSS Working Group has published a new Candidate Recommendation of CSS Mobile Profile 2.0.:

This specification defines in general a subset of CSS 2.1 [CSS21] that is to be considered a baseline for interoperability between implementations of CSS on constrained devices (e.g. mobile phones). Its intent is not to produce a profile of CSS incompatible with the complete specification, but rather to ensure that implementations that due to platform limitations cannot support the entire specification implement a common subset that is interoperable not only amongst constrained implementations but also with complete ones. Additionally, this specification aligns itself as much as possible with the OMA Wireless CSS 1.1 [WCSS11] specification. At the same time, OMA is doing alignment work in OMA Wireless CSS 1.2 [WCSS12]. It is aimed at aligning the mandatory compliance items between CSS Mobile Profile 2.0 and OMA Wireless CSS 1.2 [WCSS12].

...

In order to exit the Candidate Recommendation phase, the following criteria must be satisfied:

  1. At least two implementations of all the required features of this specification.
  2. A minimum of six months of the CR period must be elapsed to ensure that enough time is given for providing implementation feedback.

The marquee-* and overflow-style properties are at risk.

A test suite for this specification will be created during the CR period. Detailed implementation requirements and the invitation for participation in the implementation report will be provided in the preliminary implementation report. Implementers are encouraged to provide feedback by 1 February 2009.

Tuesday, January 13, 2009 (Permalink)

The W3C Geolocation Working Group has posted the first working draft of the Geolocation API Specification, a JavaScript API for exposing location information available from mobile devices like smart phones.

The Geolocation API defines a high-level interface to location information associated with the hosting device, such as latitude and longitude. The API itself is agnostic of the underlying location information sources. Common sources of location information include Global Positioning System (GPS) and location inferred from network signals such as IP address, RFID, WiFi and Bluetooth MAC addresses, and GSM/CDMA cell IDs.

The API is designed to enable both "one-shot" position requests and repeated position updates, as well as the ability to explicitly query the cached positions. Location information is represented by latitude and longitude coordinates. The Geolocation API in this specification builds upon earlier work in the industry, including [AZALOC], [GEARSLOC], and [LOCATIONAWARE].

The following code extract illustrates how to obtain basic location information:

Example of a "one-shot" position request.

    function showMap(position) {
      // Show a map centered at (position.coords.latitude, position.coords.longitude).
    }

    // One-shot position request.
    navigator.geolocation.getCurrentPosition(showMap);
    

Example of requesting repeated position updates.

    function scrollMap(position) {
      // Scrolls the map so that it is centered at (position.coords.latitude, position.coords.longitude).
    }

    // Request repeated updates.
    var watchId = navigator.geolocation.watchPosition(scrollMap);

    function buttonClickHandler() {
      // Cancel the updates when the user clicks a button.
      navigator.geolocation.clearWatch(watchId);
    }
    
Monday, January 12, 2009 (Permalink)

Google has posted the first alpha of Chrome 2.0, an open source WebKit-based browser for Windows. 2.0 adds full page zoom, autoscrolling, profiles, and GreaseMonkey support.

Thursday, January 8, 2009 (Permalink)

The W3C Voice Browser Working Group has published the first working draft of the VoiceXML 3.0 specification. VoiceXML is used to describe those annoying call trees you hear when calling most major companies. (Press 1 if you want to wait on hold for 20 minutes and then be hung up on; press 2 if you want to wait indefinitely; press 3 if you'd rather we just hung up on you now.)

How does one build a successor to VoiceXML 2.0/2.1? Requests for improvements to VoiceXML fell into two main categories: extensibility and new functionality.

To accommodate both, the Voice Browser Working Group first developed the detailed semantic descriptions of VoiceXML that versions 2.0 and 2.1 lacked. From there it was possible to describe semantics for new functionality and to restructure the language syntactically to improve extensibility.

One of the other benefits of detailed semantic descriptions is improving portability within VoiceXML. However there are many factors that contribute to portability that are outside the scope of this document (e.g. speech recognition capabilities, telephony).

Wednesday, January 7, 2009 (Permalink)

The W3C Forms Working Group has published a working draft of XForms for HTML. "XForms for HTML provides a set of attributes and script methods that can be used by the tags or elements of an HTML or XHTML web page to simplify the integration of data-intensive interactive processing capabilities from XForms. The semantics of the attributes are mapped to the rich XForms model-view-controller-connector architecture, thereby allowing web application authors a smoother, selective migration path to the higher-order behaviors available from the full element markup available in modules of XForms." I think the goal here is to hide XForms submission within standard HTML markup, but I haven't fully grokked it yet.

Tuesday, January 6, 2009 (Permalink)

Version 2.2 of XQilla, an open source XQuery 1.0 and XPath 2.0 library and command line utility written in C++, has been released. XQilla is implemented on top of Xerces-C++ and derives from Pathan. Version 2.0 implements the DOM 3 XPath API, and conforms to the both the XQuery and XPath 2.0 recommendations. "This release adds a debugger and debugging API, partial support for XSLT 2.0, and a prototype implementation of higher order functions for XQuery....Although it is possible to run a number of complex transformations there are significant features missing from the implementation, including xsl:include, xsl:import, xsl:output, xsl:result-document, and xsl:sort. If there are any developers interested in helping complete the XSLT 2.0 support, please get in touch with us at xqilla-devel@lists.sourceforge.net." XQilla is published under the Apache 2.0 licence.

Monday, January 5, 2009 (Permalink)

The Apache Commons Team has released Digester 2.0, a SAX-based XML to object mapper, designed primarily for parsing XML configuration files though it has other uses too. Digester is configured through an XML to Java object mapping module, which triggers actions whenever a pattern of nested XML elements is recognized. Versionb 2.0 now supports W3C XML Schema validation and XInclude. However it is backwards incompatible with Digester 1.x due to other internal cleanups. The Commons Projects has also released version 1.8.1, a compatible bug fix release.

Friday, January 2, 2009 (Permalink)

In the unlikely event anyone needs yet another way to listen to me blather, I now have a Twitter feed.


News from 2008 | News from 2007 | News from 2006 | News from 2005 | News from 2004 | News from 2003 | | News from 2002 | News from 2001 | News from 2000 | News from 1998 | News from 1999
[ XML Books | XML Trade Shows | XML Mailing Lists | XML Quotes ]

Copyright 2009 Elliotte Rusty Harold
elharo@ibiblio.org