XML News from Tuesday, October 3, 2006

The W3C XML Processing Model Working Group has posted the first public working draft of XProc: An XML Pipeline Language. According to the introduction,

An XML Pipeline specifies a sequence of operations to be performed on a collection of input documents. Pipelines take zero or more XML documents as their input and produce zero or more XML documents as their output. Steps in the pipeline may read or write non-XML resources as well.

A pipeline consists of components. Like pipelines, components take zero or more XML documents as their input and produce zero or more XML documents as their output. The inputs to a component come from the web, from the pipeline document, from the inputs to the pipeline itself, or from the outputs of other components in the pipeline. The outputs from a component are consumed by other components, are outputs of the pipeline as a whole, or are discarded.

There are two kinds of components: steps and (language) constructs. Steps carry out single operations and have no substructure as far as the pipeline is concerned, whereas constructs can include components within themselves.

This specification defines a standard library, Appendix D, Standard Component Library, of steps. Pipeline implementations may support additional steps as well.

The goals look laudable. I'm not sure I like the syntax that's proposed.