XML News from Wednesday, May 5, 2004

At Bob DuCharme's suggestion, I've added id attributes to most of the elements in the permalink versions of these pages starting with yesterday's news. This should allow for reasonably stable, long-term linking to individual news items and parts thereof. There's no particular scheme for how the ID values are chosen. You'll need to view source on the page for the date's news (not the main page at http://www.cafeconleche.org/) to find them. If a page is edited during the day, the IDs might change, but they should stabilize within 24 hours of posting, and normally much faster than that. Even if the IDs do change when I add a new item or edit a preexisting one, the old IDs should still point to soemthing close to what they originally pointed to. This is very much a "worse-is-better" solution, just like the one that generates the permalink pages. It will solve probably 95-99% of the problem at a cost well below what a full solution would require. (In other words, I hacked this together in about 15 minutes instead of the days that would have been required to make the links immediately stable.)

The W3C XML Protocol Working Group has published the first public working draft of SOAP Resource Representation Header. In brief this proposes encoding resources suchas JPEG images or other XML documents in a SOAP header. The spec provides this example:

<soap:Envelope xmlns:soap='http://www.w3.org/2002/12/soap-envelope' 
               xmlns:rep='http://www.w3.org/2004/02/representation' 
               xmlns:xmime='@@@@'>
  <soap:Header>
    <rep:Representation resource='http://example.org/me.png'>
      <rep:Data xmime:media-type='image/png'>
        /aWKKapGGyQ=
      </rep:Data>
    </rep:Representation>
  </soap:Header>
  <soap:Body>
    <x:MyData xmlns:x='http://example.org/mystuff'>
      <x:name>John Q. Public</x:name>
      <x:img src='http://example.org/me.png'/>
    </x:MyData>
  </soap:Body>
</soap:Envelope>

A processor that was decoding the document could load the PNG image referenced by the img element by decoding the header rather than making a second trip to the server. There mnight be a reson to do this, though I'm very nervous that as soon as I say that people are going to start suggesting that we changes the APIs like DOM and SAX to mnot provide the real XML. And then they'll want to stop shipping aorund real XML, and ionstead send the binary data itself, because that's what everyone's going to use anyway. I say this because I've seen every bit of this before. The relatively reasonable XOP proposal, which achieves essentially the same goals but by bundling everything in a MIME envelope rather than a XML document, is now proposing exactly this. This may be the first step down a very slippery slope that leads right over a cliff; and at the bottom of the cliff XML will be shattered into a confusing mess of uninteroperable, inefficient, vendor-locked-in, patented, DRM-encumbered binary data.

Personally, I wonder if bundling everything up in a zip file wouldn't be architecturally cleaner, not to mention smaller. The big issue with that approach is that resolving the URLs (especially absolute URLs) becomes tricky, and there's no convenient place to store the URLs of the cached resources. But perhaps we could do this with a manifest file as in Java's JAR archive, which is really just a zip file anyway?