XML News from Friday, September 24, 2004

Do me a favor. Please take a look at the experimental version of this page that uses CSS instead of table layouts, and let me know if anything looks too funky to live with. If so, please let me know what browser on what platform you're using. It seems to work well on all the browsers I have conveniently available, but I haven't tested IE5 for Windows yet. Netscape 4 isn't great, but I was able to hack that enough so it isn't unreadable. One thing I still haven't figured out how to do is center the h1 header ("Cafe con Leche XML News and Resources") within the left hand panel. I can center it relative to the page, but that's not quite the same thing, especially in a wide window.

Amazon has reduced the price of XML in a Nutshell, 3rd edition to $27.17, a 32% savings off the cover price. Be the first on your block to get one!

I've been spending a lot of time reviewing RSS readers lately, and overall they're a pretty poor lot. Latest example. Yesterday's Cafe con Leche feed contained this completely legal title element:

<title>I'm very pleased to announce the publication of XML in a Nutshell, 3rd edition by myself and W.
          Scott Means, soon to be arriving at a fine bookseller near you.

Note the line break in the middle of the title content. This confused at least two RSS readers even though there's nothing wrong with it according to the RSS 0.92 spec. Other features from my RSS feeds that have caused problems in the past include long titles, a single URL that points to several stories, and not including more than one day's worth of news in a feed.

Cafe con Leche and Cafe au Lait use XSLT to generate their RSS feeds, so they're always completely well-formed. The home pages are edited by hand, and may not always be well-formed; but if so the XSLT processor reports an error and does not generate a new RSS document. I really wish RSS vendors would focus on implementing the actual specs reasonably before they wasted time on supporting brain damage like malformed feeds and double escaped HTML. It's well-known that supporting non-conformant documents poisons the well for everyone. What's less well-known is that adding support for non-conformant documents tends to break the support for sites that actually follow the specifications. Everyone gets sucked into a race to the bottom, and we end up back in the world where everyone's browser handles sites just a little bit differently from everyone else's, and vendors compete based on how many broken sites they can make sense out of instead of how well they can present genuinely good data. This is the HTML hell XML was supposed to save us from. Those who forget the past are condemned to repeat it.