XML News from Wednesday, December 5, 2007

IBM gives a 7:30 A.M. sponsored breakfast that is essentially content free. I do wonder what companies think they get out of these things? Given the inflated price of hotel catering, this talk probably cost them a few thousand dollars, maybe more after travel and staff time is included.

Yesterday DataDirect used their slot to play a quiz game and give away a half dozen iPod Nanos. Amusing, but I doubt anyone really learned anything and I for one actually wanted to hear them talk about their product at a technical level. Oh well, at least there's eggs.

OK, he's finally talking about something new: a product called MashupHub that lets so-called business users build web apps by gluing together widgets and feeds. You can only publish to the server, not save the page you build. This would be more interesting if it weren't server-locked; that is is it were an interchangeable spoke rather than a hub, but even then I'm not sure would see any use for it. Basically this is another in-browser web page editor.

IBM thinks this tool is for business users so they can build web 2.0 apps rather than having the developers do it. This is common refrain we've been hearing since before I was born. Cobol was probably the first one to make that particular promise. We shouldn't forget that a few of the technologies that promised this actually delivered--HyperCard and the spreadsheet to name two--but far more of these technologies failed. The best ones like Cobol and Access became the province of professional developers. Most of them simply sunk under the waves and are now forgotten. MashupHub looks likely to sink before it leaves the harbor.

Just Systems is apparently partnering with IBM on this MashupHub thingie so now they're going to talk. They've got a mildly interesting demo but it's no clear what their tool actually does. They're playing videos of the result and talking about what they do rather than actually building something.

OK. They finally started demoing the product. It's a compound document editor for different XML vocabularies. The internal representation is DOM. This is sort of interesting, but I'm not sure if there's a need for it. Compound editors have been tried before and never really taken the world by storm. Microsoft Office/OLE is probably the most successful but most people just use it as a collection of single document type editors rather than a compound editor.

I suspect the user interface is the problem. Different document types need different user interfaces. A text editor interface doesn't work for a table. A table editor interface doesn't work for an equation. An equation editor interface doesn't work for a photo. Trying to shove an indefinite number of different UIs into one application causes interface overload. Furthermore, if one group does al the interfaces, then they don't have the time or skills to do each one well. If different groups do this, the interfaces are inconsistent.

Miguel de Icaza demoes Moonlight/Silverlight and XAML. He at least knows how to show code in a large font size and do Hello World examples. Unlike the more corporate presenters he doesn't waste time on PowerPointless slides.

Miguel de Icaza at console

He's written a Unix SDK for all this, and is demoing on Linux. Off the top of my head I think he's the only Linux presenter I've seen this week. The other presenters are split about 50/50 between Mac and Windows. A few have been using PowerBooks with Parallels to demo their Windows only software.

He isn't actually sure Silverlight will be successful (how refreshingly honest) but wants to make sure Linux isn't left out if it is. "I thought Linux was going to win nine years ago, and we're a little behind schedule."

More Microsoft stuff on Linq now with Microsoft's Shyam Pather on "Linq to XML: Visual Studio 2008, Silverlight, and Beyond". This mixes declarative SQL-like programming with traditional imperative C# programming. There's some danger at the seams though because C# is neither functional nor declarative. The optimizer could really muck things up if you aren't careful to avoid side effects in the wrong places.

Bang! In an offhand comment, Pather just put his finger on something I've seen a thousand times but never realized: DOM's use of the Document interface as a factory and requiring each node to be created by its containing document requires coders to keep passing the Document object around from method to method. when it would not be otherwise required. This bloats method signatures, makes thread safety harder, tightens coupling, and in general encourages ugly, spaghetti, procedural code instead of clean OOP design. Both XOM and JDOM cure this particular problem. I alway knew the abstract factory/factory method patterns were overused and usually ugly, but I hadn't realized just how much worse DOM's factory was than all other factories. I have to add this to the growing list of DOM's sins.

LINQ doesn't use prefixes in the API, just local names and namespace URIs. That's smart.

Performance is superior to .NET's DOM implementation.

Jason Hunter from Mark Logic gives the closing keynote with the title "You're Darn Right XML has a Future on the Web", but the wireless network is having troubles, so I may not be able to report very much.

Jason Hunter keynote at XML 2007s

He says MarkMail is one of the most XML-centric sites on the Web. He doesn't need servlets, Perl, relational DBs, etc. E-mail is half content and half data. "Semi-structured" is the dirty word this year. Jason's about the third person I've heard complain about it, because XML is actually more structured than tables, not less. It's just not repeating structures.

Case studies:

We usually want answers, not links. (That's a good explanation of why I often go to Wikipedia before Google these days. I use Google to find specific pages or a group of opinions about a subject. Use Wikipedia to get answers about a subject.)

Google gives subpar answers to programmers compared to O'Reilly books, but it gives these results much faster than pain through books.

(How big is MarkMail? How big could/would it be if it indexed all public mailing lists? or all Usenet groups? Would it hit the petabytes? What does his crawler look like?) He wrote the e-mail XML converter in Java in 3 hours (plus months handling bad mailers that didn't conform to spec.)

Jason brings his own EVDO network connection rather than relying on the crapy hotel wireless. All presenters should be this prepared.

"Michael Kay is the number one human talking about XML."

The biggest Mark Logic deployment is 200 terabytes. This is the U.S. government doing their own search engine to monitor sites they don't like. I wonder if Cafe con Leche is included? (and f not, what am I doing wrong?) Their crawler tries to avoid being noticed.

XML 2008 will take place December 8-10 in Crystal City, Virginia (near D.C.)

Is anyone else blogging from here? I haven't noticed anyone yet. Let's see what Google can find: