XML Quotes in 2008

Quotes about XML in 2008

Wednesday, December 31, 2008

if a specification requires recipients to accept *any* kind of broken input (by specifying how to parse it anyway), it essentially takes away all future extensibility with respect to syntax.

--Julian Reschke on the whatwg mailing list, Tuesday, 30 Dec 2008 10:21:06

Tuesday, December 30, 2008

One reason I haven't fully bought in to the "UTF-16 everywhere" philosophy (see UTN #12) is that UTF-16 is the only mainstream UTF that requires special handling for, and only for, the code points beyond U+FFFF. It just seems too much of a temptation to take the UCS-2 shortcut and ignore them altogether, especially with the reassurance that "supplementary code points are very rare."

--Doug Ewell on the Unicode mailing list, Monday, 29 Dec 2008 20:00:07

Monday, December 22, 2008

one of the great strengths of web applications is that they aren't bound by the crusty old conventions of desktop applications. They're free to do things differently -- and hopefully better. Web applications should play to their strengths, instead of attempting to clone desktop applications.

--Jeff Atwood
Read the rest in Coding Horror: Avoiding The Uncanny Valley of User Interface

Sunday, December 21, 2008

Twitter works like people do. If I'm interested in someone, I don't have to ask their permission to follow them. I don't have to ask if they will be my friend: that is something that evolves naturally over time. If you're a public figure like I am, the metaphor of mutual "friending" is truly broken. I get tens of thousands of friend requests from people I don't know. Accepting would make it impossible for me to use a social tool to keep in touch with my real friends. Friend groups don't really help.

Twitter's brilliant social architecture means that anyone can follow me, and I can follow anyone else (unless they want to keep their updates private.) Gradually, through repeated contact, we become friends. @ replies that can only be seen by people followed by both parties to a conversation create a natural kind of social grouping, as well as social group extensibility, as I gradually get more and more visibility into new people that my friends already know. Meanwhile, truly private direct messages are also supported.

I don't know who first used the term "ambient intimacy" but it's a great description of what begins to happen on Twitter. I know not just what people are thinking about or reading, but enough about what they are doing that our relationship deepens, just like real-world friendships. People who follow me on Twitter learn that I'm making jam or pies, or gardening or riding my bike or feeding the horses, things that I'd never (or rarely, since I'm doing it here) share on my blog. I know a lot more about many of my professional contacts that makes them more into friends. And in the case of my family, who keep their updates private and visible only to a limited group of real friends, we can keep in touch in small ways that mean a lot. I get special moments of my wife or daughters' day that we might not have shared otherwise. It's truly lovely.

--Tim O'Reilly
Read the rest in Why I Love Twitter

Friday, December 19, 2008

paraphrasing/borrowing from Churchill -- 'yes XML is the worst form of data format, except all those other forms that have been tried.'

--James Fuller on the xml-dev mailing list, Friday, 5 Dec 2008 09:47:57

Thursday, December 18, 2008

Opening new browser windows is highly confusing for most users. Although many users can cope with extra windows that they've opened themselves, few understand why the Back button suddenly stops working in a new window that the computer initiated. Opening new windows was #2 on my list of top-10 Web design mistakes of 1999; that this design approach continues to hurt users exemplifies both the longevity of usability guidelines and the limited improvement in user skills.

--Jakob Nielsen
Read the rest in User Skills Improving, But Only Slightly (Jakob Nielsen's Alertbox)

Tuesday, December 16, 2008

XQuery is interesting in that in several places it allows implementations to fail (unless it has changed) if they cannot for example figure out how to convert an XQuery into their native query capabilities.

So this is a very different view of what standards are supposed to do: it is a standard as a range of syntaxes rather than as anything that allows product substitutability. That is a quite practical, but allowing, in effect, vendors to choose which features they implement is not what many people expect from a standard: if it involves the features you are interested in, you think you have been sold a pup.

--Rick Jelliffe on the xml-dev mailing list, Wednesday, 26 Nov 2008 17:56:35 +1100

Monday, December 15, 2008

What you really want is a hierarchical data store optimized for loose tree structures: in a word, an XML database. At the time I was designing Wyvern's persistence strategy (1998-ish), XML databases were pure vaporware, and even after a few years they were still fairly exploratory and unstable.

Today things are different, and there are many interesting options for XML databases, ranging from 100% free (e.g. Berkeley DBs) through 100% expensive (e.g. Oracle XML).

You might also look into Object databases, but I've never heard of anyone coming through that experience with anything but battle scars to show for it.

--Steve Yegge
Read the rest in Stevey's Blog Rants: The Universal Design Pattern

Friday, December 12, 2008

Google has their eyes on the long haul with Android. Which is why reactions to a somewhat scattered UI in the very first implementation is not something they're worried about too much. This is a platform about further reducing the mobile carriers to raw pipes of data, and giving full control to the consumer. It’s about creating a critical-mass open-source ecosystem. And even if they fail to sell a ton of handsets, they've already put pressure on all the carriers and phone makers by the fact that they've created a free alternative that does not have to win to impact the players in this industry.

--John Mahoney
Read the rest in Android: Why Android Will Soon Kick Ass

Thursday, December 11, 2008

Where some see "the power of RDF", others see "the RDF tax". There's a tradeoff between making the common case simple and making things powerful for the less common and more complex cases. The simple case is finding out what license a document is under. Compared to looking up a string value by unstructured opaque string key from within the file, it's very different to extract an RDF graph from a file, defererence all namespace URIs using a network connection relying on hosts being reachable, load data describing equivalence and subclass relations--perhaps recursively--and simplify until the application sees a value connected to a property it is programmed to know about.

--Henri Sivonen on the whatwg mailing list, Sunday, 24 Aug 2008 22:17:23

Tuesday, December 9, 2008

When it comes to state documents, using the same format for both document creation and document publication is a very bad idea. The reasons are somewhat technical, but it is the obligation of everyone involved in this decision to understand them.

The key distinction that needs to be grasped here is between editable formats (authoring formats) and final formats (publishing formats). This distinction has nothing to do with ODF vs. OOXML.

Editable document formats are designed to allow quick display and easy modification of content. ODF and OOXML are both editable document formats. Publishing formats, on the other hand, are designed to precisely express the location of text and graphics on the finished page. Adobe's PDF (Portable Document Format) is an example of a publishing format. Editable formats like ODF and OOXML are inappropriate for the publication of state documents because they cannot guarantee line and page fidelity.

--Jon Bosak
Read the rest in NYS Open Records Discussion Must Recognize Technical Requirements

Monday, December 8, 2008

XML Namespaces 1.0 defines NCNameStartChar in terms of the XML 1.0 Letter production, which is still defined in the 5th edition as it was in the 4th Edition. This implies that, upon publication of XML 1.0 5th Edition, conformance to XML Namespaces 1.0 will require the first character of names to follow the 4th edition rules and the following characters to follow the 5th edition rules! Since most specs and parsers these days require documents to conform to both XML 1.0 and XML Namespaces 1.0, the net result in practice of the 5th edition will be that names in documents cannot take advantage of the 5th editions's expanded character repertoire. (Of course, using XML Namespaces 1.1 is not an option, because that references XML 1.1.)

This second point seems to me to be illustrative of a more fundamental problem with the 5th edition. Whilst in theory people writing specs that reference XML 1.0 should have given careful consideration to whether to use a dated or a non-dated reference, and should have consistently used one or the other with a full appreciation of the potential consequences of this, in practice I do not believe this has happened. Before you guys dreamed up the 5th edition, I don't think anybody would have anticipated that the possibility of a change to the fundamental philosophy behind the selection of allowed name characters in XML without changing the version number. The result is that many specs that reference XML 1.0 aren't prepared for such a change. When you look at XML 1.0 by itself, I think there's a good case that the benefits of the 5th edition are greater than its costs, but when you consider the impact on XML 1.0 together with the whole universe of specs that are built on top of XML 1.0, I think the scales clearly tip the other way.

--James Clark
Read the rest in Comments on XML 1.0 5th edition from James Clark on 2008-10-17 (xml

Thursday, December 4, 2008

there is a real disconnect between XML schema and programming language data structures, and most of the people in the SOA community who are pushing schema for web services don't seem to understand that. I personally think it's better to start from code and use data binding tools that allow clean schema generation, since the schema generated by one data binding tool will generally be usable with other tools of the same type. But too many people have been sold on "schema first" approaches, and when they actually start using their schema in real applications they're shocked to discover that, e.g., pretty much all of their lovingly-crafted simpleType restrictions are ignored in the programming API and the base datatype is just used instead.

--Dennis Sosnoski on the xml-dev mailing list, Friday, 28 Nov 2008 23:29:08

Wednesday, December 3, 2008

What I think you are seeing is problems with data binding tools. These are performing a mapping from constructs in XML Schema to constructs in some other type system. These mappings are far less governed by standards, and it's generally not surprising if there are some constructs in XML Schema that don't cleanly map to some other chosen type system, and that therefore can't be used with the tool in question.

The best answer is to do everything using XML-based processing languages (XSLT and XQuery) rather than spending 75% of your programming effort converting your data between one type system and another.

--Michael Kay on the xml-dev mailing list, Tuesday, 25 Nov 2008 22:57:55

Tuesday, December 2, 2008

I am very wary of duplicating information that is already available as it tends to become out of date and thus ends up being even more of a pain than if the information isn't there in the first place.

--Ian Hickson on the WHATWG mailing list, Tuesday, 25 Nov 2008 22:35:52 +0000

Monday, December 1, 2008

So what went wrong with MySQL 5.1 ?

This is surprisingly not because our developers don't do a good job. On the contrary we have an excellent dedicated team of developers that are very good in what they are doing. However, even an excellent team can't work if the conditions are not right.
Here follows some of the main reasons why MySQL development department again got a quality problem with a GA release:

MySQL 5.1 was declared beta and RC way too early. The reason MySQL 5.1 was declared RC was not because we thought it was close to being GA, but because the MySQL manager in charge *wanted to get more people testing MySQL 5.1*. This didn't however help much, which is proved by the fact that it has taken us 14 months and 7 RC's before we could do the current "GA". This caused problems for developers as MySQL developers have not been able to do any larger changes in the source code since February 2006!
We have changed the release model so that instead of focusing on quality and features our release is now defined by timeliness and features. Quality is not regarded to be that important. To quote Mårten Mickos: "MySQL 5.1 will be release as GA in or before December because I say so". Mårten's reasons for this is that he needs something he can sell and a release marked "GA" is much easier to sell than a release marked "RC".
The MySQL core developers have been split into too many teams and only a small part of the core developers have been working on MySQL 5.1 to get the bugs fixed. Some of the core developers have also recently left the MySQL organization which is a serious issue as there is not many of of them.
Too many new developers without a thorough knowledge of the server have been put on the product trying to fix bugs. This in combined with a failing review process have introduced of a lot new bugs while trying to fix old bugs.
Bug fixing and development processes are not systematic and not persistent.
We have not been giving the MySQL community enough opportunities to test MySQL 5.1 (too few releases). The reason few releases was made was that if we would have done a release every month, as we have done in the past, we would have got 14 RC releases which would have looked silly and proved that the first RC was made too early. In addition, the MySQL current development model doesn't in practice allow the MySQL community to participate in the development of the MySQL server.
The MySQL organization doesn't have a release criteria for the MySQL server that is followed; Both the external one and the internal one have not been followed when it comes to declaring MySQL 5.1 as GA. You can read more about our release policy in Kaj's blog.
Internal QA on the MySQL server was started very late in the process. Now when the process have started to show results, the found bugs have largely being ignored as fixing these they would delayed the MySQL 5.1 GA date.
The MySQL server team have a bug fixing policy where a bug that has existed a long time has a lower priority 'because people know about them'. This is supposedly one of the reasons why the Bug#989 mentioned above has not been fixed.

One would have thought that MySQL AB (now the MySQL department at Sun) should have learned something from our too early release of MySQL 5.0 but unfortunately this is not the case. The main argument I have heard for why MySQL 5.1 was declared as GA now is that it's better than MySQL 5.0 was when it was declared as GA. In my opinion, this is not a good reason to declare something GA, especially as 5.0 GA was in terrible shape when it was released. What is worse is that the new features in MySQL 5.1 are of no better quality than new features in MySQL 5.0 was at the time MySQL 5.0 was declared GA.

--Monty Widenius
Read the rest in Monty says: Oops, we did it again (MySQL 5.1 released as GA with crashing bugs)

Thursday, November 27, 2008

The parts of the spec that are only relevant to specific implementations belong in specifications about those implementations, not in the specification of HTML. Moreover, the specification of HTML should be in terms of a declarative language that is produced by generators and consumed by browsers, not in terms of how it impacts the internal memory structure of some browser implementations. Finally, the parts of the spec that have nothing to do with HTML, such as SQL storage for web applications, should be kicked out.

The rationale for all of that is because HTML is a declarative language that has been designed to be portable across a very wide range of platforms and accessibility constraints, and for the most part the compliant implementations of HTML are not browsers and do not behave like browsers.

--Roy Fielding
Read the rest in Re: An HTML language specification vs. a browser specification from Roy T. Fielding on 2008-11-20 (public

Wednesday, November 26, 2008

The iPhone has brainwashed us into thinking everything that's revolutionary and exciting in the gadget world needs to be a sex object. Now, I enjoy an Apple hardware brainfuck as much as the next, but a phone is never a better phone because of hardware alone, and Google knows this too. And it will be entering the market at a time when iPhone’s software strategy is starting to show wobbly legs. If you’ve been reading our weekly app roundups, you’ve probably noticed the story shifting away from clever developers doing cool things no one previously thought possible to more about what Apple won't let iPhone developers and users do with their phones. As a result, we’ve seen tons of variety, but not a lot of depth. There are a ton of clever calculator apps of various kinds. There are a ton of games, a ton of flashlights. But in the groundbreaking and unexpected functionality department, all anyone can hear lately is crickets. And Apple's lawyers trying to get the crickets to sign an NDA.

--John Mahoney
Read the rest in Android: Why Android Will Soon Kick Ass

Tuesday, November 25, 2008

The Washington Post's Ombudsman, Deborah Howell, today wrote a column claiming that one reason that The Post and other papers are losing money is because they are "too liberal"; have had "more favorable stories about Barack Obama than John McCain," and "conservatives are right that they often don't see their views reflected enough in the news pages." To mitigate newspapers' financial problems, Howell decrees: "the imbalance still needs to be corrected." She adds: "Neither the hard-core right nor left will ever be satisfied by Post coverage -- and that's as it should be."
What if the actual facts -- i.e., "reality" -- are consistent with the views of "the hard-core left" and contrary to the views of the "hard-core right"? What if, as has plainly been the case, the conservatives' views are wrong, false, inaccurate? What if the McCain campaign was failing and relying on pure falsehoods and sleazy attacks, and The Post's coverage simply reflected that reality? It doesn't matter. In order to sell more newspapers, according to Howell, The Post's news coverage must shape itself to the Right and ensure that "their views [are] reflected enough in the news pages" (I don't recall Howell complaining when her newspaper -- according to its own media critic -- systematically suppressed anti-war viewpoints in its news pages and loudly amplified pro-Bush and pro-war views).
In Howell's view, The Post shouldn't determine its news reporting based on what is factually true. Instead, it should shape its coverage to please this discredited, failed political movement -- in order to sell more papers. That corrupt formula is, of course, what is now meant by "journalistic balance" -- say what both sides believe and take no position about what is true -- and it is precisely that behavior which propped up this incomparably failed and deceitful presidency for so long. The establishment media bears much of the responsibility for what has happened during the last 8 years, and amazingly enough, the lesson many of them seemed to have learned is that they didn't go far enough ("we're too liberal; we need to accommodate the Right more"). If there is an Obama presidency, watch for them very quickly to re-discover the long-dormant concept of "adversarial behavior."

--Glenn Greenwald
Read the rest in The Post and "the most disliked president since polling began in the 1930s" - Glenn Greenwald

Thursday, November 20, 2008

Objects on the wire in the general case are a Really Bad Idea, for a bunch of reasons. Leaving aside the obvious issues of portably serializing methods and member data, and making sure you have access to anything you might be inheriting, the real deep probem is the fact (most eloquently pointed out by Adam Bosworth) that objects can spend a long time on the wire, and it's really hard to do serialized objects in a way that survives versioning of the software.

--Tim Bray on the xml-dev mailing list, Wednesday, 23 Oct 2002

Wednesday, November 19, 2008

The belief that a document creator's intent is best preserved by using an editable format for final publication rests on the assumption that exactly the same software will be used to consume the document that was used to produce it. This may be true in a monopoly environment where everyone uses the same program running under the same operating system on the same hardware, but as we begin to open that environment to competition, this assumption no longer holds. If the idea is to preserve the original intent of the author in a heterogeneous environment, then publishing in an editable format is exactly the wrong way to do it.

--Jon Bosak
Read the rest in NYS Open Records Discussion Must Recognize Technical Requirements

Tuesday, November 18, 2008

I cringe every time I use a browser for many reasons. The browser people had a chance to make a more integrated UI and functionality, but really did pretty much the opposite in almost all respects. But, because of the attraction, and even some real value of stuff on the internet, there is more pressure to do better. I would expect to see some real alternatives to the typical "bad defacto standard" browsers we've had to put up with.

For example, suppose you actually have "real objects" as they were originally thought about will have "ideas" about how they can display themselves and be edited, and can be moderated by various kinds of styles. The PARC UI took this approach and displayed the views from different objects in a 2.5-dimensional media layout space. Boundaries could be put around these views or not. There were no applications, instead there were a highly integrated form of what today are called mashups. This was in the original PARC GUI!

There is much to be done here, and to even get back to a number of important integration and workflow ideas that were part of the PARC UI.

--Alan Kay
Read the rest in The Laptop Celebrates 40 Years | Gadget Lab from Wired.com

Saturday, November 15, 2008

Let’s be honest, Internet Explorer never really offered Firefox much of a challenge. Internet Explorer has only really managed to remain relevant through being included with every copy of Windows. Much the same can be said of Safari on the Mac (it’s nice to see that the WebKit rendering engine has a new home in Chrome). When Google released Chrome, Firefox could finally pick on a kid that was the same size.

--Adrian Kingsley-Hughes
Read the rest in Firefox 3.1 raises the browser bar … again | Hardware 2.0 | ZDNet.com

Friday, November 14, 2008

I obviously believe that the GPL (and v2 in particular) is a great model for working together - letting everybody share the code, but also making sure that nobody can then try to take advantage of other peoples work - you "pay" for the source code by giving source code back. I call it the "tit-for-tat" model, and it works well not only in the software world, but is fairly well known in economics and game theory too.

But the fact that I like it for the kind of endeavor I'm involved with doesn't mean that others can't have other goals. For example, if you're a standards body, and you want to use open source as a way to distribute a reference model, you may not be interested in the "tit-for-tat" part, but you want to just spread the reference code as widely as possible so that people start out with a certain basic proficiency, but you also want to make that reference the base for proprietary code-bases. So in that second situation, you might want to do an Apache or BSD license.

--Linus Torvalds
Read the rest in Linus Torvalds, Geek of the Week

Thursday, November 13, 2008

while it's true that RelaxNG is the simpler, cleaner, and in certain dimensions more powerful language, and with a nicer syntax, one can make the case that XSD has technical advantages for certain scenarios as well. There is a tradeoff between building languages in which the parse is deterministic (XSD) vs. non-deterministic (RelaxNG). RelaxNG, for example, has a very nice story about closure under intersection/union/difference: you can always write a RelaxNG schema that accepts just exactly what is accepted by any two other RNG schemas, what's accepted by either of them, etc. XSD can't in general do the equivalent. On the other hand, I would claim that XSD has a stronger story regarding type assignment, and it has a stronger story about static typing. The XQuery language uses XSD as its type system. You can, for example, declare in XSD a type called "Measurement" that has calls for a decimal number as the element content, with an attribute specifying units such as "inches" or "cm". You can use that type in your schema to declare elements named "Width" and "Height", which you can show in advance will always accept the same content. Furthermore, you can define XQuery functions which return as their value a such a Measurement.

--Noah Mendelson on the xml-dev mailing list, Sunday, 10 Apr 2008 11:23:12

Tuesday, November 11, 2008

In terms of Unicode support, what's vitally important is that any Unicode character is allowed in attribute values and character data. And XML 1.0 has always supported that. This change is just about the Unicode characters allowed in element and attribute names (and entity names and processing instruction targets).

I see relatively little use of non-ASCII characters in element and attribute names. A user who is technical enough to deal with raw XML markup can deal with ASCII element/attribute names. For less technical users who want to see element/attribute names in their native language, using native language markup is not a good solution, because it only allows a document or schema to be localized for a single language. An XML editor can provide a much better solution by supporting schema annotations that allow an element or attribute to be given friendly names in multiple languages. So a Thai user editing a document using the schema can work with Thai element/attribute names, and an English user working with the same document can see English names.

This is just following basic I18N principles of storing/exchanging information in a language neutral form, and then localizing it when you present it to a particular user. (This is the same reason why it's perfectly OK from an I18N perspective for XML Schema Datatypes just to support one specific non-localized format for dates/times.)

Perhaps this is part of the reason why there was so little enthusiasm for XML 1.1, and why there seems to be little interest in doing the 5th edition change as an XML 1.2.

--James Clark
Read the rest in James Clark's Random Thoughts: XML 1.0 5th edition

Monday, November 10, 2008

When your only model's a data table, every problem looks like another column.

--Dave Kellogg, CEO Mark Logic
Read the rest in Dr. Dobb's | XML as a Content Platform | September 11, 2008

Thursday, November 6, 2008

Most APIs don't tell you the encoding, because you don't need to know: by the time your application sees the content, it has been decoded from its original encoding into the native representation of your programming language.

--Michael Kay on the xml-dev mailing list, Tuesday, 20 Dec 2005 23:16:29

Wednesday, November 5, 2008

In many domains, programmers will decide on an XML representation before they've thought much about the class model, because for those domains XML actually offers the most convenient way of thinking about the problem.

The kinds of data that work well with XML modeling tend to be poorly suited for relational modeling, and vice-versa, with the practical result that XML/relational mapping is almost as infamously thorny as O/R mapping.

And as for XML/OO mapping, most of us tend to treat it as a more or less solved problem. However, in practice there are several competing ways of doing XML/OO mapping. The W3C DOM and SAX enjoy the broadest use, but they are both sufficiently cumbersome that alternatives such as JDom and REXML (among others) have gained significant followings.

I mention this not to start a fight, but only to illustrate that XML is a third modeling technique in its own right. It has both natural resonances and surfaces of friction with both relational design and OO design, as one might expect.

--Steve Yegge
Read the rest in Stevey's Blog Rants: The Universal Design Pattern

Monday, November 3, 2008

parts of Microsoft certainly are sincere, and other parts are almost certainly not. It's a pretty big and bloated company, and when one hand says it wants to participate in open source, I doubt the other hand knows or cares about it.'

--Linus Torvalds
Read the rest in Linus Torvalds, Geek of the Week

Thursday, October 30, 2008

XOM is the most correct and easiest to use XML tree and streaming API I've come across so far.

--Wolfgang Hoschek on the xom-interest mailing list, Wednesday, 5 Jan 2005 12:11:06

Wednesday, October 29, 2008

ENTITY declarations are reasonably sane in the publishing-technology context (although less useful in practice than the theory of SGML held), and XML was invented by by a bunch of publishing-tech geeks. In the world of wire protocols, Entities are actively pernious; among other things, they open the door to the billion-laughs attack. You really don't want a general-purpose recursive macro processor running over high-volume protocol traffic.

--Tim Bray on the xml-dev mailing list, Tuesday, 19 Feb 2008 11:31:24

Tuesday, October 28, 2008

You already have a namespace URI and a local name, why add a type? I've only ever seen xsi:type used badly. Document-level casting is silly, really. It's like saying "what this really is is a whale, that's what it is, but I'm going to call it a lemur, just because. I think someone said I could only have lemurs. Yeah it's a lemur; a humongous, blubbery water lemur."

--Robin Berjon on the xml-dev mailing list, Wednesday, 20 Feb 2008 17:01:49

Monday, October 27, 2008

Personally, I HATE xsi:type. In most B2B scenarios I've been involved with it causes more interoperability issues then it solves. I'd much rather get rid of XML type, and force people to validate against the schema as is (whether that be RelaxNG, DTD, XSD, SchemaTron....etc).

--David Carver on the xml-dev mailing list, Wednesday, 20 Feb 2008 15:32:17

Friday, October 24, 2008

here’s the thing: Android, and the G1, are open. Open, open, open, in ways that would make Steve Jobs cringe. You can unlock this phone after 90 days—that is, use any SIM card from any carrier in it. The operating system is free and open-source, meaning that any company can make changes without consulting or paying Google. The App store is completely open, too; T-Mobile and Google say they won’t censor programs that they don’t approve of, as Apple does with the iPhone store. Yes, even if someone writes a Skype-like program that lets people avoid using up T-Mobile cellular voice minutes.

Android is not as beautiful or engaging as the iPhone’s software, but it’s infinitely superior to Windows Mobile—and it’s open. The G1 is only the first phone to use it, the first of many; it’s going to be an exciting ride.

--David Pogue
Read the rest in A First Look at Googles New Phone - Pogue’s Posts - Technology

Thursday, October 23, 2008

Age has finally become an issue for John McCain. But the problem isn't the candidate's 72 years; it's the antediluvian approach of his campaign.

McCain is running a textbook Rovian race: fear-based, smear-based, anything goes. But it isn't working. The glitch in the well-oiled machine? The Internet.

"We are witnessing the end of Rovian politics," Eric Schmidt, CEO of Google told me. And YouTube, which Google bought in 2006 for $1.65 billion, is one of the causes of its demise.

Thanks to YouTube -- and blogging and instant fact-checking and viral emails -- it is getting harder and harder to get away with repeating brazen lies without paying a price, or to run under-the-radar smear campaigns without being exposed.

But the McCain campaign hasn't gotten the message, hence the blizzard of racist, alarmist, xenophobic, innuendo-laden accusations being splattered at Obama.

--Arianna Huffington
Read the rest in Arianna Huffington: The Internet and the Death of Rovian Politics

Wednesday, October 22, 2008

Google basically says that it's up to people to scan Street View themselves, pick out photos that might be private, then notify the company. Staying off of private roads isn't Google's problem; it's the homeowner's.That might sound burdensome, but it's the same argument deployed against rightsholders over video.

This fundamental tension between the opt-in/get-permission/check-first model and the opt-out/seek-forgiveness/fix-later approach is shaping up as a fundamental point of contention on the Internet. NebuAd's opt-out approach to grabbing ISP clickstream data has become such a big deal that Congress has already held multiple hearings on the matter and has ISPs across the country running scared. When it comes to copyright, rightsholders have pushed (with some success) for video-sharing sites to screen uploaded content for possible violations before it goes live. User-generated content sites, which have powered the Web 2.0 revolution, are under attack over uploads of child pornography, regular pornography, and clips of public harassment and abuse of others. And a UK government commission this week recommended that user-generated content sites be forced to screen all uploads with human eyes before pushing them out to the web.

Copyright, privacy, and school bullying videos might not seem to have much in common, but the debate over screening first vs. fixing later could reshape the Internet as we know it. Having to get permission or screen content would hobble useful services like Street View and YouTube, and it would probably put companies like NebuAd out of business, even as it might lead to less objectionable or private content online.

--Nate Anderson
Read the rest in Opt-in or opt-out

Tuesday, October 21, 2008

There is a real problem here, and it's partly my fault. If we had had a bit more foresight ten years ago, we would have made the 1st edition of XML 1.0 say what is now being proposed for the 5th Edition. I know that the XML Core WG are trying to do the right thing, but I really don't think this is a good idea.

I think you've got to look at the impact of the change not just on XML 1.0 but on the whole universe of specs that are built on top of XML 1.0. In an ideal world, all the specs that refer to XML 1.0 would have carefully chosen whether to make a dated or an undated reference to XML 1.0, and would have done so consistently and with a full consideration of the consequences of the choice. In practice, I don't believe this has happened. Indeed, before the 5th edition, I believe very few people would have considered that XML might make a fundamental change to its philosophy about which characters were allowed in names while still keeping the same version number.

Even W3C specs don't get this right. In particular, XML Namespaces 1.0 gets completely broken by this (as my comment explains).

Now you can argue that the breakage and chaos that the 5th edition would cause is due to bugs in the specs that reference XML 1.0. But that doesn't make the breakage any less real.

--James Clark
Read the rest in James Clark's Random Thoughts: XML 1.0 5th edition

Friday, October 17, 2008

Some of the state initiatives to define an official publishing format exhibit a mistaken belief that since ODF and OOXML are based on the XML metalanguage, they must be more “open” and capable than a non-XML (but text-based) page description language like PDF. It is true that an XML format that was semantically rich enough could in theory support the addition of unlimited amounts of metadata that would greatly enhance our ability to search and navigate the document space. But in reality there are at least two factors that render this theoretical benefit irrelevant. First, enabling this functionality across all state documents would require a level of metadata standardization that lies decades in the future, if indeed it can ever be attained; second, and more to the point, the human work needed at the current stage of technology to add even the smallest amount of semantic enrichment to any meaningful number of documents is far beyond what any state agency has the resources to accomplish. And even if we were somehow able to overcome these enormous barriers, the fact is that neither ODF nor OOXML is natively equipped to support semantic enrichment any better than PDF is. In other words, in reality, a PDF file is just as searchable as an ODF file or an OOXML file. And of course, the PDF file specifies unambiguously what the piece was intended to look like, which is something that the editable formats cannot do.

If by “open” the proponents of XML-based formats simply mean that the contents of ODF and OOXML documents are easier to manipulate, they are correct — but as noted above, this capability is the opposite of what we want in electronic documents published by the state.

--Jon Bosak
Read the rest in NYS Open Records Discussion Must Recognize Technical Requirements

Thursday, October 16, 2008

I never thought I’d ever say this again (having lived through the browser wars of the late 90s), but I’m so glad that the browser wars have re-ignited and that developers are concentrating on performance, standards and stability rather than “here today, gone tomorrow” gimmicks. In normal day-to-day use I’m using Firefox more and more, and actually enjoying it, but I have to admit that I also like Chrome because the browser feels snappier under heavy load (not to mention a lot more robust). Firefox and Chrome make Internet Explorer (even IE8, currently in beta) both look and feel like a lumbering dinosaur waiting to be made extinct.

--Adrian Kingsley-Hughes
Read the rest in Firefox 3.1 raises the browser bar … again | Hardware 2.0 | ZDNet.com

Tuesday, October 14, 2008

Here's what bothers me about namespaces:

I need write namespaces URIs several times a day, but the URIs aren't memorable. Mistyping an NS URI would waste even more time as bugs than looking URIs up for copying and pasting, so I look them up for copying and pasting, and it's a huge waste of time.

The indirection layer from prefix to URI confuses people.

Namespaces not inheriting to attributes confuses people. (I have had to give a crash course in how namespaces work on W3C telecons and f2f meetings! Others have had to do it as well. This point is so confusing that people whose job is working on Web specs get it wrong. I've been told about a professor teaching a class about XML who got it wrong.)

Instead of comparing names against a string literals, you have to compare two datums against two literals. That is, instead of doing "foo-bar".equals(name), you have to do "http://www.example.com/2008/08/namespace#".equals(uri) && "bar".equals(localName).

Removing uri,local pairs from XML parsing context makes it hard to write the full name in a compact form. Witness the NSResolver complications with XPath and Selectors DOM APIs.

That the prefix is semantically not important confuses people who go and write uninteroperable software thinking that they should be comparing the prefix instead of the URI.

The design of namespaces considers parsing. It doesn't consider serialization. Writing an XML serializer that doesn't suck isn't trivial, and one will spend most of the development time on dealing with Namespaces. (The prefixes aren't important but people still have aesthetic opinions about how they should be generated...)

Namespaces dropped the HTML ball a decade ago letting the HTML and XML DOMs diverge.

Namespaces stuff their syntax into attributes as opposed to having syntax on their own meaning that certain magic attribute names need blacklisting both in parsing and in serialization.

Namespaces slow down parsing. (By over 20% with Xerces-J and the Wikipedia front page!)

I've spent *a lot* of time writing code that is Namespace-wise excruciatingly correct. Yet, Namespaces have never actually solved a problem for me. My software developer friends complain to me about how Namespaces cause them grief. No one can remember Namespaces solving a real problem. It's like feeding a white elephant.

Qnames in content have further problems: They complicate APIs and the application layer when the mapping context needs to leak to the application instead of being a parser-internal thing. Under scripted DOM scenarios, there's the issue of the mapping context not getting captured at node creation time thereby making the meaning of qnames brittle under tree mutations. Finally, serializing XML that *may* have qnames in content without the serializer knowing which values are qnames (i.e. writing a generic serializer) is complex.

--Henri Sivonen on thewhat-wg mailing list, Friday, 22 Aug 2008 10:50:59

Monday, October 13, 2008

XQuery was the DBMS community's chance to start over and they took it. XQuery is superior to SQL for a number of reasons. It's a full programming language, not just a data manipulation language. It handles XML natively, and XML is indeed becoming more and more pervasive. With adoption of Microsoft Office 2007 and ODF-based documents, every organization will find themselves with an explosion of XML and a desire to do more with it. Furthermore, while content has been seen as static, Web 2.0 applications and user generated content create an environment where documents inherently must evolve over time and XML is widely seen as the right format.

--Dave Kellogg, CEO Mark Logic
Read the rest in Dr. Dobb's | XML as a Content Platform | September 11, 2008

Friday, October 10, 2008

I never have any problems getting XHTML into IE to speak of, and don't really understand the fuss. Even using application/xhtml+xml works (see http://www.w3.org/International/tests/sec-ruby-markup-1.html as an example). I know that there are some differences, but in the vast majority of deployed pages, with a little forethought you're never going to have any major issues.

--Steven Pemberton on the www-tag mailing list, Tuesday, 30 Sep 2008 13:17:15

Thursday, October 9, 2008

The iPhone will always be constrained to hardware that is completely under Apple’s control. At best, there will be as many iPhone models as there are iPod models at any given time, should the company decide to expand into different form factors such as keyboard or flip-phone designs. And at least in the United States, we have a good four years left of an AT&T exclusive on the device. While the Android platform is currently limited to T-Mobile, any carrier or device manufacturer can get potentially get into the Android game and make software additions and improvements as well as competing hardware designs that use the Android OS.

--Jason Perlow
Read the rest in Android: It’s not about Sex, Excitement, or Cool | Tech Broiler | ZDNet.com

Wednesday, October 8, 2008

The iPhone and Android phones both ship with Web browsers based on the WebKit rendering engine. That means Web applications designed for one will render almost identically on the other, provided their developers adhere to published standards. Those same applications will also render on WebKit-based desktop browsers, such as Safari and Google Chrome, and on any other browsers that implement the standards correctly.

Based on that, all this talk of SDKs seems almost foolish. The iPhone may have more market share than Android today, but the standards-based Web has far more market share than both combined. So you tell me: If you're a developer, where's the smart money? The iPhone App Store? The Android SDK? Or somewhere else?

--Neil McAllister
Read the rest in SDK shoot-out

Tuesday, October 7, 2008

If you go to your kitchen and look in the 'fridge door, you'll see a screwdriver rack and a device for holding screws. It's there to make life easier for the people building or repairing the fridge.

OK, no, you won't see such a thing...

CDATA sections are primarily there for people writing the XML specification, to make it easier to include examples, and I think of them in much the same way as I'd think of a tool rack in my fridge.

--Liam Quin on the xml-dev mailing list, Monday, 29 Sep 2008 19:42:32

Monday, October 6, 2008

As a consumer, I would almost always be happier if you simply published your information in reasonably well structured (X)HTML with a little CSS styling to establish whatever look and feel you want. (No Flash, no silverlight, no PDF, just the content, please). If you've got your information in some more structured XML, I'd probably like access to that too, but publishing arbitrary XML on the web seems not to have taken off.

The reason is simple: PDF gives all the formatting control to the author. I'm the reader and I want the control. I want the freedom to flow the text differently to fit on my wide or narrow screen or handheld device, or convert it into an audio format, or do something else that will make the information more useful to me.

--Norm Walsh
Read the rest in Text in PDF documents

Friday, October 3, 2008

There are an unlimited number of security vulnerabilities for a given security device, system, or program, most of which will never be discovered (by the good guys or bad guys).

--Roger G. Johnston
Read the rest in Schneier on Security: Security Maxims

Wednesday, October 1, 2008

One of the problems with schemas, in fact, is that they encourage tag abuse, because they limit the freedom of users to invent new element names, encouraging them instead to overload the semantics of existing names. The tighter the schema is defined, the more creative users have to become in getting around them; in the end, they resort to comment-abuse, or even as I've seen in one case, using a namespace declaration as a boolean flag whose presence or absence conveyed information to the recipient.

--Michael Kay on the xml-dev mailing list, Sunday, 13 Dec 2007 00:16:06

Friday, September 26, 2008

We're worried that Chrome will be another giant conveyer belt moving private information about our use of the Web into Google's data vaults. Google already knows far too much about what everybody is thinking at any given moment.

--Peter Eckersley, EFF
Read the rest in EFF: We're concerned about Google's Omnibox | Webware : Cool Web apps for everyone

Monday, September 22, 2008

Once the Yahoo store goes down and the key servers go offline, existing tracks cannot be authorized to play on new computers. Instead, Yahoo recommends the old, lame, and lossy workaround of burning the files to CD, then reripping them onto the computer. Sure, you'll lose a bunch of blank CDs, sound quality, and all the metadata, but that's a small price to pay for the privilege of being able to listen to that music you lawfully acquired. Good thing you didn't download it illegally or just buy it on CD!

No, you were one of the digital pioneers, and in this brave new frontier world, a few people are just going to get malaria. Fact of life. And someone will step in a bear trap, and then it's time for the bite rag, the alcohol, and the saw. Just the price of progress. And yes, some poor group will get trapped in snowfall when crossing the pass, and cannibalism may or may not be involved by the time they stumble barefoot from the mountains next spring.

--Nate Anderson
Read the rest in DRM still sucks: Yahoo Music going dark, taking keys with it

Friday, September 19, 2008

our kids will find any data/content distinction totally arbitrary and a historical artifact of the mainframe/minicomputer era. Second, our kids will think of SQL the way that we think of COBOL. ("Daddy, do you mean you used a database language that assumed all data was stored in tables and didn't natively understand XML?" "Yes, Muffin, and I used to have to sew my own clothes, too!")

--Dave Kellogg, CEO Mark Logic
Read the rest in Dr. Dobb's | XML as a Content Platform | September 11, 2008

Wednesday, September 17, 2008

For the old guys: CALS REDUX. Four billion spent and no systems delivered.

For the younger guys: A few million tossed at a university and now the world wide web.

--Len Bullard on the xml-dev mailing list, Sunday, 28 Aug 2008 10:09:11

Tuesday, September 16, 2008

Chrome and IE8 are far more advanced than Apple’s Safari. Safari is speedy on both Mac and Windows platforms, but lacks many of the key intelligent features of its newer Google and Microsoft rivals.

--Walt Mossberg
Read the rest in First Test of Google’s New Browser | Walt Mossberg | Personal Technology | AllThingsD

Monday, September 15, 2008

Just look at metadata like lang="", character encoding information, Content-Type headers, etc. It's so unreliable that any serious system that processes large amounts of data from multiple Web authors always ends up ignoring the metadata (or at best using it as a hint) and using heuristics to determine the real information.

In controlled environments, e.g. on a single site, or in a single person's media library, or within a small coherent community where all the participants have compatible goals, it is possible to get enough discipline that metadata is both reliable and useful. And for such communities we have a raft of extension mechanisms, and clashes can be avoided easily by simply using names that nobody in the community is already using.

But as soon as this kind of thing is applied to people outside the tightnit community, the metadata becomes an utter mess, misused, wrong, missing, syntactically incorrect, semantically incorrect, unusable. We have shown time and time again that when metadata mechanisms face the wider Web community, they fail.

--Ian Hickson on the what-wg mailing list, Friday, 22 Aug 2008 01:48:39 +0000

Friday, September 12, 2008

there doesn't have to be any sort of consistency mechanism. The web today works without a logical consistency checking mechanism in any of the current publishing languages.

We are not striving for perfect semantic data with RDFa on day one - we are striving for an easy mechanism for expressing semantic data in HTML family languages.

--Manu Sporny on the whatwg mailing list, Wednesday, 27 Aug 2008 18:42:08

Thursday, September 11, 2008

These days a smart site owner thinks about how their web site looks to all browsers, from Internet Explorer to Safari to Opera to an iPhone.

--Matt Cutts
Read the rest in Answers to common Google Chrome objections

Wednesday, September 10, 2008

good XML uses elements for the main document content and attributes for metadata on that content. You can’t always make this distinction cleanly (because you can’t have attributes on attributes, or attributes containing elements), but a good markup language will attempt to do so.

--Jeni Tennison
Read the rest in Bad XML

Tuesday, September 9, 2008

Google is tying browsing, your personal history, your apps and your data together. And Google will couple these parts with Android, which is likely to link your PC and phone together. Is this playbook all that different than what Microsoft’s? Simply put, maybe everyone is evil.

--Larry Dignan
Read the rest in Google Chrome: The enterprise playbook | Between the Lines | ZDNet.com

Monday, September 8, 2008

Decision Makers in corporations (usually fanned by external "experts"/ "consultants"/ bloggers(!) ) underestimate the value in web analytics data, even with full privacy settings on. Even with 100% of your website visitors browsing in privacy mode.

Sure Unique Visitors would be imprecise. Yes you won't get much from segmenting New vs Returning Visitors, and such. But your visits number is fine. Your referrers are fine. Your keywords are fine. Initial sessions from campaigns are fine (as is conversions at least for those first sessions). Your page views are fine. And .... well I could keep going.

We should be on a quest to ensure that our data has as little inaccuracy as possible, then use the wealth of insights that are already there rather than bitching about the small % that's missing.

It is important to pause and think of faith based initiatives like ads in magazines or on the telly and how they are measured. On its worst day the web is infinitely more measurable.

--Avinash Kaushik
Read the rest in Lies, Damned Lies...: Internet Explorer 8 beta 2: Privacy vs Monetizability

Thursday, September 4, 2008

Given the organisation’s inability to follow its own rules we are no longer confident that ISO/IEC will be capable of transforming itself into the open and vendor-neutral standards setting organisation which is such an urgent requirement. What is now clear is that we will have to, albeit reluctantly, re-evaluate our assessment of ISO/IEC, particularly in its relevance to our various national government interoperability frameworks. Whereas in the past it has been assumed that an ISO/IEC standard should automatically be considered for use within government, clearly this position no longer stands.

--Aslam Raffee (South Africa)
Chairman, Government IT Officer’s Council Working Group on Open Standards Open Source Software
____________________________
Marcos Vinicius Ferreira Mazoni (Brazil)
Presidente, Servico Federal de Processamento de Dados
____________________________
Carlos Eloy Figueira (Venezuela)
President, Centro Nacional de Tecnologías de Información
____________________________
Eduardo Alvear Simba (Ecuador)
Director de Software Libre, Presidencia de la República
____________________________
Tomas Ariel Duarte C. (Paraguay)
Director de Informática, Presidencia de la República
____________________________
Miriam Valdés Abreu (Cuba)

Read the rest in Vulindlela - Open the Path

Wednesday, September 3, 2008

introducing Namespaces into XML was primarily a requirement stipulated by the RDF community. XML got Namespaces, but then at least notable parts of the RDF community figured that they didn't like RDF/XML all that much and started doing N-triples, N3 and Turtle. The damage was already done, and now the XML community is stuck with Namespaces in XML.

I write software that processes XML, so every time I have to look up a namespace URI, I'm effectively paying a spill-over RDF tax. When my software runs slower because it has to compare two strings instead of one, the users of my software are paying a spill-over RDF tax. I seriously don't like paying the spill-over RDF tax in the form of Namespaces.

--Henri Sivonen on the whatwg mailing list, Friday, 29 Aug 2008 11:02:09

Tuesday, September 2, 2008

The stories are the same except that some of the names are different. A startup gets bought and immediately stops innovating because all their development time is being spent porting the code to a new platform. During that time newer, more agile competitors show up and eat their lunch. Why I find this to be such a conundrum is that when you buy a technology startup, you are primarily buying three things

customers

employees

technology

However the standard operating procedure during Web software acquisitions is to discard the technology and consequentially tick off the employees who made the product a success in the first place thus creating an exodus. The application rewrite plus employee exodus leads to product stagnation which eventually leads to lots of pissed off customers. Thus the entire value from the acquisition is effectively thrown away.

This is the default situation when it comes to acquisitions in the software industry. For every successful acquisition like Google + YouTube there are two or three that are more like Google + Dodgeball. So if there is a startup whose product you love that you hear is being acquired by a one of the large Web companies, be happy for the founders and be sad for yourself because the product you love is likely going to become a neglected bride.

--Dare Obasanjo
Read the rest in Dare Obasanjo aka Carnage4Life

Monday, September 1, 2008

I'm afraid when it comes to a choice between conforming to standards and doing what they think is right, Microsoft tend to do what they think is right. Sometimes this is helpful to users, sometimes it causes no end of problems.

--Michael Kay on the xsl-list mailing list, Saturday, 9 Sep 2006 00:10:33

Friday, August 29, 2008

There are quite a few text editors out there for the Mac, and really all it comes down to is preference. While some users may find another more to their liking, others have been using BBEdit for so long, they just can't do what they want to do in any other software. Regardless, if you have never used it before, BBEdit deserves to be checked out.

--Jeff Smykil
Read the rest in Mac text editor BBEdit hits version 9

Thursday, August 28, 2008

Which, by the way, brings up another problem, the same one that plagues CORBA, COM/DCOM, WSDL-based services, and anything that relies on a shared definition file that is used for code-generation purposes, what I often call The Myth of the One True Schema. Assuming a developer creates a working .proto/.idl/.wsdl definition, and two companies agree on it, what happens when one side wants to evolve or change that definition? Who gets to decide the evolutionary progress of that file? Who "owns" that definition, in effect? And this, of course, presumes that we can even get some kind of definition as to what a "Customer" looks like across the various departments of the company in the first place, much less across companies. Granted, the "optional" tag in PBs help with this, but we're still stuck with an inherently unscalable problem as the number of participants in the system grows.

--Ted Neward
Read the rest in Interoperability Happens

Tuesday, August 26, 2008

I'm really really troubled by all my blog URIs breaking, if only for a short time. 6 apart clearly doesn't understand "cool uris don't change" as they break URIs in many different ways each release.

--David Orchard on the www-tag mailing list, Tuesday, 19 Jun 2007 11:35:24

Sunday, August 24, 2008

The Fairness Doctrine will come up again and again over the next few years—mostly invoked by Republicans. Barack Obama says he opposes the Fairness Doctrine. But expect everything that he asks of broadcasters and the Internet to get called the Fairness Doctrine anyway. You can also expect conservatives to see the Doctrine in any telecom proposal that sounds too regulatory: requiring a minimum of local radio fare, restrictions on product placement, caps on how much junk food advertisers can hawk to children on digital TV, etc. All these sinister ideas, plus net neutrality, will be boiled in the same rhetorical broth.

In the final analysis, this debate isn't about an extinct FCC policy. Right now it's about scoring political points. And, most importantly, it's about prolonging the fantasy that our nation's broadcasting/telecommunications infrastructure can effectively serve us without government playing a constructive role.

--Matthew Lasar
Read the rest in Fairness Doctrine Panic hits FCC, spreads through blogosphere

Friday, August 22, 2008

XML can be nested to any level, and good markup languages take advantage of that by grouping similar things together and by using inheritance to scope the applicability of particular attributes. Good markup languages also take advantage of the context in which a particular element or attribute appears to determine its meaning, rather than giving each possibility a distinct name.

--Jeni Tennison
Read the rest in Bad XML

Wednesday, August 20, 2008

Bill Gates used to worry about Microsoft losing its monopoly overnight because of a technical mistake. We all laughed. We laughed because Microsoft had such financial and sales clout and had the executive suite of nearly every customer company so snowed that they seemed unassailable. But on some level Gates was correct and we've seen that proved by Google.

--Mark Stephens
Read the rest in I, Cringely . The Pulpit . The Five Percent Solution | PBS

Tuesday, August 19, 2008

The JSON community is very happy to point out that the mechanisms of XSD and its syntax are inconvenient overkill when all you need to pass around are some data-style lists of fields, types, values. They're right about that. As has been eloquently stated on this list, SQL and relational are cleaner (not to say having decades of deployment and training investment) when all you need is more or less regular data, or things that decompose well into tables.

--Noah Mendelsohn on the xml-dev mailing list, Tuesday, 12 Feb 2008 11:47:22

Monday, August 18, 2008

In any software project with multiple contributors, sometimes they will disagree on a design issue. Where the contributors are employees, usually they’ll continue work even if they disagree with the design. But with volunteers, it’s much more likely that the project maintainer will agree to placate a contributor by adding a configuration setting for the behavior in question. The number, obscurity, and triviality of such preferences ends up confusing ordinary users, while everyone is penalized by the resulting bloat and reduced thoroughness of testing.

Solution: Strong project maintainers and a culture of simplicity.

--Matthew Paul Thomas
Read the rest in Matthew Paul Thomas » Blog Archive » Why Free Software has poor usability, and how to improve it

Saturday, August 16, 2008

OpenID could have been deployed far more effectively if it had simply reused existing information systems directly instead of inventing a duplication of DNS using HTTP and making an incomprehensible mess of its documentation as a result.

--Roy T. Fielding on the www-tag mailing list, Wednesday, 6 Aug 2008 18:06:40

Wednesday, August 13, 2008

It's not that the Internet is broken. It's that browser security is broken. Browser security is really an oxymoron.

--Jeremiah Grossman, chief technology officer with White Hat Security
Read the rest in A photo that can steal your online credentials | InfoWorld | News | 2008-08

Tuesday, August 12, 2008

I'm a big fan of collaboration and collective knowledge production and public good projects. This is one of the reasons that I love Wikipedia. Not only are Wikipedia entries the product of collective contributions, but both the small and large contributions are visible to all. Of course, contributing to Wikipedia needs to be an act of love because there are no traditional structures that reward such contributions. Wikipedia has its faults, but it is fundamentally the collaborative creation of a public good.

Google's Knol takes an entirely opposite approach to knowledge production. Knol's entire structure is built around single authors, control and individualism. There aren't even mechanisms for multiple authors and the tools available for collaboration are extremely limited. "Collaboration" still assumes a primary author. Linking between knols doesn't appear common and so there's no network of information. They key is authorship.

Since Knol launched in beta, folks have been comparing it to Wikipedia (although some argue against this comparison). Structurally, they're different. They value different things and different content emerges because of this. But fundamentally, they're both about making certain bodies of knowledge publicly accessible. They just see two different ways to get there - collaborative anarchy vs. controlled individualism. Because Knol came after Wikipedia, it appears to be a response to the criticisms that Wikipedia is too open to anonymous non-experts. The implication is that Wikipedia is the dribble of the unwashed masses. These same folks praise the control-centric Knol. Yet, I think Doc is right. A knol is quickly becoming a "unit of spam" instead of a unit of knowledge. Y'see - a system that is driven by individualism quickly becomes a tool for self-promoters. (And men...)

--danah boyd
Read the rest in apophenia: knol: content w/out context, collaboration, capital, or coruscation

Monday, August 11, 2008

I've noticed that a lot of people in the field of journalism have a tendency to judge the political proclivities of a publication by the subjective mental states of the staff. The correct way, however, is to look at what's on the pages. Having three socialists doing page layout, two moderate conservative writing features on political relevant topics, and one moderately liberal film critic does not a left-of-center publication make. Similarly, if in order to be "interesting" and "provocative" your publication contains some articles in which heterodox liberals challenge liberal conventional wisdom and other articles in which conservatives challenge liberal conventional wisdom, then your publication is mostly publishing conservative content.

--Matthew Yglesias
Read the rest in Matthew Yglesias (August 01, 2008)

Friday, August 8, 2008

The common practice of “release early, release often” can cause poor design to accumulate. When a pre-release version behaves a particular way, and testers get used to it behaving that way, they will naturally complain when a later pre-release version behaves differently — even if the new behavior is better overall. This can discourage programmers from improving the interface, and can contribute to the increase in weird configuration settings.

--Matthew Paul Thomas
Read the rest in Matthew Paul Thomas » Blog Archive » Why Free Software has poor usability, and how to improve it

Thursday, August 7, 2008

XQuery does of course handle mixed content data, and shares the same data model as XPath 2.0 and XSLT 2.0, too. It's true that it does not have apply-templates; XQuery is best when you are grabbing fragments of documents; if you want to transform the fragments, it can also make sense to use XSLT after XQuery.

--Liam Quin on the xsl-list mailing list, Sunday, 22 Jun 2008 08:43:50

Wednesday, August 6, 2008

I regard myself as a centrist. I believe very much that in proper doses copyright is essential for certain classes of works, especially commercial movies, commercial sound recordings, and commercial books, the core copyright industries. I accept that the level of proper doses will vary from person to person and that my recommended dose may be lower (or higher) than others. But in my view, and that of my cherished brother Sir Hugh Laddie, we are well past the healthy dose stage and into the serious illness stage. Much like the U.S. economy, things are getting worse, not better. Copyright law has abandoned its reason for being: to encourage learning and the creation of new works. Instead, its principal functions now are to preserve existing failed business models, to suppress new business models and technologies, and to obtain, if possible, enormous windfall profits from activity that not only causes no harm, but which is beneficial to copyright owners. Like Humpty-Dumpty, the copyright law we used to know can never be put back together again: multilateral and trade agreements have ensured that, and quite deliberately.

--William Patry, Google's Senior Copyright Counsel
Read the rest in The Patry Copyright Blog: End of the Blog

Tuesday, August 5, 2008

I had never used Microsoft search before a couple of weeks ago. I sat down to try it and then realized I had no idea what to type in. I tried Microsoft.com, but the only search bar was to help me search MSFT internal directories. I then remembered seeing some advertisement for 'Live Search,' so I tried Livesearch.com. It was some spam site. Finally, I tried Live.com, and the results were no good. I will never use it again. When I want to Google or find Yahoo, I know what to punch in. Simple as that.

--Eric Jackson, Ironfire Capital
Read the rest in Why Microsoft Will Try Again for Yahoo in 2009

Monday, August 4, 2008

Ars has been one of only many groups banging the anti-DRM drum for years. We're not pro-piracy, we're just not dumb as rocks. DRM makes things harder for legal users; it creates hassles that illegal users won't deal with; it (often) prevents cross-platform compatibility and movement between devices. In what possible world was that a good strategy for building up the nascent digital download market?

The only possible rationales could be 1) to control piracy (which, obviously, it has had no effect on, thanks to the CD and the fact that most DRM is broken) or 2) to nickel-and-dime consumers into accepting a new pay-for-use regime that sees moving tracks from CD to computer to MP3 player as a "privilege" to be monetized. What we really need to do is just—you know what? Why bother. We've been down this road so many times before that everyone knows their lines by heart.

--Nate Anderson
Read the rest in DRM still sucks: Yahoo Music going dark, taking keys with it

Saturday, August 2, 2008

Ogg is not proprietary, DRM is, and DRM-free may be a "non-starter" for Hollywood today, but that was true of music two years ago and today, most of the labels are lining up to release their catalogs without DRM. The Web, and Web-based video, are bigger than Hollywood. The Web is not a place for proprietary technology or systems that take over your computer. For Nokia (and Apple, who also lobbied hard for DRM inclusion) to get the Web this badly wrong, this many years into the game, is really sad: if you haven't figured out that the Web is open by 2007, you just haven't been paying attention.

--Cory Doctorow
Read the rest in Nokia to W3C: Ogg is proprietary, we need DRM on the Web

Friday, August 1, 2008

the best practical way of encoding an XML document as a stream of bytes for transmission over the wire is probably not to encode it at all, i.e. send the XML document as is.

--Michael Kay on the xml-dev mailing list, Saturday, 19 Feb 2005 09:03:53

Thursday, July 31, 2008

We didn't need Google, or didn't think we did before Google came along. I don't recall sitting around complaining about Alta Vista and Excite and the other pre-Google search engines, which seemed to do a pretty good job in their day. But then Google came along and was clearly better -- enough better that we all jumped.

How much better did Google have to be than Alta Vista to replace it in the minds and mice of most users? I argue five percent better is good enough. In a market where products are presented as services and those services are ad supported and don't cost users any cash, there is almost no exit barrier. The system has no friction, no stiction. Five percent better is enough to steal that kind of promiscuous market. And five percent isn't much -- a little better UI or server or just a slightly different idea can be enough.

--Mark Stephens
Read the rest in I, Cringely . The Pulpit . The Five Percent Solution | PBS

Tuesday, July 29, 2008

There must be a clear chain of authority for interpreting the URI. The chain of authority starts with the URI (or IRI) spec itself. Effectively, the URI (or IRI) spec delegates authority to the scheme. In the case of the HTTP scheme, the HTTP scheme effectively delegates authority to the domain name owner, who in turn may choose to delegate authority for some sub-spaces of its URI space. So for example, the owner of foo.example could delegate authority to dbooth for all URIs in http://*.dbooth.foo.example/* or all URIs in http://foo.example/dbooth/* . But Netscape could *not* delegate authority for interpreting all URIs that match http://*/netscape/* because Netscape does not own all of those URIs.

--David Booth on the www-tag mailing list, Tuesday, 29 Jul 2008 14:55:56

Monday, July 28, 2008

data and document applications are not always separate. The example I like to use is of insurance companies. A list of policy holders no doubt fits very well into a relational model, and SQL is great for querying and updating it. For each policy holder there is likely to be one or more insurance policy documents, and indeed information like the customer ID, policy number, customer's name and address etc. is likely shared or linked between the documents and the policy holder list. Quite possibly the policy documents are encoded as some kind of smart template, so that a join in a language like XQuery can automatically create the policy documents tailored for each customer. Indeed, there may be logic in the template or that XQuery along the lines of: for any policy in excess of $1m, include paragraph 2 on limitations of liability.

The point is that what makes XML so valuable is not just that it can do documents, but that it can do these combined document/data applications. The xsd:decimal type you use in Schema can be applied to both the amount of coverage in the list of policy holders and to the corresponding number in the policy document. You can query for all policy documents in which the zip code of the document matches the location of responsibility for some the insurance agents in your (data-style) agent list. It's the combination of data and documents in a uniform model that's powerful.

--Noah Mendelson on the xml-dev mailing list, Tuesday, 12 Feb 2008 11:47:22

Wednesday, July 23, 2008

Syntax based systems, even as well-thought through as SQL, are simply not the right interface for 'non-programmers'; hence the success of correct-by-construction QBE systems and forms that imitate paper structures.

--Len Bullard on the xml-dev mailing list, Monday, 11 Feb 2008 12:51:56

Tuesday, July 22, 2008

XML users don't want to be constrained to not have duplicate "database data types" as you call them, and they frequently have element and attribute names that are the same but mean different things in different documents and different parts of the same document.

--John Snelson on the xml-dev mailing list, Friday, 15 Feb 2008 12:03:27

Monday, July 21, 2008

Flash Web sites are still teh suck. They're a lame attempt to make your boring site look interesting by distracting users with pretty pictures and moving frames. You should be using your content and product/service offerings to do that.

HTML and Flash may be becoming more equal to the search engines, but most users will still favor an HTML Web site that they can navigate easily. Today's announcement is something worth watching, but I don't think it changes anything yet. I also wonder why Google had to wait on Adobe to "invent" some kind of special Flash player for them to use. It seems to me that if it were that simple Google would have done it years ago, no? Or are the brains at Adobe really smarter than the ones at the Googleplex?

I don't know. My SEO advice: Stick with creating HTML-based Web sites that users will want to interact with. Save the Flash stuff for the movie Web sites no one really cares about anyway.

--Bruce Clay
Read the rest in Bruceclay.com

Friday, July 18, 2008

HTML5 will never fly: It tries to reinvent the Web, if not the Internet. And this is a Bad Thing.

--Frank Ellermann on the uri mailing list, Sunday, Wed, 25 Jun 2008 09:45:24

Wednesday, July 16, 2008

namespaces might not be to everyone’s taste, but they enable one markup language to re-use other markup languages. Reuse helps everyone: it lowers the amount of design you have to do, it prevents authors from having to learn another way of marking something up, it enables programmers to reuse their code. Reusing languages such as XHTML, SVG or MathML should be a no-brainer, and we can do so easily because we have namespaces.

--Jeni Tennison
Read the rest in Bad XML

Monday, July 14, 2008

XML 1.0 (fifth edition) is bad from a standardization and engineering viewpoint, betrays the goals of XML 1.0 which have served well for the last decade, and may hurt the end-users it is intended to support. It sets up a workable versioning mechanism then fails to use it for a significant change. It provides a good foundation for workable minor versioning, then ignores the foundation and builds on sand with its allowing of incompatible names.

I may be wrong, but it looks like a hack to me. However, fortunately it barely impacts anyone in the West, including me nowadays, so who cares? Interoperability, schminteroparibility! Unambiguous labelling of data formats, gedoudahere!

--Rick Jelliffe
Read the rest in XML 1.0 (draft fifth edition) builds a foundation then doesn't use it

Thursday, July 10, 2008

Good XML vocabulary design generally (not always) involves making sure that human-language content is in element content, not in element names and not in attributes Or course, both HTML and RDF have got this wrong, but it is still a good principle.

--Liam Quin on the xml-dev mailing list, Saturday, 23 Feb 2008 00:30:30

Wednesday, July 9, 2008

Providers of broadband internet access services, including Bell, should be prohibited from throttling lawful applications. The internet is simply too important to allow them to act as such a gatekeeper; the internet's myriad benefits can only be fully realized when Canadian carriers allow end users to choose the applications and content they prefer. Protecting end user choice is the central issue in this proceeding, but also a much larger issue. It goes to the heart of the internet and how it acts as an extraordinary platform for innovation and fair competition.

Bell claims its throttling of peer-to-peer (“P2P”) applications is a reasonable form of network management. Google respectfullydisagrees. Network management does not include Canadian carriers’ blocking or degrading lawful applications that consumers wish to use. From consumer, competition and innovation perspectives, throttling applications that consumers choose is inconsistent with a content and application-neutral internet, and a violation of Canadian telecommunications law, which forbids unfair discrimination and undue or unreasonable preferences and requires that regulation be technologically and competitively neutral.

Instead of further expanding its network so that it can honour all end user choices of content and applications, and instead of implementing technologicallyand competitively neutral measures for addressing instances of congestion, Bell has pre-empted end users’ right to choose –without notice to either end users or Bell’s wholesale GAS service customers. It should not be allowed to continue to do so.

--Google Inc.
Read the rest in Comments concerning CAIP Part VII Application requesting certain orders directing Bell Canada to cease and desist from “throttling” its wholesale ADSL Access Services (annoying PDF)

Tuesday, July 8, 2008

I have no respect for ECMA, and very little for ISO, for setting standards in the fields of software and networking protocols. As far as I can see, both of those organizations have shown that they are susceptible to being heavily co-opted by large companies to screw end-users. ISO is better than ECMA, but that's pretty much damning by faint praise.

--Jeff McAdams on the whatwg mailing list, Sunday, 13 Dec 2007 15:53:48

Monday, July 7, 2008

Everybody who actually touches the technology has known the truth for years, and it’s time to stop sweeping it under the rug. W3C XML Schemas (XSD) suck. They are hard to read, hard to write, hard to understand, have interoperability problems, and are unable to describe lots of things you want to do all the time in XML. Schemas based on Relax NG, also known as ISO Standard 19757, are easy to write, easy to read, are backed by a rigorous formalism for interoperability, and can describe immensely more different XML constructs.

--Tim Bray
Read the rest in ongoing · Choose RELAX Now

Saturday, July 5, 2008

No, it doesn't matter how many different implementations handle invalid data in different ways. You can repeat those imaginary goals of HTML5 til the end of days and it still won't matter. The right way to handle invalid data is to refuse to use it, where "use" is entirely dependent on the context where it occurs. I don't care what MSIE does with invalid URI references. I do care what Firefox, Safari, and WebKit do with invalid URI references, but only because I prefer to have them highlighted/rejected rather than used. The implementations I create refuse or reject invalid data because to do anything else is going to be a security hole to someone, somewhere, and it is simply irresponsible to repeat whatever mistakes were made when hacking Mosaic in 1993.

--Roy T. Fielding on the URI mailing list, Wednesday, 25 Jun 2008 20:19:52

Friday, July 4, 2008

Saxon hit its best ever ranking on Sourceforge at number 36 early in June. It's consistently in the top 200 projects (out of 180,000). This puts it in the same league as TortoiseSVN, FileZilla, and jEdit: that is, mass-market general-purpose developer tools. XSLT is not a niche.

--Michael Kay on the <xsl-list mailing list, Monday, 23 Jun 2008 00:40:26

Wednesday, July 2, 2008

One question that I asked a number of people at DocTrain was the role of Microsoft Office and OOXML (similarly asking about Open Office and ODF) in their technical documentation system. The answers were revealing - while a major portion of the documentation that currently exists in most organizations is still in Word files, the ability to work with OOXML is not that big a factor for the typical attendee, because the problems inherent in Word as documentation format have more to do with structural integrity and accessibility than they do with pipeline production. It is possible, as is the case with In.vision Research, to create components that will set up some of these constraints so that the constraining benefits of DITA can be employed there, but Word in particular was not designed to be a constraining editor out of the box, making it fairly useless on its own to provide enough structure consistently to make the resulting XML meaningful.

--Kurt Cagle
Read the rest in Dita, DocBook and the Art of the Document

Tuesday, July 1, 2008

plain text is the only sustainable way to communicate and record language electronically. Rich text formats come and go with dizzying speed, leaving mountains of laboriously crafted documents stranded, soon to be forgotten as the effort of deciphering their quaint outmoded formats becomes less convenient and increasingly costly. What really matters in a document, and what makes it worth saving, is its content, not its form.

--Frank da Cruz on the unicode mailing list, Monday, 30 Jun 2008 9:24:30

Monday, June 30, 2008

Microsoft's success is based on two products and only two products -- Windows and Office. Microsoft is obsessed with the idea that Google will undermine one or both of those monopolies through Google Apps. This is all Steve Ballmer thinks about and is what made him so eager to spend $40+ billion for Yahoo. But what if the real threat isn't Google at all, but Apple?

--Mark Stephens
Read the rest in I, Cringely . The Pulpit . MeMobile, You Kaput | PBS

Sunday, June 29, 2008

It really makes my blood boil when I bookmark a page at a site and later find that the URL I saved is broken, because the webmorons maintaining the site couldn't be bothered to ensure that old URLs remained valid when said morons decided to reorganize.

--Scott Meyers on the scott_meyers mailing list, Sunday, 22 Jun 2008 11:49:09

Friday, June 27, 2008

Now what happens if people start to block ads? First of all, everybody who hates ads and wouldn’t click them anyway now blocks ads. And this can make advertisers really happy because instead of wasting their bandwidth (and money) they now only serve ads to people who are interested in them. They also get better statistics and can see which ads people find more interesting — without having to estimate the number of people who wouldn’t click any ad.

--Wladimir Palant
Read the rest in Adblock Plus and (a little) more: Ads don't generate money

Thursday, June 26, 2008

Advertisements on the Web have been annoying and intrusive since the beginning - from 10KB banner ads in the days of 14.4kbps modems, the more intrusive pop-up and pop-under ads when 56.6kbps modems came along. Then there came Flash ads that covered up the information we wanted to read, while we were reading it, fake Windows dialog boxes, adware programs, and now even embedded flash video in banner ads. Every time we got more bandwidth to compensate for the delays caused, in part, by advertising, advertisers took more bandwidth.

The lone exception - and unfortunately, innocent victim - in this was Google, who pioneered a low-bandwidth, highly-relevant way to do online advertising that went against everything other advertising companies had done. They provided more value, rather than less, they stayed out of the way, rather than obstructed. There is a reason that Google is the #1 online advertiser in the world, and it has everything to do with common courtesy and decency.

--Brian Boyko
Read the rest in Adblock: Adapt, or die.Network Performance Blog, Network Performance Management News, Tutorials, Resources

Monday, June 16, 2008

JavaScript is an important language because it is the language of the web browser. Its association with the browser makes it one of the most popular programming languages in the world. At the same time, it is one of the most despised programming languages in the world. The API of the browser, the Document Object Model (DOM) is quite awful, and JavaScript is unfairly blamed. The DOM would be painful to work with in any language. The DOM is poorly specified and inconsistently implemented. This book touches only very lightly on the DOM. I think writing a Good Parts book about the DOM would be extremely challenging.

JavaScript is most despised because it isn't SOME OTHER LANGUAGE. If you are good in SOME OTHER LANGUAGE and you have to program in an environment that only supports JavaScript, then you are forced to use JavaScript, and that is annoying. Most people in that situation don't even bother to learn JavaScript first, and then they are surprised when JavaScript turns out to have significant differences from the SOME OTHER LANGUAGE they would rather be using, and that those differences matter.

The amazing thing about JavaScript is that it is possible to get work done with it without knowing much about the language, or even knowing much about programming. It is a language with enormous expressive power. It is even better when you know what you're doing. Programming is difficult business. It should never be undertaken in ignorance.

--Douglas Crockford
Read the rest in JavaScript: The Good Parts

Friday, June 13, 2008

One of the most frequent criticisms which self-proclaimed journalists voice about bloggers is that journalists (but not bloggers) engage in "real reporting" -- by which they mean that they speak to government officials and then faithfully write down what they say and then include those quotes in the things they write, and often shape what they write based on those quotes. But that's exactly the process that transforms journalists into handmaidens for government propaganda, that makes them fear a loss of access, and renders them dependent on maintaining relationships with the very people whom they're ostensibly scrutinizing.

--Glenn Greenwald
Read the rest in Longtime reporter: "Bloggers have taught me a lesson" about dependency on sources - Glenn Greenwald

Wednesday, June 11, 2008

Indeed, this is a strategy I would recommend highly to both Microsoft and Sun - set up a “constrained” mode that can utilize a DTD, Relax NG or XSDL schema to determine what styles are valid at any given point in a document, then only expose those styles. It would require some other modifications (cut and paste would have to become more intelligent, for instance, but the advantage to this approach is huge - it turns the word processors that most people use into an XML friendly word processor. Not to put a damper on good third party tools such as those provided by In.vision Research or JustSystem’s XMetal , but an overarching theme that I heard at the conference was that for business documents and documentation in particular, structured documentation beats unstructured pretty much universally - it’s easier to author, easier to repurpose, easier to search, easier to integrate into larger systems. Which format works best where is still somewhat debatable, but the documentation community has taken to XML in a huge way, and more than nearly any other sector, they are working with XML in the way that it was originally intended to be worked.

--Kurt Cagle
Read the rest in Dita, DocBook and the Art of the Document

Tuesday, June 10, 2008

5 years on it's inevitable that the smart people will have recognized some of the imperfections. But that's irrelevant. XML is now in the phase where ordinary people will use it because everyone else uses it, not because it's the smartest idea in town. And quite right too.

--Michael Kay on the xml-dev mailing list, Tuesday, 26 Oct 2004

Friday, June 6, 2008

XForms+REST+XQuery = XRX = High ROI for Web Developers

XRX gives developers the luxury of using the same data selection language (XPath) on both the client and server. The same expressions can be used in your MVC bind on the client and in Schematron data validation rules on the server. This however, is not the motivation for migrating to XRX. Declarative techniques that use XML structures tend to accelerate the creation of domain-specific languages (DSLs). DSLs are easier to manage with forms and graphical user interfaces which makes them more useable by SME’s and BA’s. XRX is the front runner in the declarative revolution and the forces empowering non-programmers. This is not to say that XRX will not have opposition. Vendors selling operating-specific client APIs or SQL products will resist XRX technologies for the foreseeable future. An entire community of AJAX developers has grown up around the lack of declarative technologies in our browsers. But in the long term these opponents will be required to compete against a simpler and superior architecture.

--Dan McCreary
Read the rest in XRX: Simple, Elegant, Disruptive

Thursday, June 5, 2008

Between the ages of 25 and 60, the time users need to complete website tasks increases by 0.8% per year.

In other words, a 40-year-old user will take 8% longer than a 30-year-old user to accomplish the same task. And a 50-year-old user will require an additional 8% more time. (Mathematically inclined readers will note that this increase is linear, not exponential.)
This finding is statistically significant at the 5% level, given the 61 users in our study.
Does this mean that people in their 40s or 50s can't do their jobs? Not at all. There are many other ways in which people get better with age.
Individual differences swamp the tiny age-related difference in the 25- to 60-year-old group. Users are extraordinarily variable in their use of websites and intranets.
I have a 5-5-5 rule for task times while using websites: Across a broad range of studies, our data shows that

the slowest 5% of users are
about 5 times as slow
as the fastest 5% of users,

meaning that the slowest users need 400% more time to perform the same tasks. The 0.8% difference caused by each year of aging pales in comparison.

So, a fast 50-year-old will beat a slow 30-year-old every day — by several hundred percent.

--Jakob Nielsen
Read the rest in Middle

Wednesday, June 4, 2008

XML is extensible: you can make up your own names for the elements that you use. Bad markup languages use generic names like table or record, which only mean anything when coupled with a name provided in an attribute or child element. Good markup languages use element names to provide semantics about element content.

--Jeni Tennison
Read the rest in Bad XML

Tuesday, June 3, 2008

Cisco did a favor to the Chinese government several years ago by selling them the mirroring routers on which the Great Firewall is based, at a time when Chinese authorities could not easily have produced the systems on their own. The likely use of the routers was well understood – and it should be obvious why selling them to a government which intends to monitor its citizens is different from selling them to some company that wants to monitor its employees. But whatever the merits of the argument back then, the entire question is now moot. The Chinese authorities could buy the necessary routers from a variety of sources – notably from the homegrown firm Huawei. So, really few people here spend much time worrying about Cisco’s role anymore.

--James Fallows
Read the rest in Chinese Internet censorship: An inside look

Monday, June 2, 2008

So along comes XML 1.0 (fifth edition) as a draft. It has only a couple of changes of significance. The first is that it finally puts in place a rudimentary versioning system: E10 allows an XML 1.0 processor to parse an XML 1.x document on the understanding that it only reports things in terms of XML 1.0 rules and capabilities.

The second change then makes a mockery of the first. It introduces the lax naming rules from XML 1.1. Now such a change is not required for any reason, because XML 1.1 exists and could be used. So rather than go into a well-managed regime where documents are well-labelled, and XML minor versions chug along, XML 1.0 draft fifth edition just allows a new XML parser to reject documents that an old parser rejected: and remember this is not because of previous bad practice being more consistently exposed, but because some innocent person has created a document with the new name characters and the XML 1.0 processors deployed in the last decade reject it.

Basically, the W3C XML WG is saying that if you get a document that breaks in this way, it is the receiver’s problem. The sender can say “But it is well-formed against the latest version of XML 1.0 and the XML WG washes their hands. It is the triumph of bad engineering practice, of doing what can be guaranteed to fail, of putting the responsibility on the wrong person. It will cause problems first for the nominal beneficiaries of these extra name characters (since they will be unreliable) and second for people using non-UTF-8 encodings who won’t get as many WF errors. So who will benefit: the makers of standards who will have less housekeeping. They are not an unworthy set of stakeholders.

--Rick Jelliffe
Read the rest in XML 1.0 (draft fifth edition) builds a foundation then doesn't use it

Tuesday, May 27, 2008

XRX is a new web development architecture that is a milestone in elegant simplicity. XRX stands for:

XForms on the client
REST interfaces
and XQuery on the server

Because XRX uses a single model for data (XML) it avoids the translation complexity of other architectures. The simplicity and elegance of XRX allows developers to focus on other value-added features of web application development and enables non-programmers to create a rich web interaction experience without the need to use procedural programming languages.

--Dam McCreary
Read the rest in XRX: Simple, Elegant, Disruptive

Monday, May 26, 2008

One thing that becomes obvious when studying the XML landscape is that a significant amount of XML is still being used for purposes of describing narrative, for telling a story, advising people in the use of a product, structuring reports, and doing other things that focus more on documents than they do on data.

In some respects, this is not all that surprising. In general, when you’re dealing with data-centric applications, XML isn’t always the best choice for working with structured content, and indeed there are times where XML is perhaps the worst, most hideously inefficient mechanism for dealing with data. However, the use of XML as a means of writing and marking up narrative has become the standard means of encoding structured content in most organizations. That doesn’t mean that XML is dominant in most organizations for “unstructured” content - that distinction is still very much in favor of Microsoft Word, with XML occupying a considerably inferior position there - but for organizations that recognize the benefit of structured content, XML languages such as DITA and DocBook are very quickly becoming the standard for storing information.

--Kurt Cagle
Read the rest in Dita, DocBook and the Art of the Document

Saturday, May 24, 2008

Most of us will never need to know which characters map to which Unicode numbers, but modern computing could scarcely do without Unicode. In fact, it's what's letting you read this article in your Web browser, right now.

--Neil McAllister
Read the rest in The 10 Most Important Technologies You Never Think About

Friday, May 23, 2008

I am absolutely sick and tired of big companies coming in and throwing their weight around in standards organizations and getting their end-user-screwing technologies embedded into supposedly open and free standards. I've watched it happen in the past with the w3c, I've watched it happen repeated in the IETF, I don't think I've ever seen it *not* happen with ISO, ECMA seems *designed* to rubber stamp end-user-screwing technologies. And, yes, Apple, I'm looking at you here too. Your hands are not clean in this from past exercises. No, I don't trust you, yes, I'm going to object loud and long to any move that appears to be moving away from free and open technologies, which is what this is.

--Jeff McAdams on the WHATWG Proposals mailing list, Wednesday, 12 Dec 2007 04:30:48

Wednesday, May 21, 2008

Microsoft has domain email too, but, quick, go to MSN Search and do a search for “email for your domain.” (The result is even worse for Microsoft if you do that search on Google — although Office Live is advertising on Google, but not on MSN, ironically enough). Do you find anything about Microsoft’s hosted email? No. Why? Because Microsoft’s Web site for Office Live (which is where small businesses can get email hosted) is poorly designed for search engines. Its title tag has NOTHING about email. So, how, again, will it get listed on search engines for business or corporate email solutions? It won’t.

Microsoft’s marketers should be forced to study how search engines work. Google’s marketers understand this deeply. Which is why Podtech is using Gmail and not one of Microsoft’s solutions.

--Robert Scoble
Read the rest in Scobleizer — Tech geek blogger » Blog Archive Wanting to use “dead” Office and why we’re not «

Tuesday, May 20, 2008

The problems for the media are structural. In an age of 24-hour news channels, they have more hours than they have news. So the shows are really run as a type of soap opera. Campaigns become ongoing stories with a cast of characters and a history that can be referred back to. That requires the daily construction of a story line. Characters need definition and catchphrases and frailties. Above all, they have to be interesting and arouse strong passions -- be they anger or inspiration -- in viewers.

On a technical level, the visual nature of television requires clips that can be easily and endlessly replayed to remind viewers of what they're watching and what happened in past episodes. So it's no surprise that the media hunger for out-of-character gaffes and missteps -- those moments are crucial to the business model.

But politicians increasingly have alternatives. The daily e-mails of the candidates, for instance, reach millions of their supporters (and, for that matter, detractors and people who can't figure out why they ended up on the e-mail list), and for better or worse, lay out the campaign's chosen narrative. When Obama gave his speech on race and the Rev. Jeremiah A. Wright Jr., millions of people took to YouTube to watch the unedited version.

--Ezra Klein
Read the rest in A campaign without the 'gotchas'

Monday, May 19, 2008

Not all XML is created equal, and I think the biggest distinction between a good markup language and a bad one comes down to whether the XML was designed as a markup language or whether it’s a serialisation of a completely different model. Practically all the XML serialisations that I’ve seen of object-oriented models, or relational models, or graph models, have been dreadful as markup languages.

--Jeni Tennison
Read the rest in Bad XML

Wednesday, May 14, 2008

If you're data is really simple, maybe just a set of key/value pairs, and if both the key and the value are strings, and if the consequences of bad data are negligible, and if there's no possibility that there will ever be any additional complexity, then sure, maybe a flat text file is all you need.

On the other hand, the difference between:
fruit=pear
vegetable=carrot
topping=wax
and
<doc>
<fruit>pear</fruit>
<vegetable>carrot</vegetable>
<topping>wax</topping>
</doc>
isn't really that large, is it? (Or maybe you think it is, de gustibus non est disputandum.) Except, of course, that in the XML case, you don't have to write or maintain the code for the parser, unit tests for the parser, or documentation for the parser in every language (programming and documentation), and for every platform, supported by your application. Nor do you have to worry about how to parse the file when the data contains spaces or new lines or Chinese characters. And some day, when the data is just a tiny bit more complex, you won't have to devise some clever hack for extending the format. You'll just use XML.

--Norm Walsh
Read the rest in Defending the tax

Tuesday, May 13, 2008

My friend Eamonn Fingleton, says in a new book about China (In the Jaws of the Dragon) that many kinds of government control in China are surprisingly effective precisely because they are so variable and unpredictable in the way they're enforced. Fingleton uses the term "selective enforcement" to describe this process; some Chinese people refer to it by a Chinese saying that boils down to, "One eye open, one eye shut." The idea is that if you're never quite sure when, why and how hard the boom might be lowered on you, you start controlling yourself, rather than being limited strictly by what the government is able to control directly.

When it comes to the Internet, this haziness about just what is and is not permissible has two implications. At a purely technical level, it makes it harder to reverse-engineer the firewall's filters. One day, you can reach all pages at the BBC. The next day they're blocked. If you're trying to game out the system, you're stymied. And at a social level, it makes it hard for people to be sure that they're ever operating in a truly safe zone, since the rules of enforcement might shift tomorrow.

--James Fallows
Read the rest in Chinese Internet censorship: An inside look

Saturday, May 10, 2008

the same is true right now when it comes to projects like Open Office where, again, there are rationalizations for why you have to assign copyrights to some and they may even be valid, but it does undermine the community because it means that there is a first among equals.

There is – Sun ends up having rights that nobody else has – even if they then act perfectly and they really behave well, just the fact that they have special rights makes people legitimately feel like they are second class citizens and that’s not how you build a community.

One of the things I did with Linux from very early on was when somebody sends me changes, they retain all copyright in those changes. Nobody has – I don’t have any more rights than anybody else has except in the sense that I’ve written more code than most people, but – I say most people, not all people, that your rights when it comes to Linux are directly what you put into it.

--Linus Torvalds
Read the rest in Linus Torvalds

Friday, May 9, 2008

Corporations who were market leaders in the 1980s and 1990s for PC applications have a responsibility to make sure that documentation on their old formats are not lost. Especially for document formats before 1990, the benefits of the format as some kind of IP-embodying revenue generator will have lapsed now in 2008. However the responsibility for archiving remains.

So I call on companies in this situation, in particular Microsoft, IBM/Lotus, Corel, Computer Associates, Fujitsu, Philips, as well as the current owners of past names such as Wang, and so on, to submit your legacy binary format documentation for documents (particularly home and office documents) and media, to ISO/IEC JTC1 for acceptance as Technical Specifications.* Handing over the documentation to ISO care can shift the responsibility for archiving and making available old documentation from individual companies, provide good public relations, and allow old projects to be tidied up and closed.

--Rick Jelliffe
Read the rest in An Open Letter to Microsoft, IBM/Lotus, Corel and others on Lodging Old File Formats with ISO

Thursday, May 8, 2008

File formats may sound mundane, but they can give strategic value to those who control them as a gateway to the data held by people and companies.

--Stephen Shankland
Read the rest in Google mapping spec now an industry standard | Tech news blog

Wednesday, May 7, 2008

Don't believe everything your 25-year old Web designers tell you about "what's easy" — especially if your target audience is 50-year-old corporate managers.

--Jakob Nielsen
Read the rest in Middle

Friday, May 2, 2008

While using XML to persist data structures enables some minimal benefits via text encoding, it seems to be harmful to the larger goal of XML being easily understandable and thus processable by many applications. So, while in a small way, dumping data structures to XML is not evil, it also is not recommended. The reason is that the data remains fairly tightly coupled to the program which produced it and thus the semantic value of the data, as a standalone entity, is diminished. In short: Better to design XML documents than dump XML data.

--Michael C. Daconta
Read the rest in XML design: data or documents?

Thursday, May 1, 2008

Clearly in terms of the information represented by a message, content and markup are completely interchangeable. It's purely a tactical distinction to assist the recipient with processing.

--Michael Kay on the xml-dev mailing list, Tuesday, 11 Mar 2008 19:07:22

Tuesday, April 29, 2008

Google isn't really interested in the current enterprise shops - as Scoble says, they are way, way too deep into legacy to flip. They'll be happy to make inroads, but that's not really where they intend to win. Where they intend to win is with newer shops - the places that will be "the enterprise" in 10-20 years. Those shops are small now, and way, way easier to convince that they don't need the weight of the full set of tools from Microsoft, IBM, SAP (etc). They'll happily go along with Google Apps, SalesForce, Amazon EC2... and not look back.

--James Robertson
Read the rest in The Game is Different

Monday, April 28, 2008

GET almost always has side-effects, if only writing a line into the httpd log file. The correct characterization is that using GET is "safe" in the sense that it does incur any obligations on the part of the agent issuing it.

--Tim Bray on the Atom-Protocol mailing list, Friday, 16 Mar 2007 10:50:24

Friday, April 25, 2008

1. ISPs should be clearer in their marketing (Ofcom can help them). Unlimited broadband should mean unlimited.

2. There should be industry agreement on what you buy is what you get: for a start, an 8Mbs-1 tariff should deliver "at least" 8Mbs-1, not "up to".

--Ashley Highfield
Read the rest in BBC Internet Blog

Thursday, April 24, 2008

AOL really wasted an opportunity when they bought Netscape. They didn’t do enough to leverage the site’s traffic and brand. They basically sat on it. It was a strategic blunder.

--Greg Sterling
Read the rest in Macworld: News: Can AOL keep Netscape.com from fading away?

Tuesday, April 15, 2008

Let's revisit the spectre haunting venture capital. Why aren't there more Googles?

The answer's very simple. Because every company that had the potential to be economically revolutionary over the last five years sold out long before it ever had the chance to revolutionize anything economically.

Think about that for a second. Every single one: Myspace, Skype, Last.fm, del.icio.us, Right Media, the works. All sold out to behemoths who are destroying, with Kafkaesque precision, every ounce of radical innovation within them.

--Umair Haque
Read the rest in How to Fix Venture Capital

Monday, April 14, 2008

it is as old as plato. rdf is just the latest incarnation of the belief that there is a world of ideas, and if we just try hard enough, we will be able to formalize it, and then anything can be formalized based on these formalized ideas. this has been tried numerous times, and rdf just put pointy brackets around that.

--Erik Wilde on the uri-request mailing list, Monday, 10 Dec 2007 14:29:52

Friday, April 11, 2008

Reading an XML configuration file, everyone seems to jump on Commons Digester, which I assume you'd only ever use 10% of the API. The result is one huge dependency to do one small task. The whole thing can be done with some clever DOM coding; it's a case of taking the time.

--Jason Bell
Read the rest in A Christmas Wish List

Wednesday, April 9, 2008

I myself feel that it is very important that my ISP supplies internet to my house like the water company supplies water to my house. It supplies connectivity with no strings attached. My ISP doesn't control which websites I go to, it doesn't monitor which websites I go to.

--Tim Berners-Lee
Read the rest in BBC NEWS | Technology | Web creator rejects net tracking

Tuesday, April 8, 2008

Newspapers aren't dead yet, though they are sadly dying by the month. According to the New Yorker piece, American newspapers have lost 42% of their market value over the last 3 years alone.

The problem is that classified ads are one of the few forms of advertising that work better online than in print, and they used to be a huge part of newspaper revenues (33% of all newspaper ad revenues in 1985, the year the Web took off). Display ads work much better in print (since they hardly work at all on the Web due to banner blindness), but display ads plus subscriptions are insufficient to fund a newspaper.

Furthermore, there's a tendency for people to get more of their information online than from print, and since the value of a website reader is vastly less than the value of a print reader, this compounds the problem.

--Jakob Nielsen on the "Alertbox Announcement List", Monday, 31 Mar 2008 07:00:00

Monday, April 7, 2008

A lot of sites which claim to offer unvarnished customer reviews operate the same way. These sites aren't Consumer Reports. Their business model is selling products, services and advertising, the same as the businesses they allegedly watchdog. Worse, you ain't their client. The hotels and their referral fees are what's keeping them in the black. While they might allow benign criticism like "slow room service" I doubt they would want to alienate a client with a truly scathing review.

--Steve Manes on the WWWAC mailing list, Monday, 24 Mar 2008 10:51:20

Sunday, April 6, 2008

Governments have to start asking themselves what the ISO seal of approval really means. As demonstrated by the MPEG standards, it never meant that something qualifies as a meaningful "Open Standard." Now it seems that ISO could be the wrong forum for standards development in information technology in general. It seems to work too slowly or too poorly to make the ISO brand meaningful in the IT world. We'll have to see whether ISO can repair its own processes enough to become a meaningful participant.

--Jonas Öberg. Free Software Foundation Europe
Read the rest in Groklaw

Saturday, April 5, 2008

You'd think that sooner or later even Microsoft would learn there's a right way and a wrong way to do things. But whenever it has a choice it persists in choosing the wrong way. There are lots of smart people who work there. Didn't even one of them think bribery and corruption possibly weren't the best way to get the OOXML document format declared an international standard?

--David Dejean
Read the rest in Microsoft wins this OOXML battle, but loses the war | Computerworld Blogs

Thursday, April 4, 2008

XML is a hierarchic data model but the world is a network. There are lots of possible ways to fill in the missing links [sic], but none of them feels very satisfactory (for example, many of them only work for intra-document relationships).

I'm not even comfortable that the hierarchic relationships should be special. Why can't we have multiple hierarchic views of the same network? Why do all my queries have to change depending on whether my footnotes are inline, out-of-line referenced by IDREFs, or in external documents referenced by URI? What happened to the old doctrine of data independence?

--Michael Kay on the xml-dev mailing list, Friday, 22 Oct 2004

Wednesday, April 2, 2008

Today, the standards-based multi-vendor platform implemented by Gecko, WebKit and Opera not only works for Web users but is the preferred platform for developers. IE is a legacy platform that gets the backport. When Microsoft tries to bring the next IE closer to the standards-based multi-vendor platform, the breakage isn’t caused by defects in the standards-based multi-vendor platform—after all, Gecko, WebKit and Opera work with real sites and Web apps. The problem is that sniffed IEness is tainted by the old IE platform.

--Henri Sivonen
Read the rest in Reality Distortion Fields

Tuesday, April 1, 2008

A few years ago, some of the April Fool's day posts were inventive enough to be funny. Now? It just all seems so forced.

--James Robertson
Read the rest in April Fool's Day

Monday, March 31, 2008

In 1999, I wrote an article "Metcalfe's Law in Reverse" about the problems of so-called walled gardens, where a service cuts itself off from the Internet and tries to add value by being closed.

> http://www.useit.com/alertbox/990725.html

Facebook and the current generation of social networks are trying to replicate the walled garden strategy that failed ten years ago. It'll fail again.

--Jakob Nielsen on the "Alertbox Announcement List", Monday, 31 Mar 2008 07:00:00

Friday, March 28, 2008

Google Adsense: funding Slashdot trolls since 2003.

--Mark Pilgrim
Read the rest in Translation From MS

Thursday, March 27, 2008

Counter-intuitively, Postel's robustness principle ("be conservative in what you send, liberal in what you accept") often leads to deployment problems. Why? When a new implementation is initially fielded, it is likely that it will encounter only a subset of existing implementations. If those implementations follow the robustness principle, then errors in the new implementation will likely go undetected. The new implementation then sees some, but not widespread deployment. This process repeats for several new implementations. Eventually, the not-quite-correct implementations run into other implementations that are less liberal than the initial set of implementations. The reader should be able to figure out what happens next.

--Marshal Rose
Read the rest in RFC 3117

Tuesday, March 25, 2008

The precise problem here is that you’re pretending that there’s one standard, but since nobody has a way to test against the standard, it’s not a real standard: it’s a platonic ideal and a set of misinterpretations, and therefore the standard is not serving the desired goal of reducing the test matrix in a MANY-MANY market.

--Joel Spolsky
Read the rest in Martian Headsets

Monday, March 24, 2008

Don't kid yourself into thinking that anyone other than experts will ever be interested in managing metadata. The 303 and Link solutions are for Semantic Web experts to care about, content management software to implement, and "online librarians" to manipulate. Normal folks who just want a page on the Web do not need any of this nonsense -- they don't care if links are ambiguous and never will.

--Roy T. Fielding on the www-tag mailing list, Sunday, 20 Mar 2008 18:20:54

Saturday, March 22, 2008

Content stored on a DirecTV DVR is locked down tighter than a SuperMax prison after visiting hours, so piracy doesn't even enter the equation here. What the move is intended to do is drive additional rentals and protect DVD and Blu-ray sales. By limiting the PPV window to 24 hours, DirecTV and the studios can prevent customers from indefinitely storing movies for repeated viewing. So if you want to watch that Hollywood blockbuster after the 24-hour PPV windows expires, you'll have to reorder it, head down to the video rental shop (brick and mortar or online), or buy physical media. If you want to watch it twice, you'll have to pay twice.

Or, you'll do what millions already do, and go straight to BitTorrent.

--Eric Bangeman
Read the rest in DirecTV DVR clampdown: a sober reminder of DRM suckitude

Friday, March 21, 2008

DRM is dead, even if a lot of broadcasters don't realize it. If it's bad for the consumers, it's bad for the company.

--Guinevere Orvis, CBC producer
Read the rest in Canadian public TV to try out BitTorrent | Surveillance State

Wednesday, March 19, 2008

the writing is on the wall: the days of private proprietary formats, especially binary formats, are numbered and perhaps have already expired.

--Rick Jelliffe
Read the rest in O'Reilly XML Blog

Tuesday, March 18, 2008

Imagine that everything you are typing is being read by the person you are applying to for your first job. Imagine that it's all going to be seen by your parents and your grandparents and your grandchildren as well.

--Tim Berners-Lee
Read the rest in BBC NEWS | Technology | Web creator rejects net tracking

Monday, March 17, 2008

It’s a pity; when XSD came out people thought that since it came from the W3C, same as XML, it must be the way to go, and it got baked into a bunch of other technology before anyone really had a chance to think it over. So now lots of people say “Well, yeah, it sucks, but we’re stuck with it.” Wrong! The time has come to declare it a worthy but failed experiment, tear down the shaky towers with XSD in their foundation, and start using RELAX for all significant XML work.

--Tim Bray
Read the rest in ongoing · Choose RELAX Now

Sunday, March 16, 2008

police departments should be required to post all citizen complaints against individual officers online in a searchable database. Individual officers, their union reps, or their departments could post responses or explanations to frivolous claims. Police officers are public servants. Not only that, they're public servants with the power to arrest, detain, and use lethal force. If certain officers are the subject of repeated complaints and aren't being properly investigated internally, the public ought to be informed of that. This culture of secrecy—and of intimidating anyone who dares question it—isn't healthy.

--Radley Balko
Read the rest in Reason Magazine

Saturday, March 15, 2008

DTDs are not only needed for validation, but also for attribute value defaulting and general entity definitions. That is, unless document is declared to be standalone ('standalone="yes"' in xml declaration); in which case it is not necessary to load dtd except for validation purposes.

--Tatu Saloranta on the xalan-j-users mailing list, Tuesday, 11 Mar 2008 12:20:28 -0700

Thursday, March 13, 2008

I've never read a satisfying, compelling paper that explains how we are to differentiate between "content" and "markup" in an arbitrary context (it seems to depend upon perspective -- as similarly for "data" vs "metadata").

--Robin Cover on the xml-dev mailing list, Tuesday, 11 Mar 2008 14:46:46 -0400

Wednesday, March 12, 2008

I have become a fervent supporter of XOM and wish I had discovered it earlier. W3C DOM has wasted at least a year of my life. Thank you to Elliotte for speaking out against it - I thought it was me that was incompetent!

--Peter Murray-Rust on the xml-dev mailing list, Sunday, 29 Sep 2005 18:11:14

Tuesday, March 11, 2008

A hallmark of what should be seen as a reasonable business practice is certainly whether or not the people engaging in that practice are willing to describe it publicly.

--FCC Chairman Kevin Martin
Read the rest in FCC hints at taking action against Comcast | Tech news blog

Thursday, March 6, 2008

Microsoft says many things and they may even mean them, but the fact that the person who says them means them, does that mean anything in the larger Microsoft picture? I don’t know. That seems to always be a bit unclear.

I think there are people inside Microsoft who really want to improve interoperability and I also think there are people inside Microsoft who would much rather just try to stab their competition in the back.

I think the latter class of people have usually been the one who won out in the end, but – so I wouldn’t exactly trust them. But I think they’re sincere; it’s just that a part of them, at least, is sincere.

--Linus Torvalds
Read the rest in Linus Torvalds

Wednesday, March 5, 2008

using a database that does relational data as well as XML has a big downside in terms of complexity. There's a lot of functionality there that you don't need, and you can't ignore it - it keeps popping up in all sorts of places, and it colours the whole flavour of the product. Also, if you're in the position where you need to convince and re-educate your developers to do everything the "pure XML" way, it's a good idea to use technology that doesn't give them any choice.

--Michael Kay on the talk@xquery.com mailing list, Saturday, 15 Dec 2007 18:01:31

Saturday, March 1, 2008

namespaces are one of the arguments people keep pushing when they want to reject XML (as whole or not). That was the first argument Dave Winer threw on RSS 1.0 back in 2000 and last year at XTech 2007, this was again one of the main arguments the WHATWG threw against XHTML 2.0. And people keep doing that because they've noticed that we can't seriously deny that XML namespaces are insane and because we've written it many times in the past

--Eric van der Vlist on the xml-dev mailing list, Tuesday, 12 Feb 2008 20:31:19

Thursday, February 28, 2008

I went to "Office Live" thinking it would be an office suite, but once again, it's something I don't quite understand--maybe it's web marketing. It's not clear. Oh, I can create a web site and buy ads. How, in this or any universe, should this be called "Office" when Microsoft Office has always been productivity tools like a word processor and spread sheet?

Even paying people they are going to have to stop being so freaking clueless when it comes to naming. You can't take a brand name like "Office" (so original in the first place) and suddenly make it about domain names and ads. It makes absolutely no sense and it undermines your brand.

What are they smoking in Redmond? Maybe it's just Ballmer's after shave which is making them too dizzy to think clearly.

--Daniel Will-Harris
Read the rest in Frickingenius: How Microsoft should spend their $44 billion

Wednesday, February 27, 2008

the really big question: Microsoft already has an Internet division: MSN. But they haven't been able to make that go anywhere. Yahoo made a good start at the beginning of the Web revolution, but then got stuck and faltered. Why on earth would combining the talents of Microsoft MSN and Yahoo produce anything other than more stumbling around in the dark? Which company would lead this recovery? At least Yahoo managed to make a strong start on the Web before driving into the weeds. But no, it would be the company that takes over, Microsoft, who brought us both MSN and Windows Vista.

How does combining a failing company (Yahoo) and a failed division of Microsoft (MSN) produce a successful division of Microsoft?

The one thing that the two companies have in common is the very thing that will make them both fail at this Web thing. They both believe that it's about centralization. The companies are built around that philosophy. Yahoo tried to create a central place where people would go on the Internet, which worked in the early wild-west days of the web -- Yahoo was the best thing out there for awhile -- and it's what caused Yahoo to slide as the Web became more decentralized. Yahoo's centralized services are no longer as good as their decentralized competitors. Now we have BBC for news, YouTube for video, etc. And Microsoft, of course, is the "Cathedral" in The Cathedral and the Bazaar.

--Bruce Eckel
Read the rest in Should Microsoft Buy Yahoo?

Saturday, February 23, 2008

While FedEx and UPS have a bit more latency than a TCP/IP connection, you simply can't beat the bandwidth of sticking a hard drive into a box and having it show up somewhere else the next day with all of its data.

--James Duncan Davidson
Read the rest in The Economics of Online Backup

Friday, February 22, 2008

The immense popularity of sites like YouTube has unexpectedly turned Flash Video (FLV) into one of the de facto standards for Internet video. The proliferation of sites using FLV has been a boon for remix culture, as creators made their own versions of posted videos. And thus far there has been no widespread DRM standard for Flash or Flash Video formats; indeed, most sites that use these formats simply serve standalone, unencrypted files via ordinary web servers.

Now Adobe, which controls Flash and Flash Video, is trying to change that with the introduction of DRM restrictions in version 9 of its Flash Player and version 3 of its Flash Media Server software. Instead of an ordinary web download, these programs can use a proprietary, secret Adobe protocol to talk to each other, encrypting the communication and locking out non-Adobe software players and video tools. We imagine that Adobe has no illusions that this will stop copyright infringement -- any more than dozens of other DRM systems have done so -- but the introduction of encryption does give Adobe and its customers a powerful new legal weapon against competitors and ordinary users through the Digital Millennium Copyright Act (DMCA).

--Seth Schoen
Read the rest in Adobe Pushes DRM for Flash | Electronic Frontier Foundation

Thursday, February 21, 2008

If you're arguing against xsi:type, I can certainly understand that. It's the wrong thing in >90% of use cases (probably way more, but I'm just guessing and trying to be conservative), and it makes a mess of instance documents. It was introduced because some members of the Schema WG were very anxious to handle serialization of object graphs, I.e. to do what Java Object Serialization (used by RMI) does. SOAP Encoding is an example of such XML-based object serialization, and it uses xsi:type. Indeed, because of dynamic subtyping, it's generally the case that the serialization formats for such systems must carry for each item both its name and its type. Go read through the format spec. for Java Object Serialization and you'll see that's the core of it. In XML with schema, you can use the tag for the name, and xsi:type for the type.

--Noah Mendelsohn on the xml-dev mailing list, Wednesday, 20 Feb 2008 23:39:20

Wednesday, February 20, 2008

XML is good for reuse. It lets you reuse all your old SGML presentations.

--Eve Maler
Read the rest in Pushing String » XML at X; film at XI

Tuesday, February 19, 2008

Now it is true that countries have their own histories, peculiar circumstances and cultures and that too much preaching can smack of the old missionary zeal to assume that all the things we cherish at home should be universally adopted. But culture, in the sense of custom and tradition, is often nothing but an excuse for political arrangements. Democrats from countries such as China, Pakistan or Burma do not accuse the West of imposing its values. Only authoritarians do.

A few decades ago, it was fashionable, especially in Singapore and Malaysia, to talk about 'Asian values'. Obedience to authority, sacrificing self-interest to what governments defined as national interests, accepting curbs on free speech, all these things were claimed to be specifically Asian, part of ancient traditions, something all Asians had in their cultural DNA. In fact, it was a justification of authoritarian politics inherited by the likes of Prime Ministers Mahathir and Lee Kuan Yew from the British empire.

Even as the Asian values were being touted, South Koreans, Taiwanese, Thais, Chinese and Filipinos were demonstrating in huge numbers against their authoritarian rulers. In South Korea, Taiwan and, more fitfully, Thailand and the Philippines, they succeeded. And what was it that the Burmese were risking their lives for recently, not to mention Kenyans, Zimbabweans, and many others, if not more of what Musharraf dismisses as European cultural impositions? What they want is not our culture, but the kind of freedoms that we take for granted.

-- Ian Buruma
Read the rest in Guardian Unlimited | Comment is free | Culture is no excuse for China denying its people democracy

Monday, February 18, 2008

One of the many troubles with XSD-style schema systems is that they encourage a big bang mentality, where you have to try to make all sorts of decisions about structure and type that are unnecessary or which can be left to emerge, while making it difficult to add constraints based on feedback from processes about what the real problems might be.

--Rick Jelliffe on the xml-dev mailing list, Tuesday, 29 Jan 2008 10:07:58

Saturday, February 16, 2008

A web service which requires me to use XSLT is using standards that anyone can look up and implement (if needed), whereas a service feeding me JSON is requiring me to opt-in to a code library to understand how objects are converted into markup, and understand the client-by-client implementation differences. Such a service would have a much steeper learning curve and greater maintenance hassles, than a system which accomplishes the same thing using Atom + XSLT.

--Eric J. Bowman on the rest-discuss mailing list, Saturday, 29 Dec 2007 12:04:33

Wednesday, February 13, 2008

When people say "XML is hard", they usually do not mean "XML 1.0 is hard" but "XML 1.0 + namespaces in XML + XPath + DOM + XSLT + W3C XML Schema + XML Base + xml:id + XInclude + XPointer + ... is hard" and the proportion of criticism that goes to XML 1.0 itself is usually pretty low.

--Eric van der Vlist on the xml-dev mailing list, Tuesday, 12 Feb 2008 08:28:05

Tuesday, February 12, 2008

When we think of phishers, we often guess that they are a group of highly skilled ninja hackers. They have collectively caused billions of dollars in losses, and ruined the lives of many citizens whose identities they have stolen and abused. These people have got to be pretty smart, right? Wrong. Just think about what a typical phisher is really doing: installing pre-coded websites on compromised servers - that hardly takes any skill and it shouldn't impress anyone. Then you have cases where phishers steal information from other phishers by planting backdoors in the phishing kits in the form of elementary obfuscation of scripting code. In other words, phishers are not able to pull of their attacks because they are highly skilled, but because the are abusing a few fundamental flaws such as lack of awareness, lack of standards around browser UI that clearly highlights high assurance websites, and our dependence on static identifiers such as SSN, Credit Card numbers, etc to establish identifies and commit financial transactions.

--Nitesh Dhanjani
Read the rest in Interview with Nitesh Dhanjani and Billy Rios, Spies in the Phishing Underground

Sunday, February 10, 2008

we screwed up in letting attribute normalization into XML. It still boggles my mind in retrospect that during the discussions back in 96-97, nobody piped up to say "why are you morons doing this?" Because we probably would have said "D'oh, right, lose it." Sigh.

--Tim Bray on the xml-dev mailing list, Friday, 20 Jun 2003

Saturday, February 9, 2008

One of the problems with schemas, in fact, is that they encourage tag abuse, because they limit the freedom of users to invent new element names, encouraging them instead to overload the semantics of existing names. The tighter the schema is defined, the more creative users have to become in getting around them; in the end, they resort to comment-abuse, or even as I've seen in one case, using a namespace declaration as a boolean flag whose presence or absence conveyed information to the recipient.

--Michael Kay on the xml-dev mailing list, Sunday, 13 Dec 2007 00:16:06

Friday, February 8, 2008

In an AJAX application, you explicitly sit on the page load tag to grab an XML (or related) resource from the server, creating an asynchronous call that will update an explicitly defined model, then write more code that will walk through that XML model and either pass it to an XSLT transformation or run it through more DOM manipulation code in order to explicily populate the fields in question in the web page (any more explicit and it would require an X rating). You can cut this code down considerably with some carefully coded libraries, but these libraries add to the overhead of both loading and maintaining the page, sometimes significantly, and if they are not well written can also add a considerable burden to the debugging department.

On the other hand, an XForms like architecture defines XML data models, constraints, and bindings explicitly as rules given within an a declarative XML format. This declarative form works in a manner similar to that of an Excel spreadsheet. In Excel, for instance, you define both blocks that hold initial data then you have functional blocks that perform calculations either upon the initial data or a previously defined functional block. What you end up with in this kind of declarative “program” is an application where the “answer” may in fact be a lot of different pieces of information, where the only action involves the changing of one or more initial values, and where constraints will at a minimum prevent you from entering erroneous data.

--Kurt Cagle
Read the rest in Why XForms Matter, Revisited

Thursday, February 7, 2008

Even now, as people have learned that they’re supposed to say they are not doing RPC (”look, my WSDL says doc/literal therefore I am not doing RPC”), the code is still RPC-ish with the grand-children of the body being serialized into Java (or another language) objects and passed as arguments to an operation inside a machine-generated stub.

--William Vambenepe
Read the rest in William Vambenepe’s blog » Blog Archive » Gutting the SOAP processing model

Saturday, February 2, 2008

RDF is still only used on the periphery of the web and certainly not as part of the mainstream web. And IMO, RDF will probably never make the mainstream because it requires people to be too concise, and people in general are not good at being concise (witness the percentage of HTML files on the web that validate...)

--Mike Schinkel on the www-tag mailing list, Tuesday, 18 Dec 2007 15:57:46

Wednesday, January 30, 2008

leaving Chicago, Homeland Security seized my toothpaste. The tube that was carefully purchased in a size that’s legal to take on planes. Except for, it wasn’t in a plastic bag. You see, 90ml of naked toothpaste is dangerous and can be used in terrorist attacks, but putting it in a plastic bag prevents this.

Can we get some intelligent behavior out of our politicians please? This moronic security theater is damaging our quality-of-life and arguably playing into the actual terrorists’ hands.

--Tim Bray
Read the rest in ongoing · Naked Terrorist Toothpaste

Monday, January 28, 2008

We are not going to win the arms race with the phishers unless we admit to the underlying problem at hand: the reliance on static identifiers to establish identities and execute financial transactions. Let me expand on this a bit. We take care not to blurt out our SSN to anyone on the street, yet it is likely to be stored on hundreds of corporate databases as we progress in our lives. We take care not to expose our Credit Card numbers, but we must hand them over to people we don’t know at retail stores if we want to use it. We aren’t going to solve the problem of online PII (Personally Identifiable Information) and identify theft just by writing even more secure code (although it certainly helps), or by continuing to play whack-a-mole with phishers. The system of relying on static identifiers to commit financial transactions needs to be rethought.

Commercial financial institutions such as credit card companies and banks realize that the cost of implementing a new system that does not merely rely on static identifiers is higher than the fraud committed, so they decide to accept the cost. This is the reason why the system has not changed. Unfortunately, financial institutions only take into account their cost when making this decision, but it also ends up affecting the lives of millions of people who have to pay with their identities when such fraud is committed

--Nitesh Dhanjani
Read the rest in Interview with Nitesh Dhanjani and Billy Rios, Spies in the Phishing Underground

Friday, January 25, 2008

the quality of XSLT code on my project is very poor at the moment, because I employed two coders for six months who just couldn't grasp functional programming. So I fired them and took on a new partner to re-code the bulk of the project, but he had to learn XSLT first. This only took him one week, as he's solidly grounded in functional programming.

--Eric J. Bowman on the rest-discuss mailing list, Sunday, 30 Dec 2007 01:45:12

Thursday, January 24, 2008

I have forever been amused by sales pitches that a product or service will pay for itself within so many weeks, months, or years. Generally speaking, if "pay for itself" means the product or service will actually increase cash flow and sales to an extreme, then I'm in. But if "pay for itself" means an increase in productivity, then the red light on top of my BS meter immediately goes off.

This is the modern version of a snake-oil cure. Close relatives include the "lost productivity" indexes as well as "sales lost to piracy." If sales were indeed lost to piracy, then why hasn't this become a line item on the books and therefore tax deductible? Seriously, if all this intellectual property theft is theft in any conceivable way, then take the IRS deduction based on the press announcements and see how far you get.

--John C. Dvorak
Read the rest in Promises of Productivity Are Often BS

Wednesday, January 23, 2008

For a certain applications — web crawling and log analysis are two that come to mind — the ability to process a huge volume of data quickly is more important than guaranteeing 100% data integrity and completeness. Relational databases dominate data management because they can make those guarantees. But those guarantees have their own costs and limitations. If I need to churn through gigabytes of web server log files to generate some numbers to make a business decision now, I don’t care if a few of the log entries are ignored; I just need to filter and categorize the data fast.

--Greg Jorgensen
Read the rest in Typical Programmer

Tuesday, January 22, 2008

The world of email deliverability has changed a lot in recent years. Originally, content was king. As long as you weren’t blacklisted and your email content didn’t include lots of “spammy” words, then you had a good chance of getting delivered. Today however, ISP’s and spam filtering technology has got a whole lot smarter and more aggressive.

Who is sending the email is becoming more important than what the email says. ISPs are doing this by watching their customers reassign habits and use of the “Mark as spam” button, and then tying that back to the sending domain and IP address. If a subscriber is regularly opening your email then you should be in the clear, but if a decent number rarely open your email and mark it as spam then you might not make the inbox much longer. Known as your sender reputation, it’s the most important factor in getting your emails delivered today. Just like the real world, having a good reputation ensures trust, and if ISP’s don’t trust you, you’re in trouble.

This introduces a new challenge to anyone sending reasonable amounts of email. No longer is getting permission enough to assume your email will be delivered. If it’s not relevant and exactly what the subscriber asked for, you run the risk of them marking your email as spam. It’s a delicate but extremely democratic system; your ability to deliver email truly is in your subscriber’s hands, so treat them with the respect they deserve.

--David Greiner
Read the rest in Vitamin Features » Ensuring your HTML emails look great and get delivered

Monday, January 21, 2008

It’s a lot of work getting your head around some of these deeper, darker parts of QuickTime. Not only are they harder to find documentation and sample code for, but they often lack convenient all-in-one API calls, requiring you instead to build up structures of QuickTime “atoms” by yourself.

--Chris Adamson
Read the rest in Rebooting Java Media, Act II: Development

Sunday, January 20, 2008

namespaces as implemented in XML are a pox on the web and software/development in general.

--Mike Schinkel on the rest-discuss mailing list, Friday, 28 Dec 2007 19:31:16

Saturday, January 19, 2008

As a word processor, little has changed in Word 2008. The elimination of VBA and weak support for AppleScript and Automator make the program far less versatile and valuable for users who really need automation. But the program’s streamlined interface and the addition of the new Publishing Layout view make it a compelling choice, especially if you want to stretch Word beyond its current capabilities.

--Jeffery Battersby
Read the rest in Macworld | Microsoft Word 2008

Friday, January 18, 2008

It's been my experience that if you follow the successful practices of the Web, it ends up working quite well internally, if you follow practices only intended to work internally, they don't always work internally.

--Assaf Arkin on the rest-discuss mailing list, Sunday, 17 Jan 2008 19:17:06

Thursday, January 17, 2008

1. Validating data is different from processing data.

2. Just because an application can validate some data doesn't mean it can process the data.

2.1 Just because an application can process some data that it validated doesn't mean that *any* data it validates can be processed.

--Roger L. Costello, on the xml-dev mailing list, Friday, 28 Dec 2007 09:02:51

Wednesday, January 16, 2008

At a certain point, referential integrity has to be given up. Scale a bit further, and even the notion of a relation in the relational database sense of the word starts to break down. To cope, you denormalize a bit, not so much for performance reasons (though that’s important too), but as a self defense mechanism so that the pieces of data that you do have have enough context to be meaningful.

--Sam Ruby
Read the rest in Sam Ruby: Dare Takes a Look at CouchDB

Sunday, January 13, 2008

the coverage focuses on the horse-race and the theater of the campaign and not on the issues. But I don’t really think that’s the fault of reporters. First, they have to file constantly on short deadlines. So even if Obama releases a tax plan one day, and you write a piece about that, that’s still only a one-day story. What do you write about the next day? Why, Obama sniping with Hillary or somesuch. Second of all, consider the imbalance in expertise between a campaign and those who cover it. When Obama releases a tax plan, it’s a product of a team of policy experts, who know the terrain inside and out. But the reporter who has to file the deadline piece about it doesn’t have any expertise on tax policy. So how could their coverage be anything but shallow?

--Christopher Hayes
Read the rest in Is Good Campaign Coverage Possible?

Saturday, January 12, 2008

Decriminalizing all non-commercial file sharing and forcing the market to adapt is not just the best solution. It’s the only solution, unless we want an ever more extensive control of what citizens do on the Internet. Politicians who play for the antipiracy team should be aware that they have allied themselves with a special interest that is never satisfied and that will always demand that we take additional steps toward the ultimate control state. Today they want to transform the Internet Service Providers into an online police force, and the Antipiracy Bureau wants the authority for themselves to extract the identities of file sharers. Then they can drag the 15-year-old girl who downloaded a Britney Spears song to civil court and sue her.

Will the Antipiracy Bureau be satisfied with this? Probably not, because even the harsher laws now proposed will not stop the file sharing. Already there are anonymization services on the market that make the new laws ineffective. For this reason, the Antipiracy Bureau will demand new tools that further intensifies the surveillance of the Internet. The simple truth is that almost all communication channels on the Internet can be used to distribute copyrighted information. If you can use a service to send a message you can most likely use the same service to send an mp3-song. Those who want to prevent people from exchanging of copyrighted material must control all electronic communication between citizens.

--Karl Sigfrid
Read the rest in Decriminalize File Sharing « Karl Sigfrid

Friday, January 11, 2008

SQL is great when you have highly structured data. The problem is much of the data we generate day to day isn't easily extractable into carefully planned schemas and are challenging to represent and query in a SQL databases. That means lots of useful data that could be stored and queried ends up unused or lost because we don't have the time and resources to build schemas to store them.

--Damien Katz
Read the rest in Damien Katz: System overload

Thursday, January 10, 2008

JSON makes very good use of Javascript’s literal object notation. But it’s a consequence of this fact that a JSON message can conveniently be processed by reading it into a variable and then running eval on the variable. (This is where we came in.) The moment you do this, of course, you expose your code to a Javascript injection attack.

To say “You don’t have to use eval — JSON has a very simple syntax and you can parse it yourself, or use an off the shelf parser, and in so doing protect yourself against the security issue,” seems to ignore an important fact about notations: they make some things easier and (necessarily) some things harder. They don’t force you to do things the easy way; they don’t prevent you from doing them the hard way. They don’t have to. The gentle pressure of the notation can be enough. It’s like gravity: it never lets up.

If the notation makes a dangerous or dirty practice easy, then the systems built with it will be spotlessly clean if the users have the self-discipline to keep it clean. For most of us, that means: not very clean.

--Michael Sperberg-McQueen
Read the rest in Messages in a bottle » Blog Archive » Gravity never sleeps (notations that use eval)

Wednesday, January 9, 2008

OOXML is nothing more than the preferences of Microsoft Office. Whenever Word changes, OOXML will change. And if you are a user or competitor of Word, you will be the last one to hear about these changes. ISO does not own OOXML. Ecma does not own OOXML. OOXML, in practice, is controlled and determined solely by the Office product teams at Microsoft. No one else matters.

Consider that Microsoft has recently proposed over 1,700 changes to the OOXML specification, including fixes that presumably will be made into a future Office 2007 fixpack. Microsoft knows what these fixes will be. The Office developer teams know what these fixes will be. But if you are a competitor of Microsoft's in this space, do you know what these changes are? No. Microsoft has decided to keep them a secret, claiming that the ISO process allows them to withhold interoperability information from competitors in what they maintain is an "open standard".

Further, the coding of Office 14 a.k.a. Office 2009 is well underway. Beta releases are expected in early 2008. But are file format changes needed to accommodate the new features being discussed in Ecma? No. Are they being discussed in ISO? No. Are they being discussed anywhere publicly? No.

Is this how an open standard is developed?

My prediction is that the first time anyone hears about what is in the next version of OOXML will be when Office 14 Beta 1 is announced at Microsoft's 2008 Office System Developers Conference (ODC). Other vendors will not hear a word about the format changes until after the Beta 1 is already released. Not even Ecma will hear about the changes until after the ODC.

DIS 29500 is already obsolete, has already been embraced and extended. You just don't know about it yet. You weren't meant to know. In fact, pretend you don't know. Give Microsoft a big head start. They need it.

--Rob Weir
Read the rest in An Antic Disposition: Those who forget Santayana...

Tuesday, January 8, 2008

The dominance of Windows and Office have waned. The most important personal computer application by far these days is the Internet, and there's no advantage to having Microsoft Windows when you use the Internet. The other most popular personal computer apps, web-based email and instant messaging, as well as the advent of Google Apps and other web-based Office clones, there's also a lot less reason than there used to be to use the PC platform that most applications are developed for.

In short, the network effect that supported the Windows monopoly--and bolstered the sales of all companies that hitched their wagons to this star--is disintegrating, enabling computer uses to simply choose the computers that they like best.

--Henry Blodget
Read the rest in Christmas Present For Apple Bulls: Mac Cleared for Takeoff

Sunday, January 6, 2008

Many back-end developers for some reason I cannot understand seem to be completely dependent on having an IDE or a framework create all front-end code for them. It seems like HTML, CSS, and JavaScript are all way too complicated for most back-end programmers. And that is really unfortunate since the IDEs many of them use create front-end code that is nothing but pathetic.

Look, it’s very simple. You’re a programmer. You should be smart enough to learn HTML, CSS, and JavaScript. If your IDE or framework produces rubbish for front-end code, don’t let it. Twist its arm to make it produce sane markup.

--Roger Johannson
Read the rest in Lame excuses for not being a Web professional | 456 Berea Street

Saturday, January 5, 2008

HTML needs fixing. The HTML 4 recommendation was published in 1999. Since then, the web has grown from a document retrieval system into an application delivery system. We have made significant progress since then, due to the cleverness of the web development community and the surprising expressive power of JavaScript, but we are at the limits. HTML is no longer a driver of innovation. It is now a serious impediment.

There are good ideas in HTML, but many of these were discarded in the XHTML effort. My thinking is that we should take a step back and refocus. The problems with HTML will not be solved by making it bigger and more complicated. I think instead we should generalize what it does well, while excising features that are problematic. HTML can be made into a general application delivery format without disrupting its original role as a document format.

--Douglas Crockford
Read the rest in Fixing HTML

Friday, January 4, 2008

Semantic drift happens naturally in the real world, for example credit card numbers which once identified an account might start to identify a specific card with access to that account. It's not surprising that it happens, because if a system is capable of meeting new requirements without requiring any software changes then people will use it creatively in new ways to meet those requirements. One of the challenges in designing schemas (or database integrity constraints) is knowing whether you should try to resist semantic drift as a menace to information integrity, or whether you should allow your system to ride the waves, thus increasing its flexibility and longevity.

--Michael Kay on the xml-dev mailing list, Sunday, 27 Dec 2007 11:12:50

Thursday, January 3, 2008

Relational databases have failed the software industry in much the same way XML, Java and client-server failed the software industry. In other words, no failure to see here, move along. Those are all excellent technologies for solving a wide range of problems. Just that there are some problems they’re particularly poor at solving.

--Assaf Arkin
Read the rest in Labnotes » Conflicting Reads and Writes

Tuesday, January 1, 2008

If I were world dictator, I'd put a kibosh on binary XML, and I'm quite confident that the people who are pushing for it would find another solution.

--Tim Bray
Read the rest in Putting XML in the fast lane | CNET News.com

Elliotte Rusty Harold

elharo@ibiblio.org

Last Modified at Wednesday, December 31, 2008 5:45:21 AM