XML News from Wednesday, April 18, 2007

The W3C ???? Working group has posted the first public working draft of RDFa Use Cases: Scenarios for Embedding RDF in HTML. "Current web pages, written in HTML, contain significant inherent structured data. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published as easily as the original photo itself, enabling structured search and sharing. RDFa is a syntax for expressing RDF structured data in HTML. This document provides use case scenarios for RDFa. An introduction to implementing RDFa is provided in the RDFa Primer, while the details of the syntax are explained in the RDFa Syntax (to be published)." Here are some of the use cases:

  1. Use Case #1 — Basic Structured Blogging: Paul maintains a blog and wishes to "mark up" his existing page with structure so that tools can pick up his blog post tags, authors, titles, and his blogroll. In particular, his HTML blog should be usable as its own structured feed.
  2. Use Case #2 — Publishing an Event - Overriding Some of the Rendered Data: Paul sometimes gives talks on various topics, and announces them on his blog. He would like to mark up these announcements with proper scheduling information, so that RDFa-enabled agents can automatically obtain the scheduling information and add it to the browsing user's calendar. Importantly, some of the rendered data might be more informal than the machine-readable data required to produce a calendar event. Also of importance: Paul may want to annotate his event with a combination of existing vocabularies and a new vocabulary of his own design.
  3. Use Case #3 — Content Management Metadata: Tod sells an HTML-based content management system, where all documents are processed and edited as HTML, sent from one editor to another, and eventually published and indexed. He would like to build up the editorial metadata within the HTML document itself, so that it is easier to manage and less likely to be lost.
  4. Use Case #4 — Self-Contained HTML Fragments: Tara runs a video sharing web site. When Paul wants to blog about a video, he can paste a fragment of HTML provided by Tara directly into his blog. The video is then available inline, in his blog, along with any licensing information (Creative Commons?) about the video.
  5. Use Case #5 — Web Clipboard: Ursula is looking for a new apartment and some items with which to furnish it. She browses various RDFa-enabled web pages, including apartment listings, furniture stores, kitchen appliances, etc. Every time she finds an item she likes, she can point to it, extract the locally-relevant structured data expressed using RDFa, and transfer it to her apartment-hunting page, where it can be organized, sorted, categorized. Any additional features of the HTML that are not structured, e.g. links to photos, are conserved by the transfer.
  6. Use Case #6 — Semantic Wiki: Tim runs an RDFa-aware Semantic Wiki, where users contribute content in Wiki markup, using a WYSIWYG tool, or using HTML+RDFa. In all cases, the semantic wiki produces HTML+RDFa, so that users like Ursula can transfer the structured content from one semantic wiki (or any other RDFa source) to another semantic wiki (or any other RDFa destination). In particular, Ursula may be pasting her apartment-and-furnishing finds into her own Semantic Wiki.
  7. Use Case #7 — Augmented Browsing for Scientists: Patrick writes a science blog where he discusses proteins, genes, and chemicals. As he has very little control over the layout—he's using a fairly constrained hosting provider—, Patrick adds RDFa to indicate the scientific components he's working with. Ulrich, a scientist, can browse Patrick's site with an RDFa-aware browser and automatically cross-reference the proteins and genes that Patrick is talking about.
  8. Use Case #8 — Advanced Data Structures: Patrick keeps a list of his scientific publications on his web site. Using the BibTex vocabulary, he would like to provide structure within this publications page so that Ulrich, who browses the web with an RDFa-aware client, can automatically extract this information and use it to cite Patrick's papers.
  9. Use Case #9 — Publishing a RDF Vocabulary: Paul wants to publish a large vocabulary in RDFS and/or OWL. Paul also wants to provide a clear, human readable description of the same vocabulary. Using RDFa, the terms themselves can be mixed with a descriptive text in HTML. The RDFa engine can then extract the vocabulary in RDF/XML and/or n3 formats, to be included used directly by RDF aware applications (eg, reasoners).

Personally, I'm still skeptical of anything that involves page-author-created metadata for web pages.