This page lists some of the XML talks I've given in the past and am willing to give in the future. Talks range from half an hour to multiple days. The duration of most of these talks can be expanded or contracted as necessary to fit your needs.
If you'd like me to speak at your conference or to your company or user group, just drop an email to me at elharo@metalab.unc.edu and we'll see what we can set up. I'm based in New York, but I do travel quite a bit. I also do a lot of speaking about Java as well. I'm constantly developing new presentations on a variety of subjects related to XML, Java, and the Web so if there's a particular topic you'd like to hear me pontificate about, feel free to ask.
Available presentations described in more detail below include:
These sessions all cover more or less the same material. The main difference is the amount of time they take, and thus the amount of depth that can be gone into on specific topics. The first two, A Brief Intro to XML and An Overview of XML, concentrate on explaining what all the pieces of the XML puzzle are and how they fit together. These are very high level talks that emphasize breadth over depth. After hearing one of them, I'd expect that attendees know what XML is and what it's used for, know what DTDs and schemas are and what they're used for, know the different choices of APIs that are available for processing XML documents, and so forth. However, students would not have a great deal of facility in actually producing these things.
XML Basics is a more technical, syntax-heavy talk that focuses on teaching students how to write well-formed XML documents that use namespaces and CSS style sheets. Less emphasis is placed on more advanced topics like DTDs, schemas, and XSLT, though these can all be covered in separate sessions if desired. XML Fundamentals is an expanded (twice as long) version of XML Basics that has the time to add DTDs and XSLT. Finally, Intro to XML is a full-day, nitty-gritty technical session that has time to cover all of this in much more detail. It's also available as a two-day hands-on session with exercises.
XML, the Extensible Markup Language, has been adopted in realms as diverse as human resources, object serialization, genealogy, and music. This high-level overview shows you how you might use XML in your own areas of interest. Standards discussed include CSS, XSLT, XSL-FO, DTDs, XLinks, SAX, and DOM. It shows you brief examples of each of these and explains both their intended purpose and their practical uses so you can intelligently decide which ones you need to work with and which ones you can safely ignore. After you've attended this seminar, you'll know what all the different XML technologies and tools are, and be able to decide when they are and are not appropriate for your web sites and applications.
This talk assumes basic familiarity with HTML and the Web but no prior knowledge of XML. It's suitable for anyone who wants to know just what the XML hype is really all about including web site designers, programmers, IT Managers, and CEOs.
XML is a simple, flexible format that allows you to design markup languages to fit your documents rather than the one-size-fits-all approach of HTML. XML is suitable not only for Web pages, but also for printed documentation, application file formats, and data interchange between applications. This talk discusses what XML can do for your programs and Web sites. In it, you'll learn how to write well-formed XML documents and display them on Web sites using CSS style sheets. This talk assumes basic familiarity with HTML and the Web but no prior knowledge of XML.
XML is a simple, flexible meta-language for creating markup languages describing particular documents or domains. Fifth generation browsers like Internet Explorer 5 and Mozilla can view XML pages directly, but that's not XML's only use. XML can also be used to define formats for printed documentation, files saved by programs, and data interchange between applications. XML is already being used in areas as diverse as mathematical notation, object serialization, vector graphics, meta-information, and more.
This tutorial will show you how to use a variety of free tools, to design XML-based markup languages, write XML documents, validate the documents against DTDs, format those documents with style sheets, and deliver them to end users.
The tutorial gets under way by exploring well-formed XML documents. We'll introduce elements and attributes, discuss when each is appropriate, and show how you can use them to divide data and documents into a tree of XML elements. You'll learn how style sheets that allow you to separate the organization of your data from its presentation, and even provide different views of the same data or document to different readers. We'll begin with the simpler CSS style sheets, then move on to the more complex and powerful XSL language.
Then we'll introduce DTDs to impose additional constraints on the form a document takes. You'll learn how to write DTDs that declare and organize the various elements and attributes used in your document, how to plan the structure of a document in advance, and how to use entities declared in the DTD to assemble a document from multiple smaller documents.
Outline:
Prerequisites:
Students should have at least a user's understanding of the Internet,
and be comfortable writing basic HTML.
XML is a simple, flexible meta-language for creating markup languages describing particular documents or domains. Fifth generation browsers like Internet Explorer 5 and Mozilla can view XML pages directly, but that's not XML's only use. XML can also be used to define formats for printed documentation, files saved by programs, and data interchange between applications. XML is already being used in areas as diverse as mathematical notation, object serialization, vector graphics, meta-information, and more.
This tutorial will show you how to use a variety of free tools, to design XML-based markup languages, write XML documents, validate the documents against DTDs, format those documents with style sheets, and deliver them to end users.
The tutorial gets under way by exploring well-formed XML documents. We'll introduce elements and attributes, discuss when each is appropriate, and show how you can use them to divide data and documents into a tree of XML elements. You'll learn how style sheets that allow you to separate the organization of your data from its presentation, and even provide different views of the same data or document to different readers. We'll begin with the simpler CSS style sheets, then move on to the more complex and powerful XSL language.
Then we'll introduce DTDs to impose additional constraints on the form a document takes. You'll learn how to write DTDs that declare and organize the various elements and attributes used in your document, how to plan the structure of a document in advance, and how to use entities declared in the DTD to assemble a document from multiple smaller documents.
This tutorial assumes basic familiarity with HTML and the Web but no prior knowledge of XML. Some experience programming in Java would be helpful.
Content wise this is very similar to the previous Intro to XML tutorial, except that there's a large hands-on component.
XML is a simple, flexible meta-language for creating markup languages describing particular documents or domains. Fifth generation browsers like Internet Explorer 5 and Mozilla can view XML pages directly, but that's not XML's only use. XML can also be used to define formats for printed documentation, files saved by programs, and data interchange between applications. XML is already being used in areas as diverse as mathematical notation, object serialization, vector graphics, meta-information, and more.
This tutorial will show you how to use a variety of free tools, to design XML-based markup languages, write XML documents, validate the documents against DTDs, format those documents with style sheets, and deliver them to end users.
The tutorial gets under way by exploring well-formed XML documents. We'll introduce elements and attributes, discuss when each is appropriate, and show how you can use them to divide data and documents into a tree of XML elements. You'll learn how style sheets that allow you to separate the organization of your data from its presentation, and even provide different views of the same data or document to different readers. We'll begin with the simpler CSS style sheets, then move on to the more complex and powerful XSL language.
Then we'll introduce DTDs to impose additional constraints on the form a document takes. You'll learn how to write DTDs that declare and organize the various elements and attributes used in your document, how to plan the structure of a document in advance, and how to use entities declared in the DTD to assemble a document from multiple smaller documents. Finally, we'll see how sophisticated hypertext linking can be added to XML documents and applications using XLinks and XPointers.
This tutorial assumes basic familiarity with HTML and the Web but no prior knowledge of XML and no knowledge of any specific programming language.
Attendees must bring a laptop they're comfortable using to develop software, including their choice of text editor and Web browser. Supported operating systems include Windows 95 and later, a reasonably modern build of Linux, and MacOS X. Attendees should have the Java Development Kit 1.4 or later installed prior to class. Other necessary software will be provided on both CD-ROM and a local network server. Open source software is used wherever possible, and no payware is required. The laptop must have an Ethernet card, and be configured to support DHCP. CD-ROM and floppy drives wouldn't hurt either, but are not absolutely required. Power strips will be provided so don't forget your power cord. One or more assistants will be available to help you get connected to the network, and show you how to use the software provided in class; but in general we won't be able to troubleshoot problems related to your specific laptop (e.g. inability to get the JDK installed).
These are all seminars that cover one particular part of XML in depth. Available topics include XQuery, XOM, JDOM, DTDs, Schemas, XSLT, Namespaces, SAX, and DOM.
It is generally recognized that the standard APIs for processing XML have their drawbacks. SAX, though efficient, is hard to use, and DOM can be just plain ugly. XOM is a new XML object model developed by Elliotte Rusty Harold. It is an open source (LGPL), tree-based, streaming API for processing XML that strives for correctness and simplicity while still being able to process documents of arbitrary size and complexity. XOM implementations are available for Java and Python. In this 90 minute presentation, Elliotte will explain the philosophy behind XOM and give some demonstrations of how it boosts productivity and leads to cleaner, more understandable code for XML processing.
XML based syndication is moving from its foundations in weblogs to unexpected arenas: source code control systems, audio narrowcasts, e-mail, bug tracking, stock tickers, and more. News readers like Vienna, NetNewsWire, RSSOwl, and Newsgator are replacing classic web browsers for many uses. This session explores the fundamental technologies underlying this explosion of content: the various versions of RSS, OPML, Atom, and the Atom Publishing Protocol. Learn the tricks and techniques for integrating these XML applications into your products as both clients and servers.
More and more applications are generating XML documents as their primary or secondary outputs. XML is much easier to parse than traditional formats. At the same time, it has a number of syntactic options that make testing output more difficult than testing traditional, less rich formats. Simple string comparison is often too naive to properly test XML. This talk explores the challenges and pitfalls of testing XML documents. It explains what to look for when testing XML documents and even more importantly what to ignore. We'll consider various tools for testing XML including parsers, schemas, DTDs, canonical XML, and XPath. Finally, we'll discuss automating tests by writing JUnit test cases that use various XML APIs such as DOM to compare the actual output to the expected output.
XML processing can be divided into four layers: binary, lexical, syntactic, and semantic. However, these layers are rarely recognized; and many APIs, tools, and programs mix them together, resulting in much needless confusion. This talk explains how to cleanly separate XML processing into distinct layers, such that each layer only talks to the ones below and above it in the stack, and each tool operates only on the information in one layer. It explains which XML constructs belong to which layers and where different kinds of programs need to operate. The result are much cleaner, much simpler, much more robust programs, APIs, and specifications.
The vision of semi-intelligent agents seamlessly creating knowledge by combining diverse sources has driven the W3C for the last six years, but will it actually work? Is there a killer app, or is the SemWeb doomed to be yet another AI fantasy? This talk explores the Semantic Web Vision and specific technologies that underlay it including RDF and OWL. Is RDF the magic bullet for enterprise integration? Is OWL going to make the 80/20 cut for AI that HTTP and HTML made for Hypertext? Or is this all just the product of a bunch of pointy-haired academics who've spent too much time reading Hegel and too little time studying the real world? Find out whether RDF and OWL are right for you (or perhaps whether they're right for anyone.)
Despite all the advances made on the Web in the last decade+, forms have changed little since Mosaic 1.0. We now have JavaScript to give them a little intelligence, and browsers can sometimes autofill a few fields, but otherwise they present the same basic interface they did ten years ago. The wait for better user interfaces is over. XForms are a radically more powerful forms technology for the Web that enables separation of presentation and content, data type aware input fields, enhanced accessibility for users, far richer user interface widgets, and much, much more. Learn how you can take advantage of XForms to build the next generation of killer web apps.
Despite all the advances made on the Web in the last decade, forms have changed little since Mozilla 1.0. JavaScript gave them a little intelligence, and browsers can sometimes autofill a few fields, but otherwise forms present the same basic interface they did ten years ago. The wait for better user interfaces is over. Web Forms 2.0 is a more powerful forms technology for the Web that enables separation of presentation and content, data type aware input fields, repeating fields, updateable forms, and more but still works in today's mainstream browsers. Learn how you can take advantage of Web Forms 2.0 to build the next generation of killer web apps.
XQuery is a new fourth generation language being designed at the W3C to extract information from collections of XML documents, including those stored in native XML databases. It is based on XPath 2.0, but features a more familiar, procedural syntax than XSLT. This talk explores XQuery 1.0, shows you when it's appropriate and where you can use it. You'll learn both the syntax and semantics of this powerful new language. You'll also get a personal demo of XQuisitor, a new open source GUI tool for querying XML based on the XQuery language and Michael Kay's Saxon 7.0. This talk assumes familiarity with basic XML.
XML based markup languages are described with Document Type Definitions (DTDs). This seminar shows you how to document your own XML applications using DTDs. You'll learn how to declare elements, attributes, and entities; how to validate documents against DTDs; and how to organize your DTDs in a clear, maintainable, and reusable fashion. This talk assumes familiarity with basic XML.
One of the most common complaints about DTDs is that they don't allow authors
to specify constraints on the element and attribute content. For instance,
they do not let you say that a MONTH
element contains an integer
between 1 and 12. Schemas fill this gap in the XML family. They also fill many other
holes left by DTDs; among them inheritance, data typing, and namespace-aware validation.
This talk introduces you to the W3C's new XML schema language, and shows you how to use it
to make validation far more powerful.
Without links, HTML would have been just another document format. Without links it could never have created the World Wide Web. And yet until quite recently XML has been lacking any similar facility. The wait, however, is over. This talk introduces XLinks, XPointers, XInclude, and XML Base, the technologies that enable you to make XML not just text but hypertext. You'll learn how to put these all together to build hypertext systems far more sophisticated than anything that's possible with HTML.
Without links, HTML would have been just another document format. Without links it could never have created the World Wide Web. And yet until quite recently XML has been lacking any similar facility. The wait, however, is over. This talk introduces XLinks, the primary means of making XML not just text but hypertext. You'll learn how to use both simple and extended XLinks to build hypertext systems far more sophisticated than anything that's possible with HTML.
XInclude is a radical new way of building large XML documents out of smaller component parts. Unlike competing technologies such as general entity references and XLinks, all the parts are fully usable by themselves as well as when merged with the master document. Indeed one part can be used in many different documents. XInclude encourages modular and reusable development of web sites, book, and information systems.
This talk completely covers XInclude syntax. Several different implementations are presented using a variety of APIs, and the strengths and weaknesses of each are elaborated.
The Extensible Stylesheet Language (XSL) includes a very powerful declarative language for transforming XML documents into new XML documents called XSLT. XSLT has many applications including electronic commerce and Web page display. In this talk you'll learn how to write XSL stylesheets that can change XML to HTML, sort data, extract subsets of the data, and otherwise manipulate and massage data for display to the end user.
XSLT 1.0 has been one of the biggest success stories in the XML family.
It's solved a lot of problems developers didn't even know they had
until they saw what XSLT could do. This talk explores the future
of querying and transforming XML documents
including new XSLT elements such as xsl:document
,
extension functions written in XSLT,
a new version of XPath that can query remote documents
and use functions in location paths, and a whole new language
called XQuery, a fourth-generation language
for extracting data from XML documents and using that
data to produce new XML documents.
Attendees should be reasonably comfortable with XSLT 1.0
before attempting this class.
XQuery is a new fourth generation language being designed at the W3C to extract information from collections of XML documents, including those stored in native XML databases. It is based on XPath 2.0, but features a more familiar, procedural syntax than XSLT. This talk explores XQuery 1.0, shows you when it's appropriate and where you can use it. You'll learn both the syntax and semantics of this powerful new language. You'll also get a personal demo of XQuisitor, a new open source GUI tool for querying XML based on the XQuery language and Michael Kay's Saxon 7.0.
SAX is a straight-forward, event based Java API for reading XML documents. It's parser independent and supported by all major parsers including Xerces, Oracle's XML Parser for Java, and expat. This talk shows you how to use SAX to read XML documents of any size and any complexity. We'll discuss the data structures the event approach necessitates, and see examples of parsing many different kinds of XML documents.
Cookies are an ugly hack that violate the fundamental design of HTTP. They attempt to graft sessions onto a stateless, sessionless protocol. Applications that depend on cookies are thus fragile, insecure, and user-hostile. This talk will explain the problems with cookies, and then demonstrate the proper, RESTful way to build web sites and applications that support user authentication, shopping carts, games, and more without using any cookies anywhere. The resulting applications are far more reliable and user friendly than the cookie-based applications they replace.
Cookies giving you indigestion? Session state overloading your servers? Web Services breaking down your firewalls? It's not you. The Web was never meant to work like this, and the problems you're having are a direct result of working against the design of HTTP rather than with it.
HTTP has an explicit architecture, and that architecture is called REST. Stateless web applications built with REST principles are simpler, faster, more robust, more secure, and much more scalable than the client-server architectures frequently deployed. Come learn how to build systems play to HTTP's strengths instead of compensating for its weaknesses.
XMLPULL is a new, extremely fast streaming API for reading XML documents that can handle arbitrarily large documents without many of the complexities of SAX. This pull-based API has the potential to be as easy to use as tree-based APIs like DOM and JDOM while being as fast and memory efficient as push-based APIs like SAX. Although its API is very different than either SAX or DOM, XMLPULL is fully conformant with all relevant XML standards. It supports well-formedness checking, validation, namespaces, and even schemas. Learn how you can integrate pull-parsing into your application chains.
StAX, the Streaming API for XML, is a new pull parsing API for processing XML with Java. Unlike traditional, SAX based push parsers, pull parsers put your application in charge of the program. You drive the processing rather than being driven by it. Nonetheless, you get all the speed and size advantages of a streaming API. This class explores the design patterns common in pull parsing. Learn how you can integrate pull-parsing into your application chains, and more importantly whether you should.
Six years after XML 1.0 was released, the community is finally beginning to understand which practices work and which ones don't. This talk explores patterns (and quite a few anti-patterns) common in XML development, and explains which are which, and why. You'll learn as many specific ways to improve your XML systems as we can cram into 90 minutes.
Do DTDs bore you? Does the W3C XML Schema Language drive you to tears? Rejoice, because an alternative is here. RELAX NG is a new international standard schema language that's based on powerful mathematics but simple enough for math-phobics to use. It provides the 80% of features developers actually need from schemas at about 20% of the complexity. And it can do a few things the W3C XML Schema Language can't. In 90 minutes you can learn to do with RELAX NG what would take a full day of training in the W3C XML Schema Language.
Existing XML APIs range from too complex (DOM, SAX) to too simple (JDOM) or both (ElectricXML, dom4j). Is there a happy medium? Can an API be both easy-to-use for developers who don't spend their weekends memorizing W3C recommendations while being correct and powerful enough to satisfy the experts? How simple can an XML API possibly be? How complex does it have to be? We're going to share war stories about XML APIs like SAX and DOM, and explore some possible solutions for going forward.
The Document Object Model is a language and parser independent standard API for reading and writing XML and HTML documents. It presents the document to the programmer as a tree data structure in which the individual elements, comments, processing instructions, text, and other parts of the document all comprise nodes. This talk teaches you how to use the DOM to read and write XML documents from your Java programs while automatically maintaining well-formedness and checking validity.
DOM Level 3 is the next generation of the
popular Document Object Model language and parser independent
API for reading and writing XML and HTML documents.
DOM3 adds standard means of locating a parser, creating a new document,
and accessing DTDs and schemas. It also fill a number of holes in
existing DOM interfaces such as Element
and Node
.
Attendees should already be familiar with the DOM2 Core API.
JDOM is an open source, tree-based, Java API for processing XML documents that's been designed with simplicity and convenience as its foremost purposes. JDOM programs can use any commonly available SAX or DOM parser. At a high level, it's similar to the DOM, but since JDOM was designed specifically for Java rather than for multiple languages, it feels much more natural and "right" to Java programmers. This talk teaches you how to use the JDOM to read and write XML documents from your Java programs while automatically maintaining well-formedness and checking validity. It also compares and contrasts JDOM with the existing SAX and DOM APIs to help you choose which API to use for which projects.
Namespaces are one of the most fundamental yet widely misunderstood factors in the XML equation. They do both more and less than many people expect of them. This talk shows you how to use namespaces in both new and existing XML applications, including how to integrate them with DTDs and schemas.
These are longer half and full-day sessions that demand more of the student. In particular, they all assume a solid grasp of XML basics and extensive competence in Java.
The Extensible Stylesheet Language (XSL) includes a very powerful declarative language for transforming XML documents into new XML documents called XSLT. XSLT has many applications including electronic commerce and Web page display. In this talk you'll write XSL stylesheets that can change XML to HTML, sort data, extract subsets of the data, and otherwise manipulate and massage data for display to the end user.
Participants will be expected to bring a laptop with Java 1.3 or later. Java is necessary to run the software we'll be using. However, no knowledge of Java Programming is required. Students should also download SAXON 6.5.2 to their laptop from http://saxon.sourceforge.net/ before class. (If a later version of SAXON 6.x is published between now and the class, you can use that instead. Do not download SAXON 7.x though. That's an experimental version that doesn't really work yet.) However, it will be made available on CD-ROM if necessary.
XML is a flexible meta-language that allows you to design markup languages to fit your documents rather than the one-size-fits-all approach of HTML. XML is suitable not only for Web pages, but also for application file formats, object serialization, and data interchange between applications. This tutorial will show you how to use Java to process XML documents and show you why you might want to do that.
XML files are text files so you'll explore how Java outputs text using Writers and how to use these classes to write XML. Then you'll see some demonstrations of the conversion of existing data formats to XML. Of course writing data is only half of the I/O equation so next you'll learn how to read XML documents from Java. Input is considerably more complicated than output. Fortunately 90% of the work can be done for you by free parsers like the Apache Project's Xerces. You'll see how to use the standard, event based Simple API for XML (SAX) to drive your own programs through callbacks.
Most parsers also support a more object oriented view of an XML document by implementing the W3C's Document Object Model (DOM). You'll investigate the DOM in depth, see how it compares to SAX, and learn which API is appropriate for which tasks.
Outline:
Prerequisites:
Students should be completely comfortable programming in Java, should
have at least a user's understanding of the Internet, and should be
familiar with XML Basics.
This seminar is not for the faint of heart. We're going to run right up to the bleeding edge of today's XML technology, and then we're going to jump off. This tutorial is for programmers who have already mastered SAX and DOM and are looking for the next level. We'll be exploring the latest W3C and third party APIs and standards for processing XML including:
This session will be taught in Java. However, most of the APIs and ideas presented are available in and port quite naturally to other languages like Python and C++.
XML is a flexible meta-language that allows you to design markup languages to fit your documents rather than the one-size-fits-all approach of HTML. XML is suitable not only for Web pages, but also for application file formats, object serialization, and data interchange between applications. This tutorial will show you how to use Java to process XML documents and show you why you might want to do that.
The session begins with a quick review of XML syntax. However we'll look at it from the perspective of a programmer using XML as an output format. XML files are text files so you'll explore how Java outputs text using Writers and how to use these classes to write XML. Then you'll see some demonstrations of the conversion of existing data formats to XML.
Of course writing data is only half of the I/O equation so next you'll learn how to read XML documents from Java. Input is considerably more complicated than output. Fortunately 90% of the work can be done for you by free parsers like the Apache Project's Xerces. You'll see how to use the standard, event based Simple API for XML (SAX) to drive your own programs through callbacks.
Most parsers also support a more object oriented view of an XML document by implementing the W3C's Document Object Model (DOM). You'll investigate the DOM in depth, see how it compares to SAX, and learn which API is appropriate for which tasks.
This talk can be combined with the Intro to XML Tutorial to form a 2-day introduction to Java for Programmers.
Students should be completely comfortable programming in Java, should have at least a user's understanding of the Internet, and should be familiar with XML Basics.
This one day tutorial is not for the faint of heart. We're going to run right up to the bleeding edge of today's XML technology, and then we're going to keep on going. This tutorial is for developers who have already mastered the basics of XML, DTDs, and namespaces and are looking for the next level. We'll be exploring the latest W3C and third party extensions to XML definitely including:
We'll also take some time to explore some technologies a little further out on and even past the bleeding edge, possibly including:
However, we're going to leave the exact list of topics open right up until the day of the show so we can give you the freshest, most up-to-date coverage we can, very likely including a few things that haven't even been invented yet. This tutorial shows you not only where XML is and what you can do with it today, but also where it's going and what you'll be able to do with it tomorrow. If you want to get a leg up on your competitors by understanding tomorrow's technology today, this is the talk for you.
Students should be comfortable programming in Java, should have at least a user's understanding of the Internet, and should be familiar with standard XML and Namespaces.