[ Cafe con Leche | Order from amazon.com | Table of Contents | Sample Chapters | Examples | Errata | Second edition | First edition]
XML Bible Cover

The XML Bible, Gold Edition

When the first edition of the XML Bible was published about two years ago, XML was a promising technology with a small but growing niche. In the last two years, it has absolutely exploded. XML no longer needs to be justified as a good idea. In fact, the question developers are asking has changed from "Why XML?" to "Why not XML?" XML has become the data format of choice for fields as diverse as stock trading and graphic design. More new programs today are using XML than aren't. A solid understanding of just what XML is and how to use it has become a sine qua non for the computer literate.

The XML Bible, Gold edition is your introduction to the exciting and fast-growing world of XML. With this book, you'll learn how to write documents in XML and how to use style sheets to convert those documents into HTML so that legacy browsers can read them. You'll also learn how to use document type definitions (DTDs) and schemas to describe and validate documents. You'll experience a variety of XML applications in many domains, ranging from push to vector graphics to genealogy. And you'll learn how to take advantage of XML for your own unique projects, programs, and Web sites.

What's New in the Gold Edition

At just barely under 1600 pages, the XML Bible, Gold Edition is my largest book yet, and possibly the largest XML book published to-date. In fact, I wrote more pages than the printer could glue between two covers, so this edition is also my first English language hard back. Why so big? Because I needed to fit in everything that was in the second edition plus oodles of new and updated material including:

In addition, the treatment of several existing topics was expanded and/or updated significantly including schemas, XSL Formatting Objects, and the design of XML applications.

The plan is to keep selling both the second and gold editions simultaneously which raises the obvious question of which you should buy. On the down side, the Gold edition costs $20 more, and is about 2 pounds heavier. If you're strapped for cash or bookshelf space, you should probably choose the second edition. On the upside, the Gold edition is a little more up-to-date. If you have a particular interest in any of the new topics listed above or schemas, I'd suggest getting the Gold edition. Otherwise the second edition will probably serve you well without inducing a hernia.

For this edition, I focused on adding new material, not on rewriting and updating what had already been written. If you already own the second edition, it's probably not worth upgrading unless you have a particular need for one of the new chapters. It's only been about 5 months since the second edition came out, and pretty much everything in that edition is still current. On the other hand, if you're still using the first edition, it might be time to upgrade. The first edition is now over two years old, and there've been a lot of developments in the XML world since 1999. I'm sorry but there's no special upgrade price. Unlike software, the incremental cost of printing extra copies of a book is a non-trivial. However, if you did like the first edition, I think you'll like the Gold edition even more.

Who You Are

Unlike most other XML books on the market, The XML Bible covers XML not from the perspective of a software developer but rather that of a Web page author. It doesn't spend a lot of pages talking about BNF grammars or parsing element trees. Instead it shows you how you can use XML and existing tools today to more efficiently and productively produce attractive, exciting, easy-to-use, easy-to-maintain Web sites that will keep your readers coming back for more.

This book is aimed squarely at Web site developers. I assume you want to use XML to produce Web sites that are difficult to impossible to create with raw HTML. You'll be amazed to discover that in conjunction with style sheets and a few free tools, XML lets you do things that previously required either custom software costing hundreds to thousands of dollars per developer or extensive knowledge of programming languages like Perl. None of the software in this book will cost you more than a few minutes of download time. None of the tricks require any programming.

What You Need to Know

XML does build on HTML and the underlying infrastructure of the Internet. To that end, I will assume you know how to ftp files, send email, and load URLs in your Web browser of choice. I will also assume you have a reasonable knowledge of HTML at about the level supported by Netscape 1.1. On the other hand when this book discusses newer aspects of HTML that are not yet in widespread use like cascading style sheets or the <SPAN> and <DIV> tags, I will cover them in depth.

To be more specific:

On the other hand, there are a number of things I do not assume you know. In particular:

What You'll Learn

This book has one primary goal, to teach you to write XML documents for the Web. Fortunately XML has a decidedly unsteep learning curve, much like HTML (and unlike SGML). As you learn a little you can do a little. As you learn a little more, you can do a little more. Things you'll learn include:

In the final section of this book, you'll see several practical examples of XML being used for real-world applications, including:

What's In The Book

The XML Bible is divided into five parts:

  1. Introducing XML

  2. Document Type Definitions and Validity

  3. Style Languages

  4. Supplemental Technologies

  5. XML Applications

By the time you're finished, you'll be ready to use XML to create compelling Web pages.

Part I: Introducing XML

Part I begins with the history and theory behind XML and the goals XML is trying to achieve. It shows you how the different pieces of the XML equation fit together to create and deliver documents to readers. You'll see several compelling examples of XML applications to give you some idea of the wide applicability of XML, including Scalable Vector Graphics (SVG), the Resource Description Framework (RDF), the Mathematical Markup Language (MathML), the Extensible Forms Description Language (XFDL), and many others. Then you'll learn by example how to write XML documents with tags that you define that make sense for your document. You'll learn how to edit them in a text editor, attach style sheets to them, and load them into a Web browser such as Internet Explorer 5.0 or Mozilla. You'll even learn how you can write XML documents in languages other than English, even languages that are nothing like English, such as Chinese, Hebrew, and Russian.

Part II: Document Type Definitions

Part II (Chapters 8 through 13) focuses on document type definitions (DTDs). A DTD specifies which elements are and are not allowed in an XML document, and the exact context and structure of those elements. A validating parser can read a document, compare it to its DTD, and report any mistakes it finds. DTDs enable document authors to ensure that their work meets any necessary criteria.

In Part II, you'll learn how to attach a DTD to a document, how to validate your documents against their DTDs, and how to write your own DTDs that solve your own problems. You'll learn the syntax for declaring elements, attributes, entities, and notations. You'll learn how to use entity declarations and entity references to build both a document and its DTD from multiple, independent pieces. This enables you to make long, hard-to-follow documents much simpler by separating them into related modules and components. You'll learn how to integrate other forms of data like raw text and GIF image files in your XML document. And you'll learn how to use namespaces to mix together different XML vocabularies in one document.

Part III: Style Languages

Part III, consisting of Chapters 14 through 19, teaches you everything you need to know about style sheets. XML markup specifies only what's in a document. Unlike HTML, it does not say anything about what that content should look like. Information about an XML document's appearance when printed, viewed in a Web browser, or otherwise displayed is stored in a style sheet. Different style sheets can be used for the same document. You might, for instance, want to use one style sheet that specifies small fonts for printing, another one with larger fonts for on-screen presentation, and a third with absolutely humongous fonts to project the document on a wall at a seminar. You can change the appearance of an XML document by choosing a different style sheet without touching the document itself.

Part III describes in detail the two style sheet languages in broadest use on the Web, Cascading Style Sheets (CSS) and the Extensible Stylesheet Language (XSL). CSS is a simple style-sheet language originally designed for use with HTML. It applies fixed style rules to the contents of particular elements. CSS exists in two versions: CSS Level 1 and CSS Level 2. CSS Level 1 provides basic information about fonts, color, positioning, and text properties and is reasonably well supported by current Web browsers for HTML and XML. CSS Level 2 is a more recent standard that adds support for aural style sheets, user interface styles, international and bidirectional text, and more. CSS Level 3 modularizes the CSS specification and adds still more capabilities including namespace aware selectors, more colors, multi-column layouts, and much better support for printed pages.

XSL, by contrast, is a more complicated and more powerful style language that can apply styles to the contents of elements as well as rearrange elements, add boilerplate text, and transform documents in almost arbitrary ways. XSL is divided into two parts: a transformation language for converting XML trees to alternative trees, and a formatting language for specifying the appearance of the elements of an XML tree. Currently, many more tools support the transformation language than the formatting language.

Part IV: Supplemental Technologies

Part IV consists of Chapters 20 through 25. It introduces some XML-based languages and syntaxes that layer on top of basic XML. XLinks provides multidirectional hypertext links that are far more powerful than the simple HTML <a> tag. XPointers introduce a new syntax you can attach to the end of URLs to link not only to particular documents but also to particular parts of particular documents. XInclude allows you to build one large XML documents out of multiple smaller documents or pieces of documents. Schemas are an XML-based syntax for describing the permissible content of XML documents that's considerably more powerful, extensible, and scalable than DTDs. RDF is an XML application used to embed metadata in XML and HTML documents. Metadata is information about a document, such as the author, date, and title of a work, rather than the work itself. All of these can be added to your own XML-based markup languages to extend their power and utility. Finally canonical XML is a standard serialized form for XML documents that's useful when XML is processed by non-XML aware tools like encryption and compression programs.

Part V: XML Applications

Part V, which consists of Chapters 26 to 34, shows you several practical uses of XML in different domains. It begins with a discussion of techniques for DTD modularization in the context of SMIL 2.0. Then it introduces XHTML, a reformulation of HTML 4.0 as valid XML. Both the relatively simple XHTML 1.0 an the more powerful modular form of XHTML 1.1 are discussed. Finally, RDDL, the Resource Directory and Description Langauage is introduced as an XLink-based application of modular XHTML, used for documents placed at the end .of namespace URLs.

Next several non-HTML languages are introduced including WML, an HTML-like language for serving Web content to cell phones, PDAs, pagers, and other memory, display, and bandwidth limited devices. Scalable Vector Graphics (SVG) is a standard XML format for drawings recommended by the World Wide Web Consortium (W3C). The Vector Markup Language (VML) is a Microsoft-proprietary XML application for vector graphics used by Office 2000 and Internet Explorer 5.0. Microsoft's Channel Definition Format (CDF) is an XML-based markup language for defining channels that can push updated Web-site content to subscribers. Finally, a completely new application is developed for genealogical data to show you not just how to use XML tags, but why and when to choose them. Combining all of these different applications, you'll develop a good sense of how XML applications are designed, built, and used in the real world.

What You Need

XML is a platform-independent technology. Furthermore, most of the best software for working with XML is written in Java and can run on multiple platforms. Much of this is included on the CD in the back of the book or is freely available on the Internet. To make the best use of this book and XML, you need:


Glued to the inside back cover of the book you'll find a CD-ROM that holds all numbered code listings from this book as well as some longer examples that couldn't fit into this book. The CD-ROM also contains the complete text of various XML specifications in XML and HTML. (Some of the specifications are also available in other formats like PDF.) Finally, you will find an assortment of useful software for working with XML documents. Many (though not all) of these programs are written in Java, so they'll run on any system with a reasonably compatible Java 1.1 or later virtual machine. Most of the programs that aren't written in Java are designed for Windows 95, 98, and NT/2000.

Summing Up

XML is the wave of the future. Writing The XML Bible not only taught me about XML itself. It changed the way I looked at the Web. XML is a sea change in the way you looked at web sites and web site development. And I can think of no better way to learn about it than reading The XML Bible. Why don't you check it out, and let me know what you think?

If I've succeeded in piquing your interest, you'll be able to find The XML Bible at almost any bookstore that carries computer books including amazon.com. It's $69.99, published by Hungry Minds, and written by me, Elliotte Rusty Harold. The ISBN number is 0-7645-4819-0.

Table of Contents

Part I: Introducing XML
Chapter 1: An Eagle's Eye View of XML (no examples in this chapter)
Chapter 2: XML Applications
Chapter 3: Your First XML Document
Chapter 4: Structuring Documents
Chapter 5: Attributes, Empty Tags, and XSL
Chapter 6: Well-Formed XML Documents
Chapter 7: Foreign Languages and Non-Roman Text
Part II: Document Type Definitions
Chapter 8: Document Type Definitions and Validity
Chapter 9: Element Declarations
Chapter 10: Entity Declarations
Chapter 11: Attribute Declarations
Chapter 12: Unparsed Entities, Notations, and Non-XML Data
Chapter 13: Namespaces
Part III: Style Sheets
Chapter 14: CSS Style Sheets
Chapter 15: CSS Layouts
Chapter 16: CSS Text Styles
Chapter 17: CSS Level 3
Chapter 18: XSL Transformations
Chapter 19: XSL Formatting Objects
Part IV: Supplemental Technologies
Chapter 20: XLinks
Chapter 21: XPointers
Chapter 22: XInclude
Chapter 23: Schemas
Chapter 24: The Resource Description Framework
Chapter 25: Canonical XML
Part V: XML Applications
Chapter 26: Reading DTDs
Chapter 27: XHTML
Chapter 28: Modular XHTML
Chapter 28: RDDL
Chapter 30: The Wireless Markup Language
Chapter 31: Scalable Vector Graphics
Chapter 32: The Vector Markup Language
Chapter 33: Pushing Web Sites with CDF
Chapter 34: Designing a New XML Application
XML QuickRef
Appendix A: About the CD-ROM
Appendix B: XML Reference Material
Appendix C: XML 1.0 Specification

[ Cafe con Leche | Order from amazon.com | Table of Contents | Sample Chapters | Examples | Errata | Second edition | First edition]

Copyright 2001, 2004 Elliotte Rusty Harold
Last Modified February 29, 2004