Generically, a document that describes what a correct document may contain
Specifically, a W3C Recommendation for an XML-document syntax that describes the permissible contents of XML documents
Created by W3C XML Schema Working Group based on many different submissions
No known patent, trademark, or other IP restrictions
XML Schema Part 1: Structures: http://www.w3.org/TR/xmlschema-1/
XML Schema Part 2: Datatypes: http://www.w3.org/TR/xmlschema-2/
Unusual, non-XML like syntax
No data typing, especially for element content
Limited extensibility
Only marginally compatible with namespaces
Cannot use mixed content and enforce order and number of child elements
Cannot enforce number of child elements without also enforcing order.
(i.e. no &
operator from SGML)
DTDs | Schemas |
---|---|
<!ELEMENT> declaration | xsd:element element |
<!ATTLIST> declaration | xsd:attribute element |
<!NOTATION> declaration | |
<!ENTITY> declaration | |
Data types |
Last call working draft from April 7, 2000
Candidate Recommendation October 24, 2000
Proposed Recommendation March 16, 2001
<?xml version="1.0"?> <GREETING> Hello XML! </GREETING>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="GREETING" type="xsd:string"/> </xsd:schema>
xsi:noNamespaceSchemaLocation
attribute on
root element
xsi
prefix is mapped to
http://www.w3.org/2001/XMLSchema-instance URI
For example,
<?xml version="1.0"?> <GREETING xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="greeting.xsd"> Hello XML! </GREETING>
Other means of connecting schemas to documents are allowed
D:\schemas\examples>java sax.SAX2Count -v greeting2.xml greeting2.xml: 701 ms (1 elems, 1 attrs, 0 spaces, 12 chars)
<?xml version="1.0"?> <GREETING xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="greeting.xsd"> <P>Hello XML!</P> </GREETING>
D:\speaking\XMLOne London\schemas\examples>java sax.SAX2Count -v greeting3.xml [Error] greeting3.xml:4:6: Element type "P" must be declared. [Error] greeting3.xml:5:13: Datatype error: In element 'GREETING' : Can not have element children within a simple type content. greeting3.xml: 781 ms (2 elems, 1 attrs, 0 spaces, 14 chars)
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="simple_song.xsd"> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>6:20</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
Complex types can have child elements and attributes
Simple types cannot have children or attributes
xsd:element
declares an element and assigns it a type
xsd:attribute
declares an attribute and assigns it a type
xsd:complexType
defines a new type
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string" minOccurs="1" maxOccurs="1"/> <xsd:element name="COMPOSER" type="xsd:string" minOccurs="1" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0" maxOccurs="1"/> <xsd:element name="LENGTH" type="xsd:timeDuration" minOccurs="1" maxOccurs="1"/> <xsd:element name="YEAR" type="xsd:gYear" minOccurs="1" maxOccurs="1"/> <xsd:element name="ARTIST" type="xsd:string" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
D:\speaking\XMLOne London\schemas\examples>java sax.SAX2Count -v hotcop.xml
[Error] hotcop.xml:10:25: Datatype error: java.text.ParseException: Illegal or
misplaced separator.
Here's the problem:
<?xml version="1.0"?>
<SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="song.xsd">
<TITLE>Hot Cop</TITLE>
<COMPOSER>Jacques Morali</COMPOSER>
<COMPOSER>Henri Belolo</COMPOSER>
<COMPOSER>Victor Willis</COMPOSER>
<PRODUCER>Jacques Morali</PRODUCER>
<PUBLISHER>PolyGram Records</PUBLISHER>
<LENGTH>6:20</LENGTH>
<YEAR>1978</YEAR>
<ARTIST>Village People</ARTIST>
</SONG>
This is not in the schema time duration format! which is ISO 8601 "PnYn MnDTnH nMnS, where nY represents the number of years, nM the number of months, nD the number of days, 'T' is the date/time separator, nH the number of hours, nM the number of minutes and nS the number of seconds. The number of seconds can include decimal digits to arbitrary precision. An optional preceding minus sign ('-') is allowed, to indicate a negative duration."
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="simple_song.xsd"> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>P0YT6M20S</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
Xerces doesn't get this one right yet!
Default value of minOccurs
is 1
Default value of maxOccurs
is 1
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:timeDuration"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" minOccurs="1" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0" maxOccurs="1"/> <xsd:element name="LENGTH" type="xsd:timeDuration"/> <xsd:element name="YEAR" type="xsd:string"/> <xsd:element name="ARTIST" type="xsd:string" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:element> </xsd:schema>
Consider this document:
<foo>
<value>45.67</value>
</foo>
What is the type of value
?
A decimal monetary type, as in COBOL
A fixed point number
An infinitely precise floating point number such as
represented by the java.math.BigDecimal
class
An IEEE754 double
A Java double
An IEEE 754 float
A VAX Fortran REAL
An imprecisely known decimal number with 4 significant digits that's plus or minus 1 in the last place.
An imprecisely known decimal number with 4 significant digits that's plus or minus 5 in the last place.
Build 67 of version 45 of Microsoft Word
A regular expression matching all strings that begin with the two characters '4' and '5', followed by a single character, followed by the two characters '6' and '7'.
A string of characters a monkey typed on a keyboard
Other interpretations are doubtless possible, and even make sense in particular contexts.
There's no guarantee that the string 45.67
in fact represents any particular type.
A schema assigns an identifiable type to each element
Schema validation produces in a Post Schema Validation Infoset, PSVI for short
Schema aware applications using schema aware parsers and APIs can make use of the types of elements
Boolean
String
URIs
Numeric types
Time types
XML types
No money types. However, these can be derived
XML Schema Built-In Simple Types | ||
---|---|---|
Name | Type | Examples |
float | IEEE 754 32-bit floating point number | -INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN |
double | IEEE 754 64-bit floating point number | -INF, 1.401E-90, -1E4, -0, 0, 12.78E-2, 12, INF, NaN, 3.4E42 |
number | arbitrary precision, decimal numbers | -2.7E400, 5.7E-444, -3.1415292, 0, 7.8, 90200.76, 3.4E1024 |
integer | an arbitrarily large or small integer | -500000000000000000000000, -9223372036854775809, -126789, -1, 0, 1, 5, 23, 42, 126789, 9223372036854775808, 456734987324983264987362495809587095720978 |
nonPositiveInteger | an integer less than or equal to zero | 0, -1, -2, -3, -4, -5, ... |
negativeInteger | an integer strictly less than zero | -1, -2, -3, -4, -5, ... |
long | an eight-byte two's complement integer such as Java's
long type |
-9223372036854775808, -12678967543233, -1, 9223372036854775807 |
int | an integer that can be represented as a four-byte,
two's complement number such as Java's int type |
-2147483648, -1, 0, 1, 5, 23, 42, 2147483647 |
short | an integer that can be represented as a two-byte,
two's complement number such as Java's short type |
-32768, -1, 0, 1, 5, 23, 42, 32767 |
byte | an integer that can be represented as a one-byte,
two's complement number such as Java's byte type |
-128, -1, 0, 1, 5, 23, 42, 127 |
nonNegativeInteger | an integer greater than or equal to zero | 0, 1, 2, 3, 4, 5, ... |
unsignedLong | an eight-byte unsigned integer | 0, 1, 2, 3, 4, 5, ...18446744073709551614, 18446744073709551615 |
unsignedInt | a four-byte unsigned integer | 0, 1, 2, 3, 4, 5, ...4294967294, 4294967295 |
unsignedShort | a two-byte unsigned integer | 0, 1, 2, 3, 4, 5, ...65534, 65535 |
unsignedByte | a one-byte unsigned integer | 0, 1, 2, 3, 4, 5, ...254, 255 |
positiveInteger | an integer strictly greater than zero | 1, 2, 3, 4, 5, 6, ... |
XML Schema Built-In Simple Types | ||
---|---|---|
Name | Type | Examples |
timeInstant | a particular moment in Coordinated Universal Time; up to an arbitrarily small fraction of a second | 1999-05-31T13:20:00.000-05:00 |
gMonth | A given month in a given year | 2000-10 |
gYear | a given year | 2000 |
recurringDate | a date in no particular year, or rather in every year | --10-31 |
recurringDay | a day in no particular month, or rather in every mnonth | ----31 |
timeDuration | a length of time, without fixed endpoints, to an arbitrary fraction of a second | P2000Y10M31DT09H32M7.4312S |
date | a specific day in history | 2000-10-31 |
time | a specific time of day, that recurs every day | 14:30:00.000, 09:30:00.000-05:00 |
XML Schema Built-In Simple Types | ||
---|---|---|
Name | Type | Examples |
ID | XML 1.0 ID attribute type | any XML name that's unique among ID type attributes |
IDREF | XML 1.0 IDREF attribute type | any XML name that's used as an ID type attribute elsewhere in the document |
ENTITY | XML 1.0 ENTITY attribute type | any XML name that's declared as an unparsed entity in the DTD |
NOTATION | XML 1.0 NOTATION attribute type | any XML name that's declared as a notation name in the DTD |
language | valid values for xml:lang as defined in XML 1.0 | en-GB, en-US, fr |
IDREFS | XML 1.0 IDREFS attribute type | a white space separated list of IDREF names |
ENTITIES | XML 1.0 ENTITIES attribute type | a white space separated list of ENTITY names |
NMTOKEN | XML 1.0 NMTOKEN attribute type | 12 are you ready |
NMTOKENS | XML 1.0 NMTOKENS attribute type | a white space separated list of name tokens |
Name | An XML 1.0 Name | set, title, rdf, math, math123, href |
QName | a prefixed name | song:title |
NCName | a local name without any colons | title |
XML Schema Built-In Simple Types | ||
---|---|---|
Name | Type | Examples |
string | Parsed Character Data; #PCDATA | Hot Cop |
normalizedString | A string that does not contain any tabs, carriage returns, or linefeeds | PIC1, PIC2, PIC3, cow_movie, MonaLisa, Hello World , Warhol, red green |
token | A string with no leading or trailing white space, no tabs, no linefeeds, and not more than one consecutive space | p1 p2, ss123 45 6789, _92, red, green, NT Decl, seventeenp1, p2, 123 45 6789, ^*&^*&_92, red green blue, NT-Decl, seventeen; Mary had a little lamb, The love of money is the root of all Evil. |
boolean | C++'s bool type | true, false, 1, 0 |
anyURI | relative or absolute URI | http://www.w3.org/TR/2000/WD-xmlschema-2-20000407/#timeDuration, /javafaq/reports/JCE1.2.1.html |
hexBinary | Arbitrary binary data encoded in hexadecimal form | A4E345EC54CC8D52198000FFEA6C |
base64Binary | Arbitrary binary data encoded in Base64 | 6jKpNnmkkWeArsn5Oeeg2njcz+nXdk0f9kZI892ddlR8Lg1aMhPeFTYuoq3I6neFlb BjWzuktNZKiXYBfKsSTB8U09dTiJo2ir3HJuY7eW/p89osKMfixPQsp9vQMgzph6Qa lY7j4MB7y5ROJYsTr1/fFwmj/yhkHwpbpzed1LE= |
You can derive new simple types from existing types by restricting the type to a subset of its normal values
An xsd:simpleType
element defines the restricted type
The name
attribute
of xsd:simpleType
assigns a name to the new type,
by which it can be referred to in xsd:element
type
attributes.
An xsd:restriction
child element specifies
what type is being restricted via its base
attribute.
Facet children of xsd:restriction
specify the constraints on the type.
For example, this xsd:simpleType
element defines
a phonoYear
as any year from 1877 (the year Edison invented the
phonograph) on:
<xsd:simpleType name="phonoYear">
<xsd:restriction base="xsd:gYear">
<xsd:minInclusive value="1877"/>
</xsd:restriction>
</xsd:simpleType>
Then you declare the year element like this:
<xsd:element type="phonoYear" />
Facets include:
length
minLength
maxLength
pattern
enumeration
whiteSpace
maxInclusive
maxExclusive
minInclusive
minExclusive
totalDigits
fractionDigits
period
duration
Not all facets apply to all types.
The number of characters allowed in a string
Must be a non-negative integer
Applies to string
,
normalizedString
, token
,
hexBinary
, base64Binary
, QName
, NCname,
ID
, IDREF
, IDREFS
, language
, anyURI
, ENTITY
, ENTITIES
, NOTATION,
NOTATIONS
, NMTOKEN
and
NMTOKENS
type items
For example, to say that all names and titles must contain between 1 and 255 characters:
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:simpleType name="Str255"> <xsd:restriction base="xsd:string"> <xsd:minLength value="1"/> <xsd:maxLength value="255"/> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="Str255"/> <xsd:element name="COMPOSER" type="Str255" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="Str255" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="Str255" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:timeDuration"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="Str255" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
Determines the minimum and maximum allowed values
Applies to ordered simple types including
byte
, unsignedByte
,
integer
, positiveInteger
,
negativeInteger
, nonNegativeInteger
,
nonPositiveInteger
, int
,
unsignedInt
, long
,
number
, unsignedLong
,
short
, unsignedShort
, number
,
float
, double
, time
,
timeInstant
, timePeriod
,
timeDuration
, date
, gMonth
,
gYear
, recurringDay
,
and recurringDate
.
For example, to say that the year must be between 1877 and 2100:
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:simpleType name="phonoYear"> <xsd:restriction base="xsd:gYear"> <xsd:minInclusive value="1877"/> <xsd:maxInclusive value="2100"/> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:timeDuration"/> <xsd:element name="YEAR" type="phonoYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
Determines what the application should do with white space found in the content
Three possible values:
preserve
: The white space in the input document is left unchanged
replace
: Each tab, carriage return and linefeed
is replaced with a single space.
collapse
: Each tab, carriage return and linefeed
is replaced with a single space. Furthermopre, after this replacement is
performed, all runs of multiple spaces are condensed to a single space.
leading and trailing white space is deleted.
No effect on validation
Applies to string
,
normalizedString
and token
type items
Per XML 1.0, white space in attributes is normalized irregardless of the schema
For example, to say that white space should be collapsed in all names and titles:
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:simpleType name="NormalizedString"> <xsd:restriction base="xsd:string"> <xsd:whiteSpace value="collapse"/> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="NormalizedString"/> <xsd:element name="COMPOSER" type="NormalizedString" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="NormalizedString" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="NormalizedString" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:timeDuration"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="NormalizedString" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
The totalDigits
facet
specifies the maximum number of decimal digits in a number
as a positive integer
The fractionDigits
facet
specifies the maximum number of decimal digits to the right of the decimal
point as a non-negative integer
Applies to all number types except base64Binary
and hexBinary
including byte
, unsignedByte
,
byte
, integer
, positiveInteger
,
negativeInteger
, nonNegativeInteger
,
nonPositiveInteger
,
int
, unsignedInt
, long
,
unsignedLong
, short
,
unsignedShort
and
number
You can specify at most two fractional digits or at most seven decimal digits, but not at least two fractional digits or exactly seven decimal digits
The period
facet defines the
frequency of recurrence (after what duration it recurs) for time types.
Its value is a time duration.
The duration
facet defines the
the length of the duration for time types.
Its value is also a time duration.
Applies to time types:
time
, timeInstant
, timePeriod
,
date
, month
, year
,
recurringDay
, and recurringDate
.
For example, you might use the period
facet to define a
twoWeek
type with a fourteen day
and a fourteen day period
duration for paychecks.
The enumeration
facet lists all allowed values
Applies to all simple types except boolean
For example, to say that the publisher must be one of the oligopoly that controls 90% of U.S. music (Warner-Elektra-Atlantic, Universal Music Group, Sony Music Entertainment, Inc., Capitol Records, Inc., BMG Music)
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:simpleType name="oligopolyMember"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="Warner-Elektra-Atlantic"/> <xsd:enumeration value="Universal Music Group"/> <xsd:enumeration value="Sony Music Entertainment, Inc."/> <xsd:enumeration value="Capitol Records, Inc."/> <xsd:enumeration value="BMG Music"/> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="oligopolyMember" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:timeDuration"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="priced_song.xsd"> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> <PRICE>$1.35</PRICE> </SONG>
Suppose you want a money type to specify that the PRICE
element content must look like $1.35 or ¥11000
Derive this from the xsd:string
type by restriction
Use the pattern
facet to specify
a regular expression instances must match
More or less Perl-like with some Unicode extensions
The money regular expression:
\p{Sc}\p{Nd}+(\.\p{Nd}\p{Nd})?
\p{Sc}
\p{Nd}
\p{Nd}+
\.
(\.\p{Nd}\p{Nd})
(\.\p{Nd}\p{Nd})?
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <xsd:simpleType name="money"> <xsd:restriction base="xsd:string"> <xsd:pattern value="\p{Sc}\p{Nd}+(\.\p{Nd}\p{Nd})?"/> <!-- Regular Expression: \p{Sc} Any Unicode currency indicator; e.g. $, ¥, £, &#A4, etc. \p{Nd} A Unicode decimal digit character \p{Nd}+ One or more Unicode decimal digit characters \. The period character (\.\p{Nd}\p{Nd}) (\.\p{Nd}\p{Nd})? Zero or one strings of the form .35 This works for any decimalized currency. --> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRICE" type="money"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
Elements that contain child elements or have attributes or both
Defined by an xsd:complexType
element
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="attribute_song.xsd"> <TITLE>Hot Cop</TITLE> <PHOTO ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <!-- An empty element --> <xsd:complexType name="PhotoType"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0""/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="nested_song.xsd"> <TITLE>Hot Cop</TITLE> <PHOTO ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER> <NAME> <GIVEN>Jacques</GIVEN> <FAMILY>Morali</FAMILY> </NAME> </COMPOSER> <COMPOSER> <NAME> <GIVEN>Henri</GIVEN> <FAMILY>Belolo</FAMILY> </NAME> </COMPOSER> <COMPOSER> <NAME> <GIVEN>Victor</GIVEN> <FAMILY>Willis</FAMILY> </NAME> </COMPOSER> <PRODUCER> <NAME> <GIVEN>Jacques</GIVEN> <FAMILY>Morali</FAMILY> </NAME> </PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="ComposerType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType> <xsd:sequence> <xsd:element name="GIVEN" type="xsd:string"/> <xsd:element name="FAMILY" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="ProducerType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType> <xsd:sequence> <xsd:element name="GIVEN" type="xsd:string"/> <xsd:element name="FAMILY" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="PhotoType"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="ComposerType" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="ProducerType" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
PRODUCER
and COMPOSER
are
really the same type.
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="PersonType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType> <xsd:sequence> <xsd:element name="GIVEN" type="xsd:string"/> <xsd:element name="FAMILY" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="PersonType" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="PersonType" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="PhotoType"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> </xsd:schema>
Schemas let you enforce order and appearance of elements in mixed content.
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="mixed_song.xsd"> <TITLE>Hot Cop</TITLE> <PHOTO ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER> <NAME>Mr. <GIVEN>Jacques</GIVEN> <FAMILY>Morali</FAMILY> Esq.</NAME> </COMPOSER> <COMPOSER> <NAME>Mr. <GIVEN>Henri</GIVEN> L. <FAMILY>Belolo</FAMILY>, M.D.</NAME> </COMPOSER> <COMPOSER> <NAME>Mr. <GIVEN>Victor</GIVEN> C. <FAMILY>Willis</FAMILY></NAME> </COMPOSER> <PRODUCER> <NAME>Mr. <GIVEN>Jacques</GIVEN> S. <FAMILY>Morali</FAMILY></NAME> </PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="PersonType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType mixed="true"> <xsd:sequence> <xsd:element name="GIVEN" type="xsd:string"/> <xsd:element name="FAMILY" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="PersonType" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="PersonType" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="PhotoType"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> </xsd:schema>
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="unordered_song.xsd"> <TITLE>Hot Cop</TITLE> <PHOTO ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER> <NAME><FAMILY>Morali</FAMILY> <GIVEN>Jacques</GIVEN></NAME> </COMPOSER> <COMPOSER> <NAME><GIVEN>Henri</GIVEN> <FAMILY>Belolo</FAMILY></NAME> </COMPOSER> <COMPOSER> <NAME><FAMILY>Willis</FAMILY> <GIVEN>Victor</GIVEN></NAME> </COMPOSER> <PRODUCER> <NAME><GIVEN>Jacques</GIVEN> <FAMILY>Morali</FAMILY></NAME> </PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
Each element in the xsd:all
group must occur zero or once; that is
minOccurs
and maxOccurs
must each be 0 or 1
The xsd:all
group must be the top level element of its type
The xsd:all
group may contain only individual element declarations;
no choice or sequences
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="PersonType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType> <xsd:all> <xsd:element name="GIVEN" type="xsd:string"/> <xsd:element name="FAMILY" type="xsd:string"/> </xsd:all> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="PersonType" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="PersonType" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <!-- An empty element --> <xsd:complexType name="PhotoType" content="empty"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> </xsd:schema>
xsd:choice
requires exactly one of a group
of specified elements to appear
The choice can have
minOccurs
and maxOccurs
attributes
that adjust this from zero to any given number.
xsd:sequence
requires each child element it specifies
to appear in the specified order
The sequence can have
minOccurs
and maxOccurs
attributes
that repeat each sequence zero to any given number of times.
<?xml version="1.0"?> <GREETING xmlns="http://ibiblio.org/xml/schemas/greeting/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://ibiblio.org/xml/schemas/greeting/ greeting_defaultNS.xsd"> Hello XML! </GREETING>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://ibiblio.org/xml/schemas/greeting/" > <xsd:element name="GREETING" type="xsd:string"/> </xsd:schema>
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <SONG xmlns="http://ibiblio.org/xml/namespace/song" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation = "http://ibiblio.org/xml/namespace/song namespace_song.xsd" > <TITLE>Hot Cop</TITLE> <PHOTO ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://ibiblio.org/xml/namespace/song" targetNamespace="http://ibiblio.org/xml/namespace/song" elementFormDefault="qualified" attributeFormDefault="unqualified" > <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="PhotoType"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> </xsd:schema>
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <SONG xmlns="http://ibiblio.org/xml/namespace/song" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation = "http://ibiblio.org/xml/namespace/song xlink_song.xsd http://www.w3.org/1999/xlink xlink.xsd" > <TITLE>Hot Cop</TITLE> <PHOTO xlink:type="simple" xlink:href="hotcop.jpg" xlink:actuate="onLoad" xlink:show="embed" ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.w3.org/1999/xlink" targetNamespace="http://www.w3.org/1999/xlink" attributeFormDefault="unqualified" > <xsd:attribute name="type" type="xsd:string" use="fixed" value="simple" /> <xsd:attribute name="href" type="xsd:anyURI"/> <xsd:attribute name="actuate" type="xsd:string" use="fixed" value="onLoad" /> <xsd:attribute name="show" type="xsd:string" use="fixed" value="embed" /> </xsd:schema>
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://ibiblio.org/xml/namespace/song" xmlns:xlink="http://www.w3.org/1999/xlink" targetNamespace="http://ibiblio.org/xml/namespace/song" elementFormDefault="qualified" attributeFormDefault="unqualified" > <xsd:import namespace="http://www.w3.org/1999/xlink" schemaLocation="xlink.xsd"/> <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="PhotoType"> <xsd:attribute name="WIDTH" type="xsd:positiveInteger" use="required" /> <xsd:attribute name="HEIGHT" type="xsd:positiveInteger" use="required" /> <xsd:attribute name="ALT" type="xsd:string" use="required" /> <xsd:attribute ref="xlink:type"/> <xsd:attribute ref="xlink:href" use="required"/> <xsd:attribute ref="xlink:actuate"/> <xsd:attribute ref="xlink:show"/> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:timeDuration"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
The top-level xsd:annotation
element describes the schema
Its xsd:documentation
child element describes the schema
for human readers
Its xsd:appInfo
child element describes the schema
for computer programs; e.g. stylesheet instructions
<xsd:annotation>
<xsd:documentation>
Song schema for XML and Java Example at XMLOne London
Copyright 2001 Elliotte Rusty Harold.
</xsd:documentation>
</xsd:annotation>
Cannot declare entities
Parent models
Extra-document validation
Rick Jelliffe's Schematron
Murato Makoto's RELAX
James Clark's TREX
Rick Jelliffe's Hook
DTDs
According to Schematron inventor Rick Jelliffe:
The Schematron differs in basic concept from other schema languages in that it not based on grammars but on finding tree patterns in the parsed document. This approach allows many kinds of structures to be represented which are inconvenient and difficult in grammar-based schema languages.
XSLT/XPath based
W3C Schemas are conservative: everything not permitted is forbidden.
Schematron is liberal: everything not forbidden is permitted.
No data typing; validation only
Handles unordered structures very well
Handles descendant constraints very well
Almost self-documenting
A schema
contains a title
and a pattern
Each pattern
contains rules
Each rule
contains assert
and report
elements and has a context
attribute
Each assert
and report
element
has a test
attribute containing an XPath expression
whihc returns a boolean.
The contents of each assert
element is printed if the assertion test fails
The contents of each report
element is printed if the report test
succeeds
<?xml version="1.0"?> <schema xmlns="http://www.ascc.net/xml/schematron"> <title>A Schematron Schema for Songs</title> <pattern> <rule context="SONG"> <assert test="TITLE"> A SONG must contain an initial TITLE element. </assert> <assert test="TITLE[position()=1]"> The TITLE element must be the initial element of the SONG element. </assert> <assert test="COMPOSER"> A SONG must contain at least COMPOSER element. </assert> <assert test="ARTIST"> A SONG must contain at least one ARTIST element. </assert> </rule> </pattern> </schema>
Murato Makoto
JIS standard/Proposed ISO standard
Uses W3C Schema data types
No derived types
Mostly DTD-like structures
<?xml version="1.0?> <module moduleVersion="1.2" relaxCoreVersion="1.0" targetNamespace="" xmlns="http://www.xml.gr.jp/xmlns/relaxCore"> <!-- Elements allowed as document roots --> <interface> <export label="SONG"/> </interface> <elementRule role="SONG"> <sequence> <ref label="TITLE"/> <ref label="COMPOSER" occurs="*"/> <ref label="PRODUCER" occurs="*"/> <ref label="PUBLISHER" occurs="?"/> <ref label="YEAR"/> <ref label="ARTIST" occurs="+"/> <ref label="PRICE" occurs="?"/> </sequence> </elementRule> <elementRule role="TITLE" type="string"/> <elementRule role="COMPOSER" type="string"/> <elementRule role="PRODUCER" type="string"/> <elementRule role="PUBLISHER" type="string"/> <elementRule role="YEAR" type="year"/> <elementRule role="ARTIST" type="string"/> <elementRule role="PRICE" type="string"/> </module>
Tree Regular Expressions for XML
Invented by James Clark of XSLT fame
Proposed OASIS standard
Uses W3C Schema data types
Rick Jelliffe's Hook: A One-Element Language for Validation of XML Documents based on Partial Order
XSLT/XPath based
No data typing; validation only
A Hook schema for XHTML (adapted from Rick Jelliffe):
<hook:order targetNamespace="http://www.w3.org/1999/xhtml" >
html head [ title; meta. link. base. ] body
[ a br. blockquote caption; div dl; h1; h2; h3; h4; h5; h6;
img. ol; p; pre; table; ul; ]
[ tr; dt; dd; li; ] td
[ a br. blockquote div form img. ol; ul; li; ]
[ input; label; select; textarea; ] [ option. ]
[ abbr acronym address cite code dfn em kbd q samp span strong var object; ]
param
</hook:order>
XML Bible, second edition, Chapter 24
Elliotte Rusty Harold
Hungry Minds, 2001
ISBN 0-7645-4760-7
This presentation: http://www.ibiblio.org/xml/slides/xmlonelondon2001/schemas
W3C Schema Primer: http://www.w3.org/TR/xmlschema-0/
XML Schema Part 1: Structures: http://www.w3.org/TR/xmlschema-1/
XML Schema Part 2: Datatypes: http://www.w3.org/TR/xmlschema-2/