Mastering XML SchemasElliotte Rusty HaroldXML & Web Services 2002 LondonTuesday, March 12, 2002elharo@metalab.unc.eduhttp://www.cafeconleche.org/ |
Generically, a document that describes what a correct document may contain
Specifically, a W3C Recommendation for an XML-document syntax that describes the permissible contents of XML documents
Created by W3C XML Schema Working Group based on many different submissions
No known patent, trademark, or other IP restrictions
XML Schema Part 1: Structures: http://www.w3.org/TR/xmlschema-1/
XML Schema Part 2: Datatypes: http://www.w3.org/TR/xmlschema-2/
Unusual, non-XML like syntax
No data typing, especially for element content
Limited extensibility
Only marginally compatible with namespaces
Cannot use mixed content and enforce order and number of child elements
Cannot enforce number of child elements without also enforcing order.
(i.e. no &
operator from SGML)
Last call working draft from April 7, 2000
Candidate Recommendation October 24, 2000
Proposed Recommendation March 16, 2001
2nd Proposed Recommendation March 30, 2001
Final Recommendation May 2, 2001
<?xml version="1.0"?> <GREETING> Hello XML! </GREETING>
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="GREETING" type="xsd:string"/> </xsd:schema>
xsi:noNamespaceSchemaLocation
attribute on
root element
xsi
prefix is mapped to
http://www.w3.org/2001/XMLSchema-instance URI
For example,
<?xml version="1.0"?> <GREETING xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="greeting.xsd"> Hello XML! </GREETING>
Other means of connecting schemas to documents are allowed
D:\schemas\examples>java sax.SAX2Count -v greeting2.xml greeting2.xml: 701 ms (1 elems, 1 attrs, 0 spaces, 12 chars)
<?xml version="1.0"?> <GREETING xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="greeting.xsd"> <P>Hello XML!</P> </GREETING>
D:\speaking\XML & Web Services 2002 London\schemas\examples>java sax.SAX2Count -v greeting3.xml [Error] greeting3.xml:4:6: Element type "P" must be declared. [Error] greeting3.xml:5:13: Datatype error: In element 'GREETING' : Can not have element children within a simple type content. greeting3.xml: 781 ms (2 elems, 1 attrs, 0 spaces, 14 chars)
Xerces-J 2.0: http://xml.apache.org/xerces2-j
Xerces-J 1.4.4: http://xml.apache.org/xerces-j
Xerces-C++ 1.7.0: http://xml.apache.org/xerces-c
Oracle XML Parser for Java: http://technet.oracle.com/tech/xml/xdk_java/
Oracle XML Parser for C: http://technet.oracle.com/tech/xml/xdk_c/
Oracle XML Parser for C++: http://technet.oracle.com/tech/xml/xdk_cpp/
Complex types can have child elements and attributes
Simple types cannot have children or attributes
xsd:element
declares an element and assigns it a type
xsd:attribute
declares an attribute and assigns it a type
xsd:complexType
defines a new complex type
xsd:simpleType
defines a new simple type
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="simple_song.xsd"> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>6:20</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string" minOccurs="1" maxOccurs="1"/> <xsd:element name="COMPOSER" type="xsd:string" minOccurs="1" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0" maxOccurs="1"/> <xsd:element name="LENGTH" type="xsd:duration" minOccurs="1" maxOccurs="1"/> <xsd:element name="YEAR" type="xsd:gYear" minOccurs="1" maxOccurs="1"/> <xsd:element name="ARTIST" type="xsd:string" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
D:\speaking\XML & Web Services 2002 London\schemas\examples>java sax.SAX2Count -v original_hotcop.xml
[Error] original_hotcop.xml:10:25: Datatype error: In element 'LENGTH' :
Value '6:20' is not legal value for current datatype. null.
original_hotcop.xml: 1583 ms (10 elems, 2 attrs, 30 spaces, 98 chars)
Here's the problem:
<?xml version="1.0"?>
<SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="simple_song.xsd">
<TITLE>Hot Cop</TITLE>
<COMPOSER>Jacques Morali</COMPOSER>
<COMPOSER>Henri Belolo</COMPOSER>
<COMPOSER>Victor Willis</COMPOSER>
<PRODUCER>Jacques Morali</PRODUCER>
<PUBLISHER>PolyGram Records</PUBLISHER>
<LENGTH>6:20</LENGTH>
<YEAR>1978</YEAR>
<ARTIST>Village People</ARTIST>
</SONG>
The LENGTH
element not in the schema time duration format!
which is ISO 8601
"PnYn MnDTnH nMnS, where nY represents the number of years, nM the number of months, nD
the number of days, 'T' is the date/time separator, nH the number of hours, nM the number of
minutes and nS the number of seconds. The number of seconds can include decimal digits to
arbitrary precision. An optional preceding minus sign ('-') is allowed, to indicate a negative
duration."
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="simple_song.xsd"> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>P0YT6M20S</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
Default value of minOccurs
is 1
Default value of maxOccurs
is 1
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:duration"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG"> <xsd:complexType> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" minOccurs="1" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0" maxOccurs="1"/> <xsd:element name="LENGTH" type="xsd:duration"/> <xsd:element name="YEAR" type="xsd:string"/> <xsd:element name="ARTIST" type="xsd:string" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
Consider this document:
<foo>
<value>45.67</value>
</foo>
What is the type of value
?
A longitude or latitude
A decimal monetary type, as in COBOL
A fixed point number
An infinitely precise floating point number such as
represented by the java.math.BigDecimal
class
An IEEE754 double
A Java double
An IEEE 754 float
A VAX Fortran REAL
An imprecisely known decimal number with 4 significant digits that's plus or minus 1 in the last place.
An imprecisely known decimal number with 4 significant digits that's plus or minus 5 in the last place.
Build 67 of version 45 of Microsoft Word
A regular expression matching all strings that begin with the two characters '4' and '5', followed by a single character, followed by the two characters '6' and '7'.
A string of characters a monkey typed on a keyboard
Other interpretations are doubtless possible, and even make sense in particular contexts.
There's no guarantee that the string 45.67
in fact represents any particular type.
A schema assigns an identifiable type to each element
Schema validation produces a Post Schema Validation Infoset, PSVI for short
Schema aware applications using schema aware parsers and APIs can make use of the types of elements
Boolean
String
URIs
Numeric types
Time types
XML types
No money types. However, these can be derived
XML Schema Built-In Numeric Simple Types | ||
---|---|---|
Name | Type | Examples |
float | IEEE 754 32-bit floating point number | -INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN |
double | IEEE 754 64-bit floating point number | -INF, 1.401E-90, -1E4, -0, 0, 12.78E-2, 12, INF, NaN, 3.4E42 |
decimal | arbitrary precision, decimal numbers | -2.7E400, 5.7E-444, -3.1415292, 0, 7.8, 90200.76, 3.4E1024 |
integer | an arbitrarily large or small integer | -500000000000000000000000, -9223372036854775809, -126789, -1, 0, 1, 5, 23, 42, 126789, 9223372036854775808, 456734987324983264987362495809587095720978 |
nonPositiveInteger | an integer less than or equal to zero | 0, -1, -2, -3, -4, -5, ... |
negativeInteger | an integer strictly less than zero | -1, -2, -3, -4, -5, ... |
long | an eight-byte two's complement integer such as Java's
long type |
-9223372036854775808, -12678967543233, -1, 9223372036854775807 |
int | an integer that can be represented as a four-byte,
two's complement number such as Java's int type |
-2147483648, -1, 0, 1, 5, 23, 42, 2147483647 |
short | an integer that can be represented as a two-byte,
two's complement number such as Java's short type |
-32768, -1, 0, 1, 5, 23, 42, 32767 |
byte | an integer that can be represented as a one-byte,
two's complement number such as Java's byte type |
-128, -1, 0, 1, 5, 23, 42, 127 |
nonNegativeInteger | an integer greater than or equal to zero | 0, 1, 2, 3, 4, 5, ... |
unsignedLong | an eight-byte unsigned integer | 0, 1, 2, 3, 4, 5, ...18446744073709551614, 18446744073709551615 |
unsignedInt | a four-byte unsigned integer | 0, 1, 2, 3, 4, 5, ...4294967294, 4294967295 |
unsignedShort | a two-byte unsigned integer | 0, 1, 2, 3, 4, 5, ...65534, 65535 |
unsignedByte | a one-byte unsigned integer | 0, 1, 2, 3, 4, 5, ...254, 255 |
positiveInteger | an integer strictly greater than zero | 1, 2, 3, 4, 5, 6, ... |
XML Schema Built-In Time Simple Types | ||
---|---|---|
Name | Type | Examples |
dateTime | a particular moment in Coordinated Universal Time; up to an arbitrarily small fraction of a second | 1999-05-31T13:20:00.000-05:00 |
gMonth | A given month in a given year | 2000-10 |
gYear | a given year | 2000 |
gMonthDay | a date in no particular year, or rather in every year | --10-31 |
gDay | a day in no particular month, or rather in every mnonth | ----31 |
duration | a length of time, without fixed endpoints, to an arbitrary fraction of a second | P2000Y10M31DT09H32M7.4312S |
date | a specific day in history | 2000-10-31 |
time | a specific time of day, that recurs every day | 14:30:00.000, 09:30:00.000-05:00 |
XML Schema Built-In XML Simple Types | ||
---|---|---|
Name | Type | Examples |
ID | XML 1.0 ID attribute type | any XML name that's unique among ID type attributes |
IDREF | XML 1.0 IDREF attribute type | any XML name that's used as an ID type attribute elsewhere in the document |
ENTITY | XML 1.0 ENTITY attribute type | any XML name that's declared as an unparsed entity in the DTD |
NOTATION | XML 1.0 NOTATION attribute type | any XML name that's declared as a notation name in the DTD |
language | Permissible values for xml:lang as defined in XML 1.0
|
en-GB, en-US, fr |
IDREFS | XML 1.0 IDREFS attribute type | a white space separated list of IDREF names |
ENTITIES | XML 1.0 ENTITIES attribute type | a white space separated list of ENTITY names |
NMTOKEN | XML 1.0 NMTOKEN attribute type | 12 are you ready |
NMTOKENS | XML 1.0 NMTOKENS attribute type | a white space separated list of name tokens |
Name | An XML 1.0 Name | set, title, rdf, math, math123, href |
QName | an optionally prefixed, namespace qualified name | song:title |
NCName | a local name without any colons | title |
XML Schema Built-In Simple Types | ||
---|---|---|
Name | Type | Examples |
string | Parsed Character Data; #PCDATA | Hot Cop |
normalizedString | A string whose normalized value does not contain any tabs, carriage returns, or linefeeds | PIC1, PIC2, PIC3, cow_movie, MonaLisa, Hello World , Warhol, red green |
token | A string whose normalized value has no leading or trailing white space, no tabs, no linefeeds, and not more than one consecutive space | p1 p2, ss123 45 6789, _92, red, green, NT Decl, seventeenp1, p2, 123 45 6789, ^*&^*&_92, red green blue, NT-Decl, seventeen; Mary had a little lamb, The love of money is the root of all Evil. |
boolean | C++'s bool type |
true, false, 1, 0 |
anyURI | relative or absolute URI | http://www.w3.org/TR/2000/WD-xmlschema-2-20000407/#duration, /javafaq/reports/JCE1.2.1.html |
hexBinary | Arbitrary binary data encoded in hexadecimal form | A4E345EC54CC8D52198000FFEA6C |
base64Binary | Arbitrary binary data encoded in Base64 | 6jKpNnmkkWeArsn5Oeeg2njcz+nXdk0f9kZI892ddlR8Lg1aMhPeFTYuoq3I6neFlb BjWzuktNZKiXYBfKsSTB8U09dTiJo2ir3HJuY7eW/p89osKMfixPQsp9vQMgzph6Qa lY7j4MB7y5ROJYsTr1/fFwmj/yhkHwpbpzed1LE= |
You can derive new simple types from existing types.
An xsd:simpleType
element defines the subtype
The name
attribute
of xsd:simpleType
assigns a name to the new type,
by which it can be referred to in xsd:element
type
attributes.
An xsd:restriction
child element derives by restricting
the legal values of the base type
An xsd:list
child element derives a type as a
white space separated list of base type instances
An xsd:union
child element derives by
combining legal values from multiple base types
You can derive new simple types types from existing types by restricting the type to a subset of its normal values
An xsd:simpleType
element defines the restricted type
The name
attribute
of xsd:simpleType
assigns a name to the new type.
An xsd:restriction
child element specifies
what type is being restricted via its base
attribute.
Facet children of xsd:restriction
specify the constraints on the type.
For example, this xsd:simpleType
element defines
a phonoYear
as any year from 1877 (the year Edison invented the
phonograph) on:
<xsd:simpleType name="phonoYear">
<xsd:restriction base="xsd:gYear">
<xsd:minInclusive value="1877"/>
</xsd:restriction>
</xsd:simpleType>
Then you declare the year element like this:
<xsd:element type="phonoYear" />
Facets include:
length
minLength
maxLength
pattern
enumeration
whiteSpace
maxInclusive
maxExclusive
minInclusive
minExclusive
totalDigits
fractionDigits
Not all facets apply to all types.
The number of units allowed in a value
For strings (string
,
normalizedString
, token
,
QName
, NCname
,
ID
, IDREF
,
language
, anyURI
, ENTITY
,
NOTATION
, and NMTOKEN
)
the units are characters
For lists (IDREFS
, ENTITIES
,
and
NMTOKENS
) the units are tokens
For binary types (hexBinary
, base64Binary
)
the units are bytes after decoding
Must be a non-negative integer
For example, to say that all names and titles must contain between 1 and 255 characters:
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:simpleType name="Str255"> <xsd:restriction base="xsd:string"> <xsd:minLength value="1"/> <xsd:maxLength value="255"/> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="Str255"/> <xsd:element name="COMPOSER" type="Str255" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="Str255" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="Str255" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:duration"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="Str255" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
Determines the minimum and maximum allowed values
Applies to ordered simple types including
byte
, unsignedByte
,
integer
, positiveInteger
,
negativeInteger
, nonNegativeInteger
,
nonPositiveInteger
, int
,
unsignedInt
, long
,
number
, unsignedLong
,
short
, unsignedShort
, number
,
float
, double
, time
,
dateTime
,
duration
, date
, gMonth
,
gYear
, gDay
,
and gMonthDay
.
For example, to say that the year must be between 1877 and 2100:
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:simpleType name="phonoYear"> <xsd:restriction base="xsd:gYear"> <xsd:minInclusive value="1877"/> <xsd:maxInclusive value="2100"/> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:duration"/> <xsd:element name="YEAR" type="phonoYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
Determines what the validator should do with white space before validating the value
Three possible values:
preserve
: The white space in the input document
is left unchanged
replace
: Each tab, carriage return and linefeed
is replaced with a single space.
collapse
: Each tab, carriage return and linefeed
is replaced with a single space. Furthermore, after this replacement is
performed, all runs of multiple spaces are condensed to a single space.
Leading and trailing white space is deleted.
Applies to string
,
normalizedString
and token
type items
Per XML 1.0, white space in attributes is normalized irregardless of the schema
For example, to say that white space should be collapsed in all names and titles:
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:simpleType name="CollapsedString"> <xsd:restriction base="xsd:string"> <xsd:whiteSpace value="collapse"/> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="CollapsedString"/> <xsd:element name="COMPOSER" type="CollapsedString" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="CollapsedString" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="CollapsedString" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:duration"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="CollapsedString" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
The totalDigits
facet
specifies the maximum number of decimal digits in a number
as a positive integer
The fractionDigits
facet
specifies the maximum number of decimal digits to the right of the decimal
point as a non-negative integer
Applies to all types derived from decimal
including byte
, unsignedByte
,
integer
, positiveInteger
,
negativeInteger
, nonNegativeInteger
,
nonPositiveInteger
,
int
, unsignedInt
, long
,
unsignedLong
, short
, and
unsignedShort
.
Does not apply to float
and double
You can specify at most two fractional digits or at most seven decimal digits, but not at least two fractional digits or exactly seven decimal digits
The enumeration
facet lists all allowed values
Applies to all simple types except boolean
For example, to say that the publisher must be one of the oligopoly that controls 90% of U.S. music (Warner-Elektra-Atlantic, Universal Music Group, Sony Music Entertainment, Inc., Capitol Records, Inc., BMG Music)
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="songType"/> <xsd:simpleType name="oligopolyMember"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="Warner-Elektra-Atlantic"/> <xsd:enumeration value="Universal Music Group"/> <xsd:enumeration value="Sony Music Entertainment, Inc."/> <xsd:enumeration value="Capitol Records, Inc."/> <xsd:enumeration value="BMG Music"/> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="songType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="oligopolyMember" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:duration"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="priced_song.xsd"> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> <PRICE>$1.35</PRICE> </SONG>
Suppose you want a money type to specify that the PRICE
element content must look like $1.35 or ¥11000
Derive this from the xsd:string
type by restriction
Use the pattern
facet to specify
a regular expression instances must match
More or less Perl-like including the Unicode extensions introduced in Perl 5.6
The money regular expression:
\p{Sc}\p{Nd}+(\.\p{Nd}\p{Nd})?
\p{Sc}
\p{Nd}
\p{Nd}+
\.
(\.\p{Nd}\p{Nd})
(\.\p{Nd}\p{Nd})?
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <xsd:simpleType name="money"> <xsd:restriction base="xsd:string"> <xsd:pattern value="\p{Sc}\p{Nd}+(\.\p{Nd}\p{Nd})?"/> <!-- Regular Expression: \p{Sc} Any Unicode currency indicator; e.g. $, ¥, £, &#A4, etc. \p{Nd} A Unicode decimal digit character \p{Nd}+ One or more Unicode decimal digit characters \. The period character (\.\p{Nd}\p{Nd}) (\.\p{Nd}\p{Nd})? Zero or one strings of the form .35 This works for any decimalized currency. --> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRICE" type="money"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
Elements that contain child elements or have attributes or both
Defined by an xsd:complexType
element
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="attribute_song.xsd"> <TITLE>Hot Cop</TITLE> <PHOTO ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <!-- An empty element --> <xsd:complexType name="PhotoType"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="nested_song.xsd"> <TITLE>Hot Cop</TITLE> <PHOTO ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER> <NAME> <GIVEN>Jacques</GIVEN> <FAMILY>Morali</FAMILY> </NAME> </COMPOSER> <COMPOSER> <NAME> <GIVEN>Henri</GIVEN> <FAMILY>Belolo</FAMILY> </NAME> </COMPOSER> <COMPOSER> <NAME> <GIVEN>Victor</GIVEN> <FAMILY>Willis</FAMILY> </NAME> </COMPOSER> <PRODUCER> <NAME> <GIVEN>Jacques</GIVEN> <FAMILY>Morali</FAMILY> </NAME> </PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="ComposerType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType> <xsd:sequence> <xsd:element name="GIVEN" type="xsd:string"/> <xsd:element name="FAMILY" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="ProducerType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType> <xsd:sequence> <xsd:element name="GIVEN" type="xsd:string"/> <xsd:element name="FAMILY" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="PhotoType"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="ComposerType" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="ProducerType" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
PRODUCER
and COMPOSER
are
really the same type.
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="PersonType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType> <xsd:sequence> <xsd:element name="GIVEN" type="xsd:string"/> <xsd:element name="FAMILY" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="PersonType" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="PersonType" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="PhotoType"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> </xsd:schema>
Schemas let you enforce order and appearance of elements in mixed content.
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="mixed_song.xsd"> <TITLE>Hot Cop</TITLE> <PHOTO ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER> <NAME>Mr. <GIVEN>Jacques</GIVEN> <FAMILY>Morali</FAMILY> Esq.</NAME> </COMPOSER> <COMPOSER> <NAME>Mr. <GIVEN>Henri</GIVEN> L. <FAMILY>Belolo</FAMILY>, M.D.</NAME> </COMPOSER> <COMPOSER> <NAME>Mr. <GIVEN>Victor</GIVEN> C. <FAMILY>Willis</FAMILY></NAME> </COMPOSER> <PRODUCER> <NAME>Mr. <GIVEN>Jacques</GIVEN> S. <FAMILY>Morali</FAMILY></NAME> </PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="PersonType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType mixed="true"> <xsd:sequence> <xsd:element name="GIVEN" type="xsd:string"/> <xsd:element name="FAMILY" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="PersonType" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="PersonType" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="PhotoType"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> </xsd:schema>
<?xml version="1.0"?> <SONG xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="unordered_song.xsd"> <TITLE>Hot Cop</TITLE> <PHOTO ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER> <NAME><FAMILY>Morali</FAMILY> <GIVEN>Jacques</GIVEN></NAME> </COMPOSER> <COMPOSER> <NAME><GIVEN>Henri</GIVEN> <FAMILY>Belolo</FAMILY></NAME> </COMPOSER> <COMPOSER> <NAME><FAMILY>Willis</FAMILY> <GIVEN>Victor</GIVEN></NAME> </COMPOSER> <PRODUCER> <NAME><GIVEN>Jacques</GIVEN> <FAMILY>Morali</FAMILY></NAME> </PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
Each element in the xsd:all
group must occur zero or once; that is
minOccurs
and maxOccurs
must each be 0 or 1
The xsd:all
group must be the top level element of its type
The xsd:all
group may contain only individual element declarations;
no choices or sequences
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="PersonType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType> <xsd:all> <xsd:element name="GIVEN" type="xsd:string"/> <xsd:element name="FAMILY" type="xsd:string"/> </xsd:all> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="PersonType" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="PersonType" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <!-- An empty element --> <xsd:complexType name="PhotoType" content="empty"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> </xsd:schema>
xsd:choice
requires exactly one of a group
of specified elements to appear
The choice can have
minOccurs
and maxOccurs
attributes
that adjust this from zero to any given number.
<xsd:complexType name="SongType">
<xsd:sequence>
<xsd:element name="TITLE" type="xsd:string"/>
<xsd:element name="COMPOSER" type="PersonType"/>
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element name="PRODUCER" type="PersonType"/>
<xsd:element name="COMPOSER" type="PersonType"/>
<xsd:element name="ARTIST" type="xsd:string"/>
</xsd:choice>
<xsd:element name="ARTIST" type="xsd:string"/>
<xsd:element name="PUBLISHER" type="xsd:string"
minOccurs="0"/>
<xsd:element name="LENGTH" type="xsd:string"/>
<xsd:element name="YEAR" type="xsd:string"/>
<xsd:element name="PRICE" type="xsd:string" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
xsd:sequence
requires each child element it specifies
to appear in the specified order
The sequence can have
minOccurs
and maxOccurs
attributes
that repeat each sequence zero to any given number of times.
Namespaces in XML rules and definitions:
Elements and attributes that are in namespaces are called qualified
All unprefixed attributes are unqualified
All prefixed elements are qualified
Unprefixed elements may or may not be qualified. They are qualified if they are in a default namespace.
Each schema has a target namespace
Each schema can define elements in attributes in its target namespace
A schema can also define unqualified attributes of elements in its target namespace.
A schema can also define unqualified child elements of elements in its target namespace. Unqualified child elements are called local elements. This is a very bad idea!
A schema may not define elements and attributes in namespaces other than the target namespace; i.e., for each namespace there must be at least one schema
Schemas can reference global elements and attributes defined in other schemas
by importing the schema with xsd:import
and
referencing the global elements and attributes defined therein.
<?xml version="1.0"?> <GREETING xmlns="http://www.cafeconleche.org/schemas/greeting/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.cafeconleche.org/schemas/greeting/ greeting_defaultNS.xsd"> Hello XML! </GREETING>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.cafeconleche.org/schemas/greeting/" > <xsd:element name="GREETING" type="xsd:string"/> </xsd:schema>
All elements are in the default namespace and are therefore qualified.
All attributes are unprefixed and are therefore unqualified.
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <SONG xmlns="http://www.cafeconleche.org/namespace/song" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation = "http://www.cafeconleche.org/namespace/song namespace_song.xsd" > <TITLE>Hot Cop</TITLE> <PHOTO ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
The elementFormDefault
attribute must have the value
qualified to indicate that the locally defined elements -- i.e.,
TITLE
,
PHOTO
, COMPOSER
, PRODUCER
, PUBLISHER
,
YEAR
, and ARTIST
but not
SONG
-- are
qualified.
Globally defined elements such as SONG
are always
qualified by the target namespace
The default value of elementFormDefault
is unqualified.
This is exactly backwards from what it should be.
The default value of attributeFormDefault
is unqualified.
This is correct.
This mess was caused by an misguided attempt to make local elements and attributes equivalent.
Simplest approach: either don't use namespaces at all
or make sure all elements are namespace qualified and set
elementFormDefault="qualified"
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.cafeconleche.org/namespace/song" elementFormDefault="qualified" attributeFormDefault="unqualified" > <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType" minOccurs="0"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="PhotoType"> <xsd:complexContent> <xsd:restriction base="xsd:anyType"> <xsd:attribute name="ALT" type="xsd:string"/> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger"/> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType> </xsd:schema>
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <SONG xmlns="http://www.cafeconleche.org/namespace/song" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation = "http://www.cafeconleche.org/namespace/song xlink_song.xsd http://www.w3.org/1999/xlink xlink.xsd" > <TITLE>Hot Cop</TITLE> <PHOTO xlink:type="simple" xlink:href="hotcop.jpg" xlink:actuate="onLoad" xlink:show="embed" ALT="Victor Willis in Cop Outfit" WIDTH="100" HEIGHT="200"/> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3.org/1999/xlink" > <xsd:attribute name="type" type="xsd:string" fixed="simple" /> <xsd:attribute name="href" type="xsd:anyURI" /> <xsd:attribute name="actuate" type="xsd:string" fixed="onLoad" /> <xsd:attribute name="show" type="xsd:string" fixed="embed" /> </xsd:schema>
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xlink="http://www.w3.org/1999/xlink" targetNamespace="http://www.cafeconleche.org/namespace/song" elementFormDefault="qualified" > <xsd:import namespace="http://www.w3.org/1999/xlink" schemaLocation="xlink.xsd"/> <xsd:element name="SONG" type="SongType"/> <xsd:complexType name="PhotoType"> <xsd:attribute name="WIDTH" type="xsd:nonNegativeInteger" use="required" /> <xsd:attribute name="HEIGHT" type="xsd:nonNegativeInteger" use="required" /> <xsd:attribute name="ALT" type="xsd:string" use="required" /> <xsd:attribute ref="xlink:type"/> <xsd:attribute ref="xlink:href" use="required"/> <xsd:attribute ref="xlink:actuate"/> <xsd:attribute ref="xlink:show"/> </xsd:complexType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="PHOTO" type="PhotoType"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
An xsd:annotation
element describes its parent
element.
A top-level xsd:annotation
element describes the entire schema.
Its xsd:documentation
child element describes the schema
for human readers.
Its xsd:appInfo
child element describes the schema
for computer programs; e.g. stylesheet instructions.
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:annotation> <xsd:documentation> Song schema example for XML Schemas tutorial Copyright 2001 Elliotte Rusty Harold. </xsd:documentation> </xsd:annotation> <xsd:element name="SONG" type="SongType"/> <xsd:simpleType name="money"> <xsd:restriction base="xsd:string"> <xsd:pattern value="\p{Sc}\p{Nd}+(\.\p{Nd}\p{Nd})?"> <xsd:annotation> <xsd:documentation> \p{Sc} Any Unicode currency indicator; e.g. $, ¥, £, &#A4, etc. \p{Nd} A Unicode decimal digit character \p{Nd}+ One or more Unicode decimal digit characters \. The period character (\.\p{Nd}\p{Nd}) (\.\p{Nd}\p{Nd})? Zero or one strings of the form .35 This works for any decimalized currency. </xsd:documentation> </xsd:annotation> </xsd:pattern> </xsd:restriction> </xsd:simpleType> <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:element name="COMPOSER" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRODUCER" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="YEAR" type="xsd:gYear"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRICE" type="money"/> </xsd:sequence> </xsd:complexType> </xsd:schema>
Cannot declare entities
Parent models
Extra-document validation
Rick Jelliffe's Schematron
Murato Makoto's RELAX
James Clark's TREX
RELAX + TREX -> RELAX NG
Rick Jelliffe's Hook
xlinkit
Eric van der Vlist's Examplotron
DTDs
According to Schematron inventor Rick Jelliffe:
The Schematron differs in basic concept from other schema languages in that it not based on grammars but on finding tree patterns in the parsed document. This approach allows many kinds of structures to be represented which are inconvenient and difficult in grammar-based schema languages.
Language is based on XPath
Validator is implemented in XSLT
W3C Schemas are conservative: everything not permitted is forbidden.
Schematron is liberal: everything not forbidden is permitted.
No data typing; validation only
Handles unordered structures very well
Handles descendant constraints very well
Almost self-documenting
A schema
contains a title
and a pattern
Each pattern
contains rule
child elements
Each rule
contains assert
and report
elements and has a
context
attribute
Each assert
and report
element
has a test
attribute containing an XPath expression
whihc returns a boolean.
The contents of each assert
element is printed if the assertion test fails
The contents of each report
element is printed if the report test
succeeds
<?xml version="1.0"?> <schema xmlns="http://www.ascc.net/xml/schematron"> <title>A Schematron Schema for Songs</title> <pattern> <rule context="SONG"> <assert test="TITLE"> A SONG must contain an initial TITLE element. </assert> <assert test="TITLE[position()=1]"> The TITLE element must be the initial element of the SONG element. </assert> <assert test="COMPOSER"> A SONG must contain at least one COMPOSER element. </assert> <assert test="ARTIST"> A SONG must contain at least one ARTIST element. </assert> </rule> </pattern> </schema>
Invented by Murato Makoto
JIS standard/Proposed ISO standard
Uses W3C Schema data types
No derived types
Mostly DTD-like structures
<?xml version="1.0?> <module moduleVersion="1.2" relaxCoreVersion="1.0" targetNamespace="" xmlns="http://www.xml.gr.jp/xmlns/relaxCore"> <!-- Elements allowed as document roots --> <interface> <export label="SONG"/> </interface> <elementRule role="SONG"> <sequence> <ref label="TITLE"/> <ref label="COMPOSER" occurs="*"/> <ref label="PRODUCER" occurs="*"/> <ref label="PUBLISHER" occurs="?"/> <ref label="YEAR"/> <ref label="ARTIST" occurs="+"/> <ref label="PRICE" occurs="?"/> </sequence> </elementRule> <elementRule role="TITLE" type="string"/> <elementRule role="COMPOSER" type="string"/> <elementRule role="PRODUCER" type="string"/> <elementRule role="PUBLISHER" type="string"/> <elementRule role="YEAR" type="year"/> <elementRule role="ARTIST" type="string"/> <elementRule role="PRICE" type="string"/> </module>
Tree Regular Expressions for XML
Invented by James Clark of XSLT fame
Uses W3C Schema data types
Combination of RELAX and TREX
OASIS standard
Uses externally defined data types including W3C Schema simple data types + facets
Mostly DTD-like structures
Allows fully unordered content; e.g. a name must contain one FirstName, zero or one LastNames, and any number of MiddleNames in any order.
<?xml version="1.0?> <element name="SONG" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <interleave> <element name="TITLE"> <text/> </element> <zeroOrMore> <element name="COMPOSER"> <text/> </element> </zeroOrMore> <zeroOrMore> <element name="PRODUCER"> <text/> </element> </zeroOrMore> <optional> <element name="PUBLISHER"> <text/> </element> </optional> <element name="LENGTH"> <data type="duration"/> </element> <element name="YEAR"> <data type="gYear"/> </element> <oneOrMore> <element name="ARTIST"> <text/> </element> <oneOrMore> <optional> <element name="PRICE"> <data type="string" pattern="\p{Sc}\p{Nd}+(\.\p{Nd}\p{Nd})?"/> </element> </optional> </interleave> </element>
Rick Jelliffe's Hook: A One-Element Language for Validation of XML Documents based on Partial Order
XSLT/XPath based
No data typing; validation only
A Hook schema for XHTML (adapted from Rick Jelliffe):
<hook:order targetNamespace="http://www.w3.org/1999/xhtml" >
html head [ title; meta. link. base. ] body
[ a br. blockquote caption; div dl; h1; h2; h3; h4; h5; h6;
img. ol; p; pre; table; ul; ]
[ tr; dt; dd; li; ] td
[ a br. blockquote div form img. ol; ul; li; ]
[ input; label; select; textarea; ] [ option. ]
[ abbr acronym address cite code dfn em kbd q samp span strong var object; ]
param
</hook:order>
Invented by Eric van der Vlist
Schemas look like the documents they model
Validator is an XSLT stylesheet
<?xml version="1.0"?> <SONG xmlns:eg="http://examplotron.org/0/"> <TITLE>Put title here</TITLE> <COMPOSER eg:occurs="+">Author of song</COMPOSER> <PRODUCER eg:occurs="*">Put producer here</PRODUCER> <PUBLISHER eg:occurs="?">Put publisher here</PUBLISHER> <LENGTH>Length of song</LENGTH> <YEAR>Year song was published</YEAR> <ARTIST eg:occurs="+">Singers and musicians</ARTIST> </SONG>
XML Bible, Gold edition, Chapter 23
Elliotte Rusty Harold
Hungry Minds, 2001
ISBN 0-7645-4819-0
http://www.cafeconleche.org/books/biblegold/chapters/ch24.html
This presentation: http://www.cafeconleche.org/slides/xmlone/london2002/schemas
W3C Schema Primer: http://www.w3.org/TR/xmlschema-0/
XML Schema Part 1: Structures: http://www.w3.org/TR/xmlschema-1/
XML Schema Part 2: Datatypes: http://www.w3.org/TR/xmlschema-2/