Copyright © [2012]. Roger L. Costello. All Rights Reserved. XML Schema 1.1 Structure Specification: http://www.w3.org/TR/xmlschema11-1/ Datatype Specification: http://www.w3.org/TR/xmlschema11-2/ Roger L. Costello http://www.xfront.com 14 December 2012 1 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Acknowledgements • Special thanks to Michael Kay and Michael SperbergMcQueen for answering my endless questions. • Thanks to the following people for their suggestions and identifying typos: – – – – – – – – Noah Mendelsohn Mukul Gandhi Pete Cordell Ken Starks Dave Peterson Fraser Goffin Paul Jones Henry Callihan 2 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XSD 1.1 is a Recommendation • XML Schema 1.1 became a standard on April 5, 2012 (the W3C calls it a “recommendation” not a standard, but they are the same) 3 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The "Big Picture" • I created another tutorial: XML Schema 1.1 for Managers • I strongly recommend reading it prior to reading this. • It will give you the "big picture." 4 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Prerequisite • This tutorial assumes you already have a thorough understanding of XML Schemas 1.0* • Also, you should have a pretty good understanding of XPath. Ideally you know XPath 2.0 You don't know XPath? No problem! I've created a Quick Intro to XPath. See quick-intro-to-xpath.ppt * See my tutorial on XML Schema 1.0: http://www.xfront.com/xml-schema.html 5 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Viewing this Tutorial • This tutorial is best viewed in slide show mode – Under the View menu select Slide Show • Periodically you will see an icon at the bottom, right of the slide indicating that it is time to do a lab exercise. I strongly recommend that you stop and do the lab exercise to obtain the maximum benefit from this tutorial. 6 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Your 1.0 Schemas will still work An instance document conforming to a 1.0 schema can be validated using a 1.1 validator, but an instance document conforming to a 1.1 schema may not validate using a 1.0 validator. XML Schema 1.1 XML Schema 1.1 is a superset of XML Schema 1.0 XML Schema 1.0 7 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Namespaces for 1.1 Same as in 1.0 Use this namespace in your schema: http://www.w3.org/2001/XMLSchema Use this namespace in your instance document: http://www.w3.org/2001/XMLSchema-instance Example: with regard to namespaces this 1.1 schema looks identical to a 1.0 schema: <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> … </xs:schema> 8 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New namespace in 1.1 XML Schema 1.1 introduces a new namespace, the version control namespace: http://www.w3.org/2007/XMLSchema-versioning Note: the convention is to use "vc" as the namespace prefix, e.g. vc:typeAvailable. 9 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XML Schema 1.1 Validators • The schema-aware version of SAXON (version 9.3 or later) supports the full XML Schema 1.1 specification: http://www.saxonica.com/ • Apache XERCES-J (version 2.11.0 or later) supports XML Schema 1.1: http://xerces.apache.org/xerces2-j/ • You can create XML Schema 1.1 schemas today! 10 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Running SAXON from Oxygen XML 1 3 2 Select "Preferences" in this menu. It will open the dialog box shown here. Then select the 1.1 radio button. Then press the OK button. Now, when you click on Saxon-EE your XML document will be validated using a 1.1 schema validator. 11 Copyright © [2012]. Roger L. Costello. All Rights Reserved. 5 minute intro Summary of Changes to the Structures Specification: http://www.w3.org/TR/xmlschema11-1/#changes Summary of Changes to the Datatypes Specification: http://www.w3.org/TR/xmlschema11-2/#changes 12 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The <assert> Element The <assert> element is used to make assertions about element and attribute values. An assertion may state a relationship, such as "A meeting's end time must be greater than its start time," or an assertion may state a constraint on an element or attribute above and beyond the constraint specified by its declaration. Example: the Publisher element is declared to be of type string, but the assertion constrains it to just two string values, 'Wrox Press' and 'McMillan Publishing': <element name="Book"> <complexType> <sequence> <element name="Title" type="string" /> <element name="Author" type="string" /> <element name="Date" type="string" /> <element name="ISBN" type="string" /> <element name="Publisher" type="string" /> </sequence> <assert test="(Publisher eq 'Wrox Press') or (Publisher eq 'McMillan Publishing')" /> </complexType> </element> 13 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The <assertion> Facet The <assertion> facet is used to constrain simpleTypes. $value is a built-in variable holding the value of the simpleType. Example: the Evens simpleType has the even numbers from 0 to 100: <simpleType name="Evens> <restriction base="integer"> <minInclusive value="0" /> <maxInclusive value="100" /> <assertion test="$value mod 2 = 0" /> </restriction> </simpleType> 14 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The <alternative> Element The <alternative> element is used to provide an element a choice of types, the actual type used in an instance document depends on the value of attributes. Example: if the value of the kind attribute is 'book' then <Publication>'s type is BookType; if the value of the kind attribute is 'magazine' then its type is MagazineType; otherwise, its type is PublicationType. <xs:element name="Publication" type="PublicationType"> <xs:alternative test="@kind eq 'book'" type="BookType" /> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> </xs:element> 15 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The xs:error Datatype The error datatype is used to trigger an error. It may be used wherever a type is used. Example: continuing with the last example, suppose PublicationType is declared in a schema that cannot be modified, and it declares the kind attribute to be of type string. In your schema you want the value of kind to be restricted to 'book' and 'magazine.' Here's how to throw an error if kind does not have the value ‘book’ or ‘magazine’: <xs:element name="Publication" type="PublicationType"> <xs:alternative test="@kind eq 'book'" type="BookType" /> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> <xs:alternative test="(@kind ne 'book') and (@kind ne 'magazine')" type="xs:error" /> </xs:element> 16 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Open Content The <openContent> and <defaultOpenContent> elements enable instance documents to contain extension elements interleaved among the elements declared by the schema. Example: the Book element has open content: <xs:element name="Book"> <xs:complexType> <xs:openContent mode="interleave"> <xs:any /> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:gYear"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> In this instance document extension elements have been interleaved among the <Title>, <Author>, <Date>, <ISBN>, and <Publisher> elements: <Book> <r:Binding>Hardcover</r:Binding> <Title>My Life and Times</Title> <r:Size>5 x 7</r:Size> <Author>Paul McCartney</Author> <r:InStock>true</r:InStock> <Date>1998</Date> <r:Category>Non-fiction</r:Category> <ISBN>1-56592-235-2</ISBN> <r:NumPages>299</r:NumPages> <Publisher>McMillin Publishing</Publisher> <r:AvailableOnTape>false</r:AvailableOnTape> </Book> 17 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Schema-wide Attributes defaultAttributes is used to specify a set of attributes that apply to every complexType in a schema document. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" Example: The Book targetNamespace="http://www.books.org" element has a required xmlns="http://www.books.org" id attribute and an defaultAttributes="myDefaultAttributes" optional class attribute: elementFormDefault="qualified"> <xs:element name="Book"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:gYear"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <xs:attributeGroup name="myDefaultAttributes"> <xs:attribute name="id" type="xs:ID" use="required" /> <xs:attribute name="class" type="xs:NMTOKENS" /> </xs:attributeGroup> </xs:schema> 18 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Vendor Unique Extensions XML Schema 1.1 vendors can add their own datatypes and facets. Example: A vendor creates a new decimal datatype and a facet for specifying the delimiter to be used in the decimal value: <xs:simpleType name="money"> <xs:restriction base="vendor:decimal"> <vendor:delimiter value="," /> <xs:restriction> </xs:simpleType> 19 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Conditional Inclusion (a.k.a Version Control) XML Schema 1.1 introduces a new namespace, the version control namespace. By convention vc: is used to prefix items in this namespace. vc:minVersion and vc:maxVersion may be placed as attributes on an element declaration to indicate which version of the schema specification the declaration was written to. vc:typeAvailable and vc:typeUnavailable may be placed as attributes on an element declaration to signal to a schema validator that a vendor-unique datatype is being used by the element. vc:facetAvailable and vc:facetUnavailable may be placed as attributes on an element declaration to signal to a schema validator that a vendor-unique facet is being used by the element. Example: here are two declarations of an element Book; the first declaration is used by schema validators that implement the XML Schema 3.2 specification (or later); the second declaration is used by schema validators that implement any version between 1.1 and (excluding) 3.2: <element name="Book" vc:minVersion="3.2"> declare the Book element </element> <element name="Book: vc:minVersion="1.1" vc:maxVersion="3.2"> declare the Book element </element> 20 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Conditional Inclusion (a.k.a Version Control) Example: here are two declarations of an element cost; the first declaration is used by schema validators that understand the datatype vendor:decimal; the second declaration is used by schema validators that don't understand vendor:decimal: <element name="cost" vc:typeAvailable="vendor:decimal" type="vendor:decimal" /> <element name="cost" vc:typeUnavailable="vendor:decimal" type="decimal" /> Example: here are two declarations of an element population; the first declaration is used by schema validators that understand the facet vendor:delimiter; the second declaration is used by schema validators that don't understand vendor:delimiter: <element name="population" vc:facetAvailable="vendor:delimiter"> <simpleType> <restriction base="integer"> <vendor:delimiter test="," /> </restriction> </simpleType> </element> <element name="population" vc:facetUnavailable="vendor:delimiter" type="integer" /> 21 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Inherited Attributes Attributes can be declared to be inheritable. Inheritable attributes can be used by descendant elements that contain <alternative> elements. See example on next slide 22 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Inherited Attributes Example: the root element, BookStore, declares the attribute xml:lang as inheritable. The descendant Book element has two alternative elements that specify the content of Publisher based on the inherited xml:lang attribute: <element name="BookStore"> <complexType> <sequence> <element name="Book" maxOccurs="unbounded"> <complexType> <sequence> … <xs:element name="Publisher" type="xs:string"> <xs:alternative test="@xml:lang eq 'en'"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="Wrox Press" /> <xs:enumeration value="'McMillan Publishing" /> </xs:restriction> </xs:simpleType> </xs:alternative> <xs:alternative test="@xml:lang eq 'fr'"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="'Bayard Presse" /> <xs:enumeration value="'Le Castor Astral" /> </xs:restriction> </xs:simpleType> </xs:alternative> </xs:element> </sequence> </complexType> </element> </sequence> <attribute ref="xml:lang" inheritable="true" /> </complexType> 23 </element> Copyright © [2012]. Roger L. Costello. All Rights Reserved. Unordered Content using the <all> Element The <all> element has been enhanced to allow elements with multiple occurrences. Also, <all> can have the wildcard, <any>, at any child position. Example: the content of Book is: any number of extension elements, any number of Authors, Title, Date, ISBN, and Publisher, and they can be arranged in any order in instance documents: <xs:element name="Book"> <xs:complexType> <xs:all> <xs:any maxOccurs="unbounded"/> <xs:element name="Author" maxOccurs="unbounded"/> <xs:element name="Title" type="xs:string"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:all> </xs:complexType> </xs:element> 24 Copyright © [2012]. Roger L. Costello. All Rights Reserved. substitutionGroup can Substitute with Multiple Elements The substitutionGroup capability has been enhanced so that an element can substitute with multiple elements. Example: the <metro> element is substitutable for either <metrorail> or <subway>: <xs:element name="metro" substitutionGroup="metrorail subway" type="xs:NCName" /> 25 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New Attributes of the <any> and <anyAttribute> Wildcard Elements The <any> and <anyAttribute> wildcard elements have been enhanced with additional attributes that allow you to indicate the kind of extension elements or attributes not allowed. The notNamespace attribute is used to indicate the namespace that extension elements or attributes cannot come from. The notQName attribute is used indicate an element or attribute that is not allowed. Example: the first wildcard does not allow <any notNamespace="http://www.example.org"/> extension elements from the <anyAttribute notNamespace="http://www.example.org"/> http://www.example.org namespace; the second wildcard does not allow extension attributes from the http://www.example.org <any notQName="xsl:value-of"/> namespace; the third wildcard does not allow xsl:value-of as an extension element: 26 Copyright © [2012]. Roger L. Costello. All Rights Reserved. More Flexible Rules for Wildcards Wildcards (<xs:any>) are important tools for extensible languages, but in XSD 1.0, it is difficult or impossible to use wildcards near optional content. XSD 1.1 is much more flexible. See example on next slide 27 Copyright © [2012]. Roger L. Costello. All Rights Reserved. More Flexible Rules for Wildcards Example: in XSD 1.0 the element declaration shown here is not legal, because there are documents with elements like <NumPages> that could match either the explicit element declaration or the wildcard. In XSD 1.1, this schema is valid, and the <NumPages> element is validated as an integer by the element declaration; the <Reviews> and <Binding> elements are validated against the wildcard. <xs:element name="Book"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string" minOccurs="0"/> <xs:element name="NumPages" type="xs:integer" minOccurs="0"/> <xs:any minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> ---------<Book> <Title>The Origin of Wealth</Title> <NumPages>321</NumPages> <Reviews>Excellent</Reviews> <Binding>Hardcover</Binding> </Book> 28 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Enhanced Usage of the ID Datatype In XML Schema 1.1 an element can have multiple attributes of type ID and the ID type can have a fixed or default value. Example: the <Stereo> element has two ID attributes, model-number and serial-number; the Food attribute has a fixed ID value: <element name="Stereo"> <complexType> <sequence> … </sequence> <attribute name="model-number" type="ID" use="required" /> <attribute name="serial-number" type="ID" use="required" /> </complexType> </element> <attribute name="Food" type="ID" fixed="Popcorn" /> 29 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The <override> Element The <override> element replaces the XML Schema 1.0 <redefine> element, which has been deprecated. The <override> element is used to replace the contents of a globally declared item in another schema. See example on next slide 30 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The <override> Element Example: Office-calendar.xsd declares a <meeting> element with content <starttime>, <end-time>, and <room-number>. Conference-calendar.xsd overrides <meeting>'s content with <track-id>, <speaker>, and <room-capacity>: Office-calendar.xsd declares a meeting element: <xs:element name="meeting"> <xs:complexType> <xs:sequence> <xs:element name="start-time" type="xs:time" /> <xs:element name="end-time" type="xs:time" /> <xs:element name="room-number" type="xs:string" /> </xs:sequence> </xs:complexType> </xs:element> Conference-calendar.xsd overrides the meeting element: <xs:override schemaLocation="office-calendar.xsd"> <xs:element name="meeting"> <xs:complexType> <xs:sequence> <xs:element name="track-id" type="xs:string" /> <xs:element name="speaker" type="xs:string" /> <xs:element name="room-capacity" type="xs:nonNegativeInteger" /> </xs:sequence> </xs:complexType> </xs:element> </xs:override> 31 Copyright © [2012]. Roger L. Costello. All Rights Reserved. targetNamespace on Element and Attribute Declarations An XSD 1.0 Schema Document with one targetNamespace could not restrict a type using locally-declared elements from another targetNamespace. In XML Schema 1.1 you can do this by adding a targetNamespace attribute to each such "foreign" element and attribute in the restriction. See example on next slide 32 Copyright © [2012]. Roger L. Costello. All Rights Reserved. targetNamespace on Element and Attribute Declarations <xs:schema targetNamespace="http://www.libraries.org" xmlns:books="http://www.books.org"> <xs:import namespace="http://www.books.org" schemaLocation="Books.xsd"/> Example: this complexType restricts books:Book, which is in another namespace; targetNamespace is placed on each element declaration: <xs:complexType name="BookInLibrary"> <xs:complexContent> <xs:restriction base="books:Book"> <xs:sequence> <xs:element name="Title" type="xs:string" targetNamespace="http://www.books.org"/> <xs:element name="Author" type="xs:string" maxOccurs="2" targetNamespace="http://www.books.org"/> <xs:element name="Date" type="xs:gYear" targetNamespace="http://www.books.org"/> <xs:element name="ISBN" type="xs:string" targetNamespace="http://www.books.org"/> <xs:element name="Publisher" type="xs:string" targetNamespace="http://www.books.org"/> </xs:sequence> </xs:restriction> </xs:complexContent> </xs:complexType> </xs:schema> 33 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The anyAtomicType Datatype The anyAtomicType is the union of the value spaces of all the primitive types. Example: this illustrates the anyAtomicType datatype: <element name="Value" type="anyAtomicType" /> --<Value xsi:type="xs:string">Hello World</Value> <Value xsi:type="xs:decimal">12.36</Value> <Value xsi:type="xs:boolean">true</Value> 34 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The dateTimeStamp Datatype A dateTimeStamp value is identical to the dateTime datatype, except it requires time zone be specified. Example: this illustrates the dateTimeStamp datatype: <element name="birthdate" type="dateTimeStamp" /> --<birthdate>1976-06-21T16:04:00-6:00</birthdate> <birthdate>1980-01-01T24:00:00-6:00</birthdate> 35 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The yearMonthDuration Datatype A yearMonthDuration value is a constrained version of the duration datatype; only years and months are specified. Example: this illustrates the yearMonthDuration datatype: <element name="eventDuration" type="yearMonthDuration" /> --<eventDuration>P1Y3M</eventDuration> <eventDuration>P15M</eventDuration> 36 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The dayTimeDuration Datatype A dayTimeDuration value is a constrained version of the duration datatype; only day and time are specified. Example: this illustrates the dayTimeDuration datatype: <element name="conferenceDuration" type="dayTimeDuration" /> --<conferenceDuration>P35DT01H22M30S</conferenceDuration> <conferenceDuration>PT11H</conferenceDuration> 37 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New Facets Here are the new facets: – assertion: use this to constrain a simpleType – explicitTimezone: use this with date datatypes to specify whether the time zone is required 38 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Let's Dive In! • That's the 5 minute introduction. • As you see, there's lots of powerful new capabilities in XML Schema 1.1 • Let's examine each of them in depth. • Happy learning! 39 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Expressing assertions using the <assert> element http://www.w3.org/TR/xmlschema11-1/#cAssertions 40 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 1 Create a schema for this XML instance document: <Document classification="secret"> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> </Document> 41 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Things the Schema Should Check <Document classification="secret"> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> </Document> Are the correct elements and attributes being used? Are the classification values correct? 42 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Grammar checks <Document classification="secret"> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> </Document> Are the correct elements and attributes being used? Are the classification values correct? 43 Copyright © [2012]. Roger L. Costello. All Rights Reserved. One more thing to check <Document classification="secret"> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> </Document> Ensure that no <Para> element has a classification higher than the <Document> element's classification 44 Copyright © [2012]. Roger L. Costello. All Rights Reserved. top-secret is higher than secret is higher than confidential is higher than unclassified 45 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Business rule check <Document classification="secret"> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> </Document> Ensure that no <Para> element has a classification higher than the <Document> element's classification 46 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XML Schema 1.0 XML Schema 1.0 just supported grammar checking. For the classification rule we needed to use Schematron. We typically created a validation pipeline: <Document classification="secret"> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> <Para classification="unclassified"> ... </Para> <Para classification="secret"> ... </Para> </Document> XML Schema Validator Schematron Validator 47 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New Capability! XML Schema 1.1 supports both grammar checking and business rule checking. 48 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The <assert> element • The XML Schema 1.1 <assert> element is used to make assertions about the values of elements and attributes. • I will use it to assert: If the Document's classification is secret then no Paras have a value equal to top-secret. If the Document's classification is confidential then no Paras have a value equal to top-secret and no Paras have a value equal to secret. If the Document's classification is unclassified then no Paras have a value equal to top-secret and no Paras have a value equal to secret and no Paras have a value equal to confidential. 49 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Use XPath 2.0 to Express Assertions <assert test="xpath" /> 50 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Use XPath 2.0 to Express Assertions <assert test="xpath" /> The XPath must evaluate to either true or false. If it evaluates to true then the data (in the instance document) is valid, otherwise the data is invalid. 51 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Here's the Assertion <xs:assert test="if (@classification eq 'secret') then not(Para/@classification = 'top-secret') else if (@classification eq 'confidential') then not(Para/@classification = 'top-secret') and not(Para/@classification = 'secret') else if (@classification eq 'unclassified') then not(Para/@classification = 'top-secret') and not(Para/@classification = 'secret') and not(Para/@classification = 'confidential') else true()" /> 52 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Equivalent 1 assert: <xs:assert test="if (@classification eq 'secret') then not(Para/@classification = 'top-secret') else if (@classification eq 'confidential') then not(Para/@classification = 'top-secret') and not(Para/@classification = 'secret') else if (@classification eq 'unclassified') then not(Para/@classification = 'top-secret') and not(Para/@classification = 'secret') and not(Para/@classification = 'confidential') else true()" /> 3 asserts: <xs:assert test="if (@classification eq 'secret') then not(Para/@classification = ('top-secret')) else true() "/> <xs:assert test="if (@classification eq 'confidential') then not(Para/@classification = ('top-secret', 'secret')) else true() "/> <xs:assert test="if (@classification eq 'unclassified') then not(Para/@classification = ('top-secret', 'secret', 'confidential')) else true()" /> Much easier to understand, I think 53 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Asserts are and’ed • If you have multiple assert elements, then they are and’ed together. • Thus, for a value to be valid, all asserts must evaluate to true. 54 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Here's where Assertions go Place assertions at the bottom of a <complexType> element: <xs:element name="Document"> <xs:complexType> <xs:sequence> <xs:element name="Para" type="ParaType" maxOccurs="unbounded"> </xs:element> </xs:sequence> <xs:attribute name="classification" type="classificationLevels" use="required"/> <xs:assert test="if (@classification eq 'secret') then else … true()" /> </xs:complexType> </xs:element> 55 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Assertions always look down Element Document Element Para Text One if … Element Para Attribute classification=“unclassified” Text And I … Assert: if (@classification eq 'secret') then … else … true() Attribute classification=“secret” Attribute classification=“confidential” Element Para Attribute classification=“unclassified” Text Ready to 56 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Assertions always look down • Note that I placed the assertion on the <Document> element, not the <Para> element. • From the <Document> element my XPath expression "looked down" to the <Para> (child) elements. • If I had placed the assertion on the <Para> element my XPath expression would need to "look up" to the <Document> element (parent). In fact, that won't do what I want. The <para> is the root for the XPath evaluation, so the XPath won't even see the <Document> element. 57 Copyright © [2012]. Roger L. Costello. All Rights Reserved. When evaluating an assertion, the element containing the assertion is considered to be the root element. 58 Copyright © [2012]. Roger L. Costello. All Rights Reserved. When evaluating an assertion, the element containing the assertion is considered to be the root element. Now you understand why assertions always "look down" 59 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The validity of an element depends only on the content of that element, and not on the context where it is used. Michael Kay 60 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Position of Assertions Two use cases to illustrate the positioning of assertions: Place the assertions on the complexType of the root element This will give you flexibility in what you incorporate into your assertion: at a later date you may decide to incorporate additional factors into your assertion; since you've positioned it at the top of the XML tree you will be able to use any data in the current document. When you want to optimize something like business rules for a particular document type, or for a major subsection of a document, then putting the assertions on the type of the document root or the root element of the subtree makes sense. Place the assertions on the type that the assertion applies to. This makes the type – and its assertions – reusable across documents. The best way to ensure rules are followed is to get them right in front of people at the exact point where the guidance is relevant. [Ross] 61 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Position of assert is a Crucial Difference Element BarnesAndNoble assert: string-length(.//Publisher le 140) Element Book Element Title Text Element Author Text Element Date Text Element ISBN Element Publisher Text Text assert: string-length(. le 140) 62 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Crucial Difference <BarnesAndNoble> assert: string-length(.//Publisher le 140) <Book> … <Publisher>____</Publisher> assert: string-length(. le 140) </Book> … </BarnesAndNoble> Note that both assertions are stating that the length of the Publisher element must be less than 140 characters. However, the assertions are critically different. The assertion on BarnesAndNoble says that the BarnesAndNoble element is invalid if the Publisher has a string length greater than 140 characters, whereas the assertion on Publisher says that the Publisher element is invalid if it has a string length greater than 140 characters. Do you see the difference? It’s important that you do. 63 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Run it! Here's my folder hierarchy: xml-schemas1.1 examples assertions classification classification.xml classification.xsd Validate the instance document against the schema Do Lab1 64 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Equivalent <xs:element name="BarnesAndNoble"> <xs:complexType> <xs:sequence> <xs:element ref="Book" maxOccurs="unbounded" /> </xs:sequence> <xs:assert test=“every $i in Book satisfies string-length($i/Publisher) le 140" /> </xs:complexType> </xs:element> <xs:element name="BarnesAndNoble"> <xs:complexType> <xs:sequence> <xs:element ref="Book" maxOccurs="unbounded" /> </xs:sequence> <xs:assert test="not(Book[string-length(Publisher) gt 140])" /> </xs:complexType> </xs:element> 65 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Does this work? <xs:element name="BarnesAndNoble"> <xs:complexType> <xs:sequence> <xs:element ref="Book" maxOccurs="unbounded" /> </xs:sequence> <xs:assert test="Book[string-length(Publisher) le 140]" /> </xs:complexType> </xs:element> 66 Copyright © [2012]. Roger L. Costello. All Rights Reserved. No! <xs:element name="BarnesAndNoble"> <xs:complexType> <xs:sequence> <xs:element ref="Book" maxOccurs="unbounded" /> </xs:sequence> <xs:assert test="Book[string-length(Publisher) le 140]" /> </xs:complexType> </xs:element> Read the assertion as: “I assert that there exists a Book such that the string-length of its Publisher is le 140.” 67 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 2 Consider this element declaration: <xs:element name="Publisher" type="xs:string"/> What values can the <Publisher> element have in an XML instance document: <Publisher>_________</Publisher> Suppose the Publisher element is being used by Book: <xs:element name="Book"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string"/> <xs:element name="Date" type="xs:gYear"/> <xs:element name="ISBN" type="xs:string"/> <xs:element ref="Publisher" /> </xs:sequence> </xs:complexType> </xs:element> Okay, what values can Publisher have? 68 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Did you answer: The value of Publisher can be any unconstrained string. If you're using XML Schema 1.1 then your answer may be wrong. The <assert> element can impose additional constraints on the Publisher element. For example, the <Book> element is nested within a <BarnesAndNoble> element, which has an <assert> element that constrains the Publisher element to a maximum string length of 140 characters: <xs:element name="BarnesAndNoble"> <xs:complexType> <xs:sequence> <xs:element ref="Book" maxOccurs="unbounded" /> </xs:sequence> <xs:assert test="not(Book[string-length(Publisher) gt 140])" /> </xs:complexType> </xs:element> The <Book> element is also nested within a <Borders> element, which has an <assert> element that constrains the Publisher element to either 'McMillin Publishing', 'Dell Publishing Co.', 'Harper &amp; Row', or 'Wrox Press': <xs:element name="Borders"> <xs:complexType> <xs:sequence> <xs:element ref="Book" maxOccurs="unbounded" /> </xs:sequence> <xs:assert test="not(Book[not(Publisher = ('McMillin Publishing', 'Dell Publishing Co.','Harper &amp; Row', 'Wrox Press'))])" /> </xs:complexType> </xs:element> See the examples folder, assertions, whats-thevalue-space 69 Copyright © [2012]. Roger L. Costello. All Rights Reserved. What values can Publisher have? - It could be an unbounded string, or - It could be a string of max length 140, or - It could be an enumeration list, 'McMillin Publishing', 'Dell Publishing Co.', 'Harper &amp; Row', or 'Wrox Press' - Or, it could be something else. That's interesting, but so what? continued 70 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: What are the valid values of Publisher? A: You can't tell just by looking at the declaration. 71 Copyright © [2012]. Roger L. Costello. All Rights Reserved. To understand an element you must understand its ancestors. (Ancestors may exert an action at a distance.) Do Lab2 72 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 3 • countries.xml is a list of countries. • My schema declares a <country> element. I want to validate that its value matches one of the values in countries.xml <countries> <country>Afghanistan</country> <country>Albania</country> <country>Algeria</country> <country>American Samoa</country> <country>Andorra</country> <country>Angola</country> <country>Anguilla</country> <country>Antarctica</country> <country>Antigua and Barbuda</country> <country>Argentina</country> <country>Armenia</country> … </countries> countries.xml <Example> <country>______</country> </Example> Check that this value matches a value in here 73 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <element name="Example"> <complexType> <sequence> <element name="country" type="string" /> </sequence> <assert test="country = doc('countries.xml')//country" /> </complexType> </element> </schema> See the folder: validating-against-an-external-document 74 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Bad News • The cross-document validation described in the previous two slides is not permitted! • You cannot use the doc() function in an assertion. 75 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Why? I think the working group felt that introducing context-dependent validation (where the validity of a document depends on factors other than the schema and the instance document) was a risky architectural innovation, and possibly a step that would be later regretted. Michael Kay 76 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 4 • In this example I write an assertion which asserts that a meeting's end time must be greater than its start time. • I have three versions of the schema: – No namespace: the schema doesn't use targetNamespace – One namespace: the schema has a targetNamespace – Two namespaces: the start-time is in one namespace, the end-time is in another 77 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Element Example Element meeting Element start-time Text assert: end-time gt start-time Element end-time Text 78 Copyright © [2012]. Roger L. Costello. All Rights Reserved. V1 – No Namespace <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <element name="Example"> <complexType> <sequence> <element name="meeting-time"> <complexType> <sequence> <element name="start-time" type="time" /> <element name="end-time" type="time" /> </sequence> <assert test="end-time gt start-time" /> </complexType> </element> </sequence> </complexType> </element> </schema> See meeting-time_v1.xsd in the folder: meeting-time 79 Copyright © [2012]. Roger L. Costello. All Rights Reserved. v2 – One Namespace <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.meeting.org" elementFormDefault="qualified"> <element name="Example"> <complexType> <sequence> <element name="meeting-time"> <complexType> <sequence> <element name="start-time" type="time" /> <element name="end-time" type="time" /> </sequence> <assert test="end-time gt start-time" xpathDefaultNamespace="##targetNamespace" /> </complexType> </element> </sequence> </complexType> </element> </schema> See meeting-time_v2.xsd in the folder: meeting-time 80 Copyright © [2012]. Roger L. Costello. All Rights Reserved. xpathDefaultNamespace • It is an optional attribute of xs:assert • Here are its legal values: – ##targetNamespace: this means the default namespace within the XPath expression is the targetNamespace – ##defaultNamespace: this means the default namespace within the XPath expression is the schema's default namespace (which may be different than the targetNamespace) – ##local: this means the default namespace within the XPath expression is no namespace (this is the default value) – http://www.example.org: this means the default namespace within the XPath expression is the http://www.example.org namespace 81 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Equivalent! <assert test="end-time gt start-time" xpathDefaultNamespace="##targetNamespace" /> xmlns:m="http://www.meeting.org" … <assert test="m:end-time gt m:start-time" /> 82 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Place xpathDefaultNamespace on <xs:schema> element • Rather than placing xpathDefaultNamespace on each <xs:assert> element, you can place it on the <xs:schema> element. • Thus, in one fell-swoop you can provide a default namespace for all xpath expressions. <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.meeting.org" xpathDefaultNamespace="##targetNamespace" elementFormDefault="qualified"> ... </schema> Do Lab3 83 Copyright © [2012]. Roger L. Costello. All Rights Reserved. v3 – Two Namespaces <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.meeting.org" xmlns:t="http://www.times.org" elementFormDefault="qualified"> <import namespace="http://www.times.org" schemaLocation="times.xsd" /> <element name="Example"> <complexType> <sequence> <element name="meeting-time"> <complexType> <sequence> <element name="start-time" type="time" /> <element ref="t:end-time" /> </sequence> <assert test="t:end-time gt start-time" xpathDefaultNamespace="##targetNamespace" /> </complexType> </element> </sequence> </complexType> </element> </schema> See meeting-time_v3.xsd in the folder: meeting-time 84 Copyright © [2012]. Roger L. Costello. All Rights Reserved. times.xsd targetNamespace=“http://www.times.org” end-time “import” meeting-time.xsd targetNamespace=“http://www.meeting.org” Example meeting start-time Do Lab4 85 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Element Example Element couples Element name Text assert: ??? (What is a useful business rule?) Element name ... Text 86 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Element Example Element couples Element name Text assert: count(name) mod 2 = 0 Element name ... Text 87 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 5 • All the XPath functions are available for use in your assertion. (http://www.w3.org/TR/xpath-functions/) • Here I use the count() function to assert that the number of <name> elements in <couples> must be even: <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <element name="Example"> <complexType> <sequence> <element name="couples"> <complexType> <sequence> <element name="name" maxOccurs="unbounded" /> </sequence> <assert test="count(name) mod 2 = 0" /> </complexType> </element> </sequence> </complexType> </element> </schema> See couples.xsd in the folder: couples Do Lab5 88 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 6 • In XSD 1.0 it was difficult to express "If the value of <A> is xyz then there should be child element <foo> and if the value of <A> is rst then there should be a child element <bar>" • That is, expressing conditional presence was difficult. • It is easy in XSD 1.1, using assertions 89 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Conditional Presence <Transportation> <mode>air</mode> <*****>_____</*****> </Transportation> The element here depends on the value here 90 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Conditional Presence <xs:element name="Transportation"> <xs:complexType> <xs:sequence> <xs:element name="mode" type="modeType" /> <xs:choice> <xs:element name="airplane" type="xs:string" /> <xs:element name="boat" type="xs:string" /> <xs:element name="car" type="xs:string" /> </xs:choice> </xs:sequence> <xs:assert test="if (mode eq 'air') then child::airplane else if (mode eq 'water') then child::boat else if (mode eq 'ground') then child::car else false()" /> </xs:complexType> </xs:element> Do Lab6 See conditional-mode-of-transportation.xsd in the folder: conditional-presence 91 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Conflicting <assert> elements (Who wins?) <BarnesAndNoble> assert: string-length(.//Publisher ge 140) <Book> … <Publisher>____</Publisher> assert: string-length(. le 70) </Book> … </BarnesAndNoble> 92 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Conflicting <assert> elements (Who wins?) Suppose one <assert> element says "The value of <Publisher> must be a string not to exceed 140 characters in length." Another <assert> element says "The value of <Publisher> must be a string not to exceed 70 characters in length." Which <assert> element wins? Example: I have an <assert> element on the root element (<BarnesAndNoble>) which says "The value of each <Publisher> element must be a string not to exceed 140 characters in length." I have another <assert> element directly on the <Publisher> element which says: "The value of <Publisher> must be a string not to exceed 70 characters in length." Is <Publisher> constrained to a length of 140 characters or 70 characters? 93 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="Publisher"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:assert test="not(string-length(.) gt 70)" /> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <Publisher>'s value must be less than 70 characters in length. <xs:element name="Book"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string"/> <xs:element name="Date" type="xs:gYear"/> <xs:element name="ISBN" type="xs:string"/> <xs:element ref="Publisher" /> </xs:sequence> </xs:complexType> </xs:element> See BookStore_v2.xsd in the folder: whats-the-value-space <xs:element name="BarnesAndNoble"> <xs:complexType> <xs:sequence> <xs:element ref="Book" maxOccurs="unbounded" /> </xs:sequence> <xs:assert test="not(Book[string-length(Publisher) gt 140])" /> </xs:complexType> </xs:element> </xs:schema> <Publisher>'s value must be less than 140 characters in length. 94 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Answer The assertions are "anded" together: the value must satisfy all of them. In this case it means the value must be no longer than 70. It's quite possible to have assertions that really conflict, e.g test="string-length() > 100" and test="string-length() < 3". In that case there are no valid instances. Indeed, it's possible to have a single assertion that can never be satisfied, e.g. "string-length() < 0" or more simply, test="false()" Michael Kay 95 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 7 Use <assert> to express this rule: A paragraph cannot appear nested within another paragraph unless there is an intervening table. <Example> <paragraph> <table>…</table> <paragraph> <table>…</table> <paragraph> <table>…</table> <paragraph> … </paragraph> </paragraph> </paragraph> </paragraph> </Example> paragraph is recursive. 96 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="Example"> <xs:complexType> <xs:sequence> <xs:element name="paragraph" type="paragraphType" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType> </xs:element> <xs:complexType name="paragraphType" mixed="true"> <xs:sequence> <xs:element name="table" type="xs:string" minOccurs="0" /> <xs:element name="paragraph" type="paragraphType" minOccurs="0" /> </xs:sequence> <xs:assert test="if (paragraph) then paragraph/preceding-sibling::table else true()" /> </xs:complexType> </xs:schema> See paragraph.xsd in the paragraph folder 97 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Relocate the <assert> • In the previous example the <assert> was located right where it is needed. • Question: What XPath expression would you use if the <assert> was positioned on the root element? 98 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Answer not(.//paragraph[child::*[1][self::paragraph]]) 99 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Quiz • The paraType is defined recursively. Thus, there is no limit to the nesting. • Write an <assert> which limits the nesting to no more than 10 deep. 100 Copyright © [2012]. Roger L. Costello. All Rights Reserved. A Level 1 manager has a maximum signature authority of $10K. Equivalent: assert: every $i in purchase-request satisfies $i/cost le 10000 assert: not(purchase-request[cost gt 10000]) Element Level_1_Manager_Signoffs Element purchase-request Element purchase-request Element item Text Element cost Text Element item Text ... Element cost Text 101 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 8 Use <assert> to implement this rule: A Level 1 manager has a maximum signature authority of $10K. Here's the schema: <element name="Level_1_Manager_Signoffs"> <complexType> <sequence> <element name="purchase-request" maxOccurs="unbounded"> <complexType> <sequence> <element name="item" type="string" /> <element name="cost" type="decimal" /> </sequence> </complexType> </element> </sequence> <assert test="not(purchase-request[number(cost) gt 10000])"/> </complexType> </element> 102 See Level_1_Manager_Signoffs.xsd in the purchase-requests folder Copyright © [2012]. Roger L. Costello. All Rights Reserved. Subtype inherits assert’s from base type • Suppose complexType A is a subtype of complexType B (i.e., A derives-by-extension from B or A derives-by-restriction from B). • If B has one or more <xs:assert> elements, then A inherits those assert’s. 103 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Publication Title Author Date assert: Date gt 1970 "extends" BookPublication ISBN Publisher assert: Publisher eq ‘Wrox Press’ BookPublication has two assert’s 104 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xsd:complexType name="Publication"> <xsd:sequence> <xsd:element name="Title" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="Author" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="Date" type="xsd:date"/> </xsd:sequence> <xsd:assert test="Date gt xsd:date('1970-01-01')" xpathDefaultNamespace="##targetNamespace" /> </xsd:complexType> "extends" <xsd:complexType name="BookPublication"> <xsd:complexContent> <xsd:extension base="Publication"> <xsd:sequence> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> <xsd:assert test="Publisher = 'Wrox Press'" xpathDefaultNamespace="##targetNamespace" /> </xsd:extension> </xsd:complexContent> </xsd:complexType> See BookStore.xsd in the inherit-asserts-from-base-complexType folder 105 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Recap • XML Schema 1.1 now enables you to make assertions, using the <assert> element: <assert test="XPath" /> • The <assert> element is placed at the bottom of a <complexType>. • The XPath cannot use the doc() function to reference other documents. • The XPath cannot "look up" the XML tree to the parent, grandparent, cousins, etc 106 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The portion of the DTD for XML Schema 1.1 that pertains to <assert> <!ELEMENT assert ((annotation)?)> <!ATTLIST assert test id xpathDefaultNamespace CDATA ID CDATA #REQUIRED #IMPLIED #IMPLIED> <!-- the value of test is an XPath expression--> <!ELEMENT complexType ((annotation)?, (simpleContent | complexContent | openContent?, (all | choice | sequence | group)?), ((attribute | attributeGroup)*, (anyAttribute)?), assert*)> <!ELEMENT extension ((annotation)?, (openContent?, (all | choice | sequence | group)?)) , ((attribute | attributeGroup)*, (anyAttribute)?), assert*)> <!ELEMENT restriction ((annotation)?, (simpleType?, (minExclusive | minInclusive | …)?)) , ((attribute | attributeGroup)*, (anyAttribute)?), assert*)> 107 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Schematron Niche? • XML Schema 1.0 did not have the ability to make assertions; thus, expressing things such as co-constraints was not possible. • Schematron filled that niche. • Q: Now that XML Schema 1.1 has the ability to make assertions, where does Schematron stand? • A: Recall that you cannot do cross document validation using the <assert> element. Schematron can. This is an important niche that Schematron can fill. See my tutorial on Schematron: http://www.xfront.com/schematron/ 108 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example to illustrate the need for Cross Document Validation <countries> <country>Afghanistan</country> <country>Albania</country> <country>Algeria</country> <country>American Samoa</country> <country>Andorra</country> <country>Angola</country> <country>Anguilla</country> <country>Antarctica</country> <country>Antigua and Barbuda</country> <country>Argentina</country> <country>Armenia</country> … </countries> countries.xml <Example> <country>______</country> </Example> Check that this value matches a value in here 109 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Another example to illustrate the need for Cross Document Validation ATM <Request type="withdrawal"> <Amount>$500</Amount> <Member>John Doe</Member> <TimeStamp>2009-07-27T08:08:00</TimeStamp> </Request> Check that this value is less than this value. <BankAccount> <Member>John Doe</Member> <Balance>$1000</Balance> </BankAccount> 110 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Constraining simpleTypes using the assertion facet http://www.w3.org/TR/xmlschema11-2/#rf-assertions 111 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 1 Create a schema for this XML instance document: <Example> <even-integer>____</even-integer> </Example> Even integers only, e.g. 2, 4, 6, 8, 10, … 112 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The assertion Facet XML Schema 1.1 has a new facet: assertion <element name="even-integer"> <simpleType> <restriction base="integer"> <assertion test="xpath" /> </restriction> </simpleType> </element> 113 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The assertion Facet XML Schema 1.1 has a new facet: assertion <element name="even-integer"> <simpleType> <restriction base="integer"> <assertion test="xpath" /> </restriction> </simpleType> </element> Notice that "test" is a departure from the other facets, which use "value", e.g. <minInclusive value="100" /> 114 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The $value Variable • This is a built-in variable. • It's value is the value of the current (context) element/attribute. <element name="even-integer"> <simpleType> <restriction base="integer"> <assertion test="$value mod 2 = 0" /> </restriction> </simpleType> </element> I assert that the value of <even-integer> must satisfy this XPath expression, i.e. only even integers are valid 115 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Run it! Here's my folder hierarchy: xml-schemas1.1 examples assertions classification even-integers even-integers.xml even-integers_v1.xsd Validate the instance document against the schema 116 Copyright © [2012]. Roger L. Costello. All Rights Reserved. assertion Facet: it's for Every Data Type • The <assertion> facet can be used with every data type. • Multiple <assertion> facets can be used. – These facets may repeat: • pattern • enumeration • assertion – Unlike the pattern and enumeration facets, if there are multiple assertion facets then they are “and-ed” together. 117 Copyright © [2012]. Roger L. Costello. All Rights Reserved. assertion plus other facets You can use other facets along with assertion: <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <element name="Example"> <complexType> <sequence> <element name="even-integer"> <simpleType> <restriction base="integer"> <minInclusive value="0" /> <maxInclusive value="100" /> <assertion test="$value mod 2 = 0" /> </restriction> </simpleType> </element> </sequence> </complexType> </element> </schema> See even-integers_v2.xsd The integer must be even and it must be between 0 and 100 Do Lab7 118 Copyright © [2012]. Roger L. Costello. All Rights Reserved. assertion plus matches() can replace the pattern facet Instead of this: <xs:simpleType name="English-language-family-name"> <xs:restriction base="xs:string"> <xs:minLength value="2" /> <xs:maxLength value="100" /> <xs:pattern value="[a-zA-Z' \.-]+" /> </xs:restriction> </xs:simpleType> Do this: <xs:simpleType name="English-language-family-name"> <xs:restriction base="xs:string"> <xs:minLength value="2" /> <xs:maxLength value="100" /> <xs:assertion test="matches($value, ‘^[a-zA-Z' \.-]+')$" /> </xs:restriction> </xs:simpleType> 119 Copyright © [2012]. Roger L. Costello. All Rights Reserved. There's a tricky point in substituting XPath regexs in assertion facet for XSD regexs in pattern facet. XPath regex language *adds* the ^ and $ meta-characters, which force a match at the beginning and end of the string, respectively. These aren't needed in XSD regexes, since all matches are *always* done against the complete string. So these two facets mean different things: <xs:pattern value="[\sa-zA-Z0-9,;:\.]*"/> <xs:assertion test="matches($value, '[\sa-zA-Z0-9,;:\.]*')"/> The regex is exactly the same in these two patterns, but *are not equivalent*. The pattern facet matches the regex against *the entire value* of the simple type; in contrast, the matches function in the assertion facet matches the regex against *any part of the value* of the simple type. Note, for example, that in neither case is the hyphen (-) allowed in the regular expression. A simple type with the first facet above would fail if the value contains a hyphen. However, a simple type with the second facet above would NOT fail with a hyphen; alternatively, the second facet would pass as long as *any* part of the value matches the regex. To have equivalent facets, you would need to add the ^ and $ meta-characters to the XPath regex, as follows: <xs:pattern value="[\sa-zA-Z0-9,;:\.]*"/> <xs:assertion test="matches($value, '^[\sa-zA-Z0-9,;:\.]*$')"/> The pattern facet is the same as above and repeated simply for ease in comparison. The only difference in the assertion is that the ^ begins the regex, and $ ends the regex; these meta-characters force the regex to match the entire string. These two facets are equivalent. Another major difference: XPath regex supports back references. Example taken from the spec: The regular expression ('|").*\1 matches a sequence of characters delimited either by an apostrophe at the start and end, or by a quotation mark at the start and end. This makes XPath regexs much more vulnerable to Regular Expression Denial of Service (ReDOS) attacks, since the *only* way to implement back references is with a backtracking algorithm. At least a smart XSD validating parser could choose an efficient regex engine not vulnerable to ReDOS, but there's no such option with XPath. Jonathan Cranford 120 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Conditional Type Alternatives (a.k.a. CTA) http://www.w3.org/TR/xmlschema11-1/#cTypeAlternative 121 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 1 If it's magazine then the content must be MagazineType. If kind="book" then the content must be BookType. <Publication kind="_________"> </Publication> 122 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Type Dependence <Publication kind="_________"> </Publication> The type of <Publication> depends on the value of this 123 Copyright © [2012]. Roger L. Costello. All Rights Reserved. xs:alternative • The <alternative> element is used to provide alternate types for an element. • Which type is selected is based on the results of a test. <alternative test="XPath'" type="type" /> 124 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <alternative test="XPath'" type="type" /> If this evaluates to true, then this type must be used. 125 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <Publication kind="_________"> Instance document: which type? </Publication> The declared type for <Publication> Schema document: <xs:element name="Publication" type="PublicationType"> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> <xs:alternative test="@kind eq 'book'" type="BookType" /> </xs:element> 126 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Publication" type="PublicationType"> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> <xs:alternative test="@kind eq 'book'" type="BookType" /> </xs:element> The alternative types for <Publication> 127 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Publication" type="PublicationType"> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> <xs:alternative test="@kind eq 'book'" type="BookType" /> </xs:element> "If the value of the kind attribute is 'magazine' then use MagazineType as <Publication>'s type." 128 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Publication" type="PublicationType"> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> <xs:alternative test="@kind eq 'book'" type="BookType" /> </xs:element> "If the value of the kind attribute is 'book' then use BookType as <Publication>'s type." 129 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Publication" type="PublicationType"> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> <xs:alternative test="@kind eq 'book'" type="BookType" /> </xs:element> "If the value of the kind attribute is not 'book' and not 'magazine' then use PublicationType as <Publication>'s type." Do Lab8 See publication_v1.xsd in the folder: type-alternatives/publication 130 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Alternates must derive from The type specified in <alternative> must derive from the element's declared type PublicationType BookType MagazineType In order for BookType and MagazineType to be alternative types, they must derive from <Publication>'s type (PublicationType) 131 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Publication PublicationType Title Author (0 - unbounded occurrences) Date "extends" BookType ISBN Publisher Attribute: kind "restrict" MagazineType Title Date See publication_v1.xsd in the folder: type-alternatives/publication 132 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:complexType name="PublicationType"> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="Date" type="xs:gYear"/> </xs:sequence> <xs:attribute name="kind" type="xs:string" /> </xs:complexType> Since the kind attribute is of type string, the value of @kind can be 'book' (<alternative> will then force BookType as the type), or @kind can be 'magazine' (<alternative> will then force MagazineType as the type), or @kind can be anything else and then PublicationType will be the type. Q: Suppose you always want the type to be restricted to one of the types specified by the <alternative> elements; how would you do that? 133 Copyright © [2012]. Roger L. Costello. All Rights Reserved. A: Restrict the value of @kind <xs:complexType name="PublicationType"> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="Date" type="xs:gYear"/> </xs:sequence> <xs:attribute name="kind"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="magazine" /> <xs:enumeration value="book" /> </xs:restriction> </xs:simpleType> Do Lab9 </xs:attribute> </xs:complexType> Now there's only two possible values for @kind, and the <alternative> elements specify the type for each value. See publication_v2.xsd in the folder: type-alternatives/publication 134 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Inlined Type On the <alternative> element you can specify the name of a type, or you can define a type inline: <alternative test="XPath'" type="type" /> <alternative test="XPath'"> <complexType> … </complexType> </alternative> <alternative test="XPath'"> <simpleType> … </simpleType> </alternative> 135 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Publication" type="PublicationType" maxOccurs="unbounded"> <xs:alternative test="@kind eq 'magazine'"> <xs:complexType> <xs:complexContent> <xs:restriction base="PublicationType"> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Date" type="xs:gDate"/> </xs:sequence> </xs:restriction> </xs:complexContent> </xs:complexType> </xs:alternative> <xs:alternative test="@kind eq 'book'"> <xs:complexType> <xs:complexContent> <xs:extension base="PublicationType"> <xs:sequence> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> </xs:alternative> </xs:element> See publication_v3.xsd in the folder: type-alternatives/publication inlined type definitions 136 Copyright © [2012]. Roger L. Costello. All Rights Reserved. xpathDefaultNamespace The <assert> element, the <assertion> facet, and the <alternative> element have an optional attribute, xpathDefaultNamespace: <alternative test="XPath'" type="type" xpathDefaultNamespace="___" /> <assert test="xpath" xpathDefaultNamespace="___" /> <assertion test="xpath" xpathDefaultNamespace="___" /> The values for ___ are: ##targetNamespace, ##defaultNamespace, ##local, anyURI 137 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Can't look up, can't look down • The XPath expression in the <alternative> element can only reference attributes of the current (context) element. • It cannot reference ancestor elements. • It cannot reference descendent elements. 138 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Can't look up, can't look down <xs:element name="Publication" type="PublicationType"> <xs:alternative test="___'" type="…" /> </xs:element> This XPath expression can only reference attributes of Publication. 139 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Can use <assert> instead • Rather than using <alternative> you could use <assert> • Here's how: – Group all the elements together. Make some of them optional. • ISBN and Publisher should occur only with @kind='book' so make them optional. – Create an assertion that identifies the child elements that should be present, given the value of @kind 140 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Group all elements together <xs:complexType name="PublicationType"> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="Date" type="xs:gYear"/> <xs:element name="ISBN" type="xs:string" minOccurs="0"/> <xs:element name="Publisher" type="xs:string" minOccurs="0"/> </xs:sequence> <xs:attribute name="kind" type="xs:string" /> <xs:assert test="…" /> </xs:complexType> 141 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Create an assertion <xs:assert test="if (@kind eq 'book') then Title and Date and ISBN and Publisher and empty(* except (Title[1], Date[1], Author, ISBN[1], Publisher[1])) else if (@kind eq 'magazine') then Title and Date and empty(* except (Title[1], Date[1])) else Title and Date and empty(* except (Title[1], Date[1], Author))" /> 142 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="Example"> <xs:complexType> <xs:sequence> <xs:element name="Publication" type="PublicationType" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType> </xs:element> <xs:complexType name="PublicationType"> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="Date" type="xs:gYear"/> <xs:element name="ISBN" type="xs:string" minOccurs="0"/> <xs:element name="Publisher" type="xs:string" minOccurs="0"/> </xs:sequence> <xs:attribute name="kind" type="xs:string" /> <xs:assert test="if (@kind eq 'book') then Title and Date and ISBN and Publisher and empty(* except (Title[1],Date[1],Author,ISBN[1],Publisher[1])) else if (@kind eq 'magazine') then Title and Date and empty(* except (Title[1],Date[1])) else Title and Date and empty(* except (Title[1],Date[1], Author))" /> </xs:complexType> Do Lab10 </xs:schema> See publication_v4.xsd in the folder: type-alternatives/publication 143 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Subject: Best Practice: constrain an element's content by (1) a run-time selection of alternate types or (2) a run-time selection of child elements using an XPath expression? Consider this book publication: <Publication kind="book"> <Title>Everything is Miscellaneous</Title> <Author>David Weinberger</Author> <Date>2007</Date> <ISBN>0-8050-8811-3</ISBN> <Publisher>Henry Holt and Company, LLC</Publisher> </Publication> Next, consider this magazine publication: <Publication kind="magazine"> <Title>Science News</Title> <Date>2005</Date> </Publication> Notice the *kind* attribute in both examples. If its value is 'book' then the content of <Publication> is: - Title - Author - Date - ISBN - Publisher And if its value is 'magazine' then the content of <Publication> is: - Title - Date 144 Copyright © [2012]. Roger L. Costello. All Rights Reserved. PROBLEM STATEMENT What is best practice for constraining the content of Publication? XML SCHEMA 1.1 PROVIDES TWO APPROACHES XML Schema 1.1 provides two approaches to constraining the content of the <Publication> element. APPROACH #1: ALTERNATE TYPES Create a BookType and a MagazineType and then select one of them to be <Publication>'s type depending on @kind: if @kind = 'book' then select BookType else select MagazineType Here's how it is expressed in XML Schema 1.1: <xs:element name="Publication" type="PublicationType"> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> <xs:alternative test="@kind eq 'book'" type="BookType" /> </xs:element> You see the (new) <alternative> element being used to select a type for <Publication> based on the value of @kind. 145 Copyright © [2012]. Roger L. Costello. All Rights Reserved. APPROACH #2: XPATH EXPRESSION Let the content of <Publication> be a collection of all the elements (both book elements and magazine elements) and set them optional: - Title (0,1) - Author (0, unbounded) - Date (0,1) - ISBN (0,1) - Publisher (0,1) Then create an XPath expression that selects the set of children for <Publication> depending on the value of @kind: if (@kind eq 'book') then Title and Date and ISBN and Publisher and empty(* except (Title[1],Date[1],Author,ISBN[1],Publisher[1])) else if (@kind eq 'magazine') then Title and Date and empty(* except (Title[1],Date[1])) else true() 146 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Here's how it is expressed in XML Schema 1.1: <xs:element name="Publication"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string" minOccurs="0"/> <xs:element name="Author" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="Date" type="xs:gYear" minOccurs="0"/> <xs:element name="ISBN" type="xs:string" minOccurs="0"/> <xs:element name="Publisher" type="xs:string" minOccurs="0"/> </xs:sequence> <xs:attribute name="kind" type="xs:string" /> <xs:assert test="if (@kind eq 'book') then Title and Date and ISBN and Publisher and empty(* except (Title[1],Date[1],Author,ISBN[1],Publisher[1])) else if (@kind eq 'magazine') then Title and Date and empty(* except (Title[1],Date[1])) else Title and Date and empty(* except (Title[1],Date[1], Author))" /> </xs:complexType> </xs:element> 147 Copyright © [2012]. Roger L. Costello. All Rights Reserved. You see that the content of <Publication> is all the book and magazine elements and they are optional. You see an XPath expression within the (new) <assert> element being used to constrain which child elements are allowed within Publication based on the value of @kind. TWO APPROACHES You have seen two ways of solving the problem of constraining the content of <Publication>: (a) Run-time selection of alternate types (b) Run-time selection of child elements using XPath DEFINITION OF "RUN-TIME" By "run-time" I mean that the content of <Publication> is not determined until an instance document is validated against a schema. WHICH IS BEST PRACTICE? Which approach is best practice? What are the pros and cons of each approach? 148 Copyright © [2012]. Roger L. Costello. All Rights Reserved. I think the best advice is probably: if you can do it conveniently using Conditional Type Alternative (CTA), i.e., the <alternative> element, (as you can here), then do. Otherwise use assertions. There are a number of reasons for this. (1) What is sometimes called the "rule of least power": don't use a chainsaw to snap a twig. (2) More concretely: (2a) A schema validator is more likely to adopt a streaming implementation for CTA than for assertions (2b) The schema validator is likely to produce better diagnostics if you describe the constraint using CTA (2c) You are likely to get a more precise type annotation on the element if you use CTA, which gives benefits when writing schema-aware stylesheets and queries. Michael Kay 149 Copyright © [2012]. Roger L. Costello. All Rights Reserved. An advantage of using <assert> rather than <alternative> is that you can factor in more information using <assert>. The <assert> can be placed at the top of the document, thereby enabling it to factor in information from the entire document. With <alternative> you can make a decision based only on the attributes of the current element. That's pretty limiting Roger Costello 150 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 2 <Meeting start-time="_____" end-time="______"> </Meeting> 1. Check that the end-time is greater than the start-time. use the <assert> element 2. If the end-time is before noon, bring tea to the meeting, if the end-time is after noon, bring juice to the meeting. use the <alternative> element 151 Copyright © [2012]. Roger L. Costello. All Rights Reserved. MeetingType Subject "extends" MorningMeeting Tea "extends" AfternoonMeeting Juice 152 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Sample Instance Document <?xml version="1.0" encoding="utf-8"?> <Example> <Meeting start-time="08:00:00" end-time="09:00:00"> <Subject>Discuss the new project</Subject> <Tea>green</Tea> </Meeting> <Meeting start-time="13:00:00" end-time="14:00:00"> <Subject>Discuss the new project</Subject> <Juice>apple</Juice> </Meeting> </Example> 153 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Use <assert> to check that this value is after the value of start-time <?xml version="1.0" encoding="utf-8"?> <Example> <Meeting start-time="08:00:00" end-time="09:00:00"> <Subject>Discuss the new project</Subject> <Tea>green</Tea> </Meeting> <Meeting start-time="13:00:00" end-time="14:00:00"> <Subject>Discuss the new project</Subject> <Juice>apple</Juice> </Meeting> </Example> 154 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0" encoding="utf-8"?> <Example> <Meeting start-time="08:00:00" end-time="09:00:00"> <Subject>Discuss the new project</Subject> <Tea>green</Tea> </Meeting> <Meeting start-time="13:00:00" end-time="14:00:00"> <Subject>Discuss the new project</Subject> <Juice>apple</Juice> </Meeting> This must be of type MorningMeeting since end-time is before noon. Express this type constraint using <alternative> </Example> 155 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="Example"> <xs:complexType> <xs:sequence> <xs:element name="Meeting" type="MeetingType" maxOccurs="unbounded"> <xs:alternative test="@end-time le '12:00:00'" type="MorningMeeting" /> <xs:alternative test="@end-time gt '12:00:00'" type="AfternoonMeeting" /> </xs:element> </xs:sequence> </xs:complexType> </xs:element> <xs:complexType name="MeetingType"> <xs:sequence> <xs:element name="Subject" type="xs:string"/> </xs:sequence> <xs:attribute name="start-time" type="xs:time" /> <xs:attribute name="end-time" type="xs:time" /> <xs:assert test="@end-time gt @start-time" /> </xs:complexType> <xs:complexType name="MorningMeeting"> <xs:complexContent> <xs:extension base="MeetingType"> <xs:sequence> <xs:element name="Tea" type="xs:string"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> <xs:complexType name="AfternoonMeeting"> <xs:complexContent> <xs:extension base="MeetingType"> <xs:sequence> <xs:element name="Juice" type="xs:string"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> </xs:schema> See meeting.xsd in the folder: type-alternatives/meeting 156 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 3 If the current-time is before noon, the value of <Beverage> must be "tea". If the current-time is after noon, the value of <Beverage> must be "juice" <Beverage current-time="_____"> </Beverage> 157 Copyright © [2012]. Roger L. Costello. All Rights Reserved. BeverageType Datatype: xs:string Attribute: current-time "restrict" "restrict" MorningBeverage AfternoonBeverage Datatype xs:string restricted to one value: "tea" Datatype xs:string restricted to one value: "juice" 158 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Sample Instance Document <?xml version="1.0" encoding="utf-8"?> <Example> <Beverage current-time="08:00:00">tea</Beverage> <Beverage current-time="13:00:00">juice</Beverage> </Example> 159 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0" encoding="utf-8"?> <Example> <Beverage current-time="08:00:00">tea</Beverage> <Beverage current-time="13:00:00">juice</Beverage> </Example> This must be of type MorningBeverage since current-time is before noon. Express this type constraint using <alternative> 160 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="Example"> <xs:complexType> <xs:sequence> <xs:element name="Beverage" type="BeverageType" maxOccurs="unbounded"> <xs:alternative test="@current-time le '12:00:00'" type="MorningBeverage" /> <xs:alternative test="@current-time gt '12:00:00'" type="AfternoonBeverage" /> </xs:element> </xs:sequence> </xs:complexType> </xs:element> <xs:complexType name="BeverageType"> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="current-time" type="xs:time" use="required" /> </xs:extension> </xs:simpleContent> </xs:complexType> <xs:complexType name="MorningBeverage"> <xs:simpleContent> <xs:restriction base="BeverageType"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="tea" /> </xs:restriction> </xs:simpleType> </xs:restriction> </xs:simpleContent> </xs:complexType> <xs:complexType name="AfternoonBeverage"> … </xs:complexType> </xs:schema> See beverage.xsd in the folder: type-alternatives/beverage 161 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Multiple <alternative> elements apply ... which one wins? Suppose that several <alternative> elements apply. Which one will be used? Example: the below <Beverage> element has two <alternative> elements. If the value of the current-time attribute is 08:00:00 then both <alternative> elements apply. Which one will be used? <xs:element name="Beverage" type="BeverageType"> <xs:alternative test="@current-time le '12:00:00'" type="MorningBeverage" /> <xs:alternative test="@current-time le '09:00:00'" type="EarlyMorningBeverage" /> </xs:element> 162 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Answer The first alternative with a test that evaluates to true is used. Section 3.3.4.1 (http://www.w3.org/TR/xmlschema11-1/#sec-sistd): Given a Type Table T and an element information item E, T conditionally selects a type S for E in the following way. The {test} expressions in T's {alternatives} are evaluated, in order, until one of the Type Alternatives ·successfully selects· a type definition for E, or until all have been tried without success. 163 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The Portion of the DTD for XML Schema 1.1 that Pertains to <alternative> <!ELEMENT alternative ((annotation)?, (simpleType | complexType)?) > <!ATTLIST alternative test CDATA #REQUIRED <!-- The value of test is an XPath expression --> type CDATA #IMPLIED <!-- The value of type is a QName --> xpathDefaultNamespace CDATA #IMPLIED id ID #IMPLIED > <!ELEMENT element ((annotation)?, (complexType | simpleType)?, (alternative)*, (unique | key | keyref)*)> 164 Copyright © [2012]. Roger L. Costello. All Rights Reserved. xs:error http://www.w3.org/TR/xmlschema11-1/#xsd-error 165 Copyright © [2012]. Roger L. Costello. All Rights Reserved. xs:error Datatype • This is a new datatype • It is used to trigger an error • (Recall the example in the previous section) We can use xs:error to generate an error if @kind is neither 'book' or 'magazine' <xs:element name="Publication" type="PublicationType" maxOccurs="unbounded"> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> <xs:alternative test="@kind eq 'book'" type="BookType" /> <xs:alternative test="(@kind ne 'book') and (@kind ne 'magazine')" type="xs:error" /> </xs:element> Do Lab11 If @kind is neither 'book' or 'magazine' then generate an error 166 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Equivalent! If the kind attribute is declared like this: If the kind attribute is declared like this: <attribute name="kind"> <simpleType> <restriction base="string"> <enumeration value="book" /> <enumeration value="magazine" /> </restriction> </simpleType> </attribute> <attribute name="kind" type="string" /> Then the following CTA will ensure that <Publication> has MagazineType content when kind equals 'magazine' and <Publication> has BookType when kind equals 'book' and an error is generated when kind has some other value: Then this CTA has the same functionality as the other one: <xs:element name="Publication" type="PublicationType"> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> <xs:alternative test="@kind eq 'book'" type="BookType" /> <xs:alternative test="(@kind ne 'book') and (@kind ne 'magazine')" type="xs:error" /> </xs:element> In this case xs:error is used to trigger an error when kind does not have the value 'magazine' or 'book'. <xs:element name="Publication" type="PublicationType"> <xs:alternative test="@kind eq 'magazine'" type="MagazineType" /> <xs:alternative test="@kind eq 'book'" type="BookType" /> </xs:element> 167 Copyright © [2012]. Roger L. Costello. All Rights Reserved. xs:error redundant? • The previous slide shows that xs:error is not needed – the same functionality can be achieved without it. • Q: Is xs:error redundant? • A: No. Suppose PublicationType is declared in a schema that you cannot modify. And suppose it declares kind to be of type string. Then you will have to use xs:error. 168 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 2 Suppose there is a Book schema that you want to use. It declares an optional Review element: <xs:element name="Book"> <xs:complexType> <xs:sequence> <xs:element ref="Title" /> <xs:element ref="Author" /> <xs:element ref="Date" /> <xs:element ref="ISBN" /> <xs:element ref="Publisher" /> <xs:element ref="Review" minOccurs="0" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string"/> <xs:element name="Date" type="xs:gYear"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> <xs:element name="Review" type="xs:string"/> 169 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example 2 (cont.) Suppose that in your schema you don't want the Review element, so you override Review's type (we'll see the <override> element later): <xs:override schemaLocation="Book.xsd"> <xs:element name="Review" type="xs:error" /> </xs:override> "Review is optional, but if you do use it you'll get an error." <xs:element name="BookStore"> <xs:complexType> <xs:sequence> <xs:element ref="Book" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> See the error folder for this example. Thanks to Michael Sperberg-McQueen for this example. 170 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Inherited Attributes 171 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Motivation <Meeting start-time="_____" end-time="_____"> <Beverage>_____</Beverage> </Meeting> Use the MorningBeverage type if end-time is before noon. Use the AfternoonBeverage type if end-time is after noon. 172 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: Will this Work? <alternative test="../@end-time le 12:00:00" type="MorningBeverage" /> <alternative test="../@end-time ge 12:00:00" type="AfternoonBeverage" /> 173 Copyright © [2012]. Roger L. Costello. All Rights Reserved. A: No! <alternative test="../@end-time le 12:00:00" type="MorningBeverage" /> <alternative test="../@end-time le 12:00:00" type="AfternoonBeverage" /> Trying to reference an ancestor (parent). The <alternative> element can only reference attributes on the current element. 174 Copyright © [2012]. Roger L. Costello. All Rights Reserved. If only …. Element Meeting Attributes start-time end-time If only we could get these attributes to be "inherited" by the Beverage element … Element Beverage 175 Copyright © [2012]. Roger L. Costello. All Rights Reserved. We can! If an attribute of an ancestor element of E is declared as inheritable, then you can refer to it in an <alternative> element as if it appeared on element E itself. <attribute name="start-time" type="time" inheritable="true" /> <attribute name="end-time" type="time" inheritable="true" /> 176 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Example"> <xs:complexType> <xs:sequence> <xs:element name="Meeting" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="Beverage" type="BeverageType"> <xs:alternative test="@end-time le '12:00:00'" type="MorningBeverage" /> <xs:alternative test="@end-time gt '12:00:00'" type="AfternoonBeverage" /> </xs:element> </xs:sequence> <xs:attribute name="start-time" type="xs:time" inheritable="true" /> <xs:attribute name="end-time" type="xs:time" inheritable="true" /> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> The <alternative> elements are testing end-time, which is an attribute of Meeting. Normally that's illegal. Since end-time is declared to be inheritable, it is as though it was declared by the Beverage element. And thus the <alternative> element can use it. See meeting-beverage.xsd in the folder: inheritable-attributes/meeting-beverage 177 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Only <alternative> • Inherited attributes can be used by only the <alternative> element, not the <assert> element. • Inherited attributes provide a mechanism to circumvent the “can only look at attributes” restriction. • Since inherited attributes can only be used by <alternative> elements, the <alternative> element should be favored. 178 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Which Design is Better? <?xml version="1.0"?> <BarnesAndNoble> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>New Riders</Publisher> </Book> </BarnesAndNoble> - or <?xml version="1.0"?> <BookStore storename="BarnesAndNoble"> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>New Riders</Publisher> </Book> </BookStore> 179 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Business Rule <?xml version="1.0"?> <BarnesAndNoble> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>New Riders</Publisher> </Book> </BarnesAndNoble> <?xml version="1.0"?> <BookStore storename="BarnesAndNoble"> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>New Riders</Publisher> </Book> </BookStore> Business Rule The value of Publisher depends on the store: • If the store is BarnesAndNoble then Publisher can be either Wrox Press or New Riders • If the store is Borders then Publisher can be either Norton Press or friendsofed 180 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Invalid <?xml version="1.0"?> <BarnesAndNoble> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>O’Reilly</Publisher> </Book> </BarnesAndNoble> Given the business rule on the previous slide, this data is invalid. <?xml version="1.0"?> <BookStore storename="BarnesAndNoble"> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>O’Reilly</Publisher> </Book> </BookStore> 181 Copyright © [2012]. Roger L. Costello. All Rights Reserved. What is Invalid? <?xml version="1.0"?> <BarnesAndNoble> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>O’Reilly</Publisher> </Book> </BarnesAndNoble> <?xml version="1.0"?> <BookStore storename="BarnesAndNoble"> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>O’Reilly</Publisher> </Book> </BookStore> Is BarnesAndNoble invalid or is Publisher invalid? Is BookStore invalid or is Publisher invalid? 182 Copyright © [2012]. Roger L. Costello. All Rights Reserved. What is the Business Rule a Statement About? • Is the business rule a statement about what are valid BarnesAndNoble and Borders documents? • Or, is the business rule a statement about what is a valid value of Publisher given its context? 183 Copyright © [2012]. Roger L. Costello. All Rights Reserved. B.R. is a Statement re: B&N and Borders Design the XML Schema with the business rule expressed on the root element: <?xml version="1.0"?> <BarnesAndNoble> Assert: Book/Publisher = (‘Wrox Press’, ‘New Riders’) <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>O’Reilly</Publisher> </Book> </BarnesAndNoble> <?xml version="1.0"?> <BookStore storename="BarnesAndNoble"> Assert: Book/Publisher = (‘Wrox Press’, ‘New Riders’) <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>O’Reilly</Publisher> </Book> </BookStore> See Assert-Version.xsd in the folder: inheritable-attributes/book 184 Copyright © [2012]. Roger L. Costello. All Rights Reserved. B.R. is a Statement re: Publisher Design the XML Schema with the business rule expressed on the Publisher element: <?xml version="1.0"?> <BarnesAndNoble> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>O’Reilly</Publisher> </Book> </BarnesAndNoble> The business rule, applied to the Publisher element, cannot be expressed given this design! <?xml version="1.0"?> <BookStore storename="BarnesAndNoble"> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>O’Reilly</Publisher> Alternative: if @storename=‘BarnesAndNoble’ then text() = (‘Wrox Press’, ‘New Riders’) Alternative: if @storename=‘Borders’ then text() = (‘Norton Press’, ‘friendsofed’) </Book> </BookStore> 185 See Alternative-Version.xsd in the folder: inheritable-attributes/book Copyright © [2012]. Roger L. Costello. All Rights Reserved. How you interpret your business rules has a profound impact on XML design Consider this book store document: <?xml version="1.0"?> <BookStore storename="BarnesAndNoble"> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>New Riders</Publisher> </Book> ... </BookStore> The document is used by a community that has this business rule: ------------------------------------------------Business Rule The value of Publisher depends on the store: If the store is BarnesAndNoble then Publisher can be either Wrox Press or New Riders If the store is Borders then Publisher can be either Norton Press or friendsofed ------------------------------------------------Given that business rule, this is invalid (because 'friendsofed' is an invalid publisher for the store 'BarnesAndNoble'): 186 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <BookStore storename="BarnesAndNoble"> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>friendsofed</Publisher> </Book> </BookStore> What is invalid? - Is BookStore invalid? - Or, is Publisher invalid? Is the business rule a statement about what are valid BookStores? Or, is the business rule a statement about what are valid values of Publisher given its context? The question is important. It's answer has a profound impact on XML design. IMPACT ON XML DESIGN If the business rule is a statement about what are valid BookStores then, when you design your XML Schema, you should position an <assert> element on the BookStore element declaration: Assert: Book/Publisher = ('Wrox Press', 'New Riders') 187 Copyright © [2012]. Roger L. Costello. All Rights Reserved. If the business rule is a statement about what are valid values of Publisher given its context then, when you design your XML Schema, you should position <alternative> elements in the Publisher element declaration: Alternative: if @storename='BarnesAndNoble' then text() = ('Wrox Press', 'New Riders') Alternative: if @storename='Borders' then text() = ('Norton Press', 'friendsofed') and you should declare the storename attribute to be "inheritable". You must specify the store name in an attribute value and not in an element name. The following XML design would make it impossible to implement the business rule: <?xml version="1.0"?> <BarnesAndNoble> <Book> <Title>Don't Make Me Think</Title> <Author>Steve Krug</Author> <Date>2006</Date> <ISBN>0-321-34475-8</ISBN> <Publisher>New Riders</Publisher> </Book> </BarnesAndNoble> QUESTION Is the business rule a statement about what are valid BookStores or is it a statement about what are valid values of Publisher given its context? 188 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Popular inherited attribute declare this to be inheritable <Root xml:lang="fr"> … any element in here can have an <alternative> element that uses xml:lang … </Root> Do Lab12 189 Copyright © [2012]. Roger L. Costello. All Rights Reserved. We've seen that by declaring an attribute to be inheritable, descendant elements can use the attribute in their <alternative> elements. But suppose an attribute is declared to be both inheritable and required (use="required"). Does that mean descendant elements must display that attribute in instance documents? For example, <Meeting> has two required, inheritable attributes: <element name="Meeting"> <complexType> <sequence> <element name="Beverage" type="b:BeverageType"> <alternative test="@end-time le '12:00:00'" type="b:MorningBeverage" /> <alternative test="@end-time gt '12:00:00'" type="b:AfternoonBeverage" /> </xs:element> </sequence> <attribute name="start-time" type="xs:time" use="required" inheritable="true" /> <attribute name="end-time" type="xs:time" use="required" inheritable="true" /> </complexType> </element> Q: In an instance document must the <Beverage> element have the two inherited attributes: <Meeting start-time="___" end-time="___"> <Beverage start-time="___" end-time="___"> ... </Beverage> </Meeting> 190 Copyright © [2012]. Roger L. Costello. All Rights Reserved. A: No. Inheritable attributes do not appear in instance documents. 191 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Multiple inheritable attributes with the same name ... who wins? Suppose a <Beverage> element has multiple ancestor elements with an inheritable attribute, start-time. The <Beverage> element has an <alternative> element that references start-time. Which start-time applies? Example: Suppose each start-time attribute is inheritable: -------------------------------------<Conference start-time="08:00:00"> <Meeting start-time="13:00:00"> <Beverage>Juice</Beverage> </Meeting> </Conference> -------------------------------------Which start-time is visible to Beverage? 192 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Answer It's the closest one The rules are in 3.3.5.6 Inherited Attributes. To paraphrase, this says that: (a) all inheritable attributes of ancestors of an element E are "potentially inherited" by E (b) the actual [inherited attributes] are attributes that are potentially inherited excluding any that are "masked" by an inner inherited attribute of the same name. Section 3.12.4 (rule 1.1.3) then says that in practice, the only [inherited attributes] that are relevant are those that do not have the same name as one of the element's "real" attributes. One corner case to be aware of is <a att="3"> <b att="4"> <c/> </b> </a> where element <a> defines @att as an inherited attribute, while element <b> defines @att as a non-inherited attribute. In this case element <c> effectively has the value <c att="3"/>. Michael Kay 193 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Inherited Attribute + <alternative> + <assert> • Imagine an <alternative> element that uses inherited attributes. And an element in the type specified by the <alternative> applies a constraint on a descendent element. • Thus, to understand that descendent element requires understanding its ancestor element that contains the <assert>. But the <assert> depends on the <alternative> which uses an inherited attribute. • Phew! Things can get pretty complicated. 194 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Schema-wide Attributes (defaultAttributes) http://www.w3.org/TR/xmlschema11-1/#declare-schema 195 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Motivation • Sometimes you want every complexType to have an attribute, such as an ID attribute or a class attribute. • In XML Schema 1.0 you had to declare those attributes on every complexType. • Now you can state on the <schema> element, "Hey, all complexTypes shall have the attributes that I've declared in this ____ attributeGroup." 196 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" defaultAttributes="myDefaultAttributes" elementFormDefault="qualified"> <xs:element name="BookStore"> <xs:complexType> <xs:sequence> <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> The attributes declared in the attributeGroup named "myDefaultAttributes" are to be applied to each complexType <xs:attributeGroup name="myDefaultAttributes"> <xs:attribute name="id" type="xs:ID" use="required" /> <xs:attribute name="class" type="xs:NMTOKENS" /> </xs:attributeGroup> </xs:schema> 197 Copyright © [2012]. Roger L. Costello. All Rights Reserved. These elements have @id and @class The non-leaf elements get the attributes. Element BookStore Element Book Element Title Text Element Author Text Element Date Text Element ISBN Element Publisher Text Text 198 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <BookStore xmlns="http://www.books.org" id="Barnes-and-Noble"> <Book id="McCartney"> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> </Book> <Book id="Bach"> <Title>Illusions</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> </Book> <Book id="Krishnamurti"> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </Book> </BookStore> BookStore and each Book is required to have an id attribute, and optionally a class attribute See the defaultAttributes folder, within it the bookStore folder, and BookStore.xsd 199 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Scope of defaultAttributes defaultAttributes: class This default attribute only applies to the complexTypes within this schema file Book.xsd "import" defaultAttributes: id This default attribute only applies to the complexTypes within this schema file BookStore.xsd 200 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The defaultAttributes applies only to types defined in the schema document for which the defaultAttributes attribute is specified. Types defined in other schema documents are not affected, whether they are included or acquired via import or by some other means. Michael Sperberg-McQueen 201 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.book.org" xmlns="http://www.book.org" defaultAttributes="bookDefaultAttributes" elementFormDefault="qualified"> <xs:element name="Book"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <xs:attributeGroup name="bookDefaultAttributes"> <xs:attribute name="class" type="xs:NMTOKENS" use="optional" /> </xs:attributeGroup> Only @class applies here <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.bookstore.org" xmlns="http://www.bookstore.org" xmlns:bk="http://www.book.org" defaultAttributes="bookstoreDefaultAttributes" elementFormDefault="qualified"> </xs:schema> Book.xsd <xs:import namespace="http://www.book.org" schemaLocation="Book.xsd" /> <xs:element name="BookStore"> <xs:complexType> <xs:sequence> <xs:element ref="bk:Book" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType> </xs:element> Only @id applies here <xs:attributeGroup name="bookstoreDefaultAttributes"> <xs:attribute name="id" type="xs:ID" use="required" /> </xs:attributeGroup> </xs:schema> BookStore.xsd 202 Copyright © [2012]. Roger L. Costello. All Rights Reserved. @id can only be used on BookStore @class can only be used on Book <?xml version="1.0"?> <BookStore id="Borders" xmlns="http://www.bookstore.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.books.org BookStore.xsd"> <Book class="McCartney" xmlns="http://www.book.org"> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> </Book> <Book class="Bach" xmlns="http://www.book.org"> <Title>Illusions</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> </Book> <Book class="Krishnamurti" xmlns="http://www.book.org"> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </Book> </BookStore> Do Lab13 203 See the defaultAttributes folder, within it the bookStore folder, and BookStore_v2.xsd Copyright © [2012]. Roger L. Costello. All Rights Reserved. defaultAttributesApply • Okay, you specify defaultAttributes and now every complexType has the attributes. • But suppose a complexType doesn't want those attributes? How do you "turn off" defaultAttributes? • Answer: on the complexType add this attribute: defaultAttributesApply="false" • That says, "The default attributes don't apply to this complexType." 204 http://www.w3.org/TR/xmlschema11-1/#dcl.ctd.attuses Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" defaultAttributes="myDefaultAttributes" elementFormDefault="qualified"> <xs:element name="BookStore"> <xs:complexType defaultAttributesApply="false"> <xs:sequence> <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> BookStore does not have the id or class attributes (but each Book does) <xs:attributeGroup name="myDefaultAttributes"> <xs:attribute name="id" type="xs:ID" use="required" /> <xs:attribute name="class" type="xs:NMTOKENS" /> </xs:attributeGroup> </xs:schema> 205 See the defaultAttributes folder, within it the bookStore folder, and BookStore_v3.xsd Copyright © [2012]. Roger L. Costello. All Rights Reserved. The <all> element http://www.w3.org/TR/xmlschema11-1/#all-mg 206 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Almost the same as before • Just as in XML Schema 1.0: – An <all> element must just contain elements. It cannot contain <sequence> or <choice> – The <all> element cannot be embedded within a <sequence> or <choice> – The <all> element is not repeatable, i.e. this is not allowed: <all maxOccurs="unbounded"> • There are three big changes in 1.1: 1. An <element> in <all> is not restricted to one occurrence 2. The <any> element can be used in <all> 3. The <all> can be used in a base type and then the base type can be extended with a type that uses <all> 207 Copyright © [2012]. Roger L. Costello. All Rights Reserved. You couldn't do this in 1.0 <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:all> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" maxOccurs="unbounded"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:all> </xs:complexType> </xs:element> XML Schema 1.0 did not allow this. 208 Copyright © [2012]. Roger L. Costello. All Rights Reserved. You can do this in 1.1 <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:all> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" maxOccurs="unbounded"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:all> </xs:complexType> </xs:element> "The content of Book is one Title, any number of Authors, one Date, one ISBN, one Publisher, and they can occur in any order." See BookStore_v1.xsd in the folder: all/book-store 209 Copyright © [2012]. Roger L. Costello. All Rights Reserved. You couldn't do this in 1.0 XML Schema 1.0 did not allow this. <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:all> <xs:any/> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" maxOccurs="unbounded"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:all> </xs:complexType> </xs:element> 210 Copyright © [2012]. Roger L. Costello. All Rights Reserved. You can do this in 1.1 <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:all> <xs:any/> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" maxOccurs="unbounded"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:all> </xs:complexType> </xs:element> "The content of Book is one Title, any number of Authors, one Date, one ISBN, one Publisher, one element from any namespace, and they can occur in any order." See BookStore_v2.xsd in the folder: all/book-store 211 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:all> <xs:any minOccurs="0"/> <xs:element ref="Author" maxOccurs="unbounded"/> <xs:element name="Title" type="xs:string"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:all> </xs:complexType> </xs:element> <xs:element name="Author" type="xs:string" /> <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> <Author>John Ghostwriter</Author> How does a validator validate this? Against <any/> or the Author declaration? </Book> See BookStore_v3.xsd in the folder: all/book-store 212 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:any minOccurs="0"/> <xs:element ref="Author" maxOccurs="unbounded"/> <xs:element name="Title" type="xs:string"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Author" type="xs:string" /> <Book> <Author>Paul McCartney</Author> <Author>John Ghostwriter</Author> <Title>My Life and Times</Title> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> How does a validator validate this? Against <any/> or the Author declaration? </Book> See BookStore_v4.xsd in the folder: all/book-store 213 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Rule: If an element could match either an element declaration or a wildcard, then the element declaration is chosen. 214 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:all> <xs:any minOccurs="0"/> <xs:element ref="Author" maxOccurs="unbounded"/> <xs:element name="Title" type="xs:string"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:all> </xs:complexType> </xs:element> <xs:element name="Author" type="xs:string" /> <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:any minOccurs="0"/> <xs:element ref="Author" maxOccurs="unbounded"/> <xs:element name="Title" type="xs:string"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Author" type="xs:string" /> Non-deterministic content models. 215 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Consider this: <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:any minOccurs="0"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Title" type="xs:string"/> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> This will fail because when it gets to the first <Author> (in an instance document) it commits to the declaration for Author. So, when it gets to the second <Author> it expects Title. Recommendation: when using <any> don't use <sequence>, always use <all>. See book.xsd in the folder: all/book 216 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XSD 1.1 allows certain classes of non-deterministic content models prohibited by XSD 1.0! Do Lab14 217 Copyright © [2012]. Roger L. Costello. All Rights Reserved. You couldn't do this in 1.0 XML Schema 1.0 did not allow you to extend a non-empty base type that uses <all>. <xs:complexType name="appliance"> <xs:all> <xs:element name="description" type="xs:string"/> <xs:element ref="warranty" minOccurs="0"/> </xs:all> </xs:complexType> <xs:complexType name="juiceAppliance"> <xs:complexContent> <xs:extension base="appliance"> <xs:all> <xs:element name="name" type="xs:string"/> <xs:element name="image" type="imageType"/> <xs:element name="weight" minOccurs="0" type="xs:positiveInteger" /> <xs:element name="cost" type="xs:decimal" maxOccurs="unbounded" /> <xs:element name="retailer" type="xs:anyURI"/> </xs:all> </xs:extension> </xs:complexContent> </xs:complexType> 218 Copyright © [2012]. Roger L. Costello. All Rights Reserved. You can do this in 1.1 <xs:complexType name="appliance"> <xs:all> <xs:element name="description" type="xs:string"/> <xs:element ref="warranty" minOccurs="0"/> </xs:all> </xs:complexType> <xs:complexType name="juiceAppliance"> <xs:complexContent> <xs:extension base="appliance"> <xs:all> <xs:element name="name" type="xs:string"/> <xs:element name="image" type="imageType"/> <xs:element name="weight" minOccurs="0" type="xs:positiveInteger" /> <xs:element name="cost" type="xs:decimal" maxOccurs="unbounded" /> <xs:element name="retailer" type="xs:anyURI"/> </xs:all> </xs:extension> </xs:complexContent> </xs:complexType> "The content of juiceAppliance is description, warranty, name, image, weight, cost, retailer and they can occur in any order." See juicers.xsd in the folder: all/juicers 219 Copyright © [2012]. Roger L. Costello. All Rights Reserved. You can do this in 1.1 <all> description warranty </all> "extend" <all> name image weight cost retailer </all> Note: you can extend an <all> with an <all>, but not with a <sequence> or a <choice>. 220 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Wildcard Schema Components (i.e. <any> and <anyAttribute>) http://www.w3.org/TR/xmlschema11-1/#Wildcards 221 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XML Schema 1.0 <any namespace="_____________" /> This enables you to specify what elements you want, e.g. I want any element from the targetNamespace (##targetNamespace). But in XSD 1.0 there is no way to specify what you don't want: "I want any element, provided it doesn't come from the targetNamespace." 222 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New in XML Schema 1.1 <any notNamespace="_____________" /> This enables you to specify what elements you don't want, e.g. I don't want any element from the targetNamespace (##targetNamespace). Or, I don't want any element from this namespace: http://www.example.org. Or, I don't want any element from no namespace. 223 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Specifying what extension elements you don't want <any notNamespace="##targetNamespace"/> allows the instance document to contain a new element, provided the element does not come from the targetNamespace. <any notNamespace="http://www.example.org"/> allows the instance document to contain a new element, provided the element does not come from this namespace: http://example.org <any notNamespace="##local"/> allows the instance document to contain a new element, provided the element does not come from no namespace Note: the value of notNamespace can be a list of ##targetNamespace, anyURIs, and ##local See BookStore_v1.xsd in the folder: wildcards/book-store 224 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XML Schema 1.0 <anyAttribute namespace="_____________" /> This enables you to specify what attributes you want, e.g. I want any attribute from the targetNamespace (##targetNamespace). But in XSD 1.0 there is no way to specify what you don't want: "I want any attribute, provided it doesn't come from the targetNamespace." 225 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New in XML Schema 1.1 <anyAttribute notNamespace="_____________" /> This enables you to specify what attributes you don't want, e.g. I don't want any attribute from the targetNamespace (##targetNamespace). Or, I don't want any attribute from this namespace: http://www.example.org. Or, I don't want any attribute from no namespace. 226 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Specifying what extension attributes you don't want <anyAttribute notNamespace="##targetNamespace"/> allows the instance document to contain new attributes, provided the attributes do not come from the targetNamespace. <anyAttribute notNamespace="http://www.example.org"/> allows the instance document to contain new attributes, provided the attributes do not come from this namespace: http://example.org <anyAttribute notNamespace="##local"/> allows the instance document to contain new attributes, provided the attributes do not come from no namespace Note: the value of notNamespace can be a list of ##targetNamespace, anyURIs, and ##local 227 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New in XML Schema 1.1 <any notQName="_____________" /> This enables you to specify specific elements you don't want, e.g. I don't want the {http://www.example.org}numPages element. 228 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Specifying what extension elements you don't want <any notQName="xsl:value-of"/> allows the instance document to contain a new element, provided the element is not the <value-of> element from the xsl namespace <any notQName="##defined"/> allows the instance document to contain a new element, provided its name is not the same as that of a global element declaration in the schema <any notQName="##definedSibling"/> allows the instance document to contain a new element, provided its name is not the same as that of a sibling element Note: the value of notQName can be a list of QNames, ##defined, and ##definedSibling See BookStore_v2.xsd in the folder: wildcards/book-store 229 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Open Content http://www.w3.org/TR/xmlschema11-1/#Complex_Type_Definition_details 230 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Definition of Open Content <Book> <ISBN>0-06-064831-7</ISBN> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <Publisher>Harper &amp; Row</Publisher> </Book> Design a schema so that any element, from any namespace, can occur at any of these places. Doing so makes the content of <Book> open. That is, <Book> has open content. 231 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Here's How To Do It <xs:element name="Book"> <xs:complexType> <xs:openContent mode="interleave"> <xs:any /> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> Read as: "Any elements, from any namespace, can be interleaved with the elements declared in the subsequent <sequence> content model." Dictionary definition of interleave: to insert pages between the pages of a book. 232 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example Instance Document That Validates Against the Schema <?xml version="1.0"?> <BookStore xmlns="http://www.books.org" xmlns:r="http://www.bookrepository.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.books.org BookStore.xsd http://www.bookrepository.org BookRepository.xsd"> <Book> <r:Binding>Hardcover</r:Binding> <Title>My Life and Times</Title> <r:Size>5 x 7</r:Size> <Author>Paul McCartney</Author> <r:InStock>true</r:InStock> <Date>1998</Date> <r:Category>Non-fiction</r:Category> <ISBN>1-56592-235-2</ISBN> <r:NumPages>299</r:NumPages> <Publisher>McMillin Publishing</Publisher> <r:AvailableOnTape>false</r:AvailableOnTape> </Book> <Book> <Publisher>Dell Publishing Co.</Publisher> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Title>Illusions The Adventures of a Reluctant Messiah</Title> </Book> <Book> <ISBN>0-06-064831-7</ISBN> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <Publisher>Harper &amp; Row</Publisher> </Book> </BookStore> These new (extension) elements have been inserted (interleaved) with the elements declared in <sequence> See BookStore_v1.xsd in the folder: open-content/book-store 233 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Equivalent <xs:element name="Book"> <xs:complexType> <xs:openContent mode="interleave"> <xs:any /> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Book" type="BookType" /> Book has a type which is open. Thus, the content of Book is open. The content of BookType is open. Book is of type BookType. Thus, the content of Book is open. <xs:complexType name="BookType"> <xs:openContent mode="interleave"> <xs:any /> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> 234 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Any elements, any where, any number If you declare <Book>'s content to have open content, then in the instance document you can insert extension elements around any child element within <Book>. And, you can insert any number of elements. 235 Copyright © [2012]. Roger L. Costello. All Rights Reserved. minOccurs, maxOccurs are ignored <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:openContent mode="interleave"> <xs:any minOccurs="0" maxOccurs="unbounded" /> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> minOccurs and maxOccurs are ignored. {0, unbounded} is the fixed setting. In other words, you can insert any number of elements before and after Title, Author, Date, ISBN, and Publisher. 236 Copyright © [2012]. Roger L. Costello. All Rights Reserved. It's okay to use @namespace, @notNamespace, and @notQName on the <any> element <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:openContent mode="interleave"> <xs:any namespace="##other" /> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> This specifies that the extension elements must come from a namespace other than the schema's targetNamespace. 237 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Validate the Inserted Elements? You can use the processContents attribute with the <any> element to control whether or not the inserted elements must be validated: <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:openContent mode="interleave"> <xs:any processContents="lax" /> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> "Hey schema processor, validate the inserted elements if you can. If you can't, then just skip them." 238 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <Book> <r:Binding>Hardcover</r:Binding> <Title>My Life and Times</Title> <r:Size>5 x 7</r:Size> <Author>Paul McCartney</Author> <r:InStock>true</r:InStock> <Date>1998</Date> <r:Category>Non-fiction</r:Category> <ISBN>1-56592-235-2</ISBN> <r:NumPages>299</r:NumPages> <Publisher>McMillin Publishing</Publisher> <r:AvailableOnTape>false</r:AvailableOnTape> This will validate, even if there's no schema to validate the inserted elements. </Book> 239 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:openContent mode="interleave"> <xs:any processContents="strict" /> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> "Hey schema validator, you must validate the inserted elements. If you can't validate them, then throw an error." <Book> <r:Binding>Hardcover</r:Binding> <Title>My Life and Times</Title> <r:Size>5 x 7</r:Size> <Author>Paul McCartney</Author> <r:InStock>true</r:InStock> <Date>1998</Date> <r:Category>Non-fiction</r:Category> <ISBN>1-56592-235-2</ISBN> <r:NumPages>299</r:NumPages> <Publisher>McMillin Publishing</Publisher> <r:AvailableOnTape>false</r:AvailableOnTape> </Book> This will not validate if there's no schema to validate the inserted elements. Do Lab15 240 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <openContent> applies only to the children, not the grandchildren Consider this <Book> element, declared to have open content: <element name="Book"> <complexType> <openContent mode="interleave"> <any /> </openContent> <sequence> <element name="Title" type="string"/> <element name="Author"> <complexType> <sequence> <element name="FirstName" type="string"/> <element name="LastName" type="string"/> </sequence> </complexType> </element> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </sequence> </complexType> </element> Notice that the <Author> element has child elements, which are grandchildren of <Book>. Extension elements can only be inserted before and after the *children* of Book. Extension elements cannot be inserted around the *grandchildren* of Book. Thus, extension elements cannot be inserted before or after <FirstName> and <LastName>. 241 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The mode Attribute <openContent mode="interleave"> <any /> </openContent> • There are three possible values for mode: – interleave – suffix – none • mode="interleave" means extension elements can be inserted anywhere. This is the default, i.e. if you omit the mode attribute then it defaults to mode="interleave" • mode="suffix" means extension elements can only be inserted at the bottom (e.g. after the <Publication> element) • mode="none" means that you can't insert new elements, i.e. it's not open content. Q: Why bother with <openContent> and then specify mode="none"? A: Suppose your complexType derives from someone else's complexType which has been defined to have open content. Suppose you don't want open content; so, you can use mode="none" to turn off the open content. 242 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:openContent mode="suffix"> <xs:any /> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> Any elements (from any namespace) can be inserted at the bottom of <Book> (after the <Publisher> element). 243 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <Book> <ISBN>0-06-064831-7</ISBN> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <Publisher>Harper &amp; Row</Publisher> </Book> Can only put extension elements here. 244 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> <r:Binding>Hardcover</r:Binding> <r:Size>5 x 7</r:Size> <r:InStock>true</r:InStock> <r:Category>Non-fiction</r:Category> <r:NumPages>299</r:NumPages> <r:AvailableOnTape>false</r:AvailableOnTape> </Book> The extension elements can only be inserted at the bottom of <Book>'s content model. Do Lab16 See BookStore_v2.xsd in the folder: open-content/book-store 245 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Schema-wide Open Content • You can specify at the top of the schema: "The entire schema is open" • This is accomplished using the <xs:defaultOpenContent> element <xs:defaultOpenContent mode="interleave"> <xs:any /> </xs:defaultOpenContent> Read as: "New elements, from any namespace, can be inserted before and after every element in the entire document." 246 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xs:defaultOpenContent mode="interleave"> <xs:any /> </xs:defaultOpenContent> <xs:element name="BookStore"> <xs:complexType> <xs:sequence> <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> This must be at the top of the schema, following <xs:include>, <xs:import>, <xs:redefine>, and <xs:override> </xs:schema> See BookStore_v3.xsd in the folder: open-content/book-store 247 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example XML instance document that conforms to the schema on the previous slide <?xml version="1.0"?> <BookStore xmlns="http://www.books.org" xmlns:r="http://www.bookrepository.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.books.org BookStore.xsd http://www.bookrepository.org BookRepository.xsd"> <r:StoreName>Books R Us</r:StoreName> <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> <r:Binding>Hardcover</r:Binding> </Book> <Book> <r:Size>5 x 7</r:Size> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> </Book> <Book> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <r:NumPages>299</r:NumPages> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </Book> </BookStore> Observe the extension elements that have been inserted between the <Book> elements and within them. 248 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Consider this schema, which uses <xs:defaultOpenContent> to make the entire schema open: <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xs:defaultOpenContent mode="interleave"> <xs:any /> </xs:defaultOpenContent> <xs:element name="BookStore"> <xs:complexType> <xs:sequence> <xs:element name="Book" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> Q: Can extension elements be added before and after the root element (BookStore)? 249 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: Will this instance document validate against the schema on the previous slide? Note that the schema's root element has been wrapped within an extension element. <?xml version="1.0"?> <r:MyFavoriteBookStore xmlns:r="http://www.bookrepository.org"> <BookStore xmlns="http://www.books.org"> <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> <r:Binding>Hardcover</r:Binding> </Book> <Book> <r:Size>5 x 7</r:Size> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> </Book> <Book> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <r:NumPages>299</r:NumPages> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </Book> </BookStore> </r:MyFavoriteBookStore> 250 Copyright © [2012]. Roger L. Costello. All Rights Reserved. A: No, extension elements cannot be inserted before or after the root element. The reason for this is that openness applies to types (i.e. complexTypes have open content); thus, only the content of the root element (BookStore), and its descendant elements are open. Do Lab17 251 Copyright © [2012]. Roger L. Costello. All Rights Reserved. mode="none" interleave or suffix <xs:defaultOpenContent mode="______"> <xs:any /> </xs:defaultOpenContent> When specifying a schema-wide setting for open content, you cannot use mode="none". That value can only be used when defining complex types. 252 Copyright © [2012]. Roger L. Costello. All Rights Reserved. appliesToEmpty • The defaultOpenContent element has an optional attribute, appliesToEmpty • Its value is either true or false. • Its default is false. • It's used to control whether or not extension elements can be inserted into empty elements: If appliesToEmpty="true" then extension elements can be inserted into empty elements Else extension elements cannot be inserted into empty elements 253 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Consider this element with empty content (but it has an attribute): <element name="image"> <complexType> <attribute name="href" type="anyURI" /> </complexType> </element> In an instance document we can have this: <image href="http://wwww.maps.org/boston.gif" /> Now, suppose the image element is declared in a schema that specifies xs:defaultOpenContent with appliesToEmpty="true": <xs:defaultOpenContent mode="interleave" appliesToEmpty="true"> <xs:any /> </xs:defaultOpenContent> Then in the instance document I can insert extension elements *within* the <image> element: <image href="http://wwww.maps.org/boston.gif"> <ex:comment>My home town</ex:comment> </image> Conversely, if appliesToEmpty="false": <xs:defaultOpenContent mode="interleave" appliesToEmpty="false"> <xs:any /> </xs:defaultOpenContent> Then in the instance document I *cannot* insert extension elements within the <image> element. 254 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Recall from XSD 1.0 that a complexType may extend or restrict another complexType. Suppose the parent (base) type is open. How does that influence the openness of subtypes? The following slides discusses this. Assume there is no schema-wide xs:defaultOpenContent. 255 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: Does a subtype inherit its base type's openness? PublicationType <openContent> Title Author Date This base type has open content extend BookType ISBN Publisher Does this subtype inherit the base type's openness? 256 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: Does a subtype inherit its base type's openness? Open base type <complexType name="Publication" abstract="true"> <openContent mode="interleave"> <any /> </openContent> <sequence> <element name="Title" type="string" /> <element name="Author" type="string" /> <element name="Date" type="gYear"/> </sequence> </complexType> Can extension elements be inserted before and after ISBN and Publisher? <complexType name="BookPublication"> <complexContent> <extension base="pub:Publication"> <sequence> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </sequence> </extension> </complexContent> </complexType> See BookStore_v4.xsd in the folder: open-content/book-store 257 Copyright © [2012]. Roger L. Costello. All Rights Reserved. A: Yes! Subtypes inherit the openness of its base type <?xml version="1.0"?> <BookStore xmlns="http://www.books.org" xmlns:r="http://www.bookrepository.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.books.org BookStore.xsd http://www.bookrepository.org BookRepository.xsd"> <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <r:Binding>Hardcover</r:Binding> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> </Book> <Book> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> <r:Size>5 x 7</r:Size> </Book> <Book> <r:NumPages>299</r:NumPages> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </Book> </BookStore> Extension elements can be inserted around the base type elements and the subtype elements Do Lab18 258 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: Can a subtype turn off the openness of a parent type? PublicationType <openContent mode="interleave"> Title Author Date This base type has open content extend BookType <openContent mode="none"> ISBN Publisher This subtype specifies that it's not open 259 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: Can a subtype turn off the openness of a parent type? Open base type Closed subtype <complexType name="Publication" abstract="true"> <openContent mode="interleave"> <any /> </openContent> <sequence> <element name="Title" type="string" /> <element name="Author" type="string" /> <element name="Date" type="gYear"/> </sequence> </complexType> <complexType name="BookPublication"> <complexContent> <extension base="pub:Publication"> <openContent mode="none" /> <sequence> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </sequence> </extension> </complexContent> /complexType> Notice that the <openContent> element can be empty 260 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Invalid Schema! • If the base type allows open content, then a type derived from it by extension must also allow open content. • However, a subtype can turn off openness if it does derive by restriction. See BookStore_v5.xsd in the folder: open-content/book-store 261 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Here's how to turn off the openness of a parent type PublicationType <openContent mode="interleave"> Title Author Date This base type has open content restriction BookType <openContent mode="none"> Title Author Date This subtype specifies that it's not open See BookStore_v6.xsd in the folder: open-content/book-store 262 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Here's how to turn off the openness of a parent type <complexType name="Publication" abstract="true"> <openContent mode="interleave"> <any /> </openContent> <sequence> <element name="Title" type="string" /> <element name="Author" type="string" /> <element name="Date" type="gYear"/> </sequence> </complexType> <complexType name="BookPublication"> <complexContent> <restriction base="pub:Publication"> <openContent mode="none" /> <sequence> <element name="Title" type="string" /> <element name="Author" type="string" /> <element name="Date" type="gYear"/> </sequence> </restriction> </complexContent> </complexType> Notice that the <openContent> element can be empty 263 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: Does a base type inherit the openness of a subtype? PublicationType Title Author Date Does this base type inherit the subtype's openness? extend BookType <openContent> ISBN Publisher This subtype has open content 264 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: Does a base type inherit the openness of a subtype? Can extension elements be inserted before and after Title, Author, and Date? Open subtype <complexType name="Publication" abstract="true"> <sequence> <element name="Title" type="string" /> <element name="Author" type="string" /> <element name="Date" type="gYear"/> </sequence> </complexType> <complexType name="BookPublication"> <complexContent> <extension base="pub:Publication"> <openContent mode="interleave"> <any /> </openContent> <sequence> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </sequence> </extension> </complexContent> </complexType> See BookStore_v7.xsd in the folder: open-content/book-store 265 Copyright © [2012]. Roger L. Costello. All Rights Reserved. A: The subtype's openness applies to the base type elements <?xml version="1.0"?> <BookStore xmlns="http://www.books.org" xmlns:r="http://www.bookrepository.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.books.org BookStore.xsd http://www.bookrepository.org BookRepository.xsd"> <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <r:Binding>Hardcover</r:Binding> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> </Book> <Book> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> <r:Size>5 x 7</r:Size> </Book> <Book> <r:NumPages>299</r:NumPages> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </Book> </BookStore> Extension elements can be inserted around the base type elements as well as the subtype elements 266 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: Does this subtype have interleave or suffix openness? PublicationType <openContent mode="interleave"> Title Author Date This base type specifies interleave openness extend BookType <openContent mode="suffix"> ISBN Publisher This subtype specifies suffix openness 267 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Invalid Schema! • mode="interleave" has more openness than mode="suffix" • The subtype is trying to reduce the openness of the base type. This is not legal when doing derive by extension. • However, a subtype can reduce the openness of the base type if it does derive by restriction. 268 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Here's how to reduce the openness of a parent type PublicationType <openContent mode="interleave"> Title Author Date This base type has open content restriction BookType <openContent mode="suffix"> Title Author Date This subtype specifies that extension elements can only be inserted at the bottom (after the <Date> element) See BookStore_v8.xsd in the folder: open-content/book-store 269 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Q: Does this subtype have interleave or suffix openness? PublicationType <openContent mode="suffix"> Title Author Date This base type specifies suffix openness extend BookType <openContent mode="interleave"> ISBN Publisher This subtype specifies interleave openness See BookStore_v9.xsd in the folder: open-content/book-store 270 Copyright © [2012]. Roger L. Costello. All Rights Reserved. A: The subtype has interleave openness <?xml version="1.0"?> <BookStore xmlns="http://www.books.org" xmlns:r="http://www.bookrepository.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.books.org BookStore.xsd http://www.bookrepository.org BookRepository.xsd"> <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>1998</Date> <r:Binding>Hardcover</r:Binding> <ISBN>1-56592-235-2</ISBN> <Publisher>McMillin Publishing</Publisher> </Book> <Book> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> <r:Size>5 x 7</r:Size> </Book> <Book> <r:NumPages>299</r:NumPages> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </Book> </BookStore> Extension elements can be inserted around the base type elements and the subtype elements 271 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Order of openness interleave is greater than suffix is greater than none 272 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Two Rules to Remember • Derive-by-extension: a subtype cannot specify openness that is less than its parent's openness. • Derive-by-restriction: a subtype cannot specify openness that is greater than its parent's openness. 273 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Derive by extension A subtype can extend a base type if its openness is greater than or equal to the openness of the base type. The resulting openness is the subtype's openness none extend Openness of the Parent Type (i.e. mode="____") interleave interleave interleave none none extend extend interleave suffix none OK OK OK extend none extend interleave Openness of the Subtype extend suffix OK 274 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Derive by extension Openness of the Parent Type (i.e. mode="____") suffix continued extend suffix extend interleave suffix OK OK Openness of the Subtype suffix extend none 275 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Derive by restriction A subtype can restrict a base type if its openness is less than or equal to the openness of the base type. The resulting openness is the subtype's openness none restrict interleave Openness of the Parent Type (i.e. mode="____") interleave interleave interleave none none restrict suffix restrict none restrict none restrict restrict interleave suffix OK OK OK Openness of the Subtype 276 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Derive by restriction Openness of the Parent Type (i.e. mode="____") suffix continued restrict interleave suffix restrict suffix restrict suffix none OK OK Openness of the Subtype 277 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Now let's consider the impact of xs:defaultOpenContent on the openness of a subtype. 278 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Two Rules to Remember • The open content specified on a type wins over the open content specified by xs:defaultOpenContent • The open content specified by xs:defaultOpenContent wins over the open content inherited by a subtype. 279 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Two ways to create interleaved, any-order content Below are two ways to declare a <Book> element. Both versions use <all>, to permit the elements within <Book> to occur in any order. The first version uses an unbounded <any>. The second version uses interleaved open content. Are these two versions identical? Yes VERSION #1 VERSION #2 <xs:element name="Book"> <xs:complexType> <xs:all> <xs:any minOccurs="0" maxOccurs="unbounded" /> <xs:element name="Author" type="xs:string" /> <xs:element name="Title" type="xs:string" /> <xs:element name="Date" type="xs:string" /> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string" /> </xs:all> </xs:complexType> </xs:element> <xs:element name="Book"> <xs:complexType> <xs:openContent mode="interleave"> <xs:any /> </xs:openContent> <xs:all> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:all> </xs:complexType> </xs:element> 280 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Unconstrained Openness <xs:element name="Book"> <xs:complexType> <xs:openContent mode="interleave"> <xs:any namespace="##any" processContents="skip"/> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <Book> <r:Binding>Hardcover</r:Binding> <Title>My Life and Times</Title> <r:Size>5 x 7</r:Size> <Author>Paul McCartney</Author> <r:InStock>true</r:InStock> <Date>1998</Date> <r:Category>Non-fiction</r:Category> <ISBN>1-56592-235-2</ISBN> <r:NumPages>299</r:NumPages> <Publisher>McMillin Publishing</Publisher> <r:AvailableOnTape>false</r:AvailableOnTape> </Book> The extension elements will not be validated because the schema specified processContents="skip" This means the extension elements can come from any namespace and they will not be validated 281 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Constrained Openness <xs:element name="Book"> <xs:complexType> <xs:openContent mode="interleave"> <xs:any namespace="http://www.repository.org" processContents="strict"/> </xs:openContent> <xs:sequence> <xs:element name="Title" type="xs:string"/> <xs:element name="Author" type="xs:string" /> <xs:element name="Date" type="xs:string"/> <xs:element name="ISBN" type="xs:string"/> <xs:element name="Publisher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <Book> <r:Binding>Hardcover</r:Binding> <Title>My Life and Times</Title> <r:Size>5 x 7</r:Size> <Author>Paul McCartney</Author> <r:InStock>true</r:InStock> <Date>1998</Date> <r:Category>Non-fiction</r:Category> <ISBN>1-56592-235-2</ISBN> <r:NumPages>299</r:NumPages> <Publisher>McMillin Publishing</Publisher> <r:AvailableOnTape>false</r:AvailableOnTape> </Book> The extension elements will be validated because the schema specified processContents="strict" This means the extension elements must be validated and they can only come from the repository.org namespace 282 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The portion of the DTD for XML Schema 1.1 that pertains to open content <!ELEMENT schema (include |import | redefine | override | annotation)*, (defaultOpenContent, (annotation)*)?, ((simpleType | complexType | element | attribute |attributeGroup | group | notation ),(annotation)*)* )> <!ELEMENT defaultOpenContent ((annotation)?, any)> <!ATTLIST defaultOpenContent appliesToEmpty (true | false) 'false' mode (interleave | suffix) 'interleave' id ID #IMPLIED> <!ELEMENT complexType ((annotation)?, (simpleContent | complexContent | openContent?, (all | choice | sequence | group)?), ((attribute | attributeGroup)*, (anyAttribute)?), assert*)> <!ELEMENT extension ((annotation)?, (openContent?, (all | choice | sequence | group)?)) , ((attribute | attributeGroup)*, (anyAttribute)?), assert*)> <!ELEMENT restriction ((annotation)?, (openContent?, (all | choice | sequence | group)?)) , ((attribute | attributeGroup)*, (anyAttribute)?), assert*)> <!ELEMENT openContent ((annotation)?, (any)?)> <!ATTLIST openContent mode (none | interleave | suffix) 'interleave' id ID #IMPLIED> 283 Copyright © [2012]. Roger L. Costello. All Rights Reserved. ID http://www.w3.org/TR/2009/CR-xmlschema11-1-20090430/structures.html#cvc-id 284 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XML Schema 1.0 In XML Schema 1.0 an element can have only one ID attribute. Thus, this is illegal: <element name="Widget"> <complexType> <sequence /> <attribute name="SKU" type="ID" /> <attribute name="Model-Num" type="ID" /> </complexType> </element> 285 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Multiple ID Attributes In XML Schema 1.1 an element can have multiple attributes of type ID, e.g., <Stereo model-number="M459302432" serial-number="S4390200"> … </Stereo> <element name="Stereo"> <complexType> <sequence> … </sequence> <attribute name="model-number" type="ID" use="required" /> <attribute name="serial-number" type="ID" use="required" /> </complexType> </element> See stereo.xsd in the folder: ID/stereo 286 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XML Schema 1.0 For every primitive datatype you can "fix" the value of the element/attribute, e.g., <element name="Greeting" type="string" fixed="Hello World" /> The string element Greeting has a fixed (constant) value, "Hello World" But .... there is one datatype that you can't fix: the ID datatype. Thus, this is illegal: <attribute name="Food" type="ID" fixed="Popcorn" /> 287 Copyright © [2012]. Roger L. Costello. All Rights Reserved. ID Attribute with Fixed, Default Value XML Schema 1.1 allows an ID element to have a fixed value or a default value, e.g., <Document version="1.0"> … </Document> <element name="Document"> <complexType> <sequence> … </sequence> <attribute name= "version" type="ID" fixed="1.0" /> </complexType> </element> 288 Copyright © [2012]. Roger L. Costello. All Rights Reserved. targetNamespace http://www.w3.org/TR/xmlschema11-1/#sec-src-element 289 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Problem Suppose that your schema wants to restrict a complexType in another schema. Suppose that other schema has a different targetNamespace. Book.xsd targetNamespace="http://www.book.org" BookType -Title -Author (unbounded) -Date -ISBN -Publisher "restrict" Bookstore.xsd targetNamespace="http://www.bookstore.org" BookTypeMyNamespace -Title -Author (2) -Date -ISBN -Publisher 290 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Book.xsd targetNamespace="http://www.book.org" BookType -Title -Author (unbounded) -Date -ISBN -Publisher "restrict" Bookstore.xsd targetNamespace="http://www.bookstore.org" BookTypeMyNamespace -Title -Author (2) -Date -ISBN -Publisher The Title element here is {http://www.bookstore.org}Title, but it needs to be: {http://www.book.org}Title (recall that the restricted elements must be the same as in the base type). Ditto for the other elements. 291 Copyright © [2012]. Roger L. Costello. All Rights Reserved. You cannot solve this problem in XSD 1.0 292 Copyright © [2012]. Roger L. Costello. All Rights Reserved. targetNamespace To solve this problem, in XSD 1.1 you can add a targetNamespace attribute on the element declarations in the subtype: <xs:import namespace="http://www.book.org" schemaLocation="Book.xsd"/> <xs:complexType name="BookTypeMyNamespace"> <xs:complexContent> <xs:restriction base="b:BookType"> <xs:sequence> <xs:element name="Title" type="xs:string" targetNamespace="http://www.book.org"/> <xs:element name="Author" type="xs:string" maxOccurs="2" targetNamespace="http://www.book.org"/> <xs:element name="Date" type="xs:gYear" targetNamespace="http://www.book.org"/> <xs:element name="ISBN" type="xs:string" targetNamespace="http://www.book.org"/> <xs:element name="Publisher" type="xs:string" targetNamespace="http://www.book.org"/> </xs:sequence> Do Lab19 </xs:restriction> </xs:complexContent> </xs:complexType> See the BookStore folder within the targetNamespace folder 293 Copyright © [2012]. Roger L. Costello. All Rights Reserved. targetNamespace on attributes • You can also add targetNamespace on attribute declarations. <xs:import namespace="http://www.book.org" schemaLocation="Book.xsd"/> <xs:complexType name="BookTypeMyNamespace"> <xs:complexContent> <xs:restriction base="b:BookType"> <xs:sequence> <xs:element name="Title" type="xs:string" targetNamespace="http://www.book.org"/> <xs:element name="Author" type="xs:string" maxOccurs="2" targetNamespace="http://www.book.org"/> <xs:element name="Date" type="xs:gYear" targetNamespace="http://www.book.org"/> <xs:element name="ISBN" type="xs:string" targetNamespace="http://www.book.org"/> <xs:element name="Publisher" type="xs:string" targetNamespace="http://www.book.org"/> </xs:sequence> <xs:attribute name="id" type="xs:ID" use="required" targetNamespace="http://www.book.org"/> </xs:restriction> </xs:complexContent> </xs:complexType> The Book.xsd schema declares this attribute, but makes it optional. In this declaration we restrict it to required. See the BookStore folder within the targetNamespace folder 294 Copyright © [2012]. Roger L. Costello. All Rights Reserved. redefine http://www.w3.org/TR/xmlschema11-1/#modify-schema 295 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Deprecated • The XML Schema working group has deprecated the <xs:redefine> element. • ** Don't use <xs:redefine> any more ** • Reason for deprecation: XSD 1.0 processors implemented conflicting and non-interoperable interpretations of <xs:redefine> Definition of deprecated (Wikipedia): the term deprecation is applied to software features that are superseded and should be avoided. 296 Copyright © [2012]. Roger L. Costello. All Rights Reserved. override http://www.w3.org/TR/xmlschema11-1/#override-schema 297 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Replacement for <redefine> • The <override> element replaces the <redefine> element. • With <redefine> you redefined a global type in another schema by extending it or restricting it. • With <override> you can change a global item in another schema. The change is not restricted to extension/restriction. You can replace the item. The "item" can be an element, attribute, simpleType, complexType, group, or attributeGroup. 298 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example meeting start-time: time end-time: time room-number: string calendar: meeting office-calendar.xsd override meeting meeting track-id: string speaker: string room-capacity: nonNegativeInteger conference-calendar.xsd 299 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="meeting"> <xs:complexType> <xs:sequence> <xs:element name="start-time" type="xs:time" /> <xs:element name="end-time" type="xs:time" /> <xs:element name="room-number" type="xs:string" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="calendar"> <xs:complexType> <xs:sequence> <xs:element ref="meeting" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType> </xs:element> See the meeting folder within the override folder Do Lab20 office-calendar.xsd <?xml version="1.0" encoding="utf-8"?> <calendar> override meeting <xs:override schemaLocation="office-calendar.xsd> <xs:element name="meeting"> <xs:complexType> <xs:sequence> <xs:element name="track-id" type="xs:string" /> <xs:element name="speaker" type="xs:string" /> <xs:element name="room-capacity" type="xs:nonNegativeInteger" /> </xs:sequence> </xs:complexType> </xs:element> </xs:override> conference-calendar.xsd <meeting> <track-id>XProc</track-id> <speaker>Norm Walsh</speaker> <room-capacity>225</room-capacity> </meeting> conforms to <meeting> <track-id>XSLT</track-id> <speaker>Michael Kay</speaker> <room-capacity>200</room-capacity> </meeting> </calendar> 300 conference-calendar.xml Copyright © [2012]. Roger L. Costello. All Rights Reserved. substitutionGroup 301 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XML Schema 1.0 • XML Schema 1.0 only permitted an element to be substitutable for one element. • That is, the value of substitutionGroup="___" is one QName. 302 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XML Schema 1.1 • XML Schema 1.1 permits an element to be substitutable for multiple elements. • That is, the value of substitutionGroup="___" is a list of QNames. 303 Copyright © [2012]. Roger L. Costello. All Rights Reserved. metrorail subway substitutable for substitutable for metro This is similar to multiple inheritance! 304 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="metrorail" type="xs:string" /> <xs:element name="subway" type="xs:string" /> <xs:element name="metro" substitutionGroup="metrorail subway" type="xs:NCName" /> See the substitutionGroup folder, transit subfolder. 305 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="X" type="xxx" /> <xs:element name="Y" type="yyy" /> <xs:element name="metro" substitutionGroup="X Y" type="zzz" /> metro's datatype (zzz) must derive from xxx and it must derive from yyy. Thus, if X's datatype is xs:string and metro's datatype is xs:integer, that's an error. If X's datatype is xs:string, Y's datatype is xs:token and metro's datatype is xs:Name, that okay (Name derives from string and token – see next slide) 306 Copyright © [2012]. Roger L. Costello. All Rights Reserved. 307 http://www.w3.org/TR/xmlschema11-2/#built-in-datatypes Copyright © [2012]. Roger L. Costello. All Rights Reserved. Thanks to Paul Jones for this diagram 308 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <xs:element name="X" type="xxx" /> <xs:element name="Y" type="yyy" /> <xs:element name="metro" substitutionGroup="X Y" /> Notice that metro does not specify a type. Q: What metro's datatype? xxx or yyy? 309 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Answer metro's datatype is the datatype of the first element's datatype in the substitutionGroup list (xxx) 310 Copyright © [2012]. Roger L. Costello. All Rights Reserved. In XML Schema 1.1 you can declare an element to be substitutable for multiple elements, e.g., <element name="Example" substitutionGroup="A B C" ...> Recall that in XML Schema 1.0 if you declared an element and didn't provide a type, then it inherits the type of its head element, e.g., <xs:element name="Subway" type="xs:string" /> <xs:element name="Metro" substitutionGroup="Subway" /> Note that Metro does not specify a type so it inherits Subway's type. Thus the type of Metro is xs:string. But in XML Schema 1.1 there can be multiple head elements, so what type would the element inherit? Example: The Comment element is substitutable for Subway (xs:string), isHardcover (xs:boolean), and TodaysDate (xs:date): <xs:element name="Subway" type="xs:string" /> <xs:element name="Metro" substitutionGroup="Subway" /> <xs:element name="isHardcover" type="xs:boolean" /> <xs:element name="TodaysDate" type="xs:date" /> <xs:element name="Comment" substitutionGroup="Subway isHardcover TodaysDate" /> Also notice that Comment does not specify a type. So what is Comment's data type? 311 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Answer: Comment inherits the data type of the first head element listed. In this case Subway is listed first, so Comment has the xs:string data type. Suppose that Comment was declared like this: <xs:element name="Comment" substitutionGroup=" isHardcover Subway TodaysDate" /> Now isHardcover is listed first, so Comment has the xs:boolean data type. Here is the relevant section from the XML Schema 1.1 specification: An <element> with no referenced or included type definition will correspond to an element declaration which has the same type definition as the *first* substitution-group head named in the substitutionGroup LESSON LEARNED: THE ORDER OF ELEMENTS LISTED IN A SUBSTITUTIONGROUP IS EXTREMELY IMPORTANT. 312 Copyright © [2012]. Roger L. Costello. All Rights Reserved. explicitTimezone http://www.w3.org/TR/xmlschema11-2/#rf-explicitTimezone 313 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New Facet • Recall that the timezone offset is optional on the datetime datatypes. • The purpose of this facet is to specify "Hey, you must specify the timezone offset," or "Hey, you must not specify the timezone offset," or "Hey, it's optional whether you specify a timezone offset." • The value of this facet is one of these: required, prohibited, optional • This facet is used with these datatypes: datetime, time, date, gYearMonth, gYear, gMonthDay, gDay, gMonth <simpleType name="event"> <restriction base="datetime"> <explicitTimezone value="required"/> </restriction> </simpleType> 314 Copyright © [2012]. Roger L. Costello. All Rights Reserved. dateTimeStamp http://www.w3.org/TR/xmlschema11-2/#dateTimeStamp 315 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New Datatype • dateTimeStamp is a new data type. • dateTimeStamp is exactly the same as dateTime, except it requires you to specify the time zone. You must specify the time zone <element name="birthdate" type="dateTimeStamp" /> --<birthdate>1976-06-21T16:04:00-6:00</birthdate> <birthdate>1980-01-01T24:00:00-6:00</birthdate> This is how to express end-of-day 316 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Equivalent! 2009-07-29T24:00:00-6:00 The end of the day July 29, 2009 2009-07-30T00:00:00-6:00 The start of the day July 30, 2009 They are two different lexical representations of the same xs:dateTimeStamp value. (Just as "1" and "true" are different lexical representations of the same boolean). So, once converted into the value space, the values they represent are both identical and equal. 317 Copyright © [2012]. Roger L. Costello. All Rights Reserved. anyAtomicType http://www.w3.org/TR/xmlschema11-2/#anyAtomicType 318 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New Datatype • anyAtomicType is a new data type. • It is the union of the value spaces of all the primitive types. • All of the primitive types derive from (constrain) anyAtomicType. • It has no facets. Thus it cannot be used as the base type in a simpleType. 319 Copyright © [2012]. Roger L. Costello. All Rights Reserved. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.org" xmlns="http://www.example.org" elementFormDefault="qualified"> <xs:element name="Example"> <xs:complexType> <xs:sequence> <xs:element name="Value" type="xs:anyAtomicType" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> <?xml version="1.0"?> <Example xmlns="http://www.example.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.example.org Example.xsd"> <Value xsi:type="xs:string">Hello World</Value> <Value xsi:type="xs:decimal">12.36</Value> <Value xsi:type="xs:boolean">true</Value> </Example> See Example.xsd in the folder: anyAtomicType/value 320 Copyright © [2012]. Roger L. Costello. All Rights Reserved. anyAtomicType vs anySimpleType Consider these two element declarations: <element name="A" type="anySimpleType" /> <element name="B" type="anyAtomicType" /> The value of <A> can be any primitive type or a list type, e.g. <A xsi:type="xs:string">Hello World</A> <A xsi:type="xs:decimal">12.39</A> <A xsi:type="xs:boolean">true</A> <A xsi:type="ex:LotteryNumbers">3 8 19</A> where ex:LotteryNumbers is defined as a list type: <xs:simpleType name="LotteryNumbers"> <xs:list itemType="xs:positiveInteger" /> </xs:simpleType> The value of <B> can only be a primitive type, e.g. <B xsi:type="xs:string">Hello World</B> <B xsi:type="xs:decimal">12.39</B> <B xsi:type="xs:boolean">true</B> Thus, the difference between anyAtomicType and anySimpleType is that an anySimpleType value can be a list type, whereas that's not legal for anyAtomicType. 321 Copyright © [2012]. Roger L. Costello. All Rights Reserved. yearMonthDuration http://www.w3.org/TR/xmlschema11-2/#yearMonthDuration 322 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New Datatype • yearMonthDuration is a new datatype. • It specifies a duration in terms of year and months, or just months. • Examples: P1Y3M (a duration of 1 year, 3 months) P15M (a duration of 15 months) 323 Copyright © [2012]. Roger L. Costello. All Rights Reserved. dayTimeDuration http://www.w3.org/TR/xmlschema11-2/#dayTimeDuration 324 Copyright © [2012]. Roger L. Costello. All Rights Reserved. New Datatype • dayTimeDuration is a new datatype. • It specifies a duration in terms of day and time, or just time. • Examples: P35DT01H22M30S (a duration of 35 days,1 hour, 22 minutes, 30 seconds) PT11H (a duration of 11 hours) PT114M (a duration of 114 minutes) PT300S (a duration of 300 seconds) 325 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The two new duration datatypes were created to satisfy a demand for totally ordered durations, e.g. a duration of 1 month is incomparable with a duration of 30 days -- neither greater than, equal to, nor less than. Dave Peterson 326 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Vendor-unique extensions http://www.w3.org/TR/xmlschema11-2/#idef-idep (see clauses 3 and 4) 327 Copyright © [2012]. Roger L. Costello. All Rights Reserved. In XSD 1.1, vendors can add their own datatypes and facets. 328 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Example A vendor creates a new decimal datatype and a facet that enables you to specify the delimiter used in the decimal: <xs:simpleType name="money"> <xs:restriction base="vendor:decimal"> <vendor:delimiter value="," /> <xs:restriction> </xs:simpleType> 329 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Conditional Inclusion (a.k.a. Version Control) http://www.w3.org/TR/xmlschema11-1/#cip 330 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The Version Control Namespace http://www.w3.org/2007/XMLSchema-versioning minVersion maxVersion typeAvailable typeUnavailable facetAvailable facetUnavailable They are attributes. The convention is to use vc: as the prefix. 331 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Ignored by 1.0 Validators • The version control attributes are added onto element declarations. • Since they are in a different namespace they will be ignored by 1.0 validators. 332 Copyright © [2012]. Roger L. Costello. All Rights Reserved. vc:minVersion, vc:maxVersion • They are attributes. • They can be added to any element declaration in the schema. • Their values are decimal. <element name="Book" vc:minVersion="3.2"> declare the Book element </element> <element name="Book: vc:minVersion="1.1" vc:maxVersion="3.2"> declare the Book element </element> 333 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Imagine the year is 2030 XML Schemas is now at version 3.2 a 3.2 validator will use this element declaration <element name="Book" vc:minVersion="3.2"> declare the Book element </element> <element name="Book: vc:minVersion="1.1" vc:maxVersion="3.2"> declare the Book element </element> and during pre-processing of the schema, this element declaration will be removed 334 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Pre-processing 1.1 Schema Validator <element name="Book" vc:minVersion="3.2"> declare the Book element </element> <element name="Book: vc:minVersion="1.1" vc:maxVersion="3.2"> declare the Book element </element> <element name="Book: vc:minVersion="1.1" vc:maxVersion="3.2"> declare the Book element </element> removes the 1st declaration of Book 3.2 Schema Validator <element name="Book" vc:minVersion="3.2"> declare the Book element </element> removes the 2nd declaration of Book 335 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Same name, symbol space, Different type, version • Recall from XSD 1.0: if two element declarations have the same name, are in the same symbol space, and have different types then it is an error. • Below we see two elements with the same name, in the same symbol space and with different types, but it's not an error because one of them will be removed during pre-processing. <element name="Book" vc:minVersion="3.2"> declare the Book element </element> <element name="Book" vc:minVersion="1.1" vc:maxVersion="3.2"> declare the Book element </element> 336 Copyright © [2012]. Roger L. Costello. All Rights Reserved. vc:maxVersion is "exclusive" vc:maxVersion="3.2" means "up to but not including 3.2" 337 Copyright © [2012]. Roger L. Costello. All Rights Reserved. vc:typeAvailable, vc:typeUnavailable • They are attributes. • They can be added to any element declaration in the schema. • Their values are a list of QNames (a list of namespace-qualified types). <element name="inStock" vc:typeAvailable="ex:new-datatype" type="ex:new-datatype" /> <element name="inStock" vc:typeUnavailable="ex:new-datatype" type="boolean" /> 338 Copyright © [2012]. Roger L. Costello. All Rights Reserved. If the validator supports this datatype, it will use this element declaration <element name="inStock" vc:typeAvailable="ex:new-datatype" type="ex:new-datatype" /> <element name="inStock" vc:typeUnavailable="ex:new-datatype" type="boolean" /> If the validator does not support this datatype, it will use this element declaration (<inStock> must be a boolean) 339 Copyright © [2012]. Roger L. Costello. All Rights Reserved. I Mean the Built-in Datatypes • The datatypes listed in vc:typeAvailable and vc:typeUnavailable are not user-defined datatypes. • Rather, they are built-in datatypes. • Suppose vendor X defines a new primitive datatype: vendor:decimal. The X schema validator will understand it, but other validators won't. X Schema Validator Include <element name="Size" type="vendor:decimal" vc:typeAvailable="vendor:decimal" /> Other Schema Validator Discard 340 Copyright © [2012]. Roger L. Costello. All Rights Reserved. vc:facetAvailable, vc:facetUnavailable • They are attributes. • They can be added to any element declaration in the schema. • Their values are a list of QNames (a list of namespace-qualified facets). <element name="range" vc:facetAvailable="vendor:delimiter"> <simpleType> <restriction base="decimal"> <vendor:delimiter value="," /> </restriction> </simpleType> </element> 341 Copyright © [2012]. Roger L. Costello. All Rights Reserved. If the validator supports this facet, it will use this element declaration <element name="range" vc:facetAvailable="vendor:delimiter"> <simpleType> <restriction base="decimal"> <vendor:delimiter value="," /> </restriction> </simpleType> </element> <element name="even-integer" vc:facetUnavailable="vendor:delimiter" type="decimal" /> If the validator does not support this facet, it will use this element declaration 342 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Relevant, today? > I'm having a hard time seeing the usefulness of > vc:typeAvailable, vc:typeUnavailable, vc:facetAvailable, > vc:facetUnavailable, vc:minVersion, and vc:maxVersion. > > I can see its usefulness in future versions of XML Schema > (version 1.2, 1.3. etc) but I can't see its usefulness today. > > Does it have any usefulness today? If so, can you give me a > practical example please? It's mainly there because people realized belatedly that it should have been in version 1.0. There is some optimism in some quarters that it will be retrofitted to those 1.0 processors that are still under active development, helping users to move forward to 1.1. It has been designed so that it would be conformant to the 1.0 spec to do so. Michael Sperberg-McQueen 343 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Factors Influencing an Element Declaration 344 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Cannot understand an element in isolation • In XML Schema 1.0 you could examine an element declaration and mostly understand it without consideration of the rest of the schema. – substitutionGroup did require you to examine other parts of the schema (if the element was global). • In XML Schema 1.1 an element could be affected by internal and external factors: – one or more alternatives in ancestor elements, and the alternatives may be using information in inherited attributes – defaultAttributes – vendor-unique capabilities – which version of the XML Schema specification that the validator implements 345 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Factors affecting elements inherited attribute validator version alternative element declaration default attributes vendorunique 346 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Many non-local factors affect an element declaration Which schema validator? Which version? XML Schema This attribute, starttime, is inheritable Alternative: If @starttime is before noon then use MorningBeverage Default attributes Element declaration 347 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Identity Constraints and Substitution Elements 348 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Caution using element substitution: the substituting elements may fail to catch errors in the data Elements substitutable for a head element are required to have types derived from the head element's type, but are not required to enforce the identity constraints of the head element. In other words, the substitutable elements may not validate the same way as the head element. Beware! Example: A BookStore consists of multiple Book elements and each Book element is uniquely identified by its ISBN. Here is an example BookStore: <?xml version="1.0"?> <BookStore xmlns="http://www.books.org"> <Book> <Title>Illusions The Adventures of a Reluctant Messiah</Title> <Author>Richard Bach</Author> <Date>1977</Date> <ISBN>0-440-34319-4</ISBN> <Publisher>Dell Publishing Co.</Publisher> </Book> <Book> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> </Book> </BookStore> Note that each ISBN value is unique. We want the XML Schema to enforce that uniqueness. 349 Copyright © [2012]. Roger L. Costello. All Rights Reserved. A way to accomplish this is using xsd:key. Let's look at the XML Schema code. BookStore consists of an unbounded number of Book elements: <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" maxOccurs="unbounded" /> </xsd:sequence> </xsd:complexType> </xsd:element> Each Book element has a child ISBN element that is required to be unique: <xsd:element name="Book" type="BookType"> <xsd:key name="PK"> <xsd:selector xpath="bk:Book"/> <xsd:field xpath="bk:ISBN"/> </xsd:key> </xsd:element> Here is Book's type definition: <xsd:complexType name="BookType"> <xsd:sequence> <xsd:element name="Title" type="xsd:string" /> <xsd:element name="Author" type="xsd:string" /> <xsd:element name="Date" type="xsd:string" /> <xsd:element name="ISBN" type="xsd:string" /> <xsd:element name="Publisher" type="xsd:string" /> </xsd:sequence> </xsd:complexType> 350 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Some of my clients write Fiction books. I would like them to be able to customize their markup a bit. Namely, I would like for them to be able to use the element name "Fiction" rather than "Book". So I declare Fiction to be substitutable for Book: <xsd:element name="Fiction" substitutionGroup="Book" /> My client constructs an XML document using Fiction elements: <?xml version="1.0"?> <BookStore xmlns="http://www.books.org"> <Fiction> <Title>Siddhartha</Title> <Author>Hermann Hesse</Author> <Date>1951</Date> <ISBN>0-486-40653-9</ISBN> <Publisher>Dover</Publisher> </Fiction> <Fiction> <Title>Atlas Shrugged</Title> <Author>Ayn Rand</Author> <Date>1957</Date> <ISBN>0-486-40653-9</ISBN> <Publisher>Penguin Books</Publisher> </Fiction> </BookStore> Oops! Notice that my client messed up, he mistakenly identified both books by the same ISBN value. XML Schema validation of the document returns "valid". My client has an error in his XML document and it is not being caught by schema validation. Lesson Learned: element substitution is powerful. But without intimate knowledge of the head element, use of your substitution elements may result in XML documents with erroneous data that is not detected by schema validation. 351 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Constrain Data Types to Avoid Malicious Attacks 352 Copyright © [2012]. Roger L. Costello. All Rights Reserved. When creating XML Schemas it is important that you restrict the set of allowable characters. Here is a powerful example of how unconstrained strings could be exploited. UNICODE has this character, RIGHT-TO-LEFT OVERRIDE (RLO). It is hex 202E. What the RLO character does is it reverses everything that follows it. For example, gpj.exe is reversed to exe.jpg So, this: ann[RLO]gpj.exe would display as this: annexe.jpg That is, it appears in a displayer as the name of a harmless JPG file, when in fact it is the name of an EXE file. This technique can be (and is being) used to trick users into opening malware executables. Here’s how: Here’s an XML document that contains the RLO character in the content of the element: <?xml version="1.0"?> <Part-of-Czechoslovakia-Annexed-by-Germany> http://www.example.org/ann&#x202e;gpj.exe </Part-of-Czechoslovakia-Annexed-by-Germany> 353 Copyright © [2012]. Roger L. Costello. All Rights Reserved. I dragged that into a browser (Firefox) and here’s what it displayed: <Part-of-Czechoslovakia-Annexed-by-Germany>http://www.example.org/annexe.jpg</Part-ofCzechoslovakia-Annexed-by-Germany> It appears that the data is a URL to a harmless JPG file, http://www.example.org/annexe.jpg In fact, the data is a URL to an executable file, http://www.example.org/anngpj.exe Since the data looks good it is incorporated (using an XSLT program) into an HTML document, as a hyperlink. Users then click on the link and … bam! … malware just got loaded onto their machines. You can probably see how the same method could be used to disguise some critical piece of data. Lessons Learned: Avoid using the unconstrained string data type in your XML Schemas. 354 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Here is an example of a properly constrained element: <xs:element name="Surname" type="English-language-family-name" /> <xs:simpleType name="English-language-family-name"> <xs:annotation> <xs:documentation> The vast majority of English language family names are at least 2 characters long, under 100 characters, and consist of the characters: a-z, A-Z, space, hyphen, period, and apostrophe.</xs:documentation> </xs:annotation> <xs:restriction base="xs:string"> <xs:minLength value="2" /> <xs:maxLength value="100" /> <xs:pattern value="[a-zA-Z' \.-]+" /> </xs:restriction> </xs:simpleType> For more info, see: https://www.google.com/#hl=en&output=search&sclient=psyab&q=righttoleftoverride%E2%80%9D+%28RLO%29+character&oq=right&gs_l=hp.3.0.35i39j0l3.1543.2211.0.3556.5. 5.0.0.0.1.275.813.0j4j1.5.0...0.0...1c.UpSbwsMBQcQ&pbx=1&bav=on.2,or.r_gc.r_pw.r_qf.,cf.osb&fp=717d9e3559249a 06&biw=1440&bih=742 355 Copyright © [2012]. Roger L. Costello. All Rights Reserved. XML Schema quiz on default values 356 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Recall that when you declare an element (or attribute) you can give it a default value. For example, I give the Altitude element a default value of 100: <xs:element name="Altitude" type="xs:integer" default="100" /> In an instance document, if you wish for Altitude to have the default value, then you can simply create it as an empty element: <Altitude></Altitude> The value of Altitude is 100. Let’s take another example. Here I declare the Title element to be of type string and give it a default value, “Hello World”: <xs:element name="Title" type="xs:string" default="Hello World" /> Now in my instance document I create an empty Title element: <Title></Title> Quiz: What is the value of Title? 357 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Answer: the value of Title is the empty string, not the default value. The reason is that the empty string is a valid value of the string data type. If you want Title to have the default value then you must explicitly enter the default value: <Title>Hello World</Title> 358 Copyright © [2012]. Roger L. Costello. All Rights Reserved. The answer on the previous slide is not correct. I wrote: > Answer: the value of Title is the empty string, not the default value. > The reason is that the empty string is a valid value of the string data type. The correct answer is: Answer: the value of Title is the default value. The XML Schema specification explains why: An element with a non-empty default value whose simple type definition includes the empty string in its lexical space will nonetheless never receive that empty string value, because the default value will override it. 359 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Additional Resources 360 Copyright © [2012]. Roger L. Costello. All Rights Reserved. Articles XML Schema 1.1, Part 1: An introduction to XML Schema 1.1 http://www.ibm.com/developerworks/xml/library/x-xml11pt1/ XML Schema 1.1, Part 2: An introduction to XML Schema 1.1 http://www.ibm.com/developerworks/xml/library/x-xml11pt2/ 361