ACG 5405 XML Schemas XML Namespaces XMLink The XML Foundation Many participants – an extended family! XML documents – carry data in context Each must be parsed into its component parts XML schemas – define the rules a class of documents must follow Can be used to validate documents & contents XSLT – provide processing instructions Can be used to process XML documents Namespaces – qualify elements & attributes Differentiate & associate them with a URI XPath … XLink … XQuery … XML Processors are not designed equally! XML Languages - Schema Instance Document Elements (tag sets) meta-data about data Schema Well-formatted XML document Defines structure and contents of Instance Document Similar to an ER-Diagram for databases Defines Each Element and Attribute Its Structure Includes Business Rules Cardinalities Used to Validate Instance Document Means Instance Document conforms to Schema Rules XML Schema .xsd extension Defines each attribute and extension Root element is a namespace <xs:schema xmlns:xs=“http:www.w3.org/2001/XMLSchema”> Define an Element: Simple contain only data Complex contain other elements (i.e. Root & Parent) contain attributes Simple Element Definition Declare Name Declare Type <xs:element name=“ID” type=“xs:string”/> Type= Defines the data type: string Integer date decimal other types Complex Element (Parent) Declares Name Declares type Declares Structure <xs:element name="Party"> <xs:complexType> <xs:sequence> <xs:element ref="PartyName" minOccurs="1" maxOccurs="1"/> <xs:element ref="PostalAddress" minOccurs="1" maxOccurs="1"/> <xs:element ref="Contact" minOccurs="0" maxOccurs="1"/> </xs:sequence> </xs:complexType> </xs:element> Complex Element (attribute) Declare Name Declare Type Define element and attribute(s) <xs:element name="PriceAmount"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:decimal"> <xs:attribute name="currencyID" type="xs:string" use="required"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> Create a Schema for Conceirge Instance Identify types of elements Simple Complex – Parent Complex – Attribute Create Prolog Create Root element Work down from 1st element to last Step 1 <ServiceRequest> <Request> <ID>1</ID> <ApartmentNumber>1004</ApartmentNu mber> <TenantName> <LastName>Hornik</LastName> <FirstName>Steven</FirstName> </TenantName> <ServiceName>Dry Cleaning</ServiceName> <ServiceDate>2009-06-23</ServiceDate> <ServiceTime>06:45 PM</ServiceTime> </Request> </ServiceRequest> Complex (Parent) Types: ServiceRequest Request TenantName Simple Types: ID ApartmentNumber LastName FirstName ServiceName ServiceDate ServiceTime Step 2 and 3 (Prolog and Root) <?xml version=“1.0” encoding=“UTF-8”?> <xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema> ... </xs:schema> Step 4 – Build Schema <xs:element name=“ServiceRequest"> <xs:complexType> <xs:sequence> <xs:element name=“Request“ minOccurs=“1” maxOccurs=“unbounded”> <xs:complexType> <xs:sequence> <xs:element name=“ID” type=“xs:string”/> <xs:element name=“ApartmentNumber” type=“xs:decimal”/> <xs:element name=“TenantName“ minOccurs=“1” maxOccurs=“1”> <xs:complexType> <xs:sequence> <xs:element name=“LastName” type=“xs:string”/> <xs:element name=“FirstName” type=“xs:string”/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name=“ServiceName” type=“xs:string”/> <xs:element name=“ServiceDate” type=“xs:date”/> <xs:element name=“ServiceTime” type=“xs:date”/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> UBL Schemas Schemas for each document type & Common Basic Components Defines Simple Elements Defines Complex (attribute) Elements Prefix: cbc Common Aggregate Components Defines Complex (Parent) Elements Prefix: cac Vocabularies & Schemas XBRL & UBL are vocabularies XBRL for Financial Reporting UBL for Business Documents Vocabularies are designed using Agreed upon element names Agreed upon element types Agreed upon element sequence/structure Defined by Schemas Vocabularies and Namespaces Namespace A Unique Identifier Unique Prefix refers to URI Points to where information in an XML Document can be found. (URI) Attributes of Root Element Used to preclude naming collisions Method for distinguishing between the same element name for different elements <inv:id>10001</inv:id> ... <employee:id><18897</employee:id> Declaring a Namespace (in the UBL instance document) <Catalogue xmlns="UBLCatalogueDocument" xmlns:cbc="UBLCommonBasicComponents" xmlns:cac="UBLCommonAggregateComponents"> Since UBLCatalogueDocument does NOT have a prefix any element in the instance document without a prefix relates to this namespace. Creating UBL Document Schemas Declare NameSpaces and qualifiers Import necessary Schemas Define Root Element Reference Reusable data components Declare Cardinalities UBL Namespace Declaration <?xml version="1.0" encoding="UTF-8"?> <!-- Simplified UBL Catalogue schema: SkipWhite.com, May 2008 --> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="UBLCatalogueDocument" xmlns="UBLCatalogueDocument" xmlns:cbc="UBLCommonBasicComponents" xmlns:cac="UBLCommonAggregateComponents" elementFormDefault="qualified" attributeFormDefault="unqualified"> Namespace Clarification targetNamespace="UBLCatalogueDocument“ The schema being created/used is applied to the UBLCatalogueDocument namespace elementFormDefault="qualified“ Element names will use a namespace prefix CAC: CBC: attributeFormDefault="unqualified“ Attribute names will not use a namespace prefix UBL Import <xs:import namespace="UBLCommonBasicComponents" schemaLocation="http://www.buec.udel.edu/whitec/UBL CommonBasicComponents/UBLCommonBasicCompone ntsSchema.xsd"/> <xs:import namespace="UBLCommonAggregateComponents" schemaLocation="http://www.buec.udel.edu/whitec/UBL CommonAggregateComponents/UBLCommonAggregate ComponentsSchema.xsd"/> UBL Root Element (Catalogue) <xs:element name="Catalogue"> <xs:complexType> <xs:sequence> <xs:element ref="cbc:ID" minOccurs="1" maxOccurs="1"/> <xs:element ref="cbc:Name" minOccurs="1" maxOccurs="1"/> <xs:element ref="cbc:IssueDate" minOccurs="1" maxOccurs="1"/> <xs:element ref="cac:ProviderParty" minOccurs="1" maxOccurs="1"/> <xs:element ref="cac:ReceiverParty" minOccurs="1" maxOccurs="1"/> <xs:element ref="cac:CatalogueLine" minOccurs="1“ maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> Put it all together: The Entire Schema Validating XML Ensure that Instance Document Follows business rules Data types are correct Data is properly sequenced XML Linking Language XLink Uses attributes to describe relationships between elements Simple: HTML type links Extended: More complex Relationship links