XML and The Relational Data Model <author> By: Soid Quintero </author> Overview Define XML XML model vs. relational model XML Documents Xpath XQuery Going from XML to the relational model What is XML? XML stands for Extensible Markup Language It is a metalanguage used to represent and manipulate data elements Similar to HTML in structure, but XML is concerned with the description and representation of data, rather than with the way it is displayed. XML is derived from Standard Generalized Markup Language (SGML) XML data is used to create XML documents. Why XML for databases? One of the main reasons that XML was developed was to allow the exchange of semi-structured documents, like invoices, order forms, applications, etc., over the internet. Using a database system to store XML documents allows users to be able to better access information. XML is also very flexible Data is maintained in a self-describing format to accommodate a variety of ever-evolving business needs. What is an XML Database? Simply a database that stored XML Documents There are two major types of XML databases: XML-enabled. These map all XML to a traditional database (such as a database), accepting XML as input and rendering XML as output. Native XML (NXD) The internal model of such databases depends on XML and uses XML documents as the fundamental unit of storage. XML Model VS. Relational Model? XML data is hierarchical XML data is self-describing XML data has inherent ordering An XML database contains collections relational data is represented in a model of logical relationships. relational data is not selfdescribing. Relational data does not have inherent ordering. A relational database contains tables Relational Model Order of rows is not guaranteed unless the ORDER clause is used in one or more columns Relations (tables) Data is represented in n-ary relations. Has a domain that represents a set of values Attributes (columns) Strict schema Restrictive The strict schema insures data integrity XML Model The XML Model is hierarchical format Data is represented in trees structures There's nodes Relationships between the node The schema provides flexibility Easily modified format multiple elements represented in a hierarchy, including a root“Comments” element and one or more individual “Comment” elements pertaining to a given item. XML Document Rules XML documents must be well formated, meaning that every opening tag needs a closing tag (ex. <Student> </Student> ) It allows users to define their own tags (unlike HTML) The XML tags need to be properly nested. The XML and xml tags are reserved for XML tags only. You can use <-- XXX --> symbols for comments XML is case sensitive so, <Student> is not the same as <STUDENT> Two types of XML documents commonly used, Document Type Definition (DTD) or an XML Schema Definitions (XSD) What is a Document Type Definition? DTDs can be declared inline of the XML code or can reference an external file It provides the composition of the database's logical model and defines the syntax rules or valid tags for each type of XML document. A DTD is a file that has a .dtd extension. This file describes XML elements. Example of a DTD on next slide....... Inline DTD message.xml <?xml version="1.0"?> <!DOCTYPE message [ <!ELEMENT message (to,from,subject,text)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT subject (#PCDATA)> <!ELEMENT text (#PCDATA)> ]> <message> <to>Dave</to> <from>Susan</from> <subject>Reminder</subject> <text>Don't forget to buy milk on the way home.</text> </message> External DTD message.xml <?xml version="1.0"?> <!DOCTYPE message SYSTEM "message.dtd"> <message> <to>Dave</to> <from>Susan</from> <subject>Reminder</subject> <text>Don't forget to buy milk on the way home.</text> </message> message.dtd <?xml version="1.0"?> <!ELEMENT message (to,from,subject,text)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT subject (#PCDATA)> <!ELEMENT text (#PCDATA)> What is an XML Schema Definition? The XML Schema is an advance definition language that is used to describe the structure (elements, data types, relationships types, ranges, and default values) It is an alternative to DTD Since data types are allowed data validation is possible and easier to do. A XSD file has a .xsd extension Example of a XSD on next slide....... <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> How do you access XML data? There are several languages used to access XML data from XML Documents, some are: XPath XQuery XML-QL XQL etc.... XPath Xpath is a language used to extract parts of an XML document. XPath uses path expressions to navigate in XML documents Xpath has 7 kinds of nodes: Element Attribute Text Namespace Processing-Instruction Comment Document(root) Examples of XPath <?xml version="1.0" encoding="ISO-8859-1"?> <bookstore> Attribute node Document node <book> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> Element node </bookstore> Example of an XPath expressions: /bookstore Will select the root element bookstore /bookstore/book Selects all book elements that are children of bookstore XQuery XQuery is a language for finding and extracting elements and attributes from XML documents. XQuery for XML is like SQL for databases XQuery is built on XPath expressions XQuery is supported by all the major database engines (IBM, Oracle, Microsoft, etc.) Example of XQuery <?xml version="1.0" encoding="ISO-88591"?> <bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> </bookstore> Example of an XQuery expressions: doc("books.xml")/bookstore/book/title Will return: <title lang="en">Everyday Italian</title> The XQuery FLWOR for: Iterates through a sequence, bind variable to items let: binds a variable to a sequence where:eliminates items of the iteration order by: reorders items of the iteration return: constructs query results Example of a XQuery expression: for $x in doc("books.xml")/bookstore/book where $x/price>29 return $x/title Will return: <title lang="en">Everyday Italian</title> Going from XML Model to Relational Model XML documents can be decomposed into a relational table. Those decomposed XML documents can be made into relational tables and published to an XML document( they might differ from the original) During the decomposing process, the XML document loses most of it structure in order to map into the relational table; not all the tags are stored in the relational tables. Example of an XML document <ORDER> <ORDER_ID=’83492’ CUST_ID=’93457’> <ITEM> <PROD_ID>94872</PROD_ID> <PROD_NAME>PEN</PROD_NAME> <PRICE>19.95</PRICE> <QUANTITY>30</QUANTITY> </ITEM> <ITEM> <PROD_ID>94866</PROD_ID> <PROD_NAME>BINDER</PROD_NAME> <PRICE>7.95</PRICE> <QUANTITY>26</QUANTITY> </ITEM> <ITEM> <PROD_ID>92219</PROD_ID> <PROD_NAME>LABELS</PROD_NAME> <PRICE>12.95</PRICE> <QUANTITY>250</QUANTITY> </ITEM> </ORDER> XML document decomposed into a relation Order Items Another Example..... Sales Order <SalesOrder Number="123"> <OrderDate>2003-07-28</OrderDate> <CustomerNumber>456</CustomerNumber> <Item Number="1"> <PartNumber>XY47</PartNumber> <Quantity>14</Quantity> <Price>16.80</Price> </Item> <Item Number="2"> <PartNumber>B987</PartNumber> <Quantity>6</Quantity> <Price>2.34</Price> </Item> </SalesOrder> Items table Hybrid DBMS There are DBMS that allow the use of both the relational model and the XML data model, a so call “Hybrid” model. An example of such a DBMS is IBMs DB2. Architecture of DB2 References (1) “Comparison of XML Model and the Relational Model.” DB2 Version 9 for Linux, UNIX, and Windows.10 April. 2007 <http://publib.boulder.ibm.com/infocenter/db2luw/v9/topic/com.ibm.db2.udb.apdv.embed. doc/doc/c0023811.htm>. (2) W3Schools. <http://www.w3schools.com/xpath/default.asp> (3) W3Schools. <http://www.w3schools.com/xquery/default.asp> (4) Coss Rafael. “DBS 9 pureXML”. IMSS_DB2pureXML_v3.ppt. (5) Coss, Rafael. “XML L2 Skills Transfer - template”. Jan 14, 2005. XML_L2~1.ppt (6) Bourret, Ronald. XML “DatabaseProducts“. Ronald Bourret Consulting, writing, and research in XML and database. March 13, 2007. <http://www.rpbourret.com/xml/XMLDatabaseProds.htm#xmlanddatabases> (7) W3Schools. <http://www.w3schools.com/dtd/default.asp> (8)Steegmans, Bart. Bourret, Ronald. Olivier, Guvennet.” XML for DB2 Information Integration”. IBM Redbooks .<http://www.redbooks.ibm.com/redbooks/SG246994/wwhelp/wwhimpl/java/ht ml/wwhelp.htm>