IT420: Database Management and Organization XML 21 April 2006 Adina Crăiniceanu www.cs.usna.edu/~adina Overview From HTML to XML DTDs Transforming XML: XSLT 2 Kroenke, Database Processing Introduction Database processing and document processing need each other Database processing needs document processing for transmitting/ expressing database views Document processing needs database processing for storing and manipulating data Internet expansion made the need obvious 3 Kroenke, Database Processing XML XML: Extensible Markup Language, developed in early 1990s Hybrid of document processing and database processing It provides a standardized yet customizable way to describe the content of documents A recommendation from the W3C XML = data + structure XML generated by applications XML consumed by applications Easy access: across platforms, organizations 4 Kroenke, Database Processing XML What is this? What does it mean? <h2>Madison</h2> HTML: How to display information on a browser. HTML: no “semantic” information, i.e. no meaning ascribed to tags 5 Kroenke, Database Processing XML: Semantic information <presidents> <name>Madison</name> <US_cities> <Wisconsin> <name>Madison</name> <US_Colleges> <name>Madison, U of Wisc</name> <name> Madison, James (JMU)</name> 6 Kroenke, Database Processing XML vs. HTML XML is better than HTML because It provides a clear separation between document structure content materialization It is standardized but allows for extension by developers XML tags represent the semantics of their data 7 Kroenke, Database Processing Why is XML important with regard to databases? XML provides a standardized way to describe, validate, and materialize any database view. Share information between disparate systems Materialize data anyway you want Display data on web Display data on sales-person computer Display data on mobile device 8 Kroenke, Database Processing How does XML work? Three Primary Components to XML Data has a structure Document Type Declarations (DTDs) XML Schemas can be used to describe the content of XML documents Data has content XML document Data has materializations Extensible Style Language: Transformations (XSLT) 9 Kroenke, Database Processing If we want to share information is structure important? Structure provides meaning What is the meaning of this bit stream?? 10111011000101110110100101010101101010110110101…. The bit stream has meaning if we assign structure 10 Kroenke, Database Processing Example: XML DTD & Document 11 Kroenke, Database Processing XML DTD XML document consists of two sections: Document Type Declaration (DTD) The DTD begins with DOCTYPE <document_type_name> Document data XML documents could be Type-valid if the document conforms to its DTD Well-formed and not be type-valid, because It violates the structure of its DTD It has no DTD DTD may be stored externally so many documents can be validated against the same DTD 12 Kroenke, Database Processing Create XML Documents from Relational DB Data Most RDBMS can output data in XML format MySQL: mysql –u root --xml For SQL Server: SELECT . . . FOR XML RAW | AUTO, ELEMENTS | EXPLICIT 13 Kroenke, Database Processing Lab exercise Restore some database in MySQL Open MySQL command line using mysql –u root --xml 14 Kroenke, Database Processing XSLT XSLT, or the Extensible Style Language may be used to materialize (transform) XML documents using XSL document From XML documents into HTML or into XML in another format XSLT is a declarative transformation language XSLT uses stylesheets to indicate how to transform the elements of the XML document into another format 15 Kroenke, Database Processing Example: External DTD <?xml version="1.0" encoding="UTF-8"?> <!ELEMENT customerlist (customer+)> <!ELEMENT customer (name, address)> <!ELEMENT name (firstname, lastname)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT address (street+, city, state, zip)> <!ELEMENT street (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT state (#PCDATA)> <!ELEMENT zip (#PCDATA)> 16 Kroenke, Database Processing Example: XML Document <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE customerlist SYSTEM "http://localhost/Support-Files-Chap-13-XML/CustomerList.dtd"> <?xml-stylesheet type="text/xsl" href="http://localhost/Support-Files-Chap-13-XML/CustomerList-StyleSheet.xsl"?> <customerlist> <customer> <name> <firstname>Michelle</firstname> <lastname>Correlli</lastname> </name> <address> <street>1824 East 7th Avenue</street> <street>Suite 700</street> <city>Memphis</city> <state>TN</state> <zip>32123-7788</zip> </address> </customer> <customer> <name> <firstname>Lynda</firstname> <lastname>Jaynes</lastname> </name> <address> <street>2 Elm Street</street> <city>New York City</city> <state>NY</state> <zip>02123-7445</zip> 18 </address> </customer> Kroenke, Database Processing </customerlist> XSL Stylesheet for CustomerList 19 Kroenke, Database Processing Example: XML Browser 20 Kroenke, Database Processing Show XSL document example CustomerList.xml 21 Kroenke, Database Processing XML Review STRUCTURE: DTD or XML Schema CONTENT: XML document MATERIALIZATIONS: XSL document 22 Kroenke, Database Processing Sharing Data: Transparency Agreed upon structure Database Raw data XML data XSL Trans Business A Validate DTD SHARE Database Raw data Validate DTD XSL Trans XML data Business B 23 Kroenke, Database Processing Example XML Industry Standards Accounting Extensible Financial Reporting Markup Language (XFRML) Architecture and Construction Architecture, Engineering, and Construction XML (aecXML) Automotive Automotive Industry Action Group (AIAG) XML for the Automotive Industry (SAE J2008) Banking Banking Industry Technology Secretariat (BITS) Bank Internet Payment System (BIPS) Electronic Data Interchange Data Interchange Standards Association (DISA) XML/EDI Group 24 Kroenke, Database Processing What About XML Queries? Xpath A single-document language for “path expressions” Not unlike regular expressions on tags E.g. /Contract/*/UnitPrice, /Contract//UnitPrice, etc. XSLT XPath plus a language for formatting output XQuery 25 Kroenke, Database Processing Conclusions XML: The new universal data exchange format Unlike HTML, XML = data + semantics STRUCTURE: DTD or XML Schema CONTENT: XML document MATERIALIZATIONS: XSL document More flexible than relational model More difficult to query – research 26 Kroenke, Database Processing