XML and Oracle: An Overview Roger Schrag Database Specialists, Inc. www.dbspecialists.com XML and Oracle: An Overview • XML Basics • XML’s Potential • Support for XML in Oracle Products What Is XML? Extensible Markup Language • A standard for representing structured data in human-readable text form • Any type of data can be represented in XML • Syntax uses open and close tags similar to HTML • Use tags common in your industry or make up your own XML Basics • XML Documents • Document Type Definitions • Document Object Model • Simple API for XML • Transformations XML Documents An XML document is one logical unit of data marked up in XML, such as a purchase order or a stock quote. An XML datagram is a packet of data containing an XML document that is being transported between systems. An XML document is said to be well formed if it adheres to all of the syntax rules of XML. A Sample XML Document <?xml version="1.0"?> <!DOCTYPE drink-recipe SYSTEM "drink-recipe.dtd"> <drink-recipe name="Fuzzy Navel"> <ingredients> <ingredient quantity="1" unit="ounce"> Vodka </ingredient> <ingredient quantity="1" unit="ounce"> Peach schnapps </ingredient> <ingredient quantity="4" unit="ounce"> Orange juice </ingredient> </ingredients> <preparation> <step> Pour ingredients into a highball glass almost filled with ice. </step> <step> Stir. </step> </preparation> </drink-recipe> Document Type Definition (DTD) A roadmap for how to interpret a specific type of XML document: • What tags are allowed • What attributes are allowed within each tag • Which elements are required and which are optional • Which tags may be nested inside of other tags A Sample DTD <!ELEMENT drink-recipe (ingredients, preparation)> <!ATTLIST drink-recipe name CDATA #IMPLIED> <!ELEMENT ingredients (ingredient+)> <!ELEMENT ingredient (#PCDATA)> <!ATTLIST ingredient quantity CDATA #IMPLIED unit CDATA #IMPLIED> <!ELEMENT preparation (step+)> <!ELEMENT step (#PCDATA)> What Can You Do With An XML Document? Anything you can do with a plain text file: • Edit it with vi or Notepad • Move it between servers with FTP or HTTP • Store it in a VARCHAR2 or CLOB column in your Oracle database What Else Can You Do With An XML Document? • Store it in a SYS.xmltype column in your Oracle 9i database • View it with a web browser (IE 5 or Netscape 6) • View and edit it with JDeveloper • Validate it against a DTD Document Object Model (DOM) An API for querying and updating XML documents • • • • • Uses a tree structure known as a document’s infoset Extract the infoset from an XML document Query the infoset using a search API called XPath Make changes to the infoset Write the infoset back to an XML document A “tree-based” API Simple API for XML (SAX) An API for scanning XML documents • Documents are represented as a linear sequence of parse events • Events occur at the start and end of elements and text • Application provides custom event-handlers • Application code gets executed at specified events in document An “event-based” API XML Stylesheet Language for Transformation (XSLT) A process for transforming XML documents: • From one DTD to another • Between XML and other formats such as HTML or proprietary flat file formats An XML document using the XSLT vocabulary defines the transformation. XML’s Potential Why all the excitement over XML? • Strict yet extensible standards • XML + HTTP XML Standards World Wide Web Consortium recommendations set forth in 1998: • XML 1.0 specification defines XML and DTD syntax • DOM, XPath, and XSLT are covered by a separate specification Standards Both Strict and Extensible Strict: • Unambiguous and unforgiving rules leave little to the imagination • Vendor neutral, platform neutral, language neutral Extensible: • New industry-specific DTDs being developed all the time • XSLT facilitates organizations developing their own custom DTDs The Synergy Between XML and HTTP • HTTP is now commonplace for moving content between systems without concern for vendor or platform of sender or recipient. • Since XML documents are plain text, they can easily be transported via HTTP. • While HTTP and HTML make it easy to transport simple content, HTTP and XML together make it easy to transport data of any structure and complexity. The Value of XML: The Bottom Line XML enables you to publish your complex data in the same way that HTML enables you to publish presentation content. • Vendor and platform independence in the XML standard enables data transfer between disparate systems. • DTDs and XSLT facilitate converting published data from one format to any other. • XML allows you to decouple the data from the presentation. Support for XML in Oracle Products XML Developer Kit – – – – XML Parser XSLT Processor XSQL Pages XML SQL Utility Oracle 9i SYS.xmltype Datatype Oracle Text JDeveloper PLSXML XML Developer Kit (XDK) A single Oracle module that XML-enables your Oracle database Features based on XML standards: – XML parser – DOM and SAX support – XSLT processor Oracle-specific features: – XSQL pages – XML SQL utility XDK Availability and Compatibility • Installs automatically with Oracle 9i and Oracle 8i Release 3 (8.1.7) databases • Available for Oracle 8i Release 1 and 2 from the Oracle Technology Network at technet.oracle.com • Not available for Oracle7 or Oracle8 Oracle’s XDK is evolving rapidly. Check OTN periodically to see if a newer version of the XDK is available for download. XDK Supported Languages Oracle’s XDK XML-enables applications written in: – Java – Java Beans – PL/SQL – C and C++ Install a separate XDK for each language. XDK Fun Facts • Java applications can run inside or outside the database. • You must install Oracle’s JVM in the database in order to run PL/SQL applications that use Oracle’s XDK. • Some XDK features (such as SAX support and XSQL pages) are only available in the XDK for Java. Features Based on XML Standards XML Parser, DOM, SAX, XSLT Processor • Multitude of Java classes. • PL/SQL packages such as xmlparser and xmldom. These are really PL/SQL wrappers encapsulating Java code. • Command line utilities such as oraxml and oraxsl. These are really shell script wrappers encapsulating Java code. XSQL Pages (Java XDK only) A facility for quickly publishing data in XML • Prepare an XML document encapsulating a SQL query using the XSQL DTD. • Call a URL or enter a command at operating system prompt to invoke the XSQL page processor. • Pass in criteria for the query in the URL or as command-line arguments. • An XML document is created based on the query results. • An XSLT can be applied to the query results to transform the output to HTML, a different DTD, or any format desired. Components of the XSQL Pages Framework • Java servlet that runs under Apache (Oracle 9iAS, Oracle 9i database, or Oracle 8i Release 3 database) • Command-line program called xsql • XSQL page processor that gets called by either of the above Sample XSQL Page Output <ROWSET> <ROW num="1"> <ENAME>King</ENAME> <EMPNO>7839</EMPNO> <JOB>President</JOB> </ROW> <ROW num="2"> <ENAME>Blake</ENAME> <EMPNO>7698</EMPNO> <JOB>Manager</JOB> </ROW> </ROWSET> XML SQL Utility A facility for loading XML documents into the database and retrieving data from the database into XML documents, without storing the XML text in one large CLOB column • Invoke an XSLT to transform data from any format into a <ROWSET><ROW> style XML document. • Convert the <ROWSET><ROW> document into a SQL INSERT statement and load the data into a table. • Capabilities exist for updating and deleting data as well. • Extract data from the database into a document of any format by reversing the process. Oracle 9i SYS.xmltype Datatype A new datatype that you can use on columns in tables in an Oracle 9i database • Oracle creates a hidden CLOB column in the table and stores the XML document there. • You access the XML document in ordinary SQL statements using built-in member functions of the SYS.xmltype datatype such as createXML, extract, or existsNode. • You can insert and update XML documents as a whole, and even reference them in the WHERE clause. • You cannot piece-wise update an XML document. Oracle Text (interMedia) • An Oracle facility for searching text documents stored in CLOBs, BFILEs, or referenced by URLs. • Adds new SQL functions CONTAINS and SCORE. • Includes support for many document types, and linguistic capabilities such as stemming and fuzzy matching. • Text indexes can be created in Oracle 9i on SYS.xmltype columns in order to index XML documents for intelligent, XML-aware searching. New SQL functions such as HASPATH and INPATH become available. JDeveloper Oracle’s interactive application development environment XML-aware capabilities: • XML document editing and syntax checking • XSLT manipulation • XSQL page development and viewing PLSXML A simple PL/SQL package that returns the result of a SQL query as an XML document • Download PLSXML from technet.oracle.com (search for “PLSXML”). • A very simplistic script that demonstrates converting table data to XML documents, but probably has little value beyond a demonstration. • Oracle’s one XML offering for Oracle7 and Oracle8 users. Wrapping Up • XML is a platform independent, vendor independent method for transporting structured data. • XML is defined by rigid yet extensible standards. • Oracle has shown a huge commitment to XML support in the Oracle 8i and Oracle 9i database. Further Reading • “Building Oracle XML Applications” from O’Reilly by Steve Muench • “Oracle 9i Application Developer's Guide – XML” in the Oracle 9i server documentation set • http://technet.oracle.com/tech/xml (XML home page on Oracle Technology Network) • http://www.w3org.XML (various XML specifications including the XML 1.0 specification) • http://www.xml.org (Registry of XML schemas, applications, and resources) • http://www.xml.com (Collection of articles and information about XML, co-founded by Tim Bray, one of the editors of the XML 1.0 standard) • http://www.orafaq.org/faqxml.htm (XML Oracle FAQ) Contact Information Roger Schrag rschrag@dbspecialists.com http://www.dbspecialists.com Database Specialists, Inc. 388 Market Street, Suite 400 San Francisco, CA 94111 415-344-0500