1 XQuery XML Databases Roger L. Costello 16 June 2010 2 XQuery Databases • XQuery is an XML query language. • It can be used to efficiently and easily extract information from Native XML Databases (NXD). • It can be used to query XML views of relational data. Database Systems supporting XQuery • The following database systems offer XQuery support: – Native XML Databases: • • • • • • Berkeley DB XML eXist MarkLogic Software AG Tamino Raining Data TigerLogic Documentum xDb (X-Hive/DB) – Relational Databases: • IBM DB2 • Microsoft SQL Server • Oracle 3 4 Native XML Database (NXD) • Native XML databases have an XML-based internal model, i.e., their fundamental unit of storage is XML. • To enable Oxygen XML to use a database, you need to download the database's .jar files and then point Oxygen to them. • For instructions on configuring each database, select the Oxygen Help menu and search for this: Native XML Database (NXD) Support 5 Use Cases for Native XML Databases Table of contents •1.0 Got XML? •2.0 A very quick review of native XML databases •3.0 Storing and querying document-centric XML •3.1 Documents in the real world •3.2 Inside the applications •3.2.1 Managing documents •3.2.2 Finding documents •3.2.3 Retrieving information •3.2.4 Reusing content •3.3 Why you need a native XML database •3.4 A peek into the future •4.0 Data integration •4.1 Data integration in the real world •4.2 Inside the applications •4.2.1 Query architectures •4.2.2 Handling differences in schemas •4.3 Why you need a native XML database •4.4 A peek into the future •5.0 Working with semi-structured data •5.1 Semi-structured data in the real world •5.2 Inside the applications •5.3 Why you need a native XML database •5.4 A peek into the future •6.0 Schema evolution •6.1 Schema evolution in the real world •6.2 Inside the applications •6.3 Why you need a native XML database •6.4 A peek into the future •7.0 Long-running transactions •8.0 Handling large documents •9.0 Hierarchical data •10.0 Other uses •11.0 A final peek into the future •12.0 Conclusion http://www.rpbourret.com/xml/UseCases.htm •13.0 Thanks •14.0 Resources Table of Contents •1.0 Introduction •2.0 Is XML a Database? •3.0 Why Use a Database? •4.0 Data versus Documents •4.1 Data-Centric Documents •4.2 Document-Centric Documents •4.3 Data, Documents, and Databases •5.0 Storing and Retrieving Data •5.1 Mapping Document Schemas to Database Schemas •5.1.1 Table-Based Mapping •5.1.2 Object-Relational Mapping •5.2 Query Languages •5.2.1 Template-Based Query Languages •5.2.2 SQL-Based Query Languages •5.2.3 XML Query Languages •5.3 Storing Data in a Native XML Database •5.4 Data Types, Null Values, Character Sets, and All That Stuff •5.4.1 Data Types •5.4.2 Binary Data •5.4.3 Null Data •5.4.4 Character Sets •5.4.5 Processing Instructions and Comments •5.4.6 Storing Markup •5.5 Generating XML Schemas from Relational Schemas and Vice Versa •6.0 Storing and Retrieving Documents •6.1 Storing Documents in the File System •6.2 Storing Documents in BLOBs •6.3 Native XML Databases •6.3.1 What is a Native XML Database? •6.3.2 Native XML Database Architectures •6.3.2.1 Text-Based Native XML Databases •6.3.2.2 Model-Based Native XML Databases •6.3.3 Features of Native XML Databases •6.3.3.1 Document Collections •6.3.3.2 Query Languages •6.3.3.3 Updates and Deletes •6.3.3.4 Transactions, Locking, and Concurrency •6.3.3.5 Application Programming Interfaces (APIs) •6.3.3.6 Round-Tripping http://www.rpbourret.com/xml/XMLAndDatabases.htm •6.3.3.7 Remote Data •6.3.3.8 Indexes XML and Databases 6 MarkMail -- An email search site built on XQuery http://markmail.org/ 7 8 9 eXist • • • • It is a native XML database It is open source More info: http://exist-db.org Want to see some sample XQueries run against an eXist database? Check out the eXist sandbox: http://demo.exist-db.org/exist/sandbox/sandbox.xql • In the drop-down box labeled "Paste example" select "Find books by author". Examine the XQuery then press the Send button. 10 11 Installing eXist • In your C: folder create a new folder, eXist C: eXist • In the XQuery folder is a subfolder, eXist-exe-file. Open it. Double click on the .exe file. • The second prompt will ask: Select the installation path. Enter: C:\eXist • As the password for user ‘admin’ enter: admin • Select these packs to install: Core, Javadocs (we don’t need Source) 12 Running eXist • After it’s finished installing, start the database: – Start All Programs eXist XML Database eXist Database Startup (alternatively, double click startup.bat in c:\eXist\bin) – Minimize the DOS window that is generated by starting the database (don’t close it) – Open a browser and enter: http://localhost:8080/exist/index.xml • View the Admin page. The link to it is on the left side. You need to scroll down to see the link. Remember the password is: admin. How to configure an eXist datasource in Oxygen 1. 2. 3. Open Oxygen. Go to Options -> Preferences -> Data Sources. In the Data Sources panel click the New button. Enter this name for this data source: eXist-DB Select eXist from the driver type combo box. Press the Add button and add the following eXist specific files. They are located in the eXist installation root directory (c:\eXist): 1. 2. 3. 4. 5. 4. exist.jar lib/core/xmldb.jar lib/core/xmlrpc-client-3.1.1.jar lib/core/xmlrpc-common-3.1.1.jar lib/core/ws-commons-util-1.0.2.jar Click OK to finish the data source configuration. 13 14 How to configure an eXist connection in Oxygen 1. 2. 3. Open Oxygen. Go to Options -> Preferences -> Data Sources. In the Connections panel click the New button. Enter this name for this connection: eXist-Connection Select eXist-DB from the Data Source combo box. Fill in the Connection Details: 1. 2. 3. 4. 4. XML DB URI: use this value: xmldb:exist://localhost:8080/exist/xmlrpc User: use this value: guest Password: use this value: guest Collection: use this value: /db Click OK to finish the connection configuration. 15 eXist Collections • eXist organizes all documents in hierarchical collections. • Collections are like directories. They are used to group related documents together. • On the last slide you set the default collection name to /db Add a new Collection to the Database • Switch on Oxygen's Database Perspective: Perspective -> Database • Right-click on db, and select Add a collection… 16 Add a new Collection to the Database • Type in example01 17 Put a Resource (XML File) into the Collection • Right-click on example01, and select Add Resource … 18 Put a Resource (XML File) into the Collection • Select FitnessCenter.xml in the xquery/eXist-examples/example01 folder 19 20 Editing Resources • Double clicking on FitnessCenter.xml allows you to view and edit it: 21 Run an XQuery on the XML • Drag and drop FitnessCenter.xq into Oxygen. Modify the XQuery to use the doc() function (i.e., use an explicit input): 22 Specify the Transformer • Click on the wrench icon then click on the edit button: 23 Specify the Transformer • Select eXist-Connection in the Transformer drop-down box: 24 Execute the XQuery • Click on the red triangle: 25 View the Results 26 Remember to Restart eXist • Suppose that you shut down your computer for the day. Tomorrow you Open Oxygen and try to run an XQuery on FitnessCenter.xml in the eXist database. • You get an error. Why? • Answer: you need to restart eXist before you open Oxygen. – Start All Programs eXist XML Database eXist Database Startup (alternatively, double click startup.bat in c:\eXist\bin) 27 XQuery Update in eXist • eXist supports XQuery update in a slightly nonstandard way. • You must precede each update operation with update, e.g., for $i in doc('FitnessCenter.xml')//(*|@*) return update rename node $i as lower-case(name($i)) Do Lab1