Querying XML Databases

advertisement
1
XQuery XML Databases
Roger L. Costello
16 June 2010
2
XQuery Databases
• XQuery is an XML query language.
• It can be used to efficiently and easily
extract information from Native XML
Databases (NXD).
• It can be used to query XML views of
relational data.
Database Systems
supporting XQuery
• The following database systems offer XQuery
support:
– Native XML Databases:
•
•
•
•
•
•
Berkeley DB XML
eXist
MarkLogic
Software AG Tamino
Raining Data TigerLogic
Documentum xDb (X-Hive/DB)
– Relational Databases:
• IBM DB2
• Microsoft SQL Server
• Oracle
3
4
Native XML Database (NXD)
• Native XML databases have an XML-based
internal model, i.e., their fundamental unit of
storage is XML.
• To enable Oxygen XML to use a database, you
need to download the database's .jar files and
then point Oxygen to them.
• For instructions on configuring each database,
select the Oxygen Help menu and search for
this: Native XML Database (NXD) Support
5
Use Cases for Native XML Databases
Table of contents
•1.0 Got XML?
•2.0 A very quick review of native XML databases
•3.0 Storing and querying document-centric XML
•3.1 Documents in the real world
•3.2 Inside the applications
•3.2.1 Managing documents
•3.2.2 Finding documents
•3.2.3 Retrieving information
•3.2.4 Reusing content
•3.3 Why you need a native XML database
•3.4 A peek into the future
•4.0 Data integration
•4.1 Data integration in the real world
•4.2 Inside the applications
•4.2.1 Query architectures
•4.2.2 Handling differences in schemas
•4.3 Why you need a native XML database
•4.4 A peek into the future
•5.0 Working with semi-structured data
•5.1 Semi-structured data in the real world
•5.2 Inside the applications
•5.3 Why you need a native XML database
•5.4 A peek into the future
•6.0 Schema evolution
•6.1 Schema evolution in the real world
•6.2 Inside the applications
•6.3 Why you need a native XML database
•6.4 A peek into the future
•7.0 Long-running transactions
•8.0 Handling large documents
•9.0 Hierarchical data
•10.0 Other uses
•11.0 A final peek into the future
•12.0 Conclusion
http://www.rpbourret.com/xml/UseCases.htm
•13.0 Thanks
•14.0 Resources
Table of Contents
•1.0 Introduction
•2.0 Is XML a Database?
•3.0 Why Use a Database?
•4.0 Data versus Documents
•4.1 Data-Centric Documents
•4.2 Document-Centric Documents
•4.3 Data, Documents, and Databases
•5.0 Storing and Retrieving Data
•5.1 Mapping Document Schemas to Database Schemas
•5.1.1 Table-Based Mapping
•5.1.2 Object-Relational Mapping
•5.2 Query Languages
•5.2.1 Template-Based Query Languages
•5.2.2 SQL-Based Query Languages
•5.2.3 XML Query Languages
•5.3 Storing Data in a Native XML Database
•5.4 Data Types, Null Values, Character Sets, and All That Stuff
•5.4.1 Data Types
•5.4.2 Binary Data
•5.4.3 Null Data
•5.4.4 Character Sets
•5.4.5 Processing Instructions and Comments
•5.4.6 Storing Markup
•5.5 Generating XML Schemas from Relational Schemas and Vice Versa
•6.0 Storing and Retrieving Documents
•6.1 Storing Documents in the File System
•6.2 Storing Documents in BLOBs
•6.3 Native XML Databases
•6.3.1 What is a Native XML Database?
•6.3.2 Native XML Database Architectures
•6.3.2.1 Text-Based Native XML Databases
•6.3.2.2 Model-Based Native XML Databases
•6.3.3 Features of Native XML Databases
•6.3.3.1 Document Collections
•6.3.3.2 Query Languages
•6.3.3.3 Updates and Deletes
•6.3.3.4 Transactions, Locking, and Concurrency
•6.3.3.5 Application Programming Interfaces (APIs)
•6.3.3.6 Round-Tripping
http://www.rpbourret.com/xml/XMLAndDatabases.htm
•6.3.3.7 Remote Data
•6.3.3.8 Indexes
XML and
Databases
6
MarkMail -- An email search site
built on XQuery
http://markmail.org/
7
8
9
eXist
•
•
•
•
It is a native XML database
It is open source
More info: http://exist-db.org
Want to see some sample XQueries run against
an eXist database?
Check out the eXist sandbox:
http://demo.exist-db.org/exist/sandbox/sandbox.xql
• In the drop-down box labeled "Paste example"
select "Find books by author". Examine the
XQuery then press the Send button.
10
11
Installing eXist
• In your C: folder create a new folder, eXist
C:
eXist
• In the XQuery folder is a subfolder, eXist-exe-file. Open it. Double
click on the .exe file.
• The second prompt will ask: Select the installation path.
Enter: C:\eXist
• As the password for user ‘admin’ enter: admin
• Select these packs to install: Core, Javadocs (we don’t need
Source)
12
Running eXist
• After it’s finished installing, start the database:
– Start  All Programs  eXist XML Database  eXist Database Startup
(alternatively, double click startup.bat in c:\eXist\bin)
– Minimize the DOS window that is generated by starting the database
(don’t close it)
– Open a browser and enter: http://localhost:8080/exist/index.xml
• View the Admin page. The link to it is on the left side. You need to
scroll down to see the link. Remember the password is: admin.
How to configure an eXist
datasource in Oxygen
1.
2.
3.
Open Oxygen. Go to Options -> Preferences -> Data
Sources. In the Data Sources panel click the New
button.
Enter this name for this data source: eXist-DB
Select eXist from the driver type combo box.
Press the Add button and add the following eXist
specific files. They are located in the eXist installation
root directory (c:\eXist):
1.
2.
3.
4.
5.
4.
exist.jar
lib/core/xmldb.jar
lib/core/xmlrpc-client-3.1.1.jar
lib/core/xmlrpc-common-3.1.1.jar
lib/core/ws-commons-util-1.0.2.jar
Click OK to finish the data source configuration.
13
14
How to configure an eXist
connection in Oxygen
1.
2.
3.
Open Oxygen. Go to Options -> Preferences -> Data
Sources. In the Connections panel click the New
button.
Enter this name for this connection: eXist-Connection
Select eXist-DB from the Data Source combo box.
Fill in the Connection Details:
1.
2.
3.
4.
4.
XML DB URI: use this value:
xmldb:exist://localhost:8080/exist/xmlrpc
User: use this value: guest
Password: use this value: guest
Collection: use this value: /db
Click OK to finish the connection configuration.
15
eXist Collections
• eXist organizes all documents in
hierarchical collections.
• Collections are like directories. They are
used to group related documents together.
• On the last slide you set the default
collection name to /db
Add a new Collection
to the Database
• Switch on Oxygen's Database Perspective:
Perspective -> Database
• Right-click on db, and select Add a collection…
16
Add a new Collection
to the Database
• Type in example01
17
Put a Resource (XML File)
into the Collection
• Right-click on example01, and select Add Resource …
18
Put a Resource (XML File)
into the Collection
• Select FitnessCenter.xml in the
xquery/eXist-examples/example01 folder
19
20
Editing Resources
• Double clicking on FitnessCenter.xml allows you to
view and edit it:
21
Run an XQuery on the XML
• Drag and drop FitnessCenter.xq into Oxygen. Modify the
XQuery to use the doc() function (i.e., use an explicit
input):
22
Specify the Transformer
• Click on the wrench icon then click on the edit button:
23
Specify the Transformer
• Select eXist-Connection in the Transformer
drop-down box:
24
Execute the XQuery
• Click on the red triangle:
25
View the Results
26
Remember to Restart eXist
• Suppose that you shut down your
computer for the day. Tomorrow you Open
Oxygen and try to run an XQuery on
FitnessCenter.xml in the eXist database.
• You get an error. Why?
• Answer: you need to restart eXist before
you open Oxygen.
– Start  All Programs  eXist XML Database  eXist Database Startup
(alternatively, double click startup.bat in c:\eXist\bin)
27
XQuery Update in eXist
• eXist supports XQuery update in a slightly nonstandard way.
• You must precede each update operation with
update, e.g.,
for $i in doc('FitnessCenter.xml')//(*|@*)
return update rename node $i as lower-case(name($i))
Do Lab1
Download