XML Movement in Oracle

advertisement
XML Movement in Oracle
XML Movement in Oracle.................................................................................................. 1
Tutorial Preparation ........................................................................................................ 2
From existing relational tables to XML .......................................................................... 3
Scenario One ............................................................................................................... 3
SQL Solution .............................................................................................................. 3
Activity One ................................................................................................................ 6
Activity Two ............................................................................................................... 6
Scenario Two .............................................................................................................. 7
PL/SQL Solution ......................................................................................................... 7
Activity Three ........................................................................................................... 11
Scenario Three .......................................................................................................... 12
Activity Four ............................................................................................................. 12
From XML to Oracle .................................................................................................... 13
Activity Five ............................................................................................................. 14
Scenario One ............................................................................................................. 14
Activity Six ............................................................................................................... 14
Scenario Two ............................................................................................................ 15
Scenario Three .......................................................................................................... 16
Activity Seven ........................................................................................................... 16
PLSQL Solution ........................................................................................................ 17
Activity Eight ............................................................................................................ 18
There is more! ............................................................................................................... 18
Sample Solutions .......................................................................................................... 19
Activity One .............................................................................................................. 19
Activity Two ............................................................................................................. 19
Activity Three ........................................................................................................... 20
Activity Four ............................................................................................................. 20
Activity Five ............................................................................................................. 20
Activity Six ............................................................................................................... 21
Activity Eight ............................................................................................................ 21
Tutorial Preparation
The examples and suggested activities call for you to use CDS and Composers tables. If
you have completed either the Performance or the Moving Data tutorials you may well
have these tables already. Just do a select count(*) from both to make sure you have more
than a few rows in each.
If you haven’t got these tables, you can run these two scripts:
1.
2.
table_create.sql
ShortCDInserts.sql. (Yes, there are fewer rows here than in the performance version!)
In order to carry out server-side processing, which this tutorial covers, you need to have
write privileges to a directory that the Oracle Instance can see. SHU10g, runs on IVY,
which is a Sun Solaris box that also hosts student homespace (your f:\ drive).
For obvious reasons our DBA is not prepared to grant DBA privileges to us mere mortals.
However, we have set up a utility that will allow you to have a folder on IVY which you
can access, through Windows, as a part of your homespace, but which actually maps back
to an Oracle DIRECTORY (Click for online documentation).
In order to set this DIRECTORY up you must run a Unix utility. In a SHU lab you can
follow these steps. If you are at home, you will need to connect to Ivy using Telnet
through your ISP and carry on from item 4:
1. From the START button select Internet Tools, Tera Term Pro.
2. The host should read: ivy.shu.ac.uk
3. Click on OK
4. Log in using your normal network login details
5. At the Unix prompt type: /usr/local/bin/oracle_dir
6. press Enter to run the script
7. Now type ls (Lowercase L) and enter
8. Check that you now have a directory called oracle_work
9. Type EXIT to close your Unix session
This process has created a Directory Object which maps to this directory in the user's
home directory on ivy and the F: drive: /homedir/oracle_work
In Oracle, the Directory object which points to this folder has the following name:
USERNAME_ORACLE_WORK
For example: CMSPL4_ORACLE_WORK for the author’s directory.
From existing relational tables to XML
Scenario One
You have been asked to generate some XML output which contains the composer and
CD Title.
The basis for this would be this SQL Query:
select comp.composer, cd.title
from COMPOSERS comp, CDS cd
WHERE cd.composerid=comp.composerid ;
SQL Solution
From version 9i of Oracle onwards we can do this from a SQL prompt using the SQL
extensions available in Oracle XML_DB.
The two extensions are:
 XMLElement: function transforms a relational value into an XML element. Eg:
<element1>value</element1>
 XMLForest: maps a relational set to a list, called a "forest", of XML elements.
With these two functions we can now generate XML output from the SQL> prompt. Try
copying this to SQLPLUS:
-- just for testing let's limit the output to a few rows
SELECT
xmlelement("CD", xmlforest(comp.composer, cd.title))
FROM composers comp, cds cd
where comp.composerid=cd.composerid
and comp.composerid < 10 ;
As this outputs long strings, in order to see the full output you will need to issue the
SQLPLUS command:
SET LONG 1000
Now we need to put this to a file. We can just spool the output to a file by wrapping a
SPOOL round the above. The following code will generate an output file called
TEST.XML in SQLPLUS’s current working directory:
spool TEST.XML
SELECT
xmlelement("CD", xmlforest(comp.composer, cd.title))
FROM composers comp, cds cd
where comp.composerid=cd.composerid
and comp.composerid < 10 ;
spool off ;
Run this code and then try and open TEST.XML in IE. It will error. For a variety of
reasons it is being reported as not well-formed. Edit the file in TextPad to see what the
problem is:
SQL> SELECT
2 xmlelement("CD", xmlforest(comp.composer, cd.title))
3 FROM composers comp, cds cd
4 where comp.composerid=cd.composerid
5 and comp.composerid < 10 ;
XMLELEMENT("CD",XMLFOREST(COMP.COMPOSER,CD.TITLE))
-------------------------------------------------------------------------------<CD>
<COMPOSER>Adams</COMPOSER>
<TITLE>Short Ride In a Fast Machine</TITLE>
</CD>
…..and so on until the end of the file where we find:
<CD>
<COMPOSER>Albeniz</COMPOSER>
<TITLE>Iberia</TITLE>
</CD>
9 rows selected.
SQL> spool off ;
In short we have sqlpplus feedback in our text file. In order to solve this problem we can
use a variety of SQLPLUS SET commands. Review the changes made to the script
below. If you want to understand these SET commands, visit the online documentation,
(you must be on a SHU PC to do this) or see the crib sheet:
http://www.shu.ac.uk/schools/cms/teaching/pl3/sqlplsqlreminder.doc
SET ECHO off
set pagesize 0
set long 10000
SET VERIFY off
SET FEEDBACK off
spool TEST.XML
SELECT
xmlelement("CD", xmlforest(comp.composer, cd.title))
FROM composers comp, cds cd
where comp.composerid=cd.composerid
and comp.composerid < 10 ;
spool off ;
Try this code. Does this give a well-formed XML file?
No, still not well-formed. The last problem is that, for XML to be well-formed, it must
contain only one root element. In this case CD is the highest branch, but there are
multiple entries. We need to collect this together in one tag. It doesn’t matter what we
call it, but we could stick to the Oracle practice of calling the output a ROWSET.
Our spool file needs therefore to look something like this:
<ROWSET>
<CD>……etc
</ROWSET>
To force that output we can use a select from DUAL to output a literal string :
select '<ROWSET>' from dual ;
Put together it looks like this:
SET ECHO off
set pagesize 0
set long 10000
SET VERIFY off
SET FEEDBACK off
spool TEST.XML
select '<ROWSET>' from dual ;
SELECT
xmlelement("CD", xmlforest(comp.composer, cd.title))
FROM composers comp, cds cd
where comp.composerid=cd.composerid
and comp.composerid < 10 ;
select '</ROWSET>' from dual ;
spool off ;
Try this code. Some browsers will display this, but there is still something missing.
Activity One
We are very nearly there. Can you see what is missing? And can you put it right and
generate a well-formed XML file?
Activity Two
See if you can use this code as a template. Alter it so that you generate a well-formed
XML file which contains CDID, Composer and Title for all pieces by the composer
Sibelius.
Scenario Two
As part of a development team you have been asked to generate an XML file using a
PLSQL stored procedure that other developers can call, for example, every time a new
CD is added to the CDS table. The output should contain all the information in CDS for
one particular cdid, which should be passed as a parameter. The XML file needs to be
stored on the server, where it will be used by other applications.
PL/SQL Solution
Outputting the data
Oracle introduced the XML/SQL Utility (XSU) in Version 8i, and the functionality has
been improved in subsequent versions of Oracle. In this tutorial we will use a couple of
the DBMS_xxxx PLSQL packages.
We are going to build this Procedure gradually. Step one is to make sure we know what
data we want to return. Run this SQL and see if you get the results you need:
SELECT * FROM CDS WHERE cdid=1471 ;
Now we need to embed this in a PLSQL procedure. First of all we will build a procedure
that outputs to the monitor so that we can check we are getting what we want.
Output of CLOBs
In all these examples we will be using either an XMLTYPE or CLOB datatype for our
XML data in the database. You can think of XMLTYPE as an enhanced CLOB.
SQLPLUS doesn’t handle CLOBS too well. It would be useful if we could see CLOB
output, given that XSU likes to use CLOBs for XML output. So let's create a general
purpose procedure which uses the dbms_output package to generate lines we can read
using SERVEROUTPUT ON in sql*plus.
The function OutputClob is created in the script: OutputClob.sql .You need to save this
locally and then run the script to create this function in your schema.
In order to see the output if you are using sql*plus make sure you have:
SET SERVEROUTPUT ON SIZE 20000.
The default size of the dbms_output buffer is 2k, so setting 20000 allows 10 times more
output before you get a buffer overflow error. Having set serveroutput on, we can run the
procedures we create from the SQL prompt:
SQL>Exec someprocedureorother
Output to monitor
OK, so now we have to write the procedure. The suggested code is below. Read it
through and make sure you understand what it is trying to do before dropping onto your
SQL> prompt.
CREATE OR REPLACE Procedure OutputNewCD(cdid IN integer) IS
output_xml CLOB;
Qry DBMS_XMLQuery.ctxType;
BEGIN
-- set up the query
Qry := DBMS_XMLQuery.newContext('SELECT * FROM CDS WHERE cdid='||to_char(cdid));
-- get the result
output_xml := DBMS_XMLQuery.getXML(Qry);
-- close the query handle
DBMS_XMLQuery.closeContext(Qry);
-- output to Output Buffer
OutputClob(output_xml) ;
END;
/
Here are some important aspects of the above code:
a) The created procedure allows for the passing of a parameter which contains
the CDID we want to generate the output for.
b) The Qry variable is declared as a datatype ctxType, the definition of which is
in the XMLQuery package. In this case we can think of Qry as a pointer to, or
a handle for, the query running on the Server.
c) The method .getXML will wrap up the output from the SQL statement for us
d) Because the output is a CLOB we need to call OutputClob to see it
Having run the above and got the reassuring message: Procedure created, we can call the
procedure from the SQL> prompt to test it:
Exec OutputNewCD(1471)
Notice how the .getXML method ensures the output is well-formed.
So we know we can get what we want from the procedure; let's now amend it so that it
generates a file, rather than outputs to monitor. If you haven’t already made sure you
have an oracle_work folder (see Preparing for the tutorial above), now is the time to do
so as we will be using that directory to write out our XML to.
Here is the amended code. When you copy it, don’t forget to change the username from
CMSPL4_ORACLE_WORK to your own. Also remember that Unix is case sensitive;
you must use uppercase for your Directory name.
CREATE OR REPLACE Procedure OutputNewCDFile(cdid IN integer) IS
-- new declarations
file_handle
Utl_File.FILE_TYPE;
directory_name CONSTANT VARCHAR2(60) := 'CMSPL4_ORACLE_WORK';
xml_filename
CONSTANT VARCHAR2(60) := 'NewCD.xml';
buffer
VARCHAR2(32767);
buffer_size
CONSTANT BINARY_INTEGER := 32767;
amt
BINARY_INTEGER;
offset
NUMBER(38);
-- old declarations
output_xml CLOB;
Qry DBMS_XMLQuery.ctxType;
BEGIN
-- set up the query
Qry := DBMS_XMLQuery.newContext('SELECT * FROM CDS WHERE cdid='||to_char(cdid));
-- get the result..!
output_xml := DBMS_XMLQuery.getXML(Qry);
-- close the query handle
DBMS_XMLQuery.closeContext(Qry);
-- NEW BIT output to file
-- open the file in write mode
file_handle := UTL_FILE.FOPEN(
location => directory_name,
filename => xml_filename,
open_mode => 'w',
max_linesize => buffer_size);
-- Set up looping controls
amt := buffer_size;
offset := 1;
-- read clob in chunks of buffer_size bytes and write out to file
WHILE amt >= buffer_size
LOOP
DBMS_LOB.READ(
lob_loc => output_xml,
amount => amt,
offset => offset,
buffer => buffer);
offset := offset + amt;
UTL_FILE.PUT(
file => file_handle,
buffer => buffer);
UTL_FILE.FFLUSH(file => file_handle);
END LOOP;
UTL_FILE.FCLOSE(file => file_handle);
END;
/
Key points from this code:
1. We use the read function from a built in package called DBMS_LOB which
provides large object manipulation functionality in plsql.
2. We use another built-in package; UTL_FILE to output to the OS file system. Note
that one of the parameters you pass is a database Directory Object, NOT an OS
path.
3. The call to FFLUSH causes the data to be physically written to disk.
Now run the new procedure:
exec outputnewcdfile(1471)
And then check your oracle_work folder on your fdrive.
Activity Three
Make sure you understand the code in OutputNewCDFile. In actual fact, as we will
generate a file every time a row is inserted, we will want not to hard-code the output
filename. Rewrite the procedure so that it outputs to a file whose name contains the cdid
to make sure it is unique. (Eg: NewCD1471.xml)
Scenario Three
You are asked to produce an XML Schema that describes the above XML file so that it
can be used elsewhere with a validating parser.
There are two alternative parameters you can pass to DBMS_XMLQuery.GetXML which
extend it beyond what we did above:
a) DBMS_XMLQUERY.SCHEMA
b) DBMS_XMLQUERY.DTD
Let's write a function that returns some XML from any query, and prefixes XML Schema
information which describes the data.
Create or Replace Procedure CreateXMLSchemaFor(qry IN varchar2) IS
res CLOB ;
Begin
res:=DBMS_XMLQUERY.getXML(qry, DBMS_XMLQUERY.SCHEMA) ;
--just to prove the concept, output to monitor
-- don’t forget serveroutput on size 200000
OutputClob(res) ;
End;
/
Activity Four
Now try and run this for a smallish query and see what output you get.
From XML to Oracle
How you deal with incoming XML depends upon how your users and applications will
use the information. XMLType is a new Oracle datatype which builds on the CLOB. It
provides many benefits if you are going to be using the XML in a relational way, or you
will be using XPATH to query it. If you don’t need this functionality, you can just stick
with the base datatype, the CLOB.
Whatever you decide, it seems likely that we will often want to turn the contents of an OS
file into a CLOB, so the first thing we might do is create a general purpose function
which does just that (reusable code):
create or replace function getCLOBfromFile(
from_directory in varchar2,
from_filename in varchar2)
return clob IS
from_bfile bfile;
temp_clob clob;
begin
from_bfile := bfilename(from_directory, from_filename);
-- open the OS file
dbms_lob.open(from_bfile);
-- create a temp clob to store the results
dbms_lob.createtemporary(temp_clob, true, dbms_lob.session);
-- load from file into the temp clob
dbms_lob.loadfromfile(temp_clob, from_bfile, dbms_lob.getlength(from_bfile));
dbms_lob.close(from_bfile);
return temp_clob;
end;
/
Points to note from this code:
1. We are returning a CLOB datatype
2. we create the pointer to the OS file by using the bfilename function, passing the
DIRECTORY (not OS path) and the filename
3. we are using the dbms_lob package again to first open the file and then, using
loadfromfile, put the contents into a temporary CLOB
Activity Five
Create the above function in your schema, and then, from the SQL prompt, try it out.
Scenario One
In this application it is important for you to maintain the integrity of the actual XML file
you receive, even including white space, AND your users are not going to spend any time
querying its contents. We receive information about new CDs on a suppliers' system by
XML file. (It is our friend the Newcd.xml file from above). The simplest solution is to
simply store the XML in a CLOB column. Simple probably means least performance
overhead, so we won’t impact upon whatever else our system is doing.
Here is some code that creates a table to store the XML:
DROP TABLE NewCDInfo ;
CREATE TABLE NewCDInfo (
DateTime_Recd Date DEFAULT SYSDATE,
SourceFileName Varchar2(60),
XMLContent CLOB
);
Then we store the content of NewCD in a CLOB column, using our getCLOBfromFile
function from above:
Activity Six
Write some plsql to insert values from NewCD1471.xml into the above table. You may
choose to write a procedure, or just have an anonymous block. Test that it has worked by
doing a select * from NewCDInfo.
Scenario Two
Because there may be times when your users query the XML data you are storing, you
decide to use the XMLType datatype to store the data. This will increase the processing
required on insert, but will open up the possibility of using XPATH to query the data.
Your developers have a mix of XPATH and SQL skills.
We could define a table which includes a XMLTYPE column, instead of a CLOB. In this
case though we do not need to gather any other information, so we can just define a table
as of XMLTYPE type:
drop table CDXML ;
CREATE TABLE CDXML of XMLType;
Now we need to use our getClobFromFile to insert into this table. Notice how we cast the
CLOB into a xmltype:
Declare
this_clob CLOB ;
Begin
this_clob:=getclobfromfile('CMSPL4_ORACLE_WORK','NewCD1471.xml');
INSERT INTO CDXML
VALUES
(
xmltype
(
this_clob
)
);
END ;
/
set long 1000
select * from CDXML ;
Because we have stored this as an XMLTYPE we can get at the data in a number of
ways. This is how to count all the CDIDs in the table using XPATH:
select count(*) from CDXML
WHERE
existsNode(object_value,'/ROWSET/ROW/CDID') = 1 ;
Copy this and drop it onto your SQL> prompt. This is not an XPATH tutorial; it is your
developers who will be writing the access code. The main learning point in this code is
that existsNode(some xml, some xpath value) is similar to the Find function in Excel. It
is looking for a particular node value (in this case /ROWSET/ROW/CDID), and if it finds
it returns the value 1. So the WHERE clause restricts output to those rows which contain
this nodal value. Object_Value is an Oracle pseudo-column.
We can use the object manipulation function EXTRACT to output the XML content
selectively. Try this:
SELECT extract(value(X), '/ROWSET/ROW/TITLE') as Titles
FROM CDXML X;
Scenario Three
In this scenario we have a series of relational tables already in existence. What we need
to do is use an XML file to populate those tables. The file is released by our supplier and
made public by publishing on their web site. Here is an example:
http://otho.cms.shu.ac.uk/dbstaff/cmspl4/NewCD1478.xml
Activity Seven
Review the content of the incoming XML file. How does it map across to your CDS
table?
PLSQL Solution
The work we are going to carry out needs us to access XML from a URL. Oracle
provides a HttpUriType datatype which allows us to work with URLs. It has useful
methods attached to it, such as getXML.
Here is the suggested code:
Create or Replace Procedure InsertNewCDfromXML IS
somexml XMLType;
Titl varchar2(50);
Cond varchar2(30);
Orch varchar2(30);
Comp Binary_Integer ;
BEGIN
-- pont to the URL containing the xml
-- get the xml and put it into a variable of xmltype
somexml :=
SYS.HttpUriType.createUri('http://otho.cms.shu.ac.uk/dbstaff/cmspl4/NewCD1478.xml').getXML() ;
--- get values from the xml
Titl := somexml.extract('/ROWSET/ROW/TITLE/text()').getstringval();
Cond := somexml.extract('/ROWSET/ROW/CONDUCTOR/text()').getstringval();
Orch := somexml.extract('/ROWSET/ROW/ORCHESTRA/text()').getstringval();
Comp := TO_NUMBER(somexml.extract('/ROWSET/ROW/COMPOSERID/text()').getstringval());
-- for checking/debug puposes let's output to monitor. Don’t forget serveroutput ON
DBMS_OUTPUT.PUT_LINE((SUBSTR(Titl,1,25))||Cond);
-- now insert into the table. Null is for CDID which autogenerates
-- because of the table trigger
INSERT INTO CDS VALUES (null, Titl,Cond,Orch, Comp) ;
END;
/
Things of note:
1. HttpUriType is owned by the SYS schema, but you have been granted access
2. The .getXML function works on the URL passed in as a parameter to the
CreateUri procedure
3. Once the XML is in an XMLType we can use XPATH to extract value from each
node. Text() at the end of the XPATH string just means give me the string data
value in that node.
Activity Eight
Make sure you understand the code, and then create a procedure that could import a new
COMPOSER from an OS file in your DIRECTORY, rather than from a URL. Download
this file to your oracle_work folder as the basis for your insert procedure.
There is more!
We have extracted data from Oracle as XML and we have populated relational tables
from an XML file.
Using facilities such as what the Oracle XMLDB manual describes as Structured
Mapping of XMLType we can shred incoming XML to allow you to use indexes and
create views. You can register schemas in the database to allow you to validate incoming
XML. You can parse XML in PLSQL and then operate on the document model…..
Sample Solutions
Activity One
Well actually, since IE will display this file, you could say nothing is missing and that it
is well-formed. However, W3C recommends that every XML file should start with a
version declaration. Here is the complete code, the missing bit highlighted:
SET ECHO off
set pagesize 0
set long 10000
SET VERIFY off
SET FEEDBACK off
spool TEST.XML
select '<?xml version="1.0"?>' from dual ;
select '<ROWSET>' from dual ;
SELECT
xmlelement("CD", xmlforest(comp.composer, cd.title))
FROM composers comp, cds cd
where comp.composerid=cd.composerid
and comp.composerid < 10 ;
select '</ROWSET>' from dual ;
spool off ;
Activity Two
SET ECHO off
set pagesize 0
set long 10000
SET VERIFY off
SET FEEDBACK off
spool SIBELIUS.XML
select '<?xml version="1.0"?>' from dual ;
select '<ROWSET>' from dual ;
SELECT
xmlelement("CD", xmlforest(cd.cdid, comp.composer,cd.title))
FROM composers comp, cds cd
where comp.composerid=cd.composerid
and comp.composer='Sibelius' ;
select '</ROWSET>' from dual ;
spool off ;
Activity Three
The changes are minimal and revolve around changing the xml_filename from a constant
to a variable. See the two highlighted lines below:
CREATE OR REPLACE Procedure OutputNewCDFile(cdid IN integer) IS
-- new declarations
file_handle
directory_name
xml_filename
buffer
buffer_size
amt
offset
Utl_File.FILE_TYPE;
CONSTANT VARCHAR2(60) := 'CMSPL4_ORACLE_WORK';
VARCHAR2(60) ;
VARCHAR2(32767);
CONSTANT BINARY_INTEGER := 32767;
BINARY_INTEGER;
NUMBER(38);
-- old declarations
output_xml CLOB;
Qry DBMS_XMLQuery.ctxType;
BEGIN
xml_filename:= 'NewCD'||to_char(cdid)||'.xml' ;
(…………………..the rest of the procedure is the same)
Activity Four
exec CreateXMLSchemaFor('select cdid, title from cds where cdid>1400')
Activity Five
set long 1000
select getclobfromfile('CMSPL4_ORACLE_WORK','NewCD1471.xml')from dual ;
Activity Six
DECLARE
this_clob CLOB ;
Begin
this_clob:=getclobfromfile('CMSPL4_ORACLE_WORK','NewCD1471.xml');
Insert into NewCDInfo values (SYSDATE, 'NewCD', this_clob) ;
End ;
/
Activity Eight
Create or Replace Procedure InsertNewCompfromXML IS
somexml XMLType;
Compname varchar2(30);
Compid Binary_Integer ;
this_clob CLOB ;
BEGIN
this_clob:=getclobfromfile('CMSPL4_ORACLE_WORK','newcomp.xml') ;
somexml:=xmltype(this_clob) ;
-- the xpath search string is different to the CDS one
Compid := TO_NUMBER(somexml.extract('/NEWRECORD/COMPOSERID/text()').getstringval());
Compname := somexml.extract('/NEWRECORD/COMPOSER/text()').getstringval();
-- DBMS_OUTPUT.PUT_LINE((SUBSTR(Compname,1,25))||TO_CHAR(Compid));
INSERT INTO COMPOSERS VALUES (Compid, Compname) ;
END;
/
exec InsertNewCompfromXML
Download