XML Movement in Oracle XML Movement in Oracle.................................................................................................. 1 Tutorial Preparation ........................................................................................................ 2 From existing relational tables to XML .......................................................................... 3 Scenario One ............................................................................................................... 3 SQL Solution .............................................................................................................. 3 Activity One ................................................................................................................ 6 Activity Two ............................................................................................................... 6 Scenario Two .............................................................................................................. 7 PL/SQL Solution ......................................................................................................... 7 Activity Three ........................................................................................................... 11 Scenario Three .......................................................................................................... 12 Activity Four ............................................................................................................. 12 From XML to Oracle .................................................................................................... 13 Activity Five ............................................................................................................. 14 Scenario One ............................................................................................................. 14 Activity Six ............................................................................................................... 14 Scenario Two ............................................................................................................ 15 Scenario Three .......................................................................................................... 16 Activity Seven ........................................................................................................... 16 PLSQL Solution ........................................................................................................ 17 Activity Eight ............................................................................................................ 18 There is more! ............................................................................................................... 18 Sample Solutions .......................................................................................................... 19 Activity One .............................................................................................................. 19 Activity Two ............................................................................................................. 19 Activity Three ........................................................................................................... 20 Activity Four ............................................................................................................. 20 Activity Five ............................................................................................................. 20 Activity Six ............................................................................................................... 21 Activity Eight ............................................................................................................ 21 Tutorial Preparation The examples and suggested activities call for you to use CDS and Composers tables. If you have completed either the Performance or the Moving Data tutorials you may well have these tables already. Just do a select count(*) from both to make sure you have more than a few rows in each. If you haven’t got these tables, you can run these two scripts: 1. 2. table_create.sql ShortCDInserts.sql. (Yes, there are fewer rows here than in the performance version!) In order to carry out server-side processing, which this tutorial covers, you need to have write privileges to a directory that the Oracle Instance can see. SHU10g, runs on IVY, which is a Sun Solaris box that also hosts student homespace (your f:\ drive). For obvious reasons our DBA is not prepared to grant DBA privileges to us mere mortals. However, we have set up a utility that will allow you to have a folder on IVY which you can access, through Windows, as a part of your homespace, but which actually maps back to an Oracle DIRECTORY (Click for online documentation). In order to set this DIRECTORY up you must run a Unix utility. In a SHU lab you can follow these steps. If you are at home, you will need to connect to Ivy using Telnet through your ISP and carry on from item 4: 1. From the START button select Internet Tools, Tera Term Pro. 2. The host should read: ivy.shu.ac.uk 3. Click on OK 4. Log in using your normal network login details 5. At the Unix prompt type: /usr/local/bin/oracle_dir 6. press Enter to run the script 7. Now type ls (Lowercase L) and enter 8. Check that you now have a directory called oracle_work 9. Type EXIT to close your Unix session This process has created a Directory Object which maps to this directory in the user's home directory on ivy and the F: drive: /homedir/oracle_work In Oracle, the Directory object which points to this folder has the following name: USERNAME_ORACLE_WORK For example: CMSPL4_ORACLE_WORK for the author’s directory. From existing relational tables to XML Scenario One You have been asked to generate some XML output which contains the composer and CD Title. The basis for this would be this SQL Query: select comp.composer, cd.title from COMPOSERS comp, CDS cd WHERE cd.composerid=comp.composerid ; SQL Solution From version 9i of Oracle onwards we can do this from a SQL prompt using the SQL extensions available in Oracle XML_DB. The two extensions are: XMLElement: function transforms a relational value into an XML element. Eg: <element1>value</element1> XMLForest: maps a relational set to a list, called a "forest", of XML elements. With these two functions we can now generate XML output from the SQL> prompt. Try copying this to SQLPLUS: -- just for testing let's limit the output to a few rows SELECT xmlelement("CD", xmlforest(comp.composer, cd.title)) FROM composers comp, cds cd where comp.composerid=cd.composerid and comp.composerid < 10 ; As this outputs long strings, in order to see the full output you will need to issue the SQLPLUS command: SET LONG 1000 Now we need to put this to a file. We can just spool the output to a file by wrapping a SPOOL round the above. The following code will generate an output file called TEST.XML in SQLPLUS’s current working directory: spool TEST.XML SELECT xmlelement("CD", xmlforest(comp.composer, cd.title)) FROM composers comp, cds cd where comp.composerid=cd.composerid and comp.composerid < 10 ; spool off ; Run this code and then try and open TEST.XML in IE. It will error. For a variety of reasons it is being reported as not well-formed. Edit the file in TextPad to see what the problem is: SQL> SELECT 2 xmlelement("CD", xmlforest(comp.composer, cd.title)) 3 FROM composers comp, cds cd 4 where comp.composerid=cd.composerid 5 and comp.composerid < 10 ; XMLELEMENT("CD",XMLFOREST(COMP.COMPOSER,CD.TITLE)) -------------------------------------------------------------------------------<CD> <COMPOSER>Adams</COMPOSER> <TITLE>Short Ride In a Fast Machine</TITLE> </CD> …..and so on until the end of the file where we find: <CD> <COMPOSER>Albeniz</COMPOSER> <TITLE>Iberia</TITLE> </CD> 9 rows selected. SQL> spool off ; In short we have sqlpplus feedback in our text file. In order to solve this problem we can use a variety of SQLPLUS SET commands. Review the changes made to the script below. If you want to understand these SET commands, visit the online documentation, (you must be on a SHU PC to do this) or see the crib sheet: http://www.shu.ac.uk/schools/cms/teaching/pl3/sqlplsqlreminder.doc SET ECHO off set pagesize 0 set long 10000 SET VERIFY off SET FEEDBACK off spool TEST.XML SELECT xmlelement("CD", xmlforest(comp.composer, cd.title)) FROM composers comp, cds cd where comp.composerid=cd.composerid and comp.composerid < 10 ; spool off ; Try this code. Does this give a well-formed XML file? No, still not well-formed. The last problem is that, for XML to be well-formed, it must contain only one root element. In this case CD is the highest branch, but there are multiple entries. We need to collect this together in one tag. It doesn’t matter what we call it, but we could stick to the Oracle practice of calling the output a ROWSET. Our spool file needs therefore to look something like this: <ROWSET> <CD>……etc </ROWSET> To force that output we can use a select from DUAL to output a literal string : select '<ROWSET>' from dual ; Put together it looks like this: SET ECHO off set pagesize 0 set long 10000 SET VERIFY off SET FEEDBACK off spool TEST.XML select '<ROWSET>' from dual ; SELECT xmlelement("CD", xmlforest(comp.composer, cd.title)) FROM composers comp, cds cd where comp.composerid=cd.composerid and comp.composerid < 10 ; select '</ROWSET>' from dual ; spool off ; Try this code. Some browsers will display this, but there is still something missing. Activity One We are very nearly there. Can you see what is missing? And can you put it right and generate a well-formed XML file? Activity Two See if you can use this code as a template. Alter it so that you generate a well-formed XML file which contains CDID, Composer and Title for all pieces by the composer Sibelius. Scenario Two As part of a development team you have been asked to generate an XML file using a PLSQL stored procedure that other developers can call, for example, every time a new CD is added to the CDS table. The output should contain all the information in CDS for one particular cdid, which should be passed as a parameter. The XML file needs to be stored on the server, where it will be used by other applications. PL/SQL Solution Outputting the data Oracle introduced the XML/SQL Utility (XSU) in Version 8i, and the functionality has been improved in subsequent versions of Oracle. In this tutorial we will use a couple of the DBMS_xxxx PLSQL packages. We are going to build this Procedure gradually. Step one is to make sure we know what data we want to return. Run this SQL and see if you get the results you need: SELECT * FROM CDS WHERE cdid=1471 ; Now we need to embed this in a PLSQL procedure. First of all we will build a procedure that outputs to the monitor so that we can check we are getting what we want. Output of CLOBs In all these examples we will be using either an XMLTYPE or CLOB datatype for our XML data in the database. You can think of XMLTYPE as an enhanced CLOB. SQLPLUS doesn’t handle CLOBS too well. It would be useful if we could see CLOB output, given that XSU likes to use CLOBs for XML output. So let's create a general purpose procedure which uses the dbms_output package to generate lines we can read using SERVEROUTPUT ON in sql*plus. The function OutputClob is created in the script: OutputClob.sql .You need to save this locally and then run the script to create this function in your schema. In order to see the output if you are using sql*plus make sure you have: SET SERVEROUTPUT ON SIZE 20000. The default size of the dbms_output buffer is 2k, so setting 20000 allows 10 times more output before you get a buffer overflow error. Having set serveroutput on, we can run the procedures we create from the SQL prompt: SQL>Exec someprocedureorother Output to monitor OK, so now we have to write the procedure. The suggested code is below. Read it through and make sure you understand what it is trying to do before dropping onto your SQL> prompt. CREATE OR REPLACE Procedure OutputNewCD(cdid IN integer) IS output_xml CLOB; Qry DBMS_XMLQuery.ctxType; BEGIN -- set up the query Qry := DBMS_XMLQuery.newContext('SELECT * FROM CDS WHERE cdid='||to_char(cdid)); -- get the result output_xml := DBMS_XMLQuery.getXML(Qry); -- close the query handle DBMS_XMLQuery.closeContext(Qry); -- output to Output Buffer OutputClob(output_xml) ; END; / Here are some important aspects of the above code: a) The created procedure allows for the passing of a parameter which contains the CDID we want to generate the output for. b) The Qry variable is declared as a datatype ctxType, the definition of which is in the XMLQuery package. In this case we can think of Qry as a pointer to, or a handle for, the query running on the Server. c) The method .getXML will wrap up the output from the SQL statement for us d) Because the output is a CLOB we need to call OutputClob to see it Having run the above and got the reassuring message: Procedure created, we can call the procedure from the SQL> prompt to test it: Exec OutputNewCD(1471) Notice how the .getXML method ensures the output is well-formed. So we know we can get what we want from the procedure; let's now amend it so that it generates a file, rather than outputs to monitor. If you haven’t already made sure you have an oracle_work folder (see Preparing for the tutorial above), now is the time to do so as we will be using that directory to write out our XML to. Here is the amended code. When you copy it, don’t forget to change the username from CMSPL4_ORACLE_WORK to your own. Also remember that Unix is case sensitive; you must use uppercase for your Directory name. CREATE OR REPLACE Procedure OutputNewCDFile(cdid IN integer) IS -- new declarations file_handle Utl_File.FILE_TYPE; directory_name CONSTANT VARCHAR2(60) := 'CMSPL4_ORACLE_WORK'; xml_filename CONSTANT VARCHAR2(60) := 'NewCD.xml'; buffer VARCHAR2(32767); buffer_size CONSTANT BINARY_INTEGER := 32767; amt BINARY_INTEGER; offset NUMBER(38); -- old declarations output_xml CLOB; Qry DBMS_XMLQuery.ctxType; BEGIN -- set up the query Qry := DBMS_XMLQuery.newContext('SELECT * FROM CDS WHERE cdid='||to_char(cdid)); -- get the result..! output_xml := DBMS_XMLQuery.getXML(Qry); -- close the query handle DBMS_XMLQuery.closeContext(Qry); -- NEW BIT output to file -- open the file in write mode file_handle := UTL_FILE.FOPEN( location => directory_name, filename => xml_filename, open_mode => 'w', max_linesize => buffer_size); -- Set up looping controls amt := buffer_size; offset := 1; -- read clob in chunks of buffer_size bytes and write out to file WHILE amt >= buffer_size LOOP DBMS_LOB.READ( lob_loc => output_xml, amount => amt, offset => offset, buffer => buffer); offset := offset + amt; UTL_FILE.PUT( file => file_handle, buffer => buffer); UTL_FILE.FFLUSH(file => file_handle); END LOOP; UTL_FILE.FCLOSE(file => file_handle); END; / Key points from this code: 1. We use the read function from a built in package called DBMS_LOB which provides large object manipulation functionality in plsql. 2. We use another built-in package; UTL_FILE to output to the OS file system. Note that one of the parameters you pass is a database Directory Object, NOT an OS path. 3. The call to FFLUSH causes the data to be physically written to disk. Now run the new procedure: exec outputnewcdfile(1471) And then check your oracle_work folder on your fdrive. Activity Three Make sure you understand the code in OutputNewCDFile. In actual fact, as we will generate a file every time a row is inserted, we will want not to hard-code the output filename. Rewrite the procedure so that it outputs to a file whose name contains the cdid to make sure it is unique. (Eg: NewCD1471.xml) Scenario Three You are asked to produce an XML Schema that describes the above XML file so that it can be used elsewhere with a validating parser. There are two alternative parameters you can pass to DBMS_XMLQuery.GetXML which extend it beyond what we did above: a) DBMS_XMLQUERY.SCHEMA b) DBMS_XMLQUERY.DTD Let's write a function that returns some XML from any query, and prefixes XML Schema information which describes the data. Create or Replace Procedure CreateXMLSchemaFor(qry IN varchar2) IS res CLOB ; Begin res:=DBMS_XMLQUERY.getXML(qry, DBMS_XMLQUERY.SCHEMA) ; --just to prove the concept, output to monitor -- don’t forget serveroutput on size 200000 OutputClob(res) ; End; / Activity Four Now try and run this for a smallish query and see what output you get. From XML to Oracle How you deal with incoming XML depends upon how your users and applications will use the information. XMLType is a new Oracle datatype which builds on the CLOB. It provides many benefits if you are going to be using the XML in a relational way, or you will be using XPATH to query it. If you don’t need this functionality, you can just stick with the base datatype, the CLOB. Whatever you decide, it seems likely that we will often want to turn the contents of an OS file into a CLOB, so the first thing we might do is create a general purpose function which does just that (reusable code): create or replace function getCLOBfromFile( from_directory in varchar2, from_filename in varchar2) return clob IS from_bfile bfile; temp_clob clob; begin from_bfile := bfilename(from_directory, from_filename); -- open the OS file dbms_lob.open(from_bfile); -- create a temp clob to store the results dbms_lob.createtemporary(temp_clob, true, dbms_lob.session); -- load from file into the temp clob dbms_lob.loadfromfile(temp_clob, from_bfile, dbms_lob.getlength(from_bfile)); dbms_lob.close(from_bfile); return temp_clob; end; / Points to note from this code: 1. We are returning a CLOB datatype 2. we create the pointer to the OS file by using the bfilename function, passing the DIRECTORY (not OS path) and the filename 3. we are using the dbms_lob package again to first open the file and then, using loadfromfile, put the contents into a temporary CLOB Activity Five Create the above function in your schema, and then, from the SQL prompt, try it out. Scenario One In this application it is important for you to maintain the integrity of the actual XML file you receive, even including white space, AND your users are not going to spend any time querying its contents. We receive information about new CDs on a suppliers' system by XML file. (It is our friend the Newcd.xml file from above). The simplest solution is to simply store the XML in a CLOB column. Simple probably means least performance overhead, so we won’t impact upon whatever else our system is doing. Here is some code that creates a table to store the XML: DROP TABLE NewCDInfo ; CREATE TABLE NewCDInfo ( DateTime_Recd Date DEFAULT SYSDATE, SourceFileName Varchar2(60), XMLContent CLOB ); Then we store the content of NewCD in a CLOB column, using our getCLOBfromFile function from above: Activity Six Write some plsql to insert values from NewCD1471.xml into the above table. You may choose to write a procedure, or just have an anonymous block. Test that it has worked by doing a select * from NewCDInfo. Scenario Two Because there may be times when your users query the XML data you are storing, you decide to use the XMLType datatype to store the data. This will increase the processing required on insert, but will open up the possibility of using XPATH to query the data. Your developers have a mix of XPATH and SQL skills. We could define a table which includes a XMLTYPE column, instead of a CLOB. In this case though we do not need to gather any other information, so we can just define a table as of XMLTYPE type: drop table CDXML ; CREATE TABLE CDXML of XMLType; Now we need to use our getClobFromFile to insert into this table. Notice how we cast the CLOB into a xmltype: Declare this_clob CLOB ; Begin this_clob:=getclobfromfile('CMSPL4_ORACLE_WORK','NewCD1471.xml'); INSERT INTO CDXML VALUES ( xmltype ( this_clob ) ); END ; / set long 1000 select * from CDXML ; Because we have stored this as an XMLTYPE we can get at the data in a number of ways. This is how to count all the CDIDs in the table using XPATH: select count(*) from CDXML WHERE existsNode(object_value,'/ROWSET/ROW/CDID') = 1 ; Copy this and drop it onto your SQL> prompt. This is not an XPATH tutorial; it is your developers who will be writing the access code. The main learning point in this code is that existsNode(some xml, some xpath value) is similar to the Find function in Excel. It is looking for a particular node value (in this case /ROWSET/ROW/CDID), and if it finds it returns the value 1. So the WHERE clause restricts output to those rows which contain this nodal value. Object_Value is an Oracle pseudo-column. We can use the object manipulation function EXTRACT to output the XML content selectively. Try this: SELECT extract(value(X), '/ROWSET/ROW/TITLE') as Titles FROM CDXML X; Scenario Three In this scenario we have a series of relational tables already in existence. What we need to do is use an XML file to populate those tables. The file is released by our supplier and made public by publishing on their web site. Here is an example: http://otho.cms.shu.ac.uk/dbstaff/cmspl4/NewCD1478.xml Activity Seven Review the content of the incoming XML file. How does it map across to your CDS table? PLSQL Solution The work we are going to carry out needs us to access XML from a URL. Oracle provides a HttpUriType datatype which allows us to work with URLs. It has useful methods attached to it, such as getXML. Here is the suggested code: Create or Replace Procedure InsertNewCDfromXML IS somexml XMLType; Titl varchar2(50); Cond varchar2(30); Orch varchar2(30); Comp Binary_Integer ; BEGIN -- pont to the URL containing the xml -- get the xml and put it into a variable of xmltype somexml := SYS.HttpUriType.createUri('http://otho.cms.shu.ac.uk/dbstaff/cmspl4/NewCD1478.xml').getXML() ; --- get values from the xml Titl := somexml.extract('/ROWSET/ROW/TITLE/text()').getstringval(); Cond := somexml.extract('/ROWSET/ROW/CONDUCTOR/text()').getstringval(); Orch := somexml.extract('/ROWSET/ROW/ORCHESTRA/text()').getstringval(); Comp := TO_NUMBER(somexml.extract('/ROWSET/ROW/COMPOSERID/text()').getstringval()); -- for checking/debug puposes let's output to monitor. Don’t forget serveroutput ON DBMS_OUTPUT.PUT_LINE((SUBSTR(Titl,1,25))||Cond); -- now insert into the table. Null is for CDID which autogenerates -- because of the table trigger INSERT INTO CDS VALUES (null, Titl,Cond,Orch, Comp) ; END; / Things of note: 1. HttpUriType is owned by the SYS schema, but you have been granted access 2. The .getXML function works on the URL passed in as a parameter to the CreateUri procedure 3. Once the XML is in an XMLType we can use XPATH to extract value from each node. Text() at the end of the XPATH string just means give me the string data value in that node. Activity Eight Make sure you understand the code, and then create a procedure that could import a new COMPOSER from an OS file in your DIRECTORY, rather than from a URL. Download this file to your oracle_work folder as the basis for your insert procedure. There is more! We have extracted data from Oracle as XML and we have populated relational tables from an XML file. Using facilities such as what the Oracle XMLDB manual describes as Structured Mapping of XMLType we can shred incoming XML to allow you to use indexes and create views. You can register schemas in the database to allow you to validate incoming XML. You can parse XML in PLSQL and then operate on the document model….. Sample Solutions Activity One Well actually, since IE will display this file, you could say nothing is missing and that it is well-formed. However, W3C recommends that every XML file should start with a version declaration. Here is the complete code, the missing bit highlighted: SET ECHO off set pagesize 0 set long 10000 SET VERIFY off SET FEEDBACK off spool TEST.XML select '<?xml version="1.0"?>' from dual ; select '<ROWSET>' from dual ; SELECT xmlelement("CD", xmlforest(comp.composer, cd.title)) FROM composers comp, cds cd where comp.composerid=cd.composerid and comp.composerid < 10 ; select '</ROWSET>' from dual ; spool off ; Activity Two SET ECHO off set pagesize 0 set long 10000 SET VERIFY off SET FEEDBACK off spool SIBELIUS.XML select '<?xml version="1.0"?>' from dual ; select '<ROWSET>' from dual ; SELECT xmlelement("CD", xmlforest(cd.cdid, comp.composer,cd.title)) FROM composers comp, cds cd where comp.composerid=cd.composerid and comp.composer='Sibelius' ; select '</ROWSET>' from dual ; spool off ; Activity Three The changes are minimal and revolve around changing the xml_filename from a constant to a variable. See the two highlighted lines below: CREATE OR REPLACE Procedure OutputNewCDFile(cdid IN integer) IS -- new declarations file_handle directory_name xml_filename buffer buffer_size amt offset Utl_File.FILE_TYPE; CONSTANT VARCHAR2(60) := 'CMSPL4_ORACLE_WORK'; VARCHAR2(60) ; VARCHAR2(32767); CONSTANT BINARY_INTEGER := 32767; BINARY_INTEGER; NUMBER(38); -- old declarations output_xml CLOB; Qry DBMS_XMLQuery.ctxType; BEGIN xml_filename:= 'NewCD'||to_char(cdid)||'.xml' ; (…………………..the rest of the procedure is the same) Activity Four exec CreateXMLSchemaFor('select cdid, title from cds where cdid>1400') Activity Five set long 1000 select getclobfromfile('CMSPL4_ORACLE_WORK','NewCD1471.xml')from dual ; Activity Six DECLARE this_clob CLOB ; Begin this_clob:=getclobfromfile('CMSPL4_ORACLE_WORK','NewCD1471.xml'); Insert into NewCDInfo values (SYSDATE, 'NewCD', this_clob) ; End ; / Activity Eight Create or Replace Procedure InsertNewCompfromXML IS somexml XMLType; Compname varchar2(30); Compid Binary_Integer ; this_clob CLOB ; BEGIN this_clob:=getclobfromfile('CMSPL4_ORACLE_WORK','newcomp.xml') ; somexml:=xmltype(this_clob) ; -- the xpath search string is different to the CDS one Compid := TO_NUMBER(somexml.extract('/NEWRECORD/COMPOSERID/text()').getstringval()); Compname := somexml.extract('/NEWRECORD/COMPOSER/text()').getstringval(); -- DBMS_OUTPUT.PUT_LINE((SUBSTR(Compname,1,25))||TO_CHAR(Compid)); INSERT INTO COMPOSERS VALUES (Compid, Compname) ; END; / exec InsertNewCompfromXML