Ya Bing Chen Tok Wang Ling Mong Li Lee
School of Computing
National University of Singapore
1
Introduction
ORA-SS Data Model
Motivating Example
Generation of XQuery View Definitions
Conclusion
2
3
XML
Publish data on the Web
Exchange data for Internet-based business applications
XML Views
Secure the source data
Provide application-specific views of the source data
4
SilkRoute [14]
Export relational data into XML
Complex – two different languages to define and query XML views over relational data respectively
XPERANTO [6]
Export relational data into XML and evaluate XML queries over XML views of relational data
Ignore semantics in defining XML views
Xyleme [11]
Define XML views over XML data based on DTD with selection and join operators, etc,.
Do not support complex XML views such as involving swap operators
ActiveView [3]
Define views with active features (method calls & triggers) over XML repository
Do not support semantics checking
5
Major commercial databases
Oracle 9i [22]
Export relational data to materialized XML views using
SQLX, XSQL or XML SQL Utility (XSU)
Microsoft SQL Server [21]
Create XML view of existing relational data using XML
Schema
IBM DB2 [23] (XTABLE – derived from
XPERANTO [6])
Define XML views of relational tables with XQuery
Difficult to write view definitions
6
In related works
Ignore semantics in source data
Useful for validate XML views
Difficult to use languages to define XML views
Long and complex
Hard to understand
A novel approach proposed in this paper
Adopt a semantically-rich data model – ORA-SS
Automatically generate view definitions in query expression from ORA-SS views
7
8
Object class
An entity type in Entity-Relationship
An element in XML documents
Relationship type
Relationship among object classes
Attribute
Property of object class or relationship type
Key attribute, single-value attribute, multi-value attribute, etc,.
supplier sp , 2, 1: n, 1:n sno sname part pno pname price sp spj,3,1:n,1:n project spj jno jname qty
A ORA-SS schema diagram
9
Distinguish between object class, attribute and relationship type
Differentiate attributes of object class and relationship type
Enable us to design valid XML views
That is, the views do not violate semantics in source data
Other data model such as OEM, Dataguide cannot support all the semantics above
10
Extract ORA-SS schema from XML data
Enrich ORA-SS schema with necessary semantics
Design valid XML views based on ORA-SS
Selection, drop, join & swap operators
11
12
Use XQuery to express XML views
Long and complex
Hard to understand
The following example illustrates this
<db>
<supplier sno=”s001” >
<part pno=”p001”> <price>100</price> </part>
<part pno=”p002”> <price>120</price> </part>
</supplier>
<supplier sno=”s002”>
<part pno=”p002”> <price>40</price> </part>
<part pno=”p003”> <price>30</price> </part>
</db> part supplier sno part sp , 2, 1:n, 1:n swap sp pno price
Figure 6. The ORA-SS Source Schema sp , 2, 1:n, 1:n pno supplier sp sno price
Figure 7. The ORA-SS View Schema applied with a swap operator
Figure 5. A source XML file
13
supplier supplier sno : s001 part part sno: s002 part part swap pno: p001 price:
100 pno: p002 price:
120 pno: p002 price:
40 pno: p003 price:
30
Figure 8. The ORA-SS instance diagram for source schema in Figure 6
Instance diagrams part pno: p001 supplier part pno: p002 supplier supplier part pno: p003 supplier sno: s001 price:
100 sno: s001 price:
120 sno: s002 price:
40 sno: s002 price:
30
Figure 9. The ORA-SS instance diagram for view schema in Figure 7
•Swap operator will keep the relationship set’s attributes in the current position
•For each part, we retrieve all suppliers that supply this part
14
User can write the XQuery expression for the view.
part supplier sno pno price
Figure 6. The ORA-SS Source Schema sp , 2, 1:n, 1:n part sp , 2, 1:n, 1:n swap pno sp supplier sp sno price
Figure 7. The ORA-SS View Schema applied with a swap operator
User need to know price is an attribute of relationship type sp
1. let $pno_set := distinct-values($in//part/@pno)
2. let $sno_set := distinct-values($in//supplier/@sno)
3. return <db>
4. for $p_no in $pno_set
5. let $p := $in//part[@pno=$p_no]
6. return <part pno=”{$p_no}”>
7. { for $s_no in $sno_set
8. where some $p1 in $in//part
9. satisfies ( exists($p1[@pno=$p_no]) and
10. exists($p1[ancestor::supplier/@sno=$s_no]) )
11. let $s :=$in//supplier[@sno=$s_no]
12. return <supplier sno=”{$s_no}”>
13. {$s/part[@pno=$p_no]/price}
14. </supplier>
15. }
16. </part>
17. </db>
Figure 10. XQuery expression for the view 15
XQuery expression for ORA-SS view
Complex and hard to understand
Prone to making error
User need to know where to put the related relationship attribute in the correct position in the view!
Solution:
Design views using view operators based on ORA-SS
Guarantee the designed view is valid in terms of semantics
Automatically generate XQuery expression for ORA-SS view
16
17
1.
•
•
Generate query expression of each object class in an ORA-SS view
A
expression: For, Let, Where, Order by and Return clauses
Often used for restructuring data
2.
Combine query expressions of all object classes together according to the tree structure of the view
18
It is straightforward to automatically generate
For/Let and Return clauses for an object class in the view
However, it is not a trivial task to generate the condition constrains in Where clause for an object class in the view
Many different object classes in source schema may determine the values of a given object class in the view.
Therefore, it is important to know all the relevant object classes in source schema for a given object class in the view.
19
Intuitively, the values of an object class in a view can only be determined by the values of object classes in the path from the root to the object class in the view.
Definition 1.
For any object class o in an ORA-SS view, the path from the root of the view to o is called the vpath (viewpath) of o .
The object classes that occur in the vpath of o may influence the values of o in the view.
20
Three types of object classes in the vpath of an object class o in an ORA-SS view
Type I: originate from the ancestors or descendants of o in the source schema.
Type II: originate from the descendants of o ’s ancestors in the source schema.
Type III: originate from the object classes in another source schema, whose ancestor or descendant has a key-foreign key reference with o ’s ancestor or descendant.
21
Generate condition constraints in where clause
For each type of object class in vpath of o
Present a set of rules to generate where conditions for o
In the following rules
vo refers to an arbitrary object class in o ’s vpath
o_no and vo_no refer to the key attribute of o and vo
vo is processed before o, we can employ vo_no when generating where conditions for o
$in represent the source document
$in vo vo_no o o_no source
Example schema
22
Rule Type I_A
(
If vo is an ancestor of o in source schema, then the following condition constraints (red color) are generated in the Where clause
$in vo vo_no vo vo_no let $ono_set := distinct-values($in//o/@o_no) for $o_no in $ono_set where some $vo1 in $in//vo satisfies ( exists( $vo1[@vo_no=$vo_no]) and exists($vo1[descendant::o/@o_no=$o_no]) ) o o source o_no view o_no
Figure 12(a). The case for Rule Type I_A
Figure 12(b). Where condition generated for o in Rule Type I_A
23
Rule Type I_B
If vo is a descendant of o in source schema, then the following condition constraints (red color) are generated in the Where clause
$in o vo let $ono_set := distinct-values($in//o/@o_no) for $o_no in $ono_set where some $vo1 in $in//vo satisfies ( exists( $vo1[@vo_no=$vo_no]) and exists($vo1[ancestor::o/@o_no=$o_no]) ) o_no vo_no o vo source vo_no view o_no
Figure 13(a). The case for Rule Type I_B
Figure 13(b). Where condition generated for o in Rule Type I_B
24
Rule Type II_A
Case (A): LCA is in the vpath of o
If vo is a Type II object class in o ’s vpath and the
Lowest Common Ancestor of vo and o in source schema, say LCA , is in the vpath of o in the view , then there is no need to generate a condition constraint in the Where clause for the restriction of vo on o .
$in
LCA
LCA vo vo o o source view
Figure 14(a). The case for Rule Type II_A vo
1
LCA
1
LCA
1
LCA
2
LCA
2 vo
1 vo
2 o
1 vo
2 o
2 o
1 o
2 source instance
Figure 14(b). The instance case for Rule Type II_A view instance
25
Rule Type II_B
Case (B): LCA is not in the vpath of o
If vo is a Type II object class in o
’s vpath and the Lowest
Common Ancestor of vo and o , say LCA , is not in the vpath of o in the view, then the following condition constraints
(red color) are generated in the Where clause.
$in vo
LCA o vo vo_no let $ono_set := distinct-values($in//o/@o_no) for $o_no in $ono_set where some $LCA in $in//LCA satisfies ( exists($LCA//o[@o_no=$o_no]) and exists($LCA//vo[@vo_no=$vo_no]) ) o vo_no source o_no view
Figure 15(a). The case for Rule Type II_B o_no
Figure 15(b). Where condition generated for o in Rule Type II_B
26
Rule Type III_A (
)
Case (A): referenced object class is in o ’s vpath
If vo is a descendant of the referenced object class in source schema and the referencing object class is in o ’s vpath in the view, then there is no need to generate a condition constraints for the restriction of vo on o
$in1 referenced
$in2 o vo r_no vo_no o_no vo_no referencing vo referencing o source r_no o_no
Figure 16. The case for rule type III_A view
27
r_no
Rule Type III_B
Case (B): o is the referencing object class itself
If vo is a descendant of the referenced object class and o is the referencing object class itself in source schema, then the following condition constraints are generated
$in1 referenced
$in2 o (referencing) vo let $ono_set := distinct-values($in//o/@o_no) for $o_no in $ono_set where some $referenced in $in1//referenced satisfies ( exists($referenced[@r_no=$o_no]) and exists($referenced[descendant::vo/@vo_no=$vo_no]) ) o_no vo_no vo o (referencing) vo_no source o_no
Figure 18(a). The case for rule type III_B view
Figure 18(b). Where condition generated for o in Rule Type III_B
28
Rule Type III_C
Case (C): referenced object class is not in o ’s vpath and o is an ancestor of referencing object class
If vo is a descendant of the referenced object class and o is an ancestor of the referencing object class in the source, and the referencing object class is not in o’s vpath , then the following condition constraints (red color) are generated in the Where clause
$in1 referenced
$in2 o vo r_no vo_no o_no vo_no o vo referencing source r_no o_no
Figure 17(a). The case for rule type III_C view let $ono_set := distinct-values($in//o/@o_no) for $o_no in $ono_set where some $referenced in $in1//referenced satisfies ( exists($referenced[descendant::vo/@vo_no=$vo_no]) ) and some $referencing in $in2//referencing satisfies ( exists($referencing[@r_no=$referenced/@r_no]) and exists($referencing[ancestor::o/@o_no=$o_no])
)
Figure 17(b). Where condition generated for o in Rule Type III_C 29
$in1
Rule Type III_D
Case (D): referencing object class is not in o ’s vpath and o is a descendant of referencing object class
If vo is a descendant of the referenced object class and o is a descendant of the referencing object class in the source, and the referencing object class is not in o’s vpath , then the following condition constraints are generated
$in2 let $ono_set := distinct-values($in2//o/@o_no) vo referenced referencing for $o_no in $ono_set r_no vo r_no o vo_no o where some $referenced in $in1//referenced satisfies ( exists($referenced[descendant::vo/@vo_no=$vo_no]) ) and some $referencing in $in2//referencing satisfies ( exists($referencing[@r_no=$referenced/@r_no]) and exists($referencing[descendant::o/@o_no=$o_no]) ) vo_no source o_no o_no view
Figure 19(a). The case for Rule Type III_D
Figure 19(b). Where condition generated for o in Rule Type III_D
In the cases where vo is an ancestor of the referenced object class in source schema, similar rules can be easily derived.
30
Rule Type I & Type II
Consider all different cases where vo is in the same source schema as o
Rule Type III
Consider cases where vo is in another source schema different from o’ s source schema
In summary, all the rules above consider all possible cases in designing XML views using our query operators
31
The ORA-SS view input in the algorithm is valid
First generate a set of let clauses for all object classes in the view in pre order
Generate root element for the view
Process each child of the root from left to right
Algorithm Generate_View_Definition
Input: view schema v; source schema s
Output: view definition of v
1.
for each object class o in the view{
2 generate a let clause:
“let $ono_set := distinct-alues($in//o/@o_no)”
3 }
4 generate the start tag for root of the view:
5
“return <root>”
6 for each child o of the root of v {
7 generate a start bracket: “{“
8 Generate_Objectclass_Definition(o, v) ;
9 generate a end bracket: “}”
10 }
11 generate the end tag for root of the view:
“</root>”
Figure 20. Algorithm to generate view definition
32
Algorithm Generate_ObjectClass_Definition
Input: object class o; view schema v source schema s
Output: view definition of o and its descendants
1. generate a for clause
“for $o_no in $ono_set”
2. generate an empty where clause for o ;
3. for each object class vo in the vpath of o { // top down
4. if vo belongs to type I {
5. ProcessTypeI(vo, o) ;
//generate a where condition based on rules for Type I
6. }
7. if vo belongs to type II {
8. ProcessTypeII(vo, o) ;
//generate a where condition based on rules for Type II
9. }
10. if vo belongs to type III {
11. ProcessTypeIII(vo, o) ;
//generate a where condition based on rules for Type III
12. }
13. append the generated condition in the where clause ;
14.}
15. if there is any selection operator applied to o {
16.
generate a where condition for all the operators in the where clause;
17.}
18. generate a let clause:
“let $o := $in//o[@o_no = $o_no]”
19. generate a return clause :
“return
<o o_no=”{$o_no}” distinct($o/@attributes)>”
// construct o and those attributes shown as attributes of o in the source file
20. for each attribute of o shown as a sub element of o in the source file {
21. generate it as a sub element of o:
“{distinct($o/@attribute)}”
22. }
23. if o has no child {
24. generate an end tag for o: “</o>”
25. return the generated definition;
26. }
27. else {
28. for each child object class co of o {
29. generate a start bracket: “{“
30. Generate_Objectclass_Definition(co, v, s) ;
31. generate an end bracket: “}“
32. }
33. generate an end tag for o: “</o>”
34. return the generated definition;
35. }
33
Figure 21. Algorithm to generate object class definition
34
Our motivation
Complexity of defining and querying XML views using XQuery
Propose a novel approach
Automatically generate XQuery view definitions from views defined based on ORA-SS
Application of our approach
Materialized XML views
Map queries on XML views into the equivalent queries on source XML data
35
Advantages of our work
The first work to employ a semantic data model for the design and query of XML views
Provides a facility to design valid XML views
Provides a fast and user-friendly approach to retrieve XML data via views
Ongoing work
XML views in the case where XML data are stored in traditional databases
36
7.
8.
9.
10.
11.
12.
13.
1.
2.
3.
4.
5.
6.
14.
15.
16.
21.
22.
23.
24.
17.
18.
19.
20.
25.
S. Abiteboul, D. Quass, J. McHugh, et. al., “The lorel query language for semistructured data”, International Journal of Digital
Libraries, Volume 1, No. 1, pp. 68-88, 1997.
S. Abiteboul. On views and XML. 18th ACM Symposium on Principles of Database Systems, pp. 1-9, 1999.
S. Abiteboul, S. Cluet, L. Mignet, et. al., “Active views for electronic commerce”, VLDB, pp.138-149, 1999.
S. Abiteboul, V, Aguilear, S, Ailleret, et. al., “XML repository and Active Views Demonstration”, VLDB Demo, pp.742-745, 1999.
C. Baru, A. Gupta, B. Ludaescher, et. al., “XML-Based Information Mediation with MIX”, ACM SIGMOD Demo, 1999.
M. Carey, J. Kiernan, J. hanmugasundaram, et. al., “XPERANTO: A Middleware for Publishing Object-Relational Data as XML
Documents”, VLDB, pp. 646-648, 2000.
M. Carey, D. Florescu, Z. Ives, et. al., “XPERANTO: Publishing Object-Relational Data as XML”, WebDB Workshop, 2000.
Y.B. Chen, T.W. Ling, M.L. Lee, “A Case Tool for Designing XML Views”, DIWeb Workshop, 2002.
Y.B. Chen, T.W. Ling, M.L. Lee, “Designing Valid XML Views”, ER Conference, 2002
V. Christophides, S. Cluet, J. Simeon,“On Wrapping Query Languages and Efficient XML Integration”, SIGMOD, pp. 141-152, 2000.
S. Cluet, P. Veltri, D. Vodislav, “Views in a large scale xml repository”, VLDB, pp. 271-280, 2001.
G. Dobbie, X.Y Wu, T.W Ling, M.L Lee, “ORA-SS: An Object-Relationship-Attribute Model for SemiStructured Data”, Technical
Report TR21/00, School of Computing, National University of Singapore, 2000.
C. Fan, J. Funderburk, H. Lam, Et. al., “XTABLES: Bridging Relational Technology and XML”, IBM Research Report,2002. http://www7b.boulder.ibm.com/dmdd/library/techarticle/0203shekita/0203shekita.pdf
M. Fernandez, W. Tan, D. Suciu, “Efficient Evaluation of XML Middleware Queries”, ACM SIGMOD, pp. 103-114, 2001.
M. Fernandez, W. Tan, D. Suciu, “SilkRoute: Trading Between Relations and XML”, World Wide Web Conference, 1999.
J. Shanmugasundaram, E. Shekita, R. Barr, et. al., “Efficiently Publishing Relational Data as XML Documents”, VLDB, pp. 65-76,
2000.
J. Shanmugasundaram, J. Kiernan, E. Shekita, et. al., “Querying XML Views of Relational Data”, VLDB, pp. 261-270, 2001.
“XML Schema”, W3C Recommendation, 2001. http://www.w3.org/XML/Schema
“XQuery: A Query Language for XML”, W3C Working Draft, 2002. http://www.w3.org/XML/Query
“XML Path Language”, W3C Recommendation, 1999. http://www.w3.org/TR/xpath
Microsoft Corp. http://www.microsoft.com/XML.
Oracle Corp. http://www.oracle.com/XML.
IBM Corp. http://www.ibm.com/XML.
[24] I. Manolescu, D. Florescu, D. Kossmann, “Answering XML Queries over Heterogeneous Data Sources”, VLDB Conf, 2001, pp.241-25
[25] S. Ceri, S. Comai, E. Damiani, et. al., “XML-GL: a graphical language of querying and restructuring XML documents”, WWW
Conf, pp. 151-165, 1999
37
K
Source schema 1
F
L
A
Source schema 2
Type I
B C
E
D
Type III
O H
G
M
P
Type I
Figure 11(a). Two simplified ORA-SS source schema
N
J
Type II
Vpath of O
B
P
J
K
O
...
...
...
Figure 11(b). An simplified ORA-SS view schema
38
i.
ii.
iii.
iv.
v.
For clause evaluates the expression and iterates over the items in the resulting sequence, binding the variable to each item in turn.
Let clause binds a variable to the result of its associated expression (a set of values) without iteration .
Where clause serves as a filter for the tuples of variable bindings generated by the For and Let clauses
Order by clause determine the order of the tuple stream
Return clause is evaluated once for each tuple in the tuple stream and construct the instances of the XML data.
39
supplier sp , 2, 1: n, 1:n sno part sp pno
Source Schema price swap part sp , 2, 1: n, 1:n pno supplier sp sno
View Schema price
40