Translate Graphical XML Query Language to SQLX Wei Ni Tok Wang Ling

advertisement
Translate Graphical XML Query
Language to SQLX
Wei Ni
Tok Wang Ling
Department of Computer Science
University of Singapore
{niwei, lingtw}@comp.nus.edu.sg
7/1/2016
DASFAA 2005, Beijing, China
1
Roadmap
 ORA-SS and overview of our project
 GLASS query
 Translation from GLASS to SQLX
• What is SQLX
• Preprocessing
• Translation
 Conclusion and future works
7/1/2016
DASFAA 2005, Beijing, China
2
1. ORA-SS
 ORA-SS [6] (Object-Relational-Attribute model for SemiStructured data ) is a rich semantic model for semistructured data
• Represents both the tree-like data structure and the relationship
types contained in the data set;
• Distinguishes relationship attributes from object attributes;
• Uses Object ID indicating different object instances rather than
element instances.
 It is different from ER diagram in that ORA-SS represent the
hierarchical structure of the semi-structured data.
6.
G. Dobbie, X. Y. Wu, T. W. Ling, M. L. Lee. ORA-SS: An Object-Relationship-Attribute Model for
Semistructured Data. TR21/00, Technical Report, Department of Computer Science, National University
of Singapore, December 2000.
7/1/2016
DASFAA 2005, Beijing, China
3
An XML example
binary relationship type jm
between project and member
ternary relationship
type jmp among
project, member
and publication
project
jm, 2, 1:n, 1:n
J#
member
Jname
jm
jmp, 3, 0:n, 1:n
M#
Mname
age
*
qualification
The DTD of the example dataset
job_title
degree
year
publication
job_title is an
attribute of the
relationship
type jm
university
P#
title
<!ELEMENT project (Jname?, member+)>
<!ATTLIST project project_id ID #REQUIRED>
<!ELEMENT Jname PCDATA>
<!ELEMENT member (Mname, age, job_title+, publication*, qualification*)>
<!ATTLIST member member_id CDATA #IMPLIED >
<!ELEMENT Mname PCDATA>
<!ELEMENT age PCDATA>
<!ELEMENT job_title PCDATA>
<!ELEMENT qualification PCDATA>
<!ATTLIST qualification degree CDATA #IMPLIED
university CDATA #IMPLIED
year CDATA #IMPLIED>
<!ELEMENT publication (review*)>
<!ATTLIST publication pub_id CDATA #IMPLIED
title CDATA #IMPLIED>
<!ELEMENT review PCDATA>
*
review
An example ORA-SS schema
Object Relations
project (J#, Jname)
member (M#, Mname, age
qualification(degree,
university,
year)*)
publication (P#, title, (review)*)
Relationship Relations
jm (J#, M#, job_title)
jmp (J#, M#, P#)
The ORDB storage schema of the example
XML data set
7/1/2016
DASFAA 2005, Beijing, China
4
Overview of our project
 In our project, the XML data is stored in ORDB
(e.g. Oracle 9i), where
• Relationship types and object classes are separately
stored;
• Relationship type attributes and object attributes are
differentiated;
• Multi-valued attributes and composite attributes are
stored as nested tables.
 We choose SQLX in our translation rather than
XQuery because SQLX supports SQL-in-XML-out
in ORDB (e.g. Oracle 9i). Meanwhile, we also
have translation to XQuery.
7/1/2016
DASFAA 2005, Beijing, China
5
2. GLASS query
 GLASS [12] (Graphical query LAnguage for SemiStructured data) is a high expressive graphical
query language for semi-structured data, which
is designed on the base of ORA-SS.
• Separates the complex query logic from the query
graph (the query graph is the graphical part of a
GLASS query).
• Considers relationship types in querying XML data.
12. W. Ni, T. W. Ling. GLASS: A Graphical Query Language for Semi-Structured Data. DASFAA 2003.
7/1/2016
DASFAA 2005, Beijing, China
6
2. GLASS query (Cont.)
 A typical GLASS query consists of four parts
(1) Left Hand Side Graph (LHS graph) – denotes the basic
conditions of a query (which can be different from the
structure of source schema, where we shall do view validation
first [3]);
(2) Right Hand Side Graph (RHS graph) – defines the output
structure of the query result;
(3) Link Set – specifies the bindings between the RHS graph and
LHS graph. When two graph entities are linked, they are
visually connected by a line, which means the data type and
value of the entity in the RHS graph are from the
corresponding linked entity in the LHS graph.
(4) Condition Logic Window (CLW) – It is an optional part
where users write conditions and constructions that are difficult
to draw, which includes Logic expressions, Mathematic
expressions, Comparison expressions and IF-THEN statements.
7/1/2016
DASFAA 2005, Beijing, China
7
Original Schema
2. GLASS query – example
project
jm, 2, 1:n, 1:n
J#
member
Jname
jm
Example 1.
Mname
Find the member whose age is less than 35, and he either has
taken part in less than 5 projects or written more than 6 publications in
some of the projects he attended; display the member id and name.
member
:A:
To group publication
under each pair of
member and project in the
ternary relationship type
jmp
age
age
< 35
member
_group :B:
jm, 2
project
CNT<5
_group :C: M#
jmp, 3
Mname
publication
*
qualification
job_title
degree
year
publication
university
P#
To group project under
member in the binary
relationship type jm
The hierarchical position
of member and project is
swapped in this query.
jmp, 3, 0:n, 1:n
M#
title
*
review
The object node with
name “member” is
linked with the object
node with the same
name in the LHS graph,
which means the
member in the RHS
graph (the result view)
is from the member in
the LHS graph.
CNT>6
CLW
A AND (B OR C);
The GLASS query of Example 1.
7/1/2016
DASFAA 2005, Beijing, China
8
Original Schema
2. GLASS query – example
project
jm, 2, 1:n, 1:n
J#
member
Jname
jm
Example 1.
jmp, 3, 0:n, 1:n
M#
Mname
Find the member whose age is less than 35, and he either has
taken part in less than 5 projects or written more than 6 publications in
some of the projects he attended; display the member id and name.
age
:A:
There are 3 condition
identifiers defined in the
LHS graph, “A”, “B” and
“C”, each condition
identifier is quoted by a
pair of “:”s.
age
< 35
qualification
job_title
degree
year
publication
university
P#
member
*
title
*
review
member
_group :B:
jm, 2
project
CNT<5
_group :C: M#
jmp, 3
Mname
publication
The default logic “AND”
is replaced by the logic
expressions in the CLW.
CNT>6
CLW
A AND (B OR C);
The GLASS query of Example 1.
7/1/2016
DASFAA 2005, Beijing, China
9
3. Translation from GLASS to SQLX
 What is SQLX [7]
• SQLX (aka. SQL/XML) is an XML-related specification expanded
on SQL.
• SQLX combines the features in both XML document processing
and the traditional SQL
• Three fundamental SQLX functions:
• XMLELEMENT
• XMLATTRIBUTES
• XMLAGG
 Why SQLX
• An extended standard on SQL, which is supported by Oracle,
IBM, Microsoft, etc.
• It can be processed by ORDB systems.
7.
Information technology -- Database languages -- SQL -- Part 14: XML-Related
Specifications. ISO/IEC 9075-14:2003
7/1/2016
DASFAA 2005, Beijing, China
10
3. Translation from GLASS to SQLX
– Preprocessing
 Expansion of the simple projection;
 Expansion of the abbreviated RHS graph;
 Construction of the condition tree from LHS
graph and CLW.
• In comparison with the condition tree in TQL [13],
our condition tree
• Contains the information of relationship types;
• Can represent aggregation and restructuring (such as swap);
• Expresses logic in CLW.
13. Y. Papakonstantinou, M. Petropoulos and V.Vassalos. QURSED: Querying and Reporting
Semistructured Data. ACM SIGMOD 2002, Jun 4-6, Madison, Wisconsin, USA.
7/1/2016
DASFAA 2005, Beijing, China
11
3. Translation from GLASS to SQLX (Preprocessing)
Original Schema
– The Generation of condition tree
project
jm, 2, 1:n, 1:n
J#
member
Jname
Step 1. Initializing the condition tree.
jm
Link to the member in
the RHS graph
jmp, 3, 0:n, 1:n
M#
Mname
age
*
qualification
job_title
degree
year
publication
university
P#
member
:A:
member
project
age
< 35
age
< 35
CNT<5
_group :C: M#
jmp, 3
_group :B:
jm, 2
project
CNT<5
Mname
publication
A AND (B OR C);
The original GLASS query
7/1/2016
The condition tree is
initially a copy of the
LHS graph from the
original GLASS query.
_group :C:
jmp, 3
publication
CNT>6
CLW
*
review
member
:A:
_group :B:
jm, 2
title
CNT>6
CLW
A AND (B OR C);
The CLW remains
unchanged in this step.
The condition tree after Step 1.
DASFAA 2005, Beijing, China
12
3. Translation from GLASS to SQLX (Preprocessing)
– The Generation of condition tree
Step 2. Decomposing the LHS graph (making duplicate nodes as necessary).
This is the box of the group
entity that consists of
“member” and “project”
The red dot line means that
the 3 member nodes should
be the same, i.e. the same
variable in SQLX or XQuery
expressions.
Link to the member in the RHS graph
Link to the
member in the
RHS graph
member
:A:
_group :B:
jm, 2
project
age
< 35
member
:A:
CNT<5
_group :C:
jmp, 3
publication
age
<35
member
project
member
:B:
_group
jm, 2
project
CNT<5
:C:
_group
jmp, 3
the box
publication
CNT>6
CNT>6
CLW
A AND (B OR C);
The condition tree after Step 1.
7/1/2016
CLW
A AND (B OR C);
The condition tree after Step 2.
DASFAA 2005, Beijing, China
13
3. Translation from GLASS to SQLX (Preprocessing)
– The Generation of condition tree
Step 3. Adding the logic expressions (in CLW) into the condition tree.
If there are any existential and universal quantifiers in
CLW, they will be added also in this step.
Clip A and the logic node over B
and C with a logic node “AND”
Link to the member in the RHS graph
Link to the member in the RHS graph
[AND]
Clip B and C first with
a logic node “OR”
[OR]
member
member
:B:
:A:
_group
jm, 2
project
CNT<5
age
<35
project
member
member
:B:
:A:
:C:
_group
jm, 2
_group
jmp, 3
publication
CNT>6
project
CNT<5
age
<35
project
member
:C:
_group
jmp, 3
publication
CNT>6
The condition tree after Step 3.
CLW
A AND (B OR C);
The condition tree after Step 2.
7/1/2016
member
The “AND” node here is necessary to differentiate the
following two different query logic:
- A AND B OR C
- A AND (B OR C)
DASFAA 2005, Beijing, China
14
3. Translation from GLASS to SQLX (Preprocessing)
– The Generation of condition tree
Step 4. Eliminating universal quantifiers in the condition tree. (not applied in this
example)
Link to the member in the RHS graph
[AND]
 Comments:
[OR]
member
member
project
member
:B:
:A:
:C:
_group
jm, 2
project
CNT<5
age
<35
_group
jmp, 3
publication
CNT>6
The condition tree is a combination
of LHS graph and CLW, where we
generate the WHERE clauses in the
SQLX expressions directly by traversing
the condition tree instead of the original
LHS graph and/or CLW.
The final condition tree.
7/1/2016
DASFAA 2005, Beijing, China
15
3. Translation from GLASS to SQLX
– Translate Condition Tree and RHS graph into SQLX
 Traverse the expanded RHS graph in
depth-first order and generate the select
statements of the SQLX.
 Generate where clauses:
• Render join for parent-child/ancestordescendant relations (such relations on a
XPath are performed by joining a series of
tables in ORDB);
• Specify conditions for linked nodes by
traversing the condition tree.
7/1/2016
DASFAA 2005, Beijing, China
16
3. Translation from GLASS to SQLX
– Translate Condition Tree and RHS graph into SQLX
[AND]
[OR]
member
member
:B:
:A:
_group
jm, 2
project
CNT<5
age
<35
member
Traverse the (expanded) RHS
graph and obtain the query
block below:
project
member
:C:
_group
jmp, 3
M#
Mname
publication
CNT>6
Preprocessed GLASS query for translation
For Each Node N in DF-Traversing the
expanded RHS graph
The SQLX query expressions
SELECT XMLELEMENT(NAME “member”,
XMLATTRIBUTES (M1.M# AS “member_id”)
XMLELEMENT (NAME “Mname”, M1.Mname) )
FROM member M1
WHERE …
CASE (N is the root of RHS) THEN generate a block of SELECT XMLELEMENT;
SELECT XMLELEMENT (NAME “N” …) FROM… WHERE…
CASE (N is an attribute in XML) THEN generate a block of SELECT XMLATTRIBUTES;
SELECT XMLATTRIBUTES (… AS “N” …) FROM… WHERE…
CASE (other kind of N) THEN generate a block of SELECT XMLAGG(XMLELEMENT); //i.e. N is an element but not the root
SELECT XMLAGG (XMLELEMENT (NAME “N” …) FROM… WHERE…)
7/1/2016
DASFAA 2005, Beijing, China
17
3. Translation from GLASS to SQLX
– Translate Condition Tree and RHS graph into SQLX
[AND]
[OR]
member
member
:B:
:A:
_group
jm, 2
project
CNT<5
age
<35
member
Traverse the (expanded) RHS
graph and obtain the query
block below:
project
member
:C:
_group
jmp, 3
M#
Mname
publication
CNT>6
The SQLX query expressions
M1.age <35
AND ( M1.M# IN (SELECT M# FROM member
WHERE (SELECT COUNT(J#) FROM jm
WHERE M# = jm.M#)<5)
OR M1.M# IN (SELECT M# FROM member
WHERE (SELECT J# FROM project
WHERE(SELECT COUNT(P#) FROM jmp
WHERE jmp.J# = project.J#
AND jmp.M# = member.M#)>6))
)
7/1/2016
DASFAA 2005, Beijing, China
SELECT XMLELEMENT(NAME “member”,
XMLATTRIBUTES (M1.M# AS “member_id”)
XMLELEMENT (NAME “Mname”, M1.Mname) )
FROM member M1
WHERE …
18
4. Conclusion & future works
 We present the translation from GLASS to SQLX.
• GLASS supports SWAPPING and GROUPING on the base of the
semantic information in ORA-SS.
• By translating into SQLX, GLASS can be processed on ORDB
systems such as Oracle 9i.
• The translation method is simple. Compared with other XML-toSQL translation works,
• The ORDB data storage preserves the semantic information in ORASS: differentiates relationship type attributes from object attributes.
• GLASS query considers the relationship information in ORA-SS which
are important to define the query semantic clearly.
• Our translation is simpler in comparison with the literature review in
[8] and supports SWAPPING, GROUPING (also Quantifiers) in
GLASS.
8.
R. Krishnamurthy, R. Kaushik, and J. F. Naughton XML-to-SQL Query Translation Literature: The State of the Art
and Open Problems. University of Wisconsin-Madison
7/1/2016
DASFAA 2005, Beijing, China
19
4. Conclusion & future works
 Future works
• Query optimization on the base of ORA-SS
• Decrease the cost of join
• Cost model and optimized query plan
• Improving our case tool (so far partially
developed)
7/1/2016
DASFAA 2005, Beijing, China
20
References:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
S. Ceri, S.Comai, E. Damiani, P.Fraternali, S. Paraboschi, and L.Tanca. XML-GL: a graphical language of querying
and restructuring XML documents. In Proc. WWW8, Toronto, Canada, May 1999.
S. Ceri, S. Comai, E. Damiani, P. Fraternali, and L. Tanca. Complex Queries in XML-GL. SAC(2) 2000:888-893.
Y. B. Chen, T. W. Ling, M. L. Lee. Designing Valid XML Views. 21st International Conference on Conceptual Modeling
(ER'2002), pp 463-477, October 7-11, 2002, Tampere, Finland.
S. Cohen, Y. Kanza, Y. Kogan, W. Nutt, Y. Sagiv and A. Serebrenik. Equix – Easy Querying in XML Databases. In
proceedings of Webdb’98 – The Web and Database Workshop, 1998.
S. Comai, E. Damiani, P. Fraternali. Computing Graphical Queries over XML Data. ACM Transactions on Information
Systems, Vol. 19, No. 4, October 2001, Pages 371-430.
G. Dobbie, X. Y. Wu, T. W. Ling, M. L. Lee. ORA-SS: An Object-Relationship-Attribute Model for Semistructured Data.
TR21/00, Technical Report, Department of Computer Science, National University of Singapore, December 2000.
Information technology -- Database languages -- SQL -- Part 14: XML-Related Specifications. ISO/IEC 9075-14:2003
R. Krishnamurthy, R. Kaushik, and J. F. Naughton XML-to-SQL Query Translation Literature: The State of the Art and
Open Problems. University of Wisconsin-Madison
B. Ludaescher, Y. Papakonstantinou, and P. Velikhov. Navigation-driven evaluation of virtual mediated views. In
Proceedings of the sixth International Conference on Extending Database Technology (EDBT)(Konstanz, Germany,
March), Lecture Notes in Computer Science, vol. 1777, Springer-Verlage, New York, 2000.
L. Mark, etc. XMLApe. College of Computing, Georgia Institue of Technology.
http://www.cc.gatech.edu/projects/XMLApe/
K. D. Munroe, B. Ludaescher and Y. Papakonstantinou. Blended Browsing and Querying of XML in Lazy Mediator
System. Konstanz, Germany, March 2000.
W. Ni, T. W. Ling. GLASS: A Graphical Query Language for Semi-Structured Data. DASFAA 2003.
Y. Papakonstantinou, M. Petropoulos and V.Vassalos. QURSED: Querying and Reporting Semistructured Data. ACM
SIGMOD 2002, Jun 4-6, Madison, Wisconsin, USA.
XQuery 1.0: An XML Query Language. W3C Working Draft 22 August 2003 http://www.w3.org/TR/xquery/
XML Path Language (XPath) 2.0. W3C Working Draft 22 August 2003 http://www.w3.org/TR/xpath20/
XML Schema. http://www.w3.org/XML/Schema
7/1/2016
DASFAA 2005, Beijing, China
21
The End
Thank you very much
&
wish you have wonderful time in Beijing.
7/1/2016
DASFAA 2005, Beijing, China
22
Download