XML Structures for Relational Data {duwenyue, leeml, Department of Computer Science

advertisement
XML Structures for Relational Data
Wenyue Du, Mong Li Lee, Tok Wang Ling
Department of Computer Science
School of Computing
National University of Singapore
{duwenyue, leeml, lingtw}@comp.nus.edu.sg
Contents
Introduction
1.
–
–
–
Background
2.
–
–
–
3.
4.
5.
Motivation
Related Works
Our Approach
XML
XML DTD
Semantic Enrichment
Proposed Relational to XML Translation
Comparison
Conclusion
2
1. Introduction
Outline
–
–
–
Motivation
Related Works
Our Approach
3
Introduction
Motivation

XML is emerging as a standard for information
publishing on the World Wide Web. However,
the underlying data is often stored in traditional
relational databases. Some mechanism is needed
to translate the relational data into XML data.
4
Introduction
Motivation (cont.)

Generates XML structures that are able to
describe the semantics and structures in
underlying relational databases.

Obtains properly structured XML data without
unnecessary redundancies and proliferation of
disconnected XML elements.
5
Introduction
Related Works
•
[1, 5, 6] basically focus on single relation translation. In
order to handle a set of related relations, the relations are
first denormalized to one single relation.
– The flat XML structure does not provide a good way to
show the structure of data.
– It causes a lot of redundancies.
Relations:
Dept(D#, Dname)
Employee
(E#, Ename, JoinDate, D#)
<!ELEMENT Results(Employee*)>
<!ELEMENT Employee (EMPTY)>
Maps to <!ATTLIST Employee
E#
CDATA #REQUIRED
Ename
CDATA #IMPLIED
JoinDate CDATA #IMPLIED
D#
CDATA #REQUIRED
DNAME CDATA #IMPLIED >
6
Introduction
Related Works (cont.)
•
[7] developed a method to generate a hierarchical DTD
for XML data from a relational schema.
– It lacks of semantic enrichment. So it cannot handle
more complex situations.
Relations:
<!ELEMENT Results(Employee*)>
Dept (D#, Dname)
<!ELEMENT Employee (Dept)>
Maps to
Employee
(E#, Ename, JoinDate, D#)
<!ATTLIST Employee
E#
ID
#REQUIRED
Ename
CDATA #IMPLIED
JoinDate
CDATA #IMPLIED>
<!ELEMENT Dept (EMPTY)>
Is it an attribute of object or relationship?
<!ATTLIST Dept … >
7
Introduction
Our Approach
XML structures for relational data can be obtained
by the following steps:
Relational
Schema
Semantic
Enrichment
Semantically Translation
Enriched
Rules
Relational
Schema
ORA-SS
Schema
Diagram
ORA-SS to
XML-Schema
Algorithm
XMLSchema
8
2. Background
Outline
–
–
–
XML
XML Schema
Semantic Enrichment
9
Background / XML
XML

Basic constructs of XML:
– Element
– Attribute
– Reference (link) :
a relationship between resources (e.g. elements).
It is specified by attaching specific attributes or
sub-elements.
10
Background / XML DTD
XML DTD
A Document Type Definition (DTD) describes structure on an
XML document.
<RESULTS>
<!ELEMENT RESULTS (CUSTOMER*)>
<CUSTOMER CID=“C980054Z">
<CNAME>J. Tan</CNAME>
<AGE>36</AGE>
</CUSTOMER>
…
</RESULTS>
<!ELEMENT CUSTOMER
(CNAME, AGE)>
<!ATTLIST CUSTOMER
CID
ID
<!ELEMENT CNAME (#PCDATA)>
<!ELEMENT AGE
XML document
#REQUIRED>
(#PCDATA)>
Corresponding DTD
11
Background / Semantic Enrichment
Semantic Enrichment
• Semantic enrichment is a process that upgrades
the semantics of databases, in order to explicitly
express semantics that is implicit in the data.
Such as various relationship
types, cardinality constraints, etc.
12
Background / Semantic Enrichment
Extra information needed:
• Functional Dependencies (FDs) and keys
• Inclusion dependencies (INDs)
e.g.
STUDENT (S#, SNAME)
HOBBIES(S#, HOBBY)
HOBBIES[S#]  STUDENT[S#]
• Semantic dependencies (SDs) (T.W. Ling & M.L. Lee, 1995)
13
Background / Semantic Enrichment
Semantic Dependencies
EMPLOYEE(E#, ENAME, JOINDATE, D#)
- JOINDATE is functionally dependent on only E#
- Assuming JOINDATE refers to the date on which an
employee assumes duty with the department. We say that
JOINDATE is semantically dependent on {E#, D#}
14
Background / Semantic Enrichment
Semantic Enrichment using SD together with
FD and IND
To obtain:

Object relations and object attributes that represent regular and
weak entity types, and their properties.

Relationship relations and relationship attributes that represent
various relationship types such as binary, n-ary, recursive and ISA
(inheritance), and their properties.

Mix-type relations: We need to split them into object relations and
relationship relations

Fragments of object relations or relationship relations that represent
multi-valued attributes of entity types or relationship types.

Cardinality constraints
15
Background / Semantic Enrichment
An Original Relational Schema
COURSE (CODE, TITLE)
DEPT (D#, DNAME)
STUDENT (S#, SNAME)
TUTORIAL (T#, TUTORIALTITLE)
HOBBIES(S#, HOBBY)
STUDENTDEPT (S#, D#)
C_S (CODE, S#, GRADE)
ATTEND (CODE, T#, S#)
COURSEMEETING (CODE, S#,MEETINGHISTORY)
16
Background / Semantic Enrichment
The Semantically Enriched Schema
Object Relations:
Relationship Relations:
COURSE (CODE, TITLE)
STUDENTDEPT (S#, D#)
DEPT (D#, DNAME)
C_S (CODE, S#, GRADE)
STUDENT (S#, SNAME)
ATTEND (CODE, T#, S#)
TUTORIAL
(T#, TUTORIALTITLE)
Fragment of Object Relations
HOBBIES(S#, HOBBY)
Fragment of Relationship
Relations
COURSEMEETING
(CODE, S#,MEETINGHISTORY)
fragment of C_S
17
3. Proposed Relational to XML Translation
Outline
–
–
–
ORA-SS Model
Relational Schema to ORA-SS Translation
ORA-SS to XML Schema Translation
18
Proposed Relational to XML Translation / ORA-SS
ORA-SS Model
ORA-SS (Object-Relationship-Attribute model
for Semi-Structured data)
G. Dobbie, X.Y. Wu, T.W. Ling, M.L. Lee, “ORA-SS: An ObjectRelationship-Attribute Model for Semi-structured Data”, TR 21/00,
National Univ. of Singapore, 2001
19
Proposed Relational to XML Translation / ORA-SS
Concepts of ORA-SS (cont.)
Object class
Binary
relationship
COURSE
TUTORIAL
STUDENT
C_S
2,1:n,1:n
CODE TITLE
C_S
GRADE
STUDENT1
S#
SNAME
C_S_Ref
ATTEND
3,1:n,1:n
T#
TUTORIAL
TITLE
T_Ref
TUTORIAL1
Identifier Relationship
attribute
Ternary
relationship
Reference
20
Enriched Relational Schema to ORA-SS Schema Translation
Enriched Relational Schema to ORA-SS Schema
Translation
Objectives:
• Identify object classes and their attributes
from object relations
• Identify relationship types and their
attributes from relationship relations
• Identify hierarchical structure
• Generate ORA-SS schema
21
Enriched Relational Schema to ORA-SS Schema Translation
Overview of Translation Rules
1.
Object relation rules: to translate object relations
2.
Relationship relation rules: to translate relationship
relations
3.
Combination rule: to be applied to the result obtained
from the application of object and relationship relation
rules, and generate the final ORA-SS schema.
22
Enriched Relational Schema to ORA-SS Schema Translation
/Object Relation Translation Rules
Rule O1: Mapping object relations
STUDENT
STUDENT(S#, SNAME)
Maps to
S#
SNAME
Single-valued attribute
23
Enriched Relational Schema to ORA-SS Schema Translation
/Object Relation Translation Rules
Rule O2: Mapping fragment of object relations
STUDENT
STUDENT(S#, SNAME)
HOBBIES(S#, HOBBY)
S#
SNAME
*
HOBBY
Maps to
Multivalued attribute
24
Enriched Relational Schema to ORA-SS Schema Translation
/Relationship Relation Translation Rules
Rule R1: Mapping 1-m/1-1 relationship relation
Objectives:


Reduce disconnected elements
Use parent-child structure
Avoid unnecessary redundancies
Use references
Example:
ADVISOR(STAFF#, POSITION) // object relation
STUDENT(S#, SNAME) // object relation
STU_ADV(S#, STAFF#) //1-m relationship relation
25
Enriched Relational Schema to ORA-SS Schema Translation
/Relationship Relation Translation Rules
Rule R1: Mapping 1-m/1-1 relationship relation (cont.)
Case 1:
All the objects (instances) of STUDENT participate
in the relationship type STU_ADV
ADVISOR
STU_ADV
Maps to
STU_ADV
2,0:n,1:1
STUDENT
Use parent-child structure
26
Enriched Relational Schema to ORA-SS Schema Translation
/Relationship Relation Translation Rules
Rule R1: Mapping 1-m/1-1 relationship relation (cont.)
Case 2:
1.
Not all the objects of STUDENT participate in STU_ADV.
or
2.
STUDENT is already as a child object and all the objects
of ADVISOR participate in STU_ADV .
STUDENT
STU_ADV
Maps to
STU_ADV
2,0:1,1:n
ADVISOR
Use parent-child structure
27
Enriched Relational Schema to ORA-SS Schema Translation
/Relationship Relation Translation Rules
Rule R1: Mapping 1-m/1-1 relationship relation (cont.)
Case 3:
There exist objects of STUDENT and ADVISOR do not
participate in STU_ADV
STU_ADV
STUDENT
ADVISOR
Maps to
STU_ADV
2,*,?
ADVISOR1
A_Ref
or
ADVISOR
STUDENT
STU_ADV
2,*,?
S_Ref
STUDENT1
Use reference
28
Enriched Relational Schema to ORA-SS Schema Translation
/Relationship Relation Translation Rules
Rule R2: Mapping m-n binary relationship relation
COURSE
Three ways to map:
CODE
TITLE
STUDENT
GRADE
S#
SNAME
STUDENT
COURSE
COURSE(CODE, TITLE)
C_S,
2,1:n,1:n
C_S(S#, CODE, GRADE)
STUDENT (S#, SNAME)
C_S
CODE
Preferred
Mapping
TITLE
STUDENT1
S#
SNAME
C_S_REF
C_S
GRADE
COURSE
STUDENT
C_S,
2,1:n,1:n
S#
SNAME
COURSE1
C_S
GRADE
CODE
TITLE
C_S_REF
29
Enriched Relational Schema to ORA-SS Schema Translation
/Relationship Relation Translation Rules
Other relationship relation rules

Fragment of relationship relation is translated similarly
to the translation of the fragment of object relation.

N-ary relationship relation is translated using reference
structures. The level of each referencing object may be
determined by the aggregations.

If B ISA A, then B is mapped to a child object class (OB)
of OA.
30
Enriched Relational Schema to ORA-SS Schema Translation
/Combination Rule
Combination Rule:

to be applied to the result obtained from the application
of object and relationship relation rules, and generate
the final ORA-SS schema.
Example:
PERSON(SSNO, RACE) //object relation
STUDENT(S#, SSNO, MAJOR) //object relation
DEPT(D#, DNAME) //object relation
STU_DEPT(S#, D#) //relationship relation
STUDENT ISA PERSON and one DEPT has many STUDENT.
In this case, STUDENT potentially has multiple parents (i.e., DEPT
and PERSON).
31
Enriched Relational Schema to ORA-SS Schema Translation
/Combination Rule
Combination Rule:
Current solution:

Use references (K. Williams, et al. January 2001)
-- It causes too many disconnected elements.
<!ELEMENT Results
(PERSON*, STUDENTS* DEPT*)>
<!ELEMENT PERSON (EMPTY)>
<!ATTLIST PERSON
SSNO
ID
#REQUIRED
RACE
CDATA #IMPLIED
STU_REF1 IDREF #REQUIRED>
<!ELEMENT STUDENT (EMPTY)>
<!ATTLIST STUDENT
S#
ID
#REQUIRED
MAJOR CDATA #IMPLIED >
<!ELEMENT DEPT (EMPTY)>
<!ATTLIST DEPT
D#
ID
#REQUIRED
DNAME CDATA #IMPLIED
STU_REF2 IDREFS #REQUIRED>
32
Enriched Relational Schema to ORA-SS Schema Translation
/Combination Rule
Combination Rule: (cont.)
Our approach:


Translations are produced sequentially according to their priorities.
The translation with the lowest priority will be carried out last.
The priorities of translations (in descending order)
1.
ISA, etc. semantic relationship relations and their fragments
// high semantic cohesion among these participating object classes
2.
1-1 and 1-m relationship relation and their fragments
// potentially represented as hierarchy (p-c) structure
3.
m-1
relationship
relations
and
their
fragments
// potentially represented as hierarchy structure; preferably view as 1-m
4.
m-n, n-ary relationship relations and their fragments
This rule is used to avoid or reduce potential multiple parents.
33
Enriched Relational Schema to ORA-SS Schema Translation
/Combination Rule
Combination Rule: (cont.)
DEPT
PERSON
ISA,
2,1:?,1:1
SSNO
RACE
STUDENT
D#
DNAME
STUDENT1
D_S_REF
S#
We map STUDENT to the child object
class of PERSON first. Then map
DEPT according to 1-m relationship
relation rule. Thus, we may get the
following result.
MAJOR
<!ELEMENT OurSolution (PERSON*, DEPT*)>
<!ELEMENT PERSON (STUDENT)>
<!ATTLIST PERSON
SSNO
ID
#REQUIRED
RACE
CDATA #IMPLIED >
<!ELEMENT STUDENT (EMPTY)>
<!ATTLIST STUDENT
S#
ID
MAJOR CDATA
<!ELEMENT DEPT
<!ATTLIST DEPT
D#
ID
DNAME CDATA
D_S_REF IDREFS
#REQUIRED
#IMPLIED >
(EMPTY)>
#REQUIRED
#IMPLIED
#REQUIRED>
34
Enriched Relational Schema to ORA-SS Schema Translation
A possible ORA-SS Schema diagram derived from
university database
Object Relations:
Relationship Relations:
COURSE (CODE, TITLE)
STUDENTDEPT (S#, D#)
DEPT (D#, DNAME)
C_S (CODE, S#, GRADE)
STUDENT (S#, SNAME)
ATTEND (CODE, T#, S#)
TUTORIAL (T#, TUTORIALTITLE)
Fragment of Relationship Relations
Fragment of Object Relations
COURSEMEETING
(CODE, S#,MEETINGHISTORY)
HOBBIES(S#, HOBBY)
STUDENT
COURSE
STUDENT1
TUTORIAL
DEPT
STUDENTDEPT
2,0:n,1:1
C_S
2,1:n,1:n
CODE TITLE
*
S# SNAME HOBBY
STUDENT2
D#
DNAME
C_S_REF
C_S
*
T# TUTORIAL
TITLE
D_S_REF
C_S
MEETING GRADE
HISTORY
fragment of C_S
ATTEND
3,1:n,1:n
TUTORIAL1
T_REF
35
Algorithm: Mapping ORA-SS Schema Diagram
to XML DTD
Input: an ORA-SS schema diagram SD
Output: an XML DTD
Begin
Start from the top of SD and proceed downward, for each object class O encountered do:
Step 1. Sub-object classes of O
<!ELEMENT O (subelementsList)>
Step 2. For each attribute A of O
Case (1) A is a single valued simple attribute
<!ATTLIST O A type>
Case (2) A is a single valued composite attribute, replace A with its components and
add to <!ATTLIST O attributename type>
Case (3) A is a multivalued simple attribute
<!ELEMENT A(#PCDATA)>
Case (4) A is a multivalued composite attribute
<!ELEMENT A(EMPTY)>
A’s components
<!ATTLIST A componentName type>
Step 3. For each relationship attribute A under O, add A to subelementsList in
<!ELEMENT O(subelementsList)>.
Case (1) A is a simple attribute
<!ELEMENT A(#PCDATA)>.
Case (2) A is a composite attribute
<!ELEMENT A(EMPTY)>,
36
A’s components
<!ATTLIST A componentName type>
Algorithm: Mapping ORA-SS Schema Diagram to XML DTD
The obtained XML structures (DTD)
<!ELEMENT
UNIVERSITY (COURSE*, STUDENT*, DEPT*,
TUTORIAL*)>
<!ELEMENT COURSE (STUDENT1*)>
<!ATTLIST COURSE
CODE
ID
#REQUIRED
TITLE
CDATA #IMPLIED>
<!ELEMENT STUDENT1
(MEETINGHIS*,TUTORIAL1*)>
<!ATTLIST STUDENT1
C_S_REF IDREF #REQUIRED
GRADE CDATA #IMPLIED>
<!ELEMENT MEETINGHIS (#PCDATA)>
<!ELEMENT TUTORIAL1 (EMPTY)>
<!ATTLIST TUTORIAL1
T_REF IDREF #REQUIRED>
<!ELEMENT STUDENT (HOBBIES*)>
<!ATTLIST STUDENT
S#
ID
#REQUIRED
SNAME CDATA #IMPLIED>
<!ELEMENT HOBBIES (#PCDATA)>
<!ELEMENT DEPT (STUDENT2*)>
<!ATTLIST DEPT
D#
ID
#REQUIRED
DNAME CDATA #IMPLIED>
<!ELEMENT STUDENT2 (EMPTY)>
<!ATTLIST STUDENT2
D_S_REF IDREF #IMPLIED>
<!ELEMENT TUTORIAL(EMPTY)>
<!ATTLIST TUTORIAL
T#
ID
#REQUIRED
TUTORIAL_TITLE
CDATA #IMPLIED>
37
4. Comparison
Rich structured and represents the
real world accurately
Yes (
)
Partially
[3]
No
[1, 5, 6]
The representation of various
relationship types and their
attributes
Yes (
Number of disconnected elements
Few (
)
This paper
Partially
[7]
No
[1, 3, 5, 6]
)
Many
Unnecessary redundancies
[7], This paper
Avoidable (
[7], This paper
Naïve approaches
) This paper
Partially
[3, 7]
Many
[1, 5, 6]
5 Conclusion
Method proposed in this paper achieves

Generation of semantically sound XML structures
for relational data possible

Generation of properly structured XML data
without unnecessary redundancies and proliferation
of disconnected XML elements possible
39
References
[1] S. Banerjee, et al “Oracle 8i – The XML Enabled Data Management System”,
Proc. 16th Int’l Conf. on Data Engineering, 2000
[2] G. Dobbie, X.Y. Wu, T.W. Ling, M.L. Lee, “ORA-SS: An ObjectRelationship- ttribute Model for Semi-structured Data”, TR 21/00, NUS, 2001
[3] D.W. Lee, M. Mani, F. Chiu, W.W Chu, “Nesting-based Relational-to-XML
Schema Translation”, Proc, 4th Int’l Workshop on Web and Databases, 2001
[4] T.W. Ling, M.L. Lee, “Relational to Entity-Relationship Schema Translation
Using Semantic and Inclusion Dependencies”, In Journal of Integrated
Computer-Aided Engineering, pages 125-145, 1995
[5] SYBASE, “Using XML with the Sybase Adaptive Server SQL Databases, A
Technical Whitepaper”, http://www.sybase.com,2000
[6] V. Turau, “Making Legacy Data Accessible for XML Applications”,
http://www.informatik.fh-wiesbaden.de/~turau/veroeff.html1999
[7] K. Williams, et al., “XML Structures for Existing Databases”, http://www106.ibm.com/developerworks/library/x-struct/ January 2001
[8] W.Y. Du, M.L. Lee, T.W. Ling, “XML Structures for Relational Data”,
Proc. 2nd Int’l Conf. on Web Information Systems Engineering (WISE) , IEEE Computer Society, 2001
Download