Rainbow - Bridging XML and Relational Databases: Design, Implementation, and Evaluation

advertisement
Rainbow - Bridging XML and Relational Databases:
Design, Implementation, and Evaluation
MQP Project Members:
MQP Advisor:
Tien Vu,
Prof. Elke A. Rundensteiner, PhD
Mirek Cymer,
Sponsor:
John Lee
Verizon Laboratories Incorporated
04-19-2001
1
HTML vs. XML
04-19-2001
2
XML Data Management by RDBMS

Microsoft, IBM, Informix, Oracle,...

Advantages:




Matured database tools available.
Efficient query and analysis tools.
Easy integration with existing business databases.
Issues:



Map between XML and Relational Model.
Update Propagation.
Query Translation and Optimization.
04-19-2001
3
Traditional System Architecture
XML
Query
User
XML
Legend
XML Query Engine
XML
Data
RDBMS
Sub
system
XML Manager
XML
04-19-2001
4
Motivation for Flexible Mapping
Query Performance varies with respect to how data is mapped.

Car
iid
pid
1
0
…
…
SELECT * FROM model;
Make
iid
pid
Value
2
1
Ford
…
…
…
SELECT model FROM car WHERE make = ‘Ford’;
car
Model
iid
pid
Value
3
1
Mustang
…
…
…
04-19-2001
iid pid
Make
Model
Year
1
0
Ford
Mustang 2001
…
…
…
…
…
Year
iid
pid
Value
3
1
2001
…
…
…
5
Rainbow Architecture
XML
Query
User
XML
Legend
XML Query Engine
Restructuring Subsystem
DTD Manager
DTD

RDBMS
XML Manager
XML
Data
Sub
system
XML
Flexible mapping = fixed Mapping + restructuring
04-19-2001
6
Rainbow Restructuring Subsystem
XML
Query
User
XML
Legend
XML Query Engine
Sub
system
Data
Restructuring Subsystem
Process
DTD Manager
DTD
04-19-2001
XML Manager
XML
7
Rainbow Restructuring Subsystem
XML
Query
User
XML
Legend
XML Query Engine
Restructuring
Sub
system
Mapping
Data
Restructure
Operator
Library
Restructurer
Process
DTD Manager
DTD
04-19-2001
XML Manager
XML
8
Restructuring Operator Library


Library contains following operators:
 Pushup/Pushdown Attribute
 Pushup/Pushdown Nesting
 Rename Item/Rename Attribute
 SwitchNesting
 Split/Merge Nesting
 Reference/Dereference
Operator is composed of
 DTD Modifications
 Data Changes
04-19-2001
9
Pushup Attribute Operator
DTD Modifications:
In
Out
A
A
Data Changes:
CREATE VIEW out.$A AS
SELECT p.<all_columns>, c.$x
B
B
FROM in.$A p, in.$B c
x
WHERE c.pid = p.iid
CREATE VIEW out.$B AS
x
Pushup
04-19-2001
SELECT <all-columns-but-x>
FROM in.$B
10
Instantiated Pushup Operator
DTD Modifications:
In
Out
Car
Car
Model
Data Changes:
Model
Value
CREATE VIEW out.Car AS
SELECT p.iid, p.pid, c.value
FROM in.Car p, in.Model c
WHERE c.pid = p.iid
CREATE VIEW out.Model AS
SELECT iid, pid
FROM in.Model
Value
Pushup
04-19-2001
pushUpAttribute(‘Model’, ‘Value’, ‘Car’, ‘Model’);
11
Mapping
Mapping is a Sequence of Instantiated Operators
For Example:


1. pushUpAttribute(‘Model’, ‘Value’, ‘Car’, ‘Model’);
2. renameAttribute(‘Car’, ‘Value’, ‘Model’);
car
Car
iid pid
1
1
0
… …
Model
car
iid pid
Value
1
0
Mustang
…
…
…
2
iid pid
Model
1
0
Mustang
…
…
…
Model
iid pid Value
iid
pid
3
Mustang
3
1
…
…
…
1
… …
04-19-2001
12
Rainbow Implementation


Development Tools

Java: Visual Café 4, Javadoc, JAVA2

Oracle 8i, XML 4J, JDBC1.2, SQL
Statistics of Class Implementation
 44 total
 17 created
Extended
43%
 19 extended
 8 reused
04-19-2001
New
39%
Reused
18%
13
Screen Shot of Rainbow
04-19-2001
14
Screen Shot of Rainbow
04-19-2001
15
Screen Shot of Rainbow
04-19-2001
16
Setup for Rainbow Evaluation

Experimental



Database Server:
 Oracle 8i on a PII 300MHz, 256MB, Microsoft NT Server
Client:
 Pentium 233MHz, 128MB, Microsoft NT Workstation
Data


Designed a DTD
Generated XML using IBM’s
XML-Generator
04-19-2001
DTD CONTENT:
<!ELEMENT one (two+)>
<!ELEMENT two (three)>
<!ELEMENT three (four)>
<!ELEMENT four (five)>
<!ELEMENT five (six)>
<!ELEMENT six (seven)>
<!ELEMENT seven EMPTY>
<!ATTLIST seven attribute #REQUIRED>
17
Query Performance Evaluation
Time of Join Query (data-size=22Mb)
Query Time (s)
40
30
20
10
0
0
1
2
3
4
5
6
7
# of PushUpAttribute
Time of Join Query
04-19-2001
18
Overhead Cost
Restructure Time (s)
Time of overhead (datasize=22Mb)
350
300
250
200
150
100
50
0
0
1
2
3
4
5
6
7
# of pushUpAttribute
overhead time
04-19-2001
19
MQP Accomplishments

Technical accomplishments
Implemented functional prototype system
 Confirmed feasibility of Rainbow architecture
 Designed automated test bed
 Conducted preliminary experimental studies


Knowledge acquired
OO, Java, JDBC, SQL, RDBMS, XML, DTD
 Logistics of setting up experiments
 Teamwork & S/W Engineering & Software Reuse

04-19-2001
20
Potential Future Work
XML query translation to SQL
 Experiment with test plans and test beds to realize
the full potential of the restructuring component.

04-19-2001
21
Special thanks to:
Prof. Elke A. Rundensteiner
Ph.D. Xin Zhang
Visit Rainbow at http://davis.wpi.edu/dsrg/TJM/
Project Members:
Tien Vu, Mirek Cymer, John Lee
04-19-2001
22
Download