Query Processing in the AquaLogic Data Services Platform Vinayak Borkar, Michael Carey,

advertisement
Query Processing in
the AquaLogic Data
Services Platform
Vinayak Borkar, Michael Carey,
Dmitry Lychagin, Till Westmann,
Daniel Engovatov, Nicola Onose
BEA Systems
www.bea.com
Data Is Everywhere Today
Relational databases made things too easy
Departmental vs. inter-galactic centralized databases
Databases come in many flavors
Relational: Oracle, DB2(s), SQL Server, MySQL, …
Hangers-on: IMS, IDMS, VSAM, …
Not all data is SQL-accessible
Packaged applications: SAP, PeopleSoft, Siebel, Oracle, SalesForce, …
Custom “homegrown” applications
Files of various shapes and sizes (XML, non-XML)
And the list goes on…
(C) Copyright 2006, BEA Systems, Inc | 2
Painful to Develop Applications
No one “single view of X” for any X
What data do I have about X?
How do I stitch together the info I have?
What else is X related to?
No uniformity in source model or language
Data about X is stored in many different formats
Accessing or updating X involves many different APIs
Manual coding of “distributed query plans”
No reuse of artifacts
Different access criteria or returned data  different access plans
No model to help organize or find the artifacts anyway
(C) Copyright 2006, BEA Systems, Inc | 3
Agenda
Why data services?
Overview and Example
The Query Processor
Work in progress at BEA
Summary and Q&A
(C) Copyright 2006, BEA Systems, Inc | 4
Overview
JAVA/J2EE
WEB SERVICE
JAVA API
WSDL
REPORTING
JDBC/SQL
Client API
Developer
Tooling
Data Processing Engine
Connectivity
Relational
Web
Services
Tables, views
stored
procedures,
SQL
Business
partners,
Adapter,
.Net
Files
XML, Flat
Files
Java Functions
LDAP
J2EE
JCA
JMS
(C) Copyright 2006, BEA Systems, Inc | 5
Excel
Custom
Access
Example: Customer Profile Data Service
Customer Info
Order Info
Credit Card Info
Rating Info
CUSTOMER,
ORDER
CREDIT_CARD
(C) Copyright 2006, BEA Systems, Inc | 6
getRating(…)
Data Service – Design View
(C) Copyright 2006, BEA Systems, Inc | 7
Data Service – “Get All” Read Method
(::pragma function ... kind="read" ...::)
declare function tns:getProfile() as element(ns0:PROFILE)*
{
for $CUSTOMER in db1:CUSTOMER()
return
<tns:PROFILE>
<CID>{ fn:data($CUSTOMER/CID) }</CID>
<LAST_NAME>{ fn:data($CUSTOMER/LAST_NAME) }</LAST_NAME>
<ORDERS>{ db1:getORDER($CUSTOMER) }</ORDERS>
<CREDIT_CARDS>{
db2:CREDIT_CARD()[CID eq $CUSTOMER/CID]
}</CREDIT_CARDS>
<RATING>{
fn:data(ws1:getRating(
<ns5:getRating>
<ns5:lName>{ data($CUSTOMER/LAST_NAME) }</ns5:lName>
<ns5:ssn>{ data($CUSTOMER/SSN) }</ns5:ssn>
</ns5:getRating>
)
}</RATING>
</tns:PROFILE>
};
(C) Copyright 2006, BEA Systems, Inc | 8
Data Service – Read & Navigate Methods
(::pragma function ... kind="read" ...::)
declare function tns:getProfileByID($id as xs:string)
as element(ns0:PROFILE)*
{
tns:getProfile()[CID eq $id]
};
...
(::pragma function ... kind="navigate" ...::)
declare function tns:getCOMPLAINTs($arg as element(ns0:PROFILE))
as element(ns8:COMPLAINT)*
{
db3:COMPLAINT()[CID eq $arg/CID]
};
...
(C) Copyright 2006, BEA Systems, Inc | 9
Agenda
Why data services?
Overview and Example
The Query Processor
Work in progress at BEA
Summary and Q&A
(C) Copyright 2006, BEA Systems, Inc | 10
Query Processor Overview
(C) Copyright 2006, BEA Systems, Inc | 11
Efficient processing
Avoid unnecessary work
Function inlining (view unfolding)
Push work to the sources
Queryable sources can do some of our work
(C) Copyright 2006, BEA Systems, Inc | 12
Optimization: Function Inlining
Example: This fragment
let $x := <CUSTOMER>
<LAST_NAME>{$name}</LAST_NAME>
<ORDERS>…</ORDERS>
</CUSTOMER>
return
fn:data($x/LAST_NAME)
can be replaced by
$name
But we need to:
maintain structural type information for compilation
extend “preserve” mode for runtime
(C) Copyright 2006, BEA Systems, Inc | 13
Pushdown: Overview
SQL Translator tries to “swallow” as much as possible.
Make translation easy
Remove unnecessary functions (by inlining)
Translation to joins and grouping
Split sorting and grouping
Make sources do as much as possible
DBMS specific code
Maximize Pushdown
Inverse functions
(C) Copyright 2006, BEA Systems, Inc | 14
Pushdown: Preparation
Translation to joins and grouping
Split sorting and grouping
for $CUSTOMER in db1:CUSTOMER()
return
<tns:PROFILE>
<CID>{ fn:data($CUSTOMER/CID) }</CID>
…
<ORDERS>{ db1:getORDER($CUSTOMER) }</ORDERS>
<CREDIT_CARDS>{
db2:CREDIT_CARD()[CID eq $CUSTOMER/CID]
}</CREDIT_CARDS>
…
</tns:PROFILE>
(C) Copyright 2006, BEA Systems, Inc | 15
Pushdown: Inverse Functions I
Example fragment of tns:getProfile:
<tns:PROFILE>
<CID>{fn:data($CUSTOMER/CID)}</CID>
<LAST_NAME>{
fn:data($CUSTOMER/LAST_NAME)
}</LAST_NAME>
<SINCE>{int2date($CUSTOMER/SINCE)}</SINCE>
...
<tns:PROFILE>
used in this query
for $c in tns:getProfile()
where $c/SINCE gt $start
return $c
yields (after inlining)
for $c1 in ns3:CUSTOMER()
where int2date($c1/SINCE) gt $start
return <tns:PROFILE> ... </tns:PROFILE>
(C) Copyright 2006, BEA Systems, Inc | 16
Pushdown: Inverse Functions II
Register
Inverse function date2int for int2date
Transformation rule (gt, int2date)  gt-intfromdate with
declare function gt-intfromdate($x1 as xs:dateTime,
$x2 as xs:dateTime)
as xs:boolean?{
date2int($x1) gt date2int($x2)
};
Now we can rewrite the query into
for $c1 in ns3:CUSTOMER()
where $c1/SINCE gt ns1:date2int($start)
return <tns:PROFILE> ... </tns:PROFILE>
can be pushed as
SELECT * FROM "CUSTOMER" t1 WHERE (t1."SINCE" > ?)
(C) Copyright 2006, BEA Systems, Inc | 17
Optimization: PP-k Join
Parameter Passing in chunks of k
Prerequisites
Distributed join
Right side is a relational source
Idea: relational source can partition its content
Steps
Read k items from the left into L
Select all items from the right that match any of the items in L into R
Join L and R in the middleware
Repeat until the left is exhausted
Benefit: excellent trade-off between
Memory footprint in the middleware
Roundtrip overhead imposed by the data source
k = 20
(C) Copyright 2006, BEA Systems, Inc | 18
Example: “Get All” Read Method Revisited
(::pragma function ... kind="read" ...::)
declare function tns:getProfile() as element(ns0:PROFILE)*
{
for $CUSTOMER in db1:CUSTOMER()
return
<tns:PROFILE>
<CID>{ fn:data($CUSTOMER/CID) }</CID>
<LAST_NAME>{ fn:data($CUSTOMER/LAST_NAME) }</LAST_NAME>
<ORDERS>{ db1:getORDER($CUSTOMER) }</ORDERS>
<CREDIT_CARDS>{
db2:CREDIT_CARD()[CID eq $CUSTOMER/CID]
}</CREDIT_CARDS>
<RATING>{
fn:data(ws1:getRating(
<ns5:getRating>
<ns5:lName>{ data($CUSTOMER/LAST_NAME) }</ns5:lName>
<ns5:ssn>{ data($CUSTOMER/SSN) }</ns5:ssn>
</ns5:getRating>
)
}</RATING>
</tns:PROFILE>
};
(C) Copyright 2006, BEA Systems, Inc | 19
Evaluation Plan, Example 1 (getProfile)
(C) Copyright 2006, BEA Systems, Inc | 20
Evaluation Plan, Example 2 (query getProfile)
(C) Copyright 2006, BEA Systems, Inc | 21
Agenda
Why data services?
Overview and Example
The Query Processor
Work in progress at BEA
Summary and Q&A
(C) Copyright 2006, BEA Systems, Inc | 22
Some ALDSP Work in Progress
Native JDBC/SQL92 support (available on Sep 15)
Bilingual engine for efficient reporting/BI tool access
Support for compensating transactions
Extend update facility to support safe non-XA updates (sagas)
XQuery Update support (as well as XQueryP)
Goal: no Java coding for many Web service use cases
(C) Copyright 2006, BEA Systems, Inc | 23
Agenda
Why data services?
Overview and Example
The Query Processor
Work in progress at BEA
Summary and Q&A
(C) Copyright 2006, BEA Systems, Inc | 24
Summary
Covered here
Why Data Services?
How are Data Services used?
Some techniques for efficient evaluation
In the paper
XQuery extensions
Runtime architecture
Caches
Updates
Security
(C) Copyright 2006, BEA Systems, Inc | 25
Download