A Bright Future with OGSA Data Services Malcolm Atkinson Director

advertisement
A Bright Future with
OGSA Data Services
Malcolm Atkinson
Director
www.nesc.ac.uk
7th June 2004
OGSA-DAI
Request to Registry for sources of
data about “x”
Registry responds
with Factory
handle
Analyst
SOAP/HTTP
Registry
GDSR
service creation
API interactions
Request to Factory for access to
database
Factory returns handle of
GDS to client
Factory
GDSF
Factory creates
GridDataService
Client queries GDS with SQL,
XPath, XQuery etc
Query results
returned XML
OR
delivered to consumer
as XML
Consumer
Grid Data
Service
GDS
Database
(Xindice, MySQL
Oracle, DB2)
GDS interacts
with database
Extensibility
Data resources
Unbounded variety
Data access languages
Established standards
X
With many variants
Should
extensibility be
supported by
foundation
interfaces?
SQL, OQL, semi-structured query, domain languages
Investment in DBs, DBMSs, File Stores, Bulk
stores, …
Not sensible to expect them to change to fit us
Data Access Models must be extensible
Static extension used extensively by OGSA-DAI users
Move Computation to Data
Increasingly
Code scale
Depends on wet-ware
X
necessary
No noticeable rate of improvement
Data scale
Grows Moore’s Law or Moore’s
Law2
Analysis of data
Extracts & derivatives used
X
Often smaller – more value for current investigation
Implies move code to data
Application
control or
higher-level
service
decisions
SQL, Xquery, Java code, …
Extensibility mechanisms used by OGSA-DAIers
Java mobility (e.g. DataCutter), database procedures, …
Integration is Everything
No business or research team is satisfied with
one data resource
Federation or
Virtualisation
Domain-specialist driven
preceding
Dynamic specification of combination function
integration or
Iterative processes – range of time scales
kit of
Sources inevitably heterogeneous
integration
Content, structure & policies time-varying
tools to be
Robust & stable steerable integration interwoven
services
Higher-level services over multiple resources
with an
Fundamental requirements for (re)negotiation
application?
Multiple tasks / request
C
L
I
E
N
T
R
E
Q
U
E
S
T
O
R
1
Data Set
dr
Data Set
A
P
Ident
I
S
T
Ident
U
Type
Type
B
7Value 6 Value
2
5
Ident
Type
Value
4
Ident
Type
Value
3
Ident
Type
Value
2
Ident
Type
Value
1
Ident
Type
Value
Ident
Type
0 Value
Be Direct
Breaks down
Double Handling costs too much
boundaries
Memory cycles, bus capacity, cache disruption,
…
and merges
Double Handling via discs pathologically
data, bad
execution
& transport
Data translation expensive
requirements.
Avoid or compose
Main memory is not big enough
Demands
Couple generator & consumer directly
smart workflow
enactment
Data pipe from RAM to RAM
Requires coupled computation execution service &
foundation
services
Download