Databases and the Grid OGSA-DAI Architecture & Requirements 30

advertisement
Databases and the Grid
OGSA-DAI
Architecture & Requirements
Malcolm Atkinson
OGSA-DAI Chief Architect
Director of National e-Science Centre
www.nesc.ac.uk
30th May 2002
OGSA Early Adopters’ Workshop
Argonne National Laboratories
Overview
UK e-Science
Scale, Coordination, Structure, Projects
Database Task Force & GGF DAI-WG
OGSA-DAI Project
Scope, Scale, Participants, Plans
Architecture
Relationship with OGSA
Requirements
UK e-Science Programme
Tony Hey
DG Research Councils
E-Science
Steering Committee
Director’s
Awareness and Co-ordination Role
Academic Application Support
Programme
Research Councils (£74m), DTI (£5m)
PPARC (£26m)
BBSRC (£8m)
MRC (£8m)
NERC (£7m)
£80m
ESRC (£3m)
EPSRC (£17m)
CLRC (£5m)
Grid TAG
Director
Director’s
Management Role
Generic Challenges
EPSRC (£15m), DTI (£15m)
Collaborative projects
Industrial Collaboration (£40m)
UK Grid Network
National
e-Science
Centre
Edinburgh
Glasgow
AccessGrid
always-on video
walls
Newcastle
Belfast
Daresbury Lab
Manchester
Cambridge
Hinxton
Oxford
Cardiff
RAL
London
Southampton
NeSC’s Roles
Coordination, Stimulation & Education
e-Science Centres Application Pilots
IRCs …
e-Scientists, Grid users, Grid services & Grid Developers
TAG
ETF
GNT
DBTF
ATF
NeSC
STF
GSC
UK Core Directorate
eSI
CS Research
Global Grid Forum …
UK Architectural Task Force (ATF)
Malcolm Atkinson (NeSC)
Jon Crowcroft (Cambridge U.)
Vijay Dialani (Southampton U.)
Ian Leslie (Cambridge U.)
Ken Moody (Cambridge U.)
Tony Storey (IBM)
Geof. Coulson (Lancaster U.)
David De Roure (Southampton U.)
Andrew Herbert (Microsoft)
Andrew Martin (Oxford U.)
Steven Newhouse (ICSTM & LeSC)
…………… Plus consultations
UK Role in Open Grid Services Architecture, Version 0.6 11th March 2002
www.nesc.ac.uk
→ teams
→ ATF
Obtained Agreement: OGSA as Foundation for UK work, 18 April 2002
e-Science Institute
National e-Science Centre
Edinburgh + Glasgow Universities
Physics & Astronomy × 2
Informatics, Computing Science
EPCC
£6M EPSRC/DTI + £2M SHEFC over 3
years
e-Science Institute
visitors, workshops, co-ordination,
outreach
middleware development
50 : 50 industry : academia
‘last-mile’ networking
www.nesc.ac.uk
UK Pilot Projects
Research Councils Autonomy
> 30 Projects
$5 million to $0.3 million
Wide Range of Disciplines
Industrial Involvement
Integration and Access to Information
e-Science Centre Projects
> 50% Industrial Involvement
IRC ‘Grand Challenge’ Projects
Equator: Technological
innovation in physical and
digital life
AKT: Advanced Knowledge
Technologies
DIRC: Dependability of
Computer-Based Systems
MIAS: From Medical Images
and Signals to Clinical
Information
From presentation by Tony Hey
Particle Physics and Astronomy
e-Science Projects
GridPP
links to EU DataGrid, CERN LHC Computing
Project, US GriPhyN and PPDataGrid Projects,
and iVDGL Global Grid Project
AstroGrid
links to EU AVO and US NVO projects
OGSA-DAI Early Adopter
From presentation by Tony Hey
EPSRC e-Science Projects (1)
Comb-e-Chem:Structure-Property Mapping
Southampton, Bristol, Roche, Pfizer, IBM
DAME: Distributed Aircraft Maintenance
Environment
York, Oxford, Sheffield, Leeds, Rolls Royce
Reality Grid: A Tool for Investigating
Condensed Matter and Materials
QMW, Manchester, Edinburgh, IC,
Loughborough, Oxford, Schlumberger, …
From presentation by Tony Hey
EPSRC e-Science Projects (2)
MyGrid: Personalised Extensible
Environments for Data Intensive in silico
Experiments in Biology
Manchester, EBI, Southampton, Nottingham,
Newcastle, Sheffield, GSK, Astra-Zeneca, IBM,
Sun
OGSA-DAI Early Adopter
GEODISE: Grid Enabled Optimisation and
Design Search for Engineering
Southampton, Oxford, Manchester, BAE, Rolls
Royce
Discovery Net: High Throughput Sensing
Applications
Imperial College, Infosense, …
From presentation by Tony Hey
MyGrid e-Science Workbench
Goal is to develop ‘workbench’ to support:
Experimental process of data accumulation
Use of community information
Scientific collaboration
Provide facilities for resource selection, data
management and process enactment
Bioinformatics applications
Functional genomics, pattern database annotation
Manchester, EBI, Newcastle,Nottingham,
Sheffield, Southampton
GSK, AstraZeneca, Merck, IBM, Sun, ...
From presentation by Tony Hey
Overview
UK e-Science
Scale, Coordination, Structure, Projects
Database Task Force & GGF DAI-WG
OGSA-DAI Project
Scope, Scale, Participants, Plans
Architecture
Relationship with OGSA
Requirements
(
DBTF Web Pages
http://www.cs.man.ac.uk/grid-db
DBTF Membership
Malcolm Atkinson (NESC)
Vijay Dialani (Southampton University)
Norman Paton (Manchester University)
Dave Pearson (Oracle UK)
Tony Storey (IBM Hursley)
Paul Watson (Newcastle University)
DBTF: Aims & Actions
Requirements Capture
Pilot Project Meetings
Report
Dave Pearson
Roadmap
UK Coordination
GGF Articulation
Standards
BoF GGF4
Papers GGF5
Implementation
Projects
OGSA-DAI
Architecture
Liase with ATF
Liase with Globus team
Education
e-Science Institute
Pilot Projects
GSC
Evolving
GGF DAIS WG
Broader community
Overview
UK e-Science
Scale, Coordination, Structure, Projects
Database Task Force & GGF DAI-WG
OGSA-DAI Project (
Scope, Scale, Participants, Plans
Architecture
Relationship with OGSA
Requirements
OGSA-DAI Partners
IBM
USA
EPCC & NeSC
Glasgow
Newcastle
Belfast
Daresbury Lab
Manchester
Oxford
Cambridge
Hinxton
EPCC & NeSC
Oracle
RAL
IBM UK
Cardiff
London
IBM Hurseley
IBM USA
Southampton
Manchester e-SC
Newcastle e-SC
st February 2002
$5
million,
18
months,
started
1
Oracle
OGSA-DAI Scope
Definition and development of generic Grid data
services which provide access to and integration of
data held in databases, and the management of data
within a distributed environment.
Database
A stored, structured collection of data
Accessed using an API that takes account of the
structure of the data stored
Includes
Relational and object databases
XML repositories
Adequately described collections of files
Databases in the Grid
Data
Complexity
Computational Complexity
Scope of Database Services
Discovery of Data by Content
Query and Update Statements
Metadata Management & Evolution
Transactions (Flavours of)
Distributed queries and updates
Specialised types
Encapsulated (safe) Function application
Notification (driven by triggers, etc.)
OGSA-DAI Objectives
Produce specifications for generic data services
based on a common design framework
consistent with Open Grid Service Architecture
Design specifications
as basis of standards recommendations
via Database Access and Integration Services Working Group to the Global Grid Forum
Deliver Grid data services software
in future releases of the Globus Toolkit (GT3 December 2002)
Refine identified requirements
evaluate design options
develop demonstrators
transfer skills to the Grid community
Develop reference implementations of generic data services
Ensure that the Grid model and OGSA standards address fully the needs of data
access and integration
Ensure Grid data services meet the levels of service required
performance, scalability, resilience, availability, and manageability
evolution and distribution
large user populations and large data volumes
OGSA-DAI Plan
Two Phases
Phase 1: Started Feb 02 ends GGF5
Detailed Plan –
X
X
X
X
X
X
X
X
Requirements, Designs & Prototypes
6 Work Packages
Project Management (Oracle, EPCC)
Architecture (NeSC, DBTF)
XML Data Management (NeSC & EPCC)
Distributed Query Systems (Manchester & Newcastle)
Metadata & Registries (NeSC & EPCC)
Relational Databases (IBM UK)
Phase 2: 12 months
X
X
Structure and Objectives to be Refined in Major Review
GGF5 DAIS WG meeting a major input
OGSA-DAI Time Line
WS + GSI UK support ( > 60 downloads)
XML + OGSA Prototypes for Early Adopters
RDB + GT2 / OGSA Prototypes for Early Adopters
Design Documents & Demos for DAIS WG @ GGF5
XML + OGSA Prototype Available
RDB + GT2 / OGSA Prototypes Available
Ship for GT3 Integration
Feb ’02
May ’02
Phase 1 Starts
Jul ’02
Sep ’02
Dec ’02
Phase 2 Starts
Feb ’03
May ’03
Sep ’03
Milestones & Deliverables
3rd Jul
2002
30th Sept
2002
31st Dec
2002
GGF 5 Deliverables
1st Draft – OGSA-DAI Design Specification
Working Grid data service prototype with workshop material
Draft Phase 2 functional scope for each Work Package
End Phase 1
Phase 1 Review Report and recommendations including: revisions to Phase 2 streams of work,
Work Package structure, content, and scope
Completed, Tested, Work Package prototypes with evaluation report detailing functional scope
and deficiencies, design options, measures for acceptance
RDBMS/Globus-2 prototype implementation
Phase 2 scope
Agreed 2nd Draft – OGSA-DAI design specification
Dissemination programme for UK e-Science community
Transition programme for UK Grid Support Team and Globus Development Team
Globus Toolkit Release
1st Grid data services reference implementation for Globus Toolkit 3
1st Grid data services specification for Globus Toolkit 3
Scope of functional content for 2nd Globus Toolkit release and specification
1st release training and support courses
31st Mar
2003
Interim UK e-Science community release
31st Jul
2003
Globus Toolkit Release
Interim Grid data services implementation for UK e-Science community
Release training and support courses, with documentation
2nd Grid data services reference implementation for Globus Toolkit 3
2nd Grid data services specification for Globus Toolkit 3
2nd release training and support courses
Publications and papers to support reference implementations through WG discussions and GGF
standards processes
Final Project Report
OGSA-DAI: Key Components
Grid Database Services (GDS)
GXDS, GRDS, GSFDS, …
Perform DB actions
Extra Data Service Elements
DB-action-Management Functions
Notifications from Triggers
Grid Database Service Factories (GDSF)
Create the above
Extra Data Service Elements
Database Service Registries (DSR)
Specialised Registries to find DBs, Services & Factories
Grid Data Transfer Services (GDTS)
Described at Requirement Level
Flexible & mapped to grid-FTP, MQ Series, …
OGSA-DAI Architecture
GDSF
DSR
1
request
for
factory
client
OGSA-DAI Architecture
GDSF
DSR
1
request
for
factory
client
2
response
with
GDSFs
GSHs
OGSA-DAI Architecture
GDSF
3
script for 3
GDSs
DSR
1
request
for
factory
client
2
response
with
GDSFs
GSHs
OGSA-DAI Architecture
GDSF
3
script for 3
GDSs
4
creation of 3
GDSs
GDS1
DSR
1
request
for
factory
GDS2
client
2
response
with
GDSFs
GSHs
GDS3
OGSA-DAI Architecture
GDSF
3
script for 3
GDSs
4
creation of 3
GDSs
GDS1
DSR
1
request
for
factory
client
2
response
with
GDSFs
GSHs
5
response
with 3
GSHs
GDS2
GDS3
OGSA-DAI Architecture
GDSF
3
script for 3
GDSs
4
creation of 3
GDSs
GDS1
DSR
1
request
for
factory
client
5
response
with 3
GSHs
2
response
with
GDSFs
GSHs
GDS2
GDS3
6
scripts
requesting
DB actions
OGSA-DAI Architecture
GDSF
3
script for 3
GDSs
4
creation of 3
GDSs
GDS1
DSR
1
request
for
factory
client
5
response
with 3
GSHs
2
response
with
GDSFs
GSHs
GDS2
GDS3
6
scripts
requesting
DB actions
7
transfer data
batch to GDS2
stream to GDS3
OGSA-DAI Architecture
GDSF
3
script for 3
GDSs
4
creation of 3
GDSs
GDS1
7
transfer data
batch to GDS2
stream to GDS3
DSR
1
request
for
factory
client
5
response
with 3
GSHs
2
response
with
GDSFs
GSHs
GDS2
GDS3
6
scripts
requesting
DB actions
8
stream data
to GDS2
OGSA-DAI Architecture
GDSF
3
script for 3
GDSs
4
creation of 3
GDSs
GDS1
7
transfer data
batch to GDS2
stream to GDS3
DSR
1
request
for
factory
client
5
response
with 3
GSHs
2
response
with
GDSFs
GSHs
GDS2
GDS3
6
scripts
requesting
DB actions
9
transfer data
batch
to client
8
stream data
to GDS2
OGSA-DAI Architecture
GDSF
3
script for 3
GDSs
4
creation of 3
GDSs
GDS1
7
transfer data
batch to GDS2
stream to GDS3
DSR
1
request
for
factory
client
5
response
with 3
GSHs
2
response
with
GDSFs
GSHs
GDS3
6
scripts
requesting
DB actions
10
stream data
to specified
destination
GDS2
9
transfer data
batch
to client
8
stream data
to GDS2
OGSA-DAI & OGSA <((-:}
Description, e.g. portType Works Well
Adding only one portType / GDS(F) | DSR
Expect to make extensive use of
Data Service Elements
X
Special to DBs: Static & Dynamic
Component Management
Notification
Grid-FTP
Accounting
Security:
X
Authentication, Authorisation & Privacy
Reliable invocation
…
OGSA-DAI & OGSA <))-:}
Lifetime Issues
Conditions for termination
Controlled clean-up opportunity
Scope of State
Evolution
Notification Issues
Registering & using same notification system
X
X
For DBs, e.g. triggers
do we have to construct a dummy Service Data Element?
Type System Issues
Standards needed for wide range of types
Service Definition Issues
How to create / obtain standard definitions for common
services
OGSA-DAI Summary
On Schedule & Going Well
Expect Contributions via DAIS-WG @ GGF5
Expect Contributions to GT3 Releases
Early Days
Testing Architectural Design
Using OGSA
Working with Early Adopter Pilot Projects
X
AstroGrid & MyGrid
Planned release of prototypes
Influence OGSA-DAI direction
Via DAIS-WG
Download