Databases and the Grid Norman Paton University of Manchester

advertisement
Databases and the Grid
Norman Paton
University of Manchester
(Chair of UK e-Science Programme Database
Taskforce)
Databases and the Grid
„
„
„
Why they matter.
What the issues are.
What the e-Science Programme is doing
about it.
The Grid in Context
Data
Complexity
Computational Complexity
Grids
„
„
Classical Grids
emphasise sharing
of physical
resources.
Existing Grid
middleware allows
resource discovery,
resource allocation,
data movement, …
NASA Power Grid
(http://www.ipg.nasa.gov/)
Managing Data
„
Data collections
need to be managed
to provide:
„
„
„
„
„
Scaleability.
Reliability.
Concurrency.
Evolution.
Both size and
complexity matter.
Databases and the Grid
„
„
„
Why they matter.
What the issues are.
What the e-Science Programme is doing
about it.
Data Management Complexity
„
Many different:
„
„
„
„
„
Models of data.
Domains of control.
Locations.
Patterns of use.
No well defined application boundaries:
„
Grid applications combine data access and
update with computation.
Middleware Complexity
Combining Grid and Web Services
composition
frameworks
(e.g. XCAT)
Job Submission /
Control
Grid ssh
File Transfer
CORBA
GRAM
Data Management
Monitoring
Events
……
Credential
Management
Workflow
Management
other services:
•visualization
•interface builders
•collaboration tools
•numerical grid
generators
•etc.
Python, Java, etc.,
JSPs
CoG Kits implementing
Web Services in
servelets, servers, etc.
Apache SOAP,
.NET, etc.
Apache Tomcat&WebSphere
&Cold Fusion=JVM + servlet
instantiation + routing
Resources
Condor-G
SRB/
Metadata
Catalogue
Data Replica and
Metadata Catalog
GridFTP
Grid
Monitoring
Architecture
Grid X.509
Certification
Authority
Grid
Information
Service
Grid Web Service
Description (WSDL)
& Discovery (UDDI)
MPI
Secure,
Reliable
Group Comm.
Grid Protocols and Grid Security Infrastructure
Environment
Management
(LaunchPad,
HotPage)
Grid Services:
Collective and Resource Access
Grid Protocols and Grid Security Infrastructure
http, https. etc.
Problem
Solving
Environments
(AVS, SciRun,
Cactus)
PDA
Web Browser
X Windows
Discipline /
Application
Specific
Portals
(e.g. SDSC
TeleScience)
Web
Services
XML / SOAP over Grid Security Infrastructure
Clients
Application
Portals
Compute
(many)
Storage
(many)
Communication
Instruments
(various)
The Issues
„
„
„
„
Identifying the most important services.
Agreeing consistent interfaces.
Integrating with other Grid services.
Implementing services for database:
„
„
Access.
Integration.
Databases and the Grid
„
„
„
Why they matter.
What the issues are.
What the e-Science Programme is doing
about it.
Pilot Projects
„
„
Several application
based pilot projects
need database
functionalities.
These projects:
„
„
Push current
technologies.
Identify generic
requirements.
Database Task Force
„
„
„
Collating
requirements.
Developing
standards.
Developing
reference
implementations.
Development Project
„
„
„
Developing database
access and
integration services.
Joint between
industry, and eScience Centres.
Exploiting Open Grid
Services
Architecture: OGSA.
Summary
„
„
„
„
Early Grid middleware largely overlooks the
importance of databases.
The UK e-Science programme was quick to
identify this omission.
The UK e-Science programme is working
within GGF to establish standards.
Reference implementations of the standards
will be pioneered within the UK.
Download