A Bright Future with OGSA & DAIS Data Services •

advertisement
A Bright Future with
OGSA & DAIS Data
Services
Malcolm Atkinson
Director
www.nesc.ac.uk
6th July 2004
A la carte Menu
• E-Science
• Grid Infrastructure Deployment
• WS & Grid comparisons
• OGSI → WS-RF →
WS-RF (5) + WS-Notification (3)
• OGSA Data & DAIS
• OGSA-DAI
• Future Look
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 2
1
What is e-Science?
• Goal: to enable better research
• Method: Invention and exploitation of
advanced computational methods
to generate, curate and analyse research data
• From experiments, observations and simulations
• Quality management, preservation and reliable evidence
to develop and explore models and simulations
• Computation and data at extreme scales
• Trustworthy, economic, timely and relevant results
to enable dynamic distributed virtual organisations
• Facilitating collaboration with information and resource sharing
• Security, reliability, accountability, manageability and agility
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 3
The Primary Requirement …
Enabling People to Work Together on Challenging Projects: Science, Engineering & Medicine
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 4
2
A la carte Menu
• E-Science
• Grid Infrastructure Deployment
• WS & Grid comparisons
• OGSI → WS-RF →
WS-RF (5) + WS-Notification (3)
• OGSA Data & DAIS
• OGSA-DAI
• Future Look
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 5
The e-Science
Centres
Globus Alliance
Open
Middleware
Infrastructure
Institute
Digital
Curation
Centre
e-Science
Institute
Grid
Operations
Centre
?
CeSC (Cambridge)
EGEE
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 6
3
UK e-Science Grids
1600 x CPU
AIX
512 x CPU
Irix
Engineering Task
Force
(Contributions
from e-Science
Centres)
HPC(x)
20 x CPU
18TB Disk
Linux
Grid Support
Centre / Grid
Operations
Centre
NGS
OGSA Test Grid
projects
CeSC (Cambridge)
64 x CPU
4TB Disk
Linux
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 7
Importance of collaboration: VDT
• A highly successful collaborative effort
VDT Working Group
VDS (Chimera/Pegasus) team
• Provides the “V” in VDT
Used by many projects
Systematic testing
Rich integration of
components
• Middleware, testing, patches, feedback …
The UK will be part of this
PPDG
• Hardening and testing
– exploit test bed
Pacman
Condor Team
Globus Alliance
NMI Build and Test team
EDG/LCG/EGEE
• Provides easy installation capability
• Currently Pacman 2, moving to Pacman 3 soon
Thanks to Miron Livny
contribute components
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 8
4
E-Science’s Growing Assets
• Understanding of Processes & Requirements
• International and Multi-disciplinary Skill base
• Experience composing & adapting existing
technologies
and of building new components
• Experience Supporting Developers and Users
• Experience Establishing Virtual Organisations
across Enterprise boundaries
Embedded in People & Teams, Growing – they need nurture
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 9
EGEE Implementation
• From day 1 (1st April 2004)
Production grid service based on the LCG infrastructure running LCG-2 grid
middleware (SA)
LCG-2 will be maintained until the new generation has proven itself (fallback
solution)
• In parallel develop a “next generation” grid facility (JRA)
Produce a new set of grid services according to evolving standards (Web Services)
Run a development service providing early access for evaluation purposes
Will replace LCG-2 on production facility in 2005
LCG-1
LCG-2
Globus 2 based
EGEE-1
EGEE-2
Web services based
VDT
EDG
...
AliEn
LCG
...
EGEE
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 10
5
Certification, Testing and Release Cycle
SA1
Integrate
HEP
EXPTS
Basic
Functionality
Tests
BIO-MED
Run
Certification
Matrix
OTHER
TBD
Run tests
C&T suites
Site suites
APPS
SW
Installation
Certified
Release
candidate
tag
Dev
Tag
DEPLOY
release
tag
SERVICES
Deployment
release
tag
PRODUCTION
APP
INTEGR
PRE-PRODUCTION
CERTIFICATION
TESTING
DEPLOYMENT
PREPARATION
DEVELOPMENT & INTEGRATION
UNIT & FUNCTIONAL TESTING
JRA1
Production
tag
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 11
Sites in LCG-2/EGEE-0 : June 4
2004
Austria
U-Innsbruck
Canada
Triumf
Alberta
Carleton
Montreal
Toronto
Italy
CNAF
Frascati
Legnaro
Milano
Napoli
Roma
Torino
Czech
Republic
Prague-FZU
Prague-CESNET
Japan
Tokyo
Netherlands
NIKHEF
France
CC-IN2P3
Clermont-Ferrand
Pakistan
NCP
Poland
Krakow
Portugal
LIP
Russia
SINP-Moscow
JINR-Dubna
Spain
PIC
UAM
USC
UB-Barcelona
IFCA
CIEMAT
IFIC
Germany
FZK
Aachen
DESY
Wuppertal
Greece
HellasGrid
Hungary
Budapest
India
TIFR
Israel
Tel-Aviv
Weizmann
Switzerland
CERN
CSCS
Taiwan
ASCC
NCU
UK
RAL
Birmingham
Cavendish
Glasgow
Imperial
Lancaster
Manchester
QMUL
RAL-PP
Sheffield
UCL
US
BNL
FNAL
HP
Puerto-Rico
• 22 Countries
• 58 Sites (45 Europe, 2 US, 5 Canada, 5 Asia, 1 HP)
• Coming: New Zealand, China,
other HP (Brazil, Singapore)
• 3800 cpu
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 12
6
Examples of
HealthGRID
applications
Grids for medical development
Preparation and follow-up of
medical missions in developing
Clermont-Ferrand/Paris
countries
Support to local medical
centres in terms of second
g
in
diagnosis, patient follow-up learn aces
e- rnt
e efordea snis
and e-learning
tiv sntft toio
Ibagué
Hand surgery
Medical centre
c een an
ra uctoi algt
te ReeoPq-a dndsiu
n
I id cno
V co
se
2 missions (Ibagué &
Chuxiong) with the
french NPO « Chaîne de
l’Espoir » used as test
cases
Chuxiong
The grid impact :
•Improved telemedecine
services
• Federation of patient
databases
•Interactive e-learning (high
bandwidth network required)
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6 July 2004 - 13
th
DataGrid :
status of biomedical applications
deployed
tested on EDG
under preparation
• Bio-informatics
Phylogenetics : BBE Lyon (T. Sylvestre)
Search for primers : Centrale Paris (K. Kurata)
Bio-informatics web portal : IBCP (C. Blanchet)
Parasitology : LBP Clermont, Univ B. Pascal (N.
Jacq)
Data-mining on DNA chips : Karolinska (R. Médina,
R. Martinez)
Geometrical protein comparison : Univ. Padova (C.
Ferrari)
• Medical imaging
MR image simulation : CREATIS (H. Benoit-Cattin)
Medical data and metadata management : CREATIS
(J. Montagnat)
Mammographies analysis ERIC/Lyon 2 (S. Miguet,
T. Tweed)
Simulation platform for PET/SPECT based on
Geant4 : GATE collaboration (L. Maigne)
GATE MonteCarlo simulation
platform for
nuclear
medecine
180
Local_Monopro1500MHz
X10
X20
X50
X100
160
Temps en minutes
eHealth
eScience
Sec P
o at
Pat Rned diient d
nt qf ueasgtnoata
2ined
dioalglowfosrtic
no-sup
tic
140
120
100
80
60
40
20
0
Parallelisation
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 14
7
A la carte Menu
• E-Science
• Grid Infrastructure Deployment
• WS & Grid comparisons
• OGSI → WS-RF →
WS-RF (5) + WS-Notification (3)
• OGSA Data & DAIS
• OGSA-DAI
• Future Look
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 15
October 2001 View
Web Services
Grid Technology
Grid Services
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 16
8
Web Services – Q4 2001 view
• Independence
Client from Service
Service from Client
• Description
Web Services DL
…
• Separation
Function from Delivery
• Tools & Platforms
Java ONE
Visual .NET
WebSphere
Oracle
• Commercial Buy in
www. w3c. org / TR / SOAP or TR/wsdl
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 17
Grid Technology
• Distribution
Various Protocols
FTP
• Security
Single Sign in
• Resource Sharing
Discovery
Process Creation
Scheduling
• Portability
APIs
• Gov’nm’t Agency Buy in
Foster, I., Kesselman, C. and Tuecke, S., The Anatomy of the Grid: Enabling Virtual
Organisations, Intl. J. Supercomputer Applications, 15(3), 2001
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 18
9
Service-Oriented Architecture
Registry
Discovery
Registration
Invocation
Client
Service
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 19
WS & Grid Comparison 1
Web Services
Grid Services
• Goals
• Goals
Computational
presentation & access of
Enterprise services
Marketing integrated
large scale software and
systems
Model for independent
development
Model for independent
operation
Inter-organisational
collaboration
Sharing information and
resources
Framework for
collaborative
development
Framework for
collaborative operation
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 20
10
WS & Grid Comparison 2
Web Services
Grid Services
• Commitment
• Commitment
Most large technology
providers
Some service providers
Some service hosters
Some large laboratories
Many governmentfunded research
programmes
Some resource providers
• Standardisation
• Standardisation
W3C
Oasis
…
IETF
GGF
Oasis
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 21
WS & Grid Comparison 3
Web Services
Grid Services
• Standards
• Standards
WS-I
• Core of agreed &
provided
• WSDL, SOAP, UDDI, WS-
security
• Revised regularly
Many others under way
• WS-* are important
• Competition & synthesis
Commercial battleground
• Do these standards
support my business
model
• When do I want them
None
Many exist as proposals
Important Architecture
• OGSA
Continuum from
requirements & research
to well specified
standards proposals
Building on & influencing
WS
Hard to understand and
engage
Hard to understand &
engage
What matters is what is widely adopted
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 22
11
WS & Grid Comparison 4
Web Services
• Usage
Grid Services
• Usage
Complex services created
& delivered persistently
by owner organisation
Client interactions shortlived
Multi-organisation
integration responsibility
of client
• Workflow enactment
• Transaction coordination
• May be by an
intermediate service
All of WS patterns +
Dynamic services /
resources
Long-lived interactions
Persistent computational
integration
• Data management
• Computation management
Persistent operational
infrastructures
• GOC managing European-
scale grid
System organised
optimisation
End-to-end security (goal)
Virtual Organisations
• Establish multi-organisation
Security on a local basis
security policies
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 23
WS & Grid Comparison 5
Web Services
• Status
• Status
Operational research
projects and grids
Commercially successful
operational applications
Several good toolsets
available
• >100 projects use GT2 or
GT3
• Mostly costly to use outside
academia
• BPEL4WS
Apache Tomcat
• High-level work-load
generators
Beware hype and marketing
Scale, usability & reliability
problems in free-ware
Much momentum
Very high levels of
investment
No toolsets
Scientific workflow
• Chimera, Pegasus, VDT, …
Workflow enactment
• Many fixes were needed to
Grid Services
Some very robust and well
tested technologies
• Condor, GT2, VDT, GT3.2,
LCG2, EGEE1
All free-ware
Performance, usability and
reliability problems
Much momentum
High levels of investment
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 24
12
WS & Grid Comparison 6
Web Services
Grid Services
• Interaction
• Interaction
Grids will influence
provision systems
Grids stimulating many
standards development
Using web services
extensively
Balancing act
• Reach goals
• Retain access to WS
tools
Expect a continuous coevolution
• Significant new species
next year
Application goals push technical limits in both cases
At limits expect difficulties – most work not at limits
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 25
A la carte Menu
• E-Science
• Grid Infrastructure Deployment
• WS & Grid comparisons
• OGSI → WS-RF →
WS-RF (5) + WS-Notification (3)
• OGSA Data & DAIS
• OGSA-DAI
• Future Look
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 26
13
What changed and why?
• OGSI
Notification continues
One unit
• Dynamic resources & properties
• Static functions
Dynamic lifetime management
• Creation & termination common (cheap) operations
Global persistent identification
Lost
• WS-Resource Framework
WS-Address
WS has functions
Resource has lifetime, properties, etc.
Partitioned
specs more
manageable ☺
Reconciles static view of WS with required dynamics
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 27
GT & WSRF Timeline
GGF
GW OASISGGF10 interop
TC
techPre
1
6-sys
TC2
Demo
Improved robustness,
scalability, performance,
usability
GT3.2
2004
3.2
March
4.0 β
Q2
2005
GT4.0
Not waiting for
finalisation
of WSRF specs.
Use as submitted
4.0
Q3
WSRF; some new
functionality; further
usability,
4.2
performance
Q2
‘05
enhancements
GT4.2
Numerous new WSRF-based services
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 28
14
Components in GT 3.2
GSI
WU GridFTP
Pre-WS
GRAM
MDS2
JAVA
WS Core
(OGSI)
WS-Security
RFT
(OGSI)
WS GRAM
(OGSI)
WS-Index
(OGSI)
OGSI
C Bindings
CAS
(OGSI)
RLS
SimpleCA
OGSA-DAI
OGSI
Python Bindings
(contributed)
pyGlobus
(contributed)
XIO
Security
Data
Management
Resource
Management
Information
Services
WS
Core
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 29
Planned Components in GT 4.0
GSI
New GridFTP
Pre-WS
GRAM
MDS2
JAVA
WS Core
(WSRF)
WS-Security
RFT
(WSRF)
WS-GRAM
(WSRF)
WS-Index
(WSRF)
C WS Core
(WSRF)
CAS
(WSRF)
RLS
CSF
(contribution)
SimpleCA
OGSA-DAI
Authz
Framework
XIO
Security
Data
Management
pyGlobus
(contributed)
Resource
Management
Information
Services
WS
Core
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 30
15
Relative Importance – a sense of
proportion
• What envelopes you put your messages in
How they are delivered
Infrastructure to organise a common technical
platform – the foundations of communication
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 31
Relative Importance
• What envelopes you put your messages in
How they are delivered
Infrastructure to organise a common technical platform – the foundations of
communication
• What information you send in your messages
Their patterns of Use - sequences that mean
something
Their Contents
The Grammar and Vocabulary of Communication
Agreed Interpretations
Scope of OGSI to WS-Resource Framework change
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 32
16
Relative Importance
•
What envelopes you put your messages in
Technical
Experts
How they are delivered
Infrastructure to organise a common technical platform – the foundations of communication
•
What information you send in your messages
Their patterns of Use - sequences that mean something
Their Contents
The Grammar and Vocabulary of Communication
Agreed Interpretations
• What you do when you get a message
The Application Code you Execute
The Middleware Services
• Security, Privacy, Authorisation, Accounting, Registries,
Brokers, …
Integration Services
• Multi-site Hierarchical Scheduling, Data Access & Integration,
…
Portals, Workflow Systems, Virtual Data, Semantic Grids
Tools to support Application Developers, Users &
Operations
• Incremental deployment tools, diagnostic aids, performance
monitoring, …
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 33
Relative Importance
•
What envelopes you put your messages in
•
What information you send in your messages
•
What you do when you get a message
How they are delivered
Infrastructure to organise a common technical platform – the foundations of communication
Their patterns of Use - sequences that mean something
Their Contents
The Grammar and Vocabulary of Communication
Agreed Interpretations
The Application Code you Execute
The Middleware Services
•
Security, Privacy, Authorisation, Accounting, Registries, Brokers, …
Integration Services
•
Domain
Experts
Multi-site Hierarchical Scheduling, Data Access & Integration, …
Portals, Workflow Systems, Virtual Data, Semantic Grids
Tools to support Application Developers, Users & Operations
• Creative Actions and Judgements of
Researchers, Designers & Clinicians
Data, Models & Analyses
In Silico Experiments, Design, Diagnosis & Planning
Creating the Scientific Record
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 34
17
A la carte Menu
• E-Science
• Grid Infrastructure Deployment
• WS & Grid comparisons
• OGSI → WS-RF →
WS-RF (5) + WS-Notification (3)
• OGSA Data & DAIS
• OGSA-DAI
• Future Look
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 35
Move Computation to Data
• Code scale
Increasingly
Depends on wet-ware
• No noticeable rate of improvement
necessary
Application
Grows Moore’s Law or Moore’s Law2
Analysis of data
control or
Extracts & derivatives used
higher-level
• Often smaller – more value for current investigation
service
Implies move code to data
decisions
SQL, Xquery, Java code, …
• Data scale
•
•
• Extensibility mechanisms used by OGSA-DAIers
• Java mobility (e.g. DataCutter), database procedures, …
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 36
18
Integration is Everything
• Motivation
•
•
•
•
No business or research team is satisfied with one
data resource
Federation
or
Human centred
Virtualisation
Domain-specialist driven
preceding
Dynamic specification of combination function
integration or
Iterative processes
• Revised request minutes later
kit of
• Revised request after months of thought
integration
Sources inevitably heterogeneous
tools to be
Time-varying content, structure & policies
interwoven
Robust & stable steerable integration services
with an
Higher-level services over multiple resources
Fundamental requirements for (re)negotiation
application?
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 37
Multiple tasks / request
C
L
I
E
N
T
R
E
Q
U
E
S
T
O
R
1
Data Set
dr
A
P
Ident
I
S
T
Ident
U
Type
Type
B
7Value 6 Value
2
Data Set
5
Ident
Type
Value
4
Ident
Type
Value
3
Ident
Type
Value
2
Ident
Type
Value
1
Ident
Type
Value
Ident
Type
0 Value
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 38
19
Be Direct
• Double Handling costs too much
Breaks down
boundaries and
• Double Handling via discs pathologically
bad
merges
data,
• Data translation expensive
execution &
Avoid
transport
• Deliver as stored, …
requirements.
Memory cycles, bus capacity, cache disruption, …
Compose
Stream
Demands smart
• Main memory is not big enough
workflow
Stream or use Disk
enactment
• Couple generator & consumer directly service &
Stream from RAM to RAM
foundation
Requires coupled computation execution services
Models for process transformation and optimisation
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 39
A la carte Menu
• E-Science
• Grid Infrastructure Deployment
• WS & Grid comparisons
• OGSI → WS-RF →
WS-RF (5) + WS-Notification (3)
• OGSA Data & DAIS
• OGSA-DAI
• Future Look
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 40
20
OGSA-DAI
Request to Registry for sources of
data about “x”
Registry responds
with Factory
handle
Analyst
SOAP/HTTP
Registry
GDSR
service creation
API interactions
Request to Factory for access to
database
Factory
GDSF
Factory returns handle of
GDS to client
Factory creates
GridDataService
Client queries GDS with SQL,
XPath, XQuery etc
Query results
returned XML
OR
delivered to consumer
as XML
Consumer
Grid Data
Service
GDS
Database
(Xindice, MySQL
Oracle, DB2)
GDS interacts
with database
A la carte Menu
• E-Science
• Grid Infrastructure Deployment
• WS & Grid comparisons
• OGSI → WS-RF →
WS-RF (5) + WS-Notification (3)
• OGSA Data & DAIS
• OGSA-DAI
• Future Look
BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 42
21
Download