An overview of e-science and Grid technologies

advertisement
Enabling Grids for E-sciencE
An overview of e-science
and Grid technologies
Dave Berry
Deputy Director, Research & E-Infrastructure Development
National e-Science Centre
daveb@nesc.ac.uk
9th March 2006
www.eu-egee.org
INFSO-RI-508833
Contents
Enabling Grids for E-sciencE
•
•
•
•
Introduction to E-Science
E-Infrastructure & Grids
E-Infrastructure - Where are we now?
Summmary
– Enabling the research & business of the future
– and for early adopters… the present!
INFSO-RI-508833
3
Enabling Grids for E-sciencE
‘e-Science is about global collaboration
in key areas of science, and the next
generation of infrastructure that will
enable it.’
John Taylor
Director General of Research Councils
Office of Science and Technology
INFSO-RI-508833
4
A new way of doing science
networking
grids
instrumentation
computation
data curation…
value added of
distributed
collaborative
research (virtual
organisations)
Application pull
Technology push
Enabling Grids for E-sciencE
a new way for all scientists to work on research challenges
that would otherwise be difficult to address
Mário Campolargo DG INFSO F3, Pisa 24th October 2005
INFSO-RI-508833
5
EBank
Enabling Grids for E-sciencE
Slide
from
Jeremy
Frey
INFSO-RI-508833
6
Biomedical Research Informatics
Delivered by Grid Enabled Services
CFG Virtual
Publically Curated Data
Ensembl
Organisation
OMIM
Glasgow
SWISS-PROT
Private
Edinburgh
MGI
Portal
data
Private
data
Oxford
HUGO
…
RGD
Leicester
DATA
HUB
Private
data
Netherlands
Synteny
Grid
Service
Private
data
Private
data
London
Private
data
+
http://www.brc.dcs.gla.ac.uk/projects/bridges/
eDiaMoND: Screening for
Breast Cancer
Patients
Radiology reporting
systems
Letters
Screening
1 Trust  Many Trusts
Collaborative Working
Audit capability
Epidemiology
Electronic
Patient Records
Case Information
Assessment/ Symptomatic
Biopsy
2ndary Capture
Or FFD
X-Rays and
Case Information
Other Modalities
-MRI
-PET
-Ultrasound
Symptomatic/Assessment
Information
eDiaMoND
Grid
Case and
Reading Information
Better access to
Case information
And digital tools
SMF
CAD
Training
Case and
Reading Information
Digital
Reading
3D Images
Supplement Mentoring
With access to digital
Training cases and sharing
Of information across
clinics
Manage Training Cases
Perform Training
SMF
CAD
Temporal Comparison
9
Provided by eDiamond project: Prof. Sir Mike Brady et al.
climateprediction.net and GENIE
• Largest climate model
ensemble
• >45,000 users, >1,000,000
model years
Response of Atlantic
circulation to freshwater
forcing
2K
10K
UK Grid for
Particle Physics
(2003)
CMS
LHCb
ATLAS
CMS
GridPP www.gridpp.ac.uk
What is e-science?
Enabling Grids for E-sciencE
• Collaborative research that is made possible by the
sharing across the Internet of resources (data,
instruments, computation, people’s expertise...)
–
–
–
–
Crosses organisational boundaries
Often very compute intensive
Often very data intensive
Sometimes large-scale collaboration
• Began with focus in the “big sciences”
– Spreading to new user communities (social science, arts,
humanities…)
• Technologies also relevant in industry, government,
public services
INFSO-RI-508833
13
Enabling Grids for E-sciencE
DAME: Grid based tools and Inferstructure for Aero-Engine Diagnosis
and Prognosis
Engine flight data
London Airport
Airline
office
New York Airport
•“A Significant factor in the success of the Rolls-Royce
campaign to power the Boeing 7E7 with the Trent 1000
was the emphasis on the new aftermarket support service
for the engines provided via DS&S. Boeing personnel
were shown DAME as an example of the new ways of
gathering and processing the large amounts of data that
could be retrieved from an advanced aircraft such as the
7E7, and they were very impressed”, DS&S 2004
Grid
Diagnostics Centre
Maintenance Centre
American data center
European data center
XTO
Companies:
Rolls-Royce
DS&S
Cybula
INFSO-RI-508833
Universities:
York,
Leeds,
Sheffield, Oxford
Engine Model
Case Based Reasoning
Signal Data Explorer
14
Healthcare @ Home
REFERRAL
GP
Home-mobile-clinic
via PDA-laptop-PC-Paper
REFERRAL
Diabetician
Home-mobile-clinic
via PDA-laptop-PC-Paper
Various Clinical Specialists (Distributed)
e.g. Ophthalmologist, Podiatrist, Vascular
Surgeons, Renal Specialists, Wound clinic,
Foot care clinic, Neurologists, Cardiologists
REFERRAL
VARIABLES
ACCESS
MATRIX
CASE
Patient
Home-mobile-clinic
via TV-PDA-laptop-PC-Paper
Dietitian
Biochemist
Diabetes Specialist / Other Specialist Nurses
Home-mobile-clinic
via TV-PDA-laptop-PC-Paper
Community Nurses / Health Visitors
Contents
Enabling Grids for E-sciencE
•
•
•
•
Introduction to E-Science
E-Infrastructure and Grids
E-Infrastructure - Where are we now?
Summary
– Enabling the research & business of the future
INFSO-RI-508833
16
What is E-Infrastructure? – Political
view
Enabling Grids for E-sciencE
• A shared resource
– That enables science,
research, engineering,
medicine, industry, …
– It will improve UK / European /
… productivity
 Lisbon Accord 2000
 E-Science Vision SR2000 –
John Taylor
– Commitment by UK
government
 Sections 2.23-2.25
– Always there
 c.f. telephones,
transport, power, internet
INFSO-RI-508833
18
What is Grid Computing?
Enabling Grids for E-sciencE
• The grid vision is of “Virtual
computing” (+ information
services to locate computation,
storage resources)
– Compare: The web: “virtual
documents” (+ search engine
to locate them)
• MOTIVATION: collaboration
through sharing resources
(and expertise) to expand
horizons of
– Research
– Commerce – engineering, …
– Public service – health,
environment,…
INFSO-RI-508833
19
The Grid Metaphor
Enabling Grids for E-sciencE
Mobile Access
G
R
I
D
Workstation
M
I
D
D
L
E
W
A
R
E
Supercomputer, PC-Cluster
Data-storage, Sensors, Experiments
Visualising
Internet, networks
INFSO-RI-508833
20
Computing as a Commodity
Enabling Grids for E-sciencE
• From hand-built research computers to PC’s on every
desktop
• From individual computers to the Internet and the
WWW
• From specialist supercomputers to clusters and cyclescavenging
• From proprietary formats to standards and ontologies
• From room-sized computers to PDAs and mobile
phones
• From individual servers to virtualised, dynamically
provisioned blade farms
• From applications to services
• From ownership to computing-on-demand
INFSO-RI-508833
21
What is E-Infrastructure?
Enabling Grids for E-sciencE
Grids permit resource sharing
across administrative domains
– Dynamic allocation & configuration
•
Networks permit communication
across geographical distance
•
Supporting organisations
Collaboration
– Operations for grids, networks
•
Resources
–
–
–
–
•
Computers
Digital libraries
Research data
Instruments
Middleware
–
–
–
–
–
Authentication, Authorisation
Registries, search engines
Toolkits, environments
Shared vocabularies
Provenance
INFSO-RI-508833
Grid
Operations, Support and
training
•
Network
infrastructure
& Resources
22
Typical current science Grid
Enabling Grids for E-sciencE
• Grid middleware
runs on each
shared resource
– Data storage
– (Usually) batch
jobs on pools of
processors
• Users join Virtual
organisation (VO)
• VO negotiates with
sites to agree
access to resources
• Distributed services
(both people and
software) enable
the grid
INFSO-RI-508833
INTERNET
27
Typical current science grid
Enabling Grids for E-sciencE
User/Grid
interface
Input files
Output files
Datasets info
File Replica
Catalogue
Information
Service
Resource
Broker
INFSO-RI-508833
Publish
Logging &
Book-keeping
Job Query
Job Submit Event
Author.
&Authen.
Storage
Resource
Job Status
Computing
Resource
= batch queue
28
Empowering users
Enabling Grids for E-sciencE
VO-specific developments:
Application
Application
toolkits, standards
Middleware:
“collective services”
Basic Grid services:
AA, job submission, info, …
– Portals
– Virtual Research
Environments
– Semantics, ontologies
– Workflow
– Registries of VO services
Production grids provide these
services.
Develop above these to empower
ordinary users!
INFSO-RI-508833
29
Example Portal - BRIDGES
make host proxy,
authenticate with NGS
and submit job
job request is
passed on
securely with
username
NeSC grid server
with host credentials
NGS
clusters
authenticate at
BRIDGES web portal
with username and
password only
get user
authorisations
end user
Leeds
Oxford
BRIDGES web portal
RAL
Manch
ester
NeSC machine with PERMIS
authorisation service (GT3.3)
Slide by Micha Bayer, NeSC
30
Workflow example
Enabling Grids for E-sciencE
• Taverna in MyGrid http://www.mygrid.org.uk/
• “allows the e-Scientist to describe and enact their experimental
processes in a structured, repeatable and verifiable way”
• GUI
• Workflow
language
• enactment engine
INFSO-RI-508833
31
Notification
Pub/Sub for Laboratory data
using a broker and ultimately
delivered over GPRS
Comb-e-chem: Jeremy Frey
Workshop
The many scales of grids
Enabling Grids for E-sciencE
International instruments,..
National datacentres,
HPC, instruments
Institutes’ data;
Wider collaboration
greater resources
International grid (EGEE)
National grids (e.g.
National Grid Service)
Regional grids (e.g.
White Rose Grid)
Campus grids
Condor pools
INFSO-RI-508833
33
Security and trust -1
Enabling Grids for E-sciencE
• Providers of resources (computers, databases,..)
need risks to be controlled: they are asked to trust
users they do not know
– They trust a VO
– The VO trusts its members
• User’s need
– single sign-on: to be able to logon to a machine that can pass
the user’s authorisation to other resources
– To trust owners of the resources they are using
• Build middleware on layer providing:
– Authentication: know who wants to use resource
– Authorisation: know what the user is allowed to do
– Security: reduce vulnerability, e.g. from outside the firewall
– Non-repudiation: knowing who did what
INFSO-RI-508833
34
Security and trust -2
Enabling Grids for E-sciencE
• Achieved by Certification:
– User’s identity has to be certified by one of the national
Certification Authorities (CAs)
 mutually recognized http://www.gridpma.org/,
 E.g. In UK go to http://www.grid-support.ac.uk/ca/ralist.htm
– Resources are also certified by CAs
• User
– User joins a VO
– Digital certificate is basis of AA
– Identity passed to resources you use, where it is mapped to a
local account
• Policies express the rights for a Virtual Organization
to use resources
INFSO-RI-508833
35
Enabling Grids for E-sciencE
If “The Grid”
vision leads us
here…
… then where are
we now?
INFSO-RI-508833
36
Grids: where are we now?
Enabling Grids for E-sciencE
• Many key concepts identified and known
• Major efforts now on establishing:
– Standards (a slow process)
(e.g. Global Grid Forum, http://www.gridforum.org/ )
– Production Grids for multiple VO’s
 “Production” = Reliable, sustainable, with commitments to quality of
service
• In Europe, EGEE
• In UK, National Grid Service
• In US, Teragrid and OSG
 One stack of middleware that serves many research communities
 Establishing operational procedures and organisation
• “Service orientation” - “the way to build grids”
INFSO-RI-508833
38
Where are we now? –user’s view
Enabling Grids for E-sciencE
Research
Pilot
projects
Early
adopters
Routine
production
Unimagined
possibilities
Networks
Grids
Web
Arts
Sciences,
Humanities
engineering
e-Soc-Sci
Early production grids:
UK – National Grid
Service
International - EGEE
INFSO-RI-508833
39
National Grid initiatives now include…
Enabling Grids for E-sciencE
CroGrid
INFSO-RI-508833
41
Service-Oriented Architecture
Enabling Grids for E-sciencE
Registry
Discovery
Registration
Invocation
Client
INFSO-RI-508833
Service
42
Enabling Grids for E-sciencE
• Accessible across a
network
• Loosely coupled, defined
by the messages they
receive / send
• Interoperable: each
service has a description
that is accessible and can
be used to create
software to invoke that
service
• Based on standards
• Developed in anticipation
of new uses
INFSO-RI-508833
Service orientation – software
components that are…
Client
Registry
Service
Service
Service
Service
Service
Service
43
Enabling Grids for E-sciencE
Web Services
Grid and Web Services
Grid Technology
• Research driven
• Commerce
• Data-intensive
• Standards
• Compute
intensive
• Tools
• Collaboration –
sharing of
resources
Grids based on Web Services
INFSO-RI-508833
45
Contents
Enabling Grids for E-sciencE
•
•
•
•
Introduction to E-Science
E-Infrastructure and Grids
E-Infrastructure - Where are we now?
Summary
– Enabling the research & business of the future
INFSO-RI-508833
49
Summary - 1: its about collaboration!!
Enabling Grids for E-sciencE
(As well as resource utilisation!)
INFSO-RI-508833
Collaboration
Grid
Operations, Support and
training
• Grids: collaboration
across administrative
domains
• Networks: collaboration
across geographical
distance
• Semantics, ontologies:
collaboration across
disciplines
• Storage, (“curation”):
collaboration across
time
Network
infrastructure
& Resource
centres
52
Summary - 2
Enabling Grids for E-sciencE
• Ask not what “the Grid” can
do for you
People
• BUT
• With whom do you
collaborate?
• What resources / services
can you provide?
Data
Computation
• What resources would
empower your research?
INFSO-RI-508833
53
Download