E-Science, National e-Science Centre and Open Grid Services Architecture 6

advertisement
E-Science, National e-Science Centre
and
Open Grid Services Architecture
Malcolm Atkinson
Director of NeSC
Universities of Edinburgh and Glasgow
6th March 2002
Outline
•
Review e-Science
–
–
What is it?
Assumptions & Progress
•
UK e-Science Programme
•
NeSC & e-Science Institute
•
Open Grid Services Architecture
What is e-Science?
•
An acceleration of a trend?
•
A sea change in scientific method?
•
A new opportunity for science?
–
And every other collaborative, information
intensive activity
Accelerating Trend
•
More and More data ⇒ must change methods
–
Instrument resolution doubling /12 months
u
u
–
–
–
•
–
Computations available doubling / 18 months
Analyses and simulations increasing
Faster networks ⇒ can change methods
–
•
Storage capacity doubling / 12 months
Number of data sources doubling / ?? months
Laboratory automation capacity doubling / ??
More and More Computation
–
•
Instrument and telemetry speeds increasing
Mobile sensors & radio digital networks
Raw bandwidth doubling / 9 months
These Integrate and Enable
–
–
–
More interplay between computation and data
More collaboration: scientists, medics, engineers, …
More international collaboration
Sea Change
•
In Silico discovery + systematic exploration
–
–
Exploration of data and models predicts results
Verified by directed experiments
u
u
u
•
Shared Resources ⇒ need “intelligent” labs
–
–
–
–
•
Combinatorial chemistry
Gene function
Protein Structure, …
Researcher’s Workbench →
Laboratory team →
Multi-national network of labs + modellers →
Public instruments, repositories and simulations
Floods of (public) data ⇒ must integrate data
–
–
More than can be used by human inspection
Gene sequence doubling / 9 months ⇒
u
–
Searches required doubles / 4.5 months
Discovery by correlating diverse data
But …
•
Skilled scientists and computer scientists
–
–
–
Roughly static in number
Diminishing in available attention / task
Distributed systems remain hard
u
u
–
Integration remains hard
u
–
•
–
–
–
•
E.g. heterogeneity & autonomy essential
Important data in documents
More subjects experiencing the
–
•
E.g. partial failures and latency are always with us
E.g. operational information goes stale
Data deluge
Analysis avalanche
Simulation bonanza
Collaboration growth
Therefore find general solutions
Make technology easier to use
The New Behaviour
•
Shared Infrastructure
–
–
–
•
Shared Software
–
–
•
A new attempt at making distributed computing
economic, dependable and accessible
Scientists from all disciplines share in its design and use
Shared & Automated System Administration
–
–
•
Intrinsically distributed
Intrinsically multi-organisational
Multiple uses interwoven
Replicated farms of replicated systems
Autonomic management
Immediate benefit
–
–
Faster transfer of ideas and techniques between
disciplines
Amortisation of development, operation and education
Online Access to
Scientific Instruments
Advanced Photon Source
wide-area
dissemination
real-time
collection
archival
storage
desktop & VR clients
with shared controls
tomographic reconstruction
DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago
From Steve Tuecke 12 Oct. 01
Supernova Cosmology Requires Complex,
Widely Distributed Workflow Management
Mathematicians Solve NUG30
•
•
•
Looking for the solution to the
NUG30 quadratic assignment
problem
An informal collaboration of
mathematicians and
computer scientists
Condor-G delivered 3.46E8
CPU seconds in 7 days (peak
1009 processors) in U.S. and
Italy (8 sites)
14,5,28,24,1,3,16,15,
10,9,21,2,4,29,25,22,
13,26,17,30,6,20,19,
8,18,7,27,12,11,23
MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin
From Miron Livny 7 Aug. 01
Network for Earthquake
Engineering Simulation
•
•
NEESgrid: national
infrastructure to couple
earthquake engineers with
experimental facilities,
databases, computers, &
each other
On-demand access to
experiments, data streams,
computing, archives,
collaboration
NEESgrid: Argonne, Michigan, NCSA, UIUC, USC
From Steve Tuecke 12 Oct. 01
Home Computers
Evaluate AIDS Drugs
•
Community =
–
–
–
•
1000s of home
computer users
Philanthropic
computing vendor
(Entropia)
Research group
(Scripps)
Common goal=
advance AIDS
research
From Steve Tuecke 12 Oct. 01
whole-system simulations
wing models
•lift capabilities
•drag capabilities
•responsiveness
airframe models
stabilizer models
•deflection capabilities
•responsiveness
crew capabilities
- accuracy
- perception
- stamina
- reaction times
- SOP’s
engine models
human models
•braking performance
•steering capabilities
•traction
•dampening capabilities
landing gear models
•thrust performance
•reverse thrust performance
•responsiveness
•fuel consumption
NASA Information Power Grid: coupling all sub-system simulations
global in-flight engine diagnostics
in-flight data
global network
eg SITA
airline
ground
station
DS&S Engine Health Center
internet, e-mail, pager
maintenance centre
data centre
Distributed Aircraft Maintenance Environment: Universities of Leeds, Oxford, Sheffield &York
National Airspace Simulation Environment
stabilizer models
engine models
44,000 wing runs
wing models
GRC
50,000 engine runs
airframe models
66,000 stabilizer
runs
ARC
LaRC
22,000 commercial
US flights a day
48,000 human
crew runs
human models
simulation
drivers
Virtual
National Air
Space
VNAS
22,000 airframe
impact runs
• FAA ops data
• weather data
132,000 landing/
• airline schedule data take-off gear runs
• digital flight data
• radar tracks
landing
gear
• terrain data
models
• surface data
NASA Information Power Grid: aircraft, flight paths, airport operations and the environment
are combined to get a virtual national airspace
Not Just Scientists
•
Engineers
–
•
Finance, economy, politics, humanities, arts, …
–
–
•
As above
Industry & Commerce
–
•
We can expect best use of data and models to guide
the decisions that affect our lives
e.g. home climate simulation may moderate
greenhouse gas emissions
Medicine
–
•
They already travel the same path
As above
The UK Office of Science & Technology
–
–
Has these extensions firmly in mind
So have twelve computing & S/W companies
u
u
Signed agreements with GGF
Major collaboration with IBM (Microsoft, HP, Sun, Oracle, …)
Several Assumptions
•
The Technology is Ready
–
Not true — its emerging
u
•
The Scientists / Engineers, … want this
–
Not universally true
u
u
•
Not true
u
u
Addressed by a minimum set of composable virtual services
But starting with Globus
It’s only for “big” science
–
•
Pilot projects and Demonstrators
The e-Science Institute
One Size Fits All
–
•
Building middleware, Advancing Standards, Developing
Dependability
No — “small” science collaborates too!
We know how we will use grid services
–
No — Disruptive technology
Outline
•
Review e-Science
–
–
What is it?
Assumptions & Progress
•
UK e-Science Programme
•
NeSC & e-Science Institute
•
Open Grid Services Architecture
UK e-Science
e-Science and the Grid
‘e-Science is about global collaboration in key
areas of science, and the next generation of
infrastructure that will enable it.’
‘e-Science will change the dynamic of the
way science is undertaken.’
John Taylor
Director General of Research Councils
Office of Science and Technology
From presentation by Tony Hey
NASA’s IPG
•
The vision for the Information Power Grid is
to promote a revolution in how NASA
addresses large-scale science and
engineering problems by providing
persistent infrastructure for
–
–
“highly capable” computing and data
management services that, on-demand, will
locate and co-schedule the multi-Center
resources needed to address large-scale
and/or widely distributed problems
the ancillary services that are needed to
support the workflow management frameworks
that coordinate the processes of distributed
science and engineering problems
US Grid Projects
•
•
•
•
•
•
•
•
•
•
•
•
NASA Information Power Grid
DOE Science Grid
NSF National Virtual Observatory
NSF GriPhyN
DOE Particle Physics Data Grid
NSF Distributed Terascale Facility
DOE ASCI Grid
DOE Earth Systems Grid
DARPA CoABS Grid
NEESGrid
$500 million
NSF BIRN
NSF iVDGL
EU GridProjects
•
•
•
•
•
•
•
•
•
•
DataGrid (CERN, ..)
EuroGrid (Unicore)
DataTag (TTT…)
Astrophysical Virtual Observatory
GRIP (Globus/Unicore)
GRIA (e-Business, …)
GridLab (Cactus, …)
CrossGrid
EGSO (Solar Physics)
GridStart
45 million Euros
National Grid Projects
•
•
•
•
•
•
•
•
•
•
UK e-Science Programme
Japan – Grid Data Farm, ITBL
Netherlands – VLAM, PolderGrid
Germany – UNICORE, Grid proposal
France – Grid funding approved
Italy – INFN Grid
Eire – Grid proposals
Switzerland - Network/Grid proposal
Hungary – DemoGrid, Grid proposal
……
UK e-Science Programme
DG Research Councils
E-Science
Steering Committee
Director’s
Awareness and Co-ordination Role
Academic Application Support
Programme
Research Councils (£74m), DTI (£5m)
PPARC (£26m)
BBSRC (£8m)
MRC (£8m)
NERC (£7m)
£80m
ESRC (£3m)
EPSRC (£17m)
CLRC (£5m)
Grid TAG
Director
Director’s
Management Role
Generic Challenges
EPSRC (£15m), DTI (£15m)
Collaborative projects
Industrial Collaboration (£40m)
Edinburgh
Glasgow
Newcastle
AccessGrid
always-on video
walls
Belfast
Manchester
DL
Cambridge
Hinxton
Oxford
Cardiff
RAL
London
Southampton
Outline
•
Review e-Science
–
–
What is it?
Assumptions & Progress
•
UK e-Science Programme
•
NeSC & e-Science Institute
•
Open Grid Services Architecture
NeSC’s context
Coordination
e-Science Centres Application Pilots
IRCs …
e-Scientists, Grid users, Grid services & Grid Developers
GNT
DBTF
ATF
TAG
NeSC
GSC
UK Core Directorate
eSI
CS Research
Global Grid Forum …
NeSC’s Roles
•
Stimulation of Grid & e-Science Activity
–
–
–
•
Coordination of Grid & e-Science Activity
–
–
–
•
–
–
•
Regional Centres, Task Forces, Pilots & IRCs
Technical and Managerial Fora
Support for training, travel, participation
Developing a High-Profile e-Science Institute
–
•
Users, developers, researchers
Education, Training, Support
International Research & Standards
Meetings
Visiting Researchers
International Collaboration
Regional Support
Portfolio of Industrial Research Projects
NeSC — The Team
•
Director
–
•
Deputy Director
–
•
–
Muffy Calder (Glasgow Computing Science)
Tony Doyle (Glasgow Physics & Astronomy)
Centre Manager
–
•
Richard Kenway (Edinburgh Physics & Astronomy)
Initial Board Members
–
•
Stuart Anderson (Edinburgh Informatics)
Chairman
–
•
Mark Parsons (EPCC)
Regional Director
–
•
Arthur Trew (Director EPCC)
Commercial Director
–
•
Malcolm Atkinson (Universities of Glasgow & Edinburgh)
Anna Kenway
Conference Manager
–
Andrea Grainger
Scotland at the frontier… leading
••
UK core escience
––
––
••
data integration
linked to US
Globus
••
––
––
UK AstroGrid
––
––
virtual
observatory
linked to EU AVO
UK GridPP + ScotGrid
••
particle physics data
analysis
linked to EU DataGrid
EU enacts + GRIDSTART
––
––
supercomputer centres
EU grid projects
e-Science Institute
National e-Science Centre
•
Edinburgh + Glasgow Universities
–
–
–
•
••
visitors, workshops, co-ordination,
outreach
middleware development
––
••
£6M EPSRC/DTI + £2M SHEFC over 3
years
e-Science Institute
––
••
Physics & Astronomy × 2
Informatics, Computing Science
EPCC
50 : 50 industry : academia
‘last-mile’ networking
www.nesc.ac.uk
e-Science Institute
•
Highlights so Far
–
August & September
u
u
u
–
October
u
u
u
–
u
GridPP
Configuration management
December
u
u
u
•
Steve Tuecke Globus tutorial (oversubscribed)
4-day workshop Getting Going with Globus (G3)
– Reports on DataGrid & GridPP experience
Biologist Grid Users’ Meeting 1 (BiGUM1)
November
u
–
3 workshops week 1: DF1, GUM1 & DBAG1
HEC2 and the Grid
preGGF3 & DF2
Architecture & Strategy with Ian Foster et al.
AstroGrid
DIRC meeting
625 participants, 120 organisations, 20+ countries
eSI Highlights cont.
2002
–
January
u
u
u
u
–
–
February — closed for renovation
March
u
u
u
–
Blue Gene: Protein folding Workshop 14th to 17th IBM sponsor
XML, XML Schema, Web Services Advanced Workshop
Getting OGSA Going Workshop
April
u
u
–
Regional meeting
Steve Tuecke et al. 4 day Globus Developers’ Workshop
Pilot project workshop
Grid Portals & Problem Solving Environments Workshop
Managing Grid Software Projects Advanced Workshop
Digital Libraries, Librarians, Museums and the Grid
May
u
u
4-day Advanced Grid & Globus Tutorial (probable)
Mind and Brain Workshop
eSI continued
21st to 26th July 2002
GGF5 & HPDC 11 EICC
–
August Research Festival
–
14th to 16th April 2003 Dependability
Suggestions Please
•
•
•
e-Science Institute
Welcomes suggestions and organisers
Any topic related to e-Science
–
–
•
Any format
–
•
Tutorial, advanced tutorial, workshop,
scientific meeting
We can give
–
–
•
How your subject may use e-Science
How your technology may benefit e-Science
travel, organisation, accommodation support
This building renovated!
Mail director@nesc.ac.uk
Research Visitors
•
We will welcome and support
–
•
Suggestions Please
–
•
Active e-Science Researchers
People, Topics & Groups
Applications via web site
www.nesc.ac.uk
Grid Net
•
Support for those engaged in Grid
development
–
–
•
•
International working groups
Sustained commitment
Travel, Meeting costs, …
Application process via web site
www.nesc.ac.u k
Outline
•
Review e-Science
–
–
What is it?
Assumptions & Progress
•
UK e-Science Programme
•
NeSC & e-Science Institute
•
Open Grid Services Architecture
Challenge 1
Composing Software
•Encapsulating ideas, methods & understanding
•Developed independently
•Multiple technologies
•Heterogeneous models and interfaces
•Changing components
•Uncertainty about component quality
Solving a Problem
•Iteration
Reason to Trust the Answer
An Answer in Time
Challenge 1
Challenge 1
Challenge 1
Challenge 1
Challenge 1
Trustworthy?
Engineering
Trade offs?
Problem Handling?
Flexibility?
Understood?
Reuse?
Challenge 2
Deluge of Data
•More Digital Sources
•Faster Digital Streams
•Faster Data Generation
•Heterogeneous models and standards
•Changing structures
•Uncertainty about data quality
Finding the Nuggets
•Iteration, Search, Indexing, Mining, Statistics, Inference
Reason to Trust the Answer
Challenge 3
Geographic Distribution
•Intrinsic: scientists, resources & instruments
•Diverse & Independent Regimes: Organisations, Countries
•Faster Networks
•Mobile: equipment, people & phenomena
•Changing structures
•Uncertainty about communication quality
Sustaining the Computation
•Problem Detection & Recovery, Security, Authentication, …
Reason to Trust the System’s Dependability
Ambition
in-flight data
airl
ine
grou
nd
stati
on
global network
eg SITA
DS&S Engine Health Center
internet, e-mail, pager
data centre
maintenance centre
Distributed Aircraft Maintenance Environment: Universities of Leeds, Oxford, Sheffield
&York
Fire fighting safety
Volcanic Eruption Prediction
Flood & Pollution Response
Diagnosis & Treatment Planning
Whole population health monitoring
Collision avoidance
Epidemic Detection & Management
Understanding Cells & Organs
In Flight problem management
Oceans, Climate, Ecosystems, …
Ultimate Challenge
Challenge 1 + Challenge 2 + Challenge 3
Do it often
Do it quickly
Do it for everybody
Do it for everything
Change it quickly
M
Human Race Exhausted
Web Services
Grid Technology
Grid Services
Web Services
•
Independence
–
–
•
Description
–
–
•
Client from Service
Service from Client
Web Services DL
…
Separation
Function from Delivery
www.w3.org/TR/SOAP
–
•
Tools & Platforms
–
–
–
–
•
Java ONE
Visual .NET
WebSphere
Oracle
Commercial Buy in
www. w3c. org / TR / SOAP or TR/wsdl
Grid Technology
•
Distribution
–
–
•
Security
–
•
–
–
Discovery
Process Creation
Scheduling
Portability
–
•
Single Sign in
Resource Sharing
–
•
Various Protocols
FTP
APIs
Gov’nm’t Agency Buy in
Foster, I., Kesselman, C. and Tuecke, S., The Anatomy of the Grid: Enabling Virtual
Organisations, Intl. J. Supercomputer Applications, 15(3), 2001
Open Grid Services Architecture
Applications
Using operations
Virtual Grid Services
Implemented by
Multiple implementations of
Grid Services
OGS infrastructure
Computation Context
Grid Services
OGSI
Hosting
Environment
Platforms
OGSA Features
•
WSDL + WSIL
–
–
•
–
–
–
Apache axis
…
–
–
SOAP
RPC
…
Representations
–
XML + Schema
Life Time Management
–
–
–
Invocation
–
•
Description
Discovery
Tools & Platforms
–
•
•
–
•
Authentication
–
–
•
•
Factories
Transient & Persistent GS
GS Handles
GS Records
Soft State
Notification
Certificates +
Delegation
Change Management
Platform
Foster, I., Kesselman, C., Nick, J. and Tuecke, S., The Physiology of the Grid:
An Open Grid Services Architecture for Distributed Systems Integration
OGSA Development
•
More Description
–
More Languages
u
u
u
u
–
•
u
Varied, open, analysis, synthesis
Directed composition
Change Managers
Invocation & Reps
–
–
–
–
–
–
•
Partial Models
u
•
–
Precision & Semantics
u
Standard Schemas
Namespaces
Engineering
–
Trustworthy services
Owners, Costs & Charging
Transaction & Coordination
Work Flow
Tools & Platforms
–
•
Design for Testability
–
•
Dynamic Testing
Change Management
–
•
Factories
Transient & Persistent GS
GS Handles
GS Records
Soft State
Notification
Dynamic Evolution
Platforms
–
Mapping to host,
invocation, notification,
protocol transmission,
authentication
OGSA Development 2
•
Higher-Level Description
–
–
–
•
–
•
Information-level
Semantic-level
Virtualisation
Tools & Platforms
–
•
Higher-level Models
User-Guided Automation
–
–
–
•
•
–
Agreed Semantic Models
Trustworthy Translation
Q Testing & Certification
Change Management
–
•
Accessible Trade-offs
Dynamic Control
Autonomic
Design for QA
–
Invocation & Reps
–
Engineering
Dynamic Evolution
Platforms
–
Raising their level
u
u
u
u
More high-level facilities
Coherent
Understandable
Specified
Families of Components
•
Members of a Family
–
Address a Domain
u
u
u
u
u
–
Data Integration
Biological Search
Fluid Dynamics
Ecological Models
…
Comply with Rules
u
u
u
–
•
u
u
u
–
–
–
•
•
Development
Trade offs
Constructive Rivalry
–
–
•
Operational
Usage
Effects on Science
Review and Revise
–
•
For change
For test
For performance
Engineer
Measure
–
Terms for Description
Schemas / Namespaces
Standard Operations
Varied Implementation
Design
Based on Real use
Accredit
The Yellow Brick Road
Many Players
Join in
Many Paths
Many Challenges
Worthwhile Goal
Where to Concentrate
•
International & Industrial Collaboration
–
•
Ideas, experiments, software, standards
Integrating Data across the Grid
–
–
–
–
–
Data growth demands new methods
Data ownership expects respect & security
Data is hard to scan — indexing & query
Data is hard to move — query & move code
Human attention is scarce but essential
u
u
u
–
•
•
Machine-assisted annotation, provenance, archiving
Machine-assisted data mining
Machine-assisted ontology construction & integration
Human-factors must drive designs
Dynamic, Dependable and Virtual Fabric
Improved Programming Models
For more Information
•
Ask me
•
www.nesc.ac.uk
•
director@nesc.ac.uk
•
Thank you for your attention
or for arriving early for the next talk J
Download