The National e-Science Centre and UK e-Science Programme Muffy Calder

advertisement
The National e-Science Centre
and
UK e-Science Programme
Muffy
Muffy Calder
Calder
University
University of
of Glasgow
Glasgow
http:
http: //www.nesc.ac.uk
//www.nesc.ac.uk
http://www.dcs.gla.ac.uk/~muffy
http://www.dcs.gla.ac.uk/~muffy
An overview
• E-Science
– challenges
challenges and
and opportunities
opportunities
• Grid computing
– challenges
challenges and
and opportunities
opportunities
• A
ctvies
Activities
– UK
UK and
and international
– Scotland and National e-Science Centre
e-Science
‘e-Science is about global collaboration in key areas of
science, and the next generation of infrastructure that will
enable it.’
‘e-Science will change the dynamic of the way science is
undertaken.’
John Taylor
Director General of Research Councils
Office of Science and Technology
The drivers for e-Science
• More data
–– instrument
instrument resolution
resolution and
and laboratory
laboratory automation
automation
–– storage
storage capacity
capacity and
and data
data sources
sources
• More computation
–– computations
computations available,
available, simulations
simulations
doubling
doubling every
every year
year
• Faster networks
––
––
bandwidth
bandwidth
need
need to
to schedule
schedule
• More interplay and collaboration
–
–
between scientists, engineers, computer scientists etc.
between computation and data
The drivers for e-Science
• Exploration of data and models
– in silico discovery
• Floods of public data
– gene sequence doubling every 9 months
– Searches required doubling every 4-5 months
In summary
Shared data, information and computation by
geographically dispersed communities.
Current model: ad-hoc clientserver
HPC
Experiment
Analysis
Storage
HPC
Scientist
Experiment
Computing
Storage
Analysis
HPC
The vision: computation &
information utility
Scientist
M
ID
L
E
W
A
R
E
Experiment
Analysis
Computing
Storage
Storage
Analysis
Experiment
Computing
Computing
Storage
e-Science Examples
•
•
•
•
•
•
•
•
Bioinformatics/Functional genomics
Collaborative Engineering
Medical/Healthcare informatics
Earth Observation Systems (flood monitoring)
TeleMicroscopy
Virtual Observatories
Robotic Telescopes
Particle Physics at the LHC
––
––
––
EU
EU DataGrid
DataGrid particle
particle physics,
physics, biology
biology && medical
medical imaging,
imaging, Earth
Earth observation
observation
GridPP,
GridPP, ScotGrid
ScotGrid
AstroGrid
AstroGrid
Multi-disciplinary Simulations
Wing Models
•Lift Capabilities
•Drag Capabilities
•Responsiveness
Airframe Models
Stabilizer Models
•Deflection capabilities
•Responsiveness
Crew Capabilities
- accuracy
- perception
- stamina
- re-action times
- SOP’s
Human Models
Engine Models
•Braking performance
•Steering capabilities
•Traction
•Dampening capabilities
Landing Gear Models
•Thrust performance
•Reverse Thrust performance
•Responsiveness
•Fuel Consumption
Whole system simulations are produced by coupling
all of the sub-system simulations
Multi-disciplinary Simulations
US National Air Space Simulation Environment
Stabilizer Models
GRC
Engine Models
44,000 Wing Runs
Wing Models
50,000 Engine Runs
Airframe Models
66,000 Stabilizer
Runs
ARC
LaRC
22,000 Commercial
US Flights a day
Human Models
Simulation
48,000 Human
Crew Runs
Drivers
(Being pulled together
under the NASA AvSP
Aviation ExtraNet (AEN)
Virtual
National Air
Space
VNAS
•FAA Ops Data
•Weather Data
•Airline Schedule Data
•Digital Flight Data
•Radar Tracks
•Terrain Data
•Surface Data
22,000 Airframe
Impact Runs
132,000 Landing/
Take-off
Gear Runs
Landing
Gear
Models
Many aircraft, flight paths, airport operations, and the environment are
combined to get a virtual national airspace
Global in-flight engine
diagnostics
in-flight data
airline
global network
eg SITA
ground
station
DS&S Engine Health Center
internet, e-mail, pager
maintenance centre
data centre
Distributed Aircraft Maintenance Environment: Universities of Leeds, Oxford, Sheffield &York
LHC computing
assumes PC = ~ 25 SpecInt95
~PByte/sec
~100 MByte/sec
Online System
Offline Farm
~20,000 PCs
~100 MByte/sec
•one bunch crossing per 25 ns
•100 triggers per second
•each event is ~1 MByte
Tier 1
US Regional
Centre
~ Gbit/sec
or Air Freight
CERN Computer
Centre >20,000 PCs
Tier 0
Italian Regional
Centre
French Regional
Centre
Tier 2
ScotGRID++
~1000 PCs
RAL Regional
Centre
Tier2 Centre
Tier2 Centre
Tier2 Centre
~1000 PCs~1000 PCs~1000 PCs
~Gbit/sec
Tier 3
Institute
Institute
~200 PCs
Physics data
cache
Workstations
Institute
Institute
100 - 1000
Mbit/sec
Tier 4
physicists work on analysis “channels”
each institute has ~10 physicists working on
one or more channels
data for these channels is cached by the
institute server
Emergency response teams
• bring sensors, data,
simulations and experts
together
–– wildfire:
wildfire: predict
predict movement
movement of
of fire
fire &
&
direct
direct fire-fighters
fire-fighters
–– also
also earthquakes,
earthquakes, peacekeeping
peacekeeping
forces,
forces, battlefields,…
battlefields,…
National Earthquake Simulation Grid
Los Alamos National Laboratory: wildfire
Earth observation
• ENVISAT
–
–
–
€ 3.5 billion
billion
400 terabytes/year
700 users
• ground deformation prior to
a volcano
To achieve e-Science we need…
To achieve e-Science we need…
Grid the vision
Computing resources
Instruments
Complex problem
Data
Knowledge
GRID
Solution
People
The Grid
• Computing cycles, data storage, bandwidth and
facilities viewed as commodities.
–– like
like the
the electricity
electricity grid
grid
• Software and hardware infrastructure to support
model of computation and information utilities on
demand.
–– middleware
middleware
• An emergent infrastructure
–– delivering
delivering dependable,
dependable, pervasive
pervasive and
and uniform
uniform access
access to
to aa set
set of
of
globally
globally distributed,
distributed, dynamic
dynamic and
and heterogeneous
heterogeneous resources.
resources.
The Grid
Supporting computations with differing characteristics:
• Highthroughput
–– Unbounded,
Unbounded, robust,
robust, scalable,
scalable, use
use otherwise
otherwise idle
idle machines
machines
• On demand
–– short-term
short-term bounded,
bounded, hard
hard timing
timing constraints
constraints
• Data intensive
–– computed,
computed, measured,
measured, stored,
stored, recalled,
recalled, e.g.
e.g. particle
particle physics
physics
• Distributed supercomputing
–– classical
classical CPU
CPU and
and memory
memory intensive
intensive
• Collaborative
–– mainly
mainly in
in support
support of
of hhi,
hhi, e.g.
e.g. access
access grid
grid
Grid Data
• Generated by sensor
• From a database
• Computed on request
• Measured on request
Grid Services
• Deliver bandwidth, data, computation cycles
• Architecture and basic services for
–
–
–
–
–
–
–
–
–
–
–
authentication
authorisation
resource scheduling and coscheduling
monitoring, accounting and payment
protection and security
quality of service guarantees
secondary storage
directories
interoperability
interoperability
fault-tolerance
reliability
But…has the emperor got any
clothes on?
• Maybe string vest and pants:
– semantic web
web
•• machine
machine understandable
understandable information
information on
on the
the web
web
– virtual organisations
– pervasive computing
““ …
… aa billion
billion people
people interacting
interacting with
with aa million
million e-businesses
e-businesses with
with aa trillion
trillion
intelligent
intelligent devices
devices interconnected
interconnected ””
Lou
Lou Gerstner,
Gerstner, IBM
IBM (2000)
(2000)
National and International
Activities
• USA
–– NASA
NASA Information
Information Power
Power Grid
Grid
–– DOE
DOE Science
Science Grid
Grid
–– NSF
NSF National
National Virtual
Virtual Observatory
Observatory etc.
etc.
•
•
•
•
•
•
•
•
•
•
UK e-Science Programme
Japan – Grid Data Farm, ITBL
Netherlands – VLAM, PolderGrid
Germany – UNICORE, Grid proposal
France – Grid funding approved
Italy – INFN Grid
Eire – Grid proposals
Switzerland - Network/Grid proposal
Hungary – DemoGrid, Grid proposal
…
UK e-science centres
Edinburgh
Glasgow
AccessGrid
always-on video
walls
DL
Belfast
Newcastle
Manchester
Oxford
Cardiff
RAL
Soton
Cambridge
London
Hinxton
National e-Science Centre
• Edinburgh + Glasgow Universities
–
–
–
Physics & Astronomy × 2
Informatics, Computing Science
EPCC
• £6M EPSRC/DTI + £2M SHEFC
over 3 years
• e-Science Institute
– visitors, workshops, co-ordination,
co-ordination, outreach
outreach
• middleware development
– 50 : 50 industry : academia
• UK representation at GGF
• coordinate regional centres
Generic Grid Middleware
• All e-Science Centres will donate resources to form a
UK ‘national’ Grid
• All Centres will run same Grid Software
based on Globus*, SRB and Condor
• Work with Global Grid Forum (GGF) and major
computing companies.
*Globus
*Globus –– project
project to
to develop
develop open
open architecture,
architecture, open
open source
source Grid
Grid software,
software,
Globus
Globus toolkit
toolkit 2.0
2.0 now
now available.
available.
How can you participate?
• Informal collaborations
• Formal collaborative LINK funded projects
• Attend seminars at eSI
• Suggest visitors for eSI
• Suggest themes and topics for eSI.
eSI forthcoming events
Friday,
Friday, March
March 15
15
Monday,
Monday, March
March 18
18
Tuesday,
Tuesday, March
March 19
19
Thursday,
Thursday, March
March 21
21
Monday,
Monday, April
April 88
Monday,
Monday, April
April 15
15
Wednesday,
Wednesday, April
April 17
17
Monday,
Monday, April
April 29
29
Sunday,
July
21
Sunday, July 21
Blue
Blue Gene
Gene
IWOX:
IWOX: Introductory
Introductory Workshop
Workshop on
on XML
XML Schema
Schema
AWOX:
AWOX: Advanced
Advanced Workshop
Workshop on
on XML:
XML: XML
XML Schema,
Schema, Web
Web
Services
Services and
and Tools
Tools
GOGO:
GOGO: Getting
Getting OGSA
OGSA Going
Going 11
Software
Software Management
Management for
for Grid
Grid Projects
Projects
DAI:
DAI: Database
Database Access
Access and
and Integration
Integration
DBFT
DBFT Meeting
Meeting
The
The Information
Information Grid
Grid
GGF5/
HPDC11
GGF5/ HPDC11
Watch the web pages…. www.nesc.ac.uk
Thank you!
Download