UK e-Science OGC Technical Committee Edinburgh

advertisement
UK e-Science
OGC Technical
Committee
Edinburgh
Malcolm Atkinson
Director
& e-Science Envoy
e-Science Institute
www.nesc.ac.uk
28th June 2006
Overview
Brief History
E-Science, Grids & Service-oriented Architectures
(Geo)Data Deluge
Causes of Growth
Interpretational challenges
Crucial Issues
Usability & Abstraction
Interoperation & Federations
What is e-Science?
Goal: to enable better research in all disciplines
Method: Develop collaboration supported by
advanced distributed computation
to generate, curate and analyse rich data resources
X
X
From experiments, observations and simulations
Quality management, preservation and reliable evidence
to develop and explore models and simulations
X
X
Computation and data at all scales
Trustworthy, economic, timely and relevant results
to enable dynamic distributed collaboration
X
X
Facilitating collaboration with information and resource sharing
Security, trust, reliability, accountability, manageability and agility
A Grid Computing Timeline
8
F
‘9 s
G
C
5
G
S um
‘9
t
r
a o
g
rm
o
s
F
it n
f
,
m
d
u
i
r
e
p
d
g
fo Gr
e
r
r
m
e rm
e
o
r
P
m
p
e
C
m
a
r
p
fo
ru & A
p
s
e
a
”
o
p
p
.0
G
m
y
F n
”
u
g
1
u
y
r
v
S
d ea
lo -W
i
o
m
:
o
r
i
A
A
F
o
p
y
t
s S
G ro
S
a
y
a
d
i
h
n
G
G
r
S
u
W
U
E
O
IG
…
“P O
“A
1995
’96
’97
’98
’99
2000
’01
• • UK
UKe-Science
e-Scienceprogram
programstarts
starts
••
••
••
••
••
DARPA
DARPAfunds
fundsGlobus
GlobusToolkit
Toolkit&&Legion
Legion
EU
funds
UNICORE
project
EU funds UNICORE project
US
USDoE
DoEpioneers
pioneersgrids
gridsfor
forscientific
scientificresearch
research
NSF
funds
National
Technology
Grid
NSF funds National Technology Grid
NASA
NASAstarts
startsInformation
InformationPower
PowerGrid
Grid
Source: Hiro Kishimoto GGF17 Keynote May 2006
’02
’03
’04
’05
2006
Japan
Japangovernment
governmentfunds:
funds:
• • Business
Grid
project
Business Grid project
• • NAREGI
NAREGIproject
project
Today:
Today:
• • Grid
Gridsolutions
solutionsare
arecommon
commonfor
forHPC
HPC
• • Grid-based
business
solutions
are
Grid-based business solutions are
becoming
becomingcommon
common
• • Required
technologies
Required technologies&&standards
standardsare
are
evolving
evolving
What is a Grid?
AAgrid
gridis
isaasystem
systemconsisting
consistingof
of
−−
−−
Distributed
Distributedbut
butconnected
connectedresources
resourcesand
and
Software
Softwareand/or
and/orhardware
hardwarethat
thatprovides
providesand
andmanages
manageslogically
logically
seamless
access
to
those
resources
to
meet
desired
objectives
seamless access to those resources to meet desired objectives
License
License
Web
Web
server
server
Handheld
Server
Workstation
Database
Database
Supercomputer
Cluster
Data Center
Printer
Printer
R2AD
Grid & Related Paradigms
Distributed
DistributedComputing
Computing
•• Loosely
Looselycoupled
coupled
•• Heterogeneous
Heterogeneous
•• Single
SingleAdministration
Administration
Grid
GridComputing
Computing
•• Large
Largescale
scale
•• Cross-organizational
Cross-organizational
•• Geographical
Geographicaldistribution
distribution
•• Distributed
Management
Distributed Management
Utility
UtilityComputing
Computing
••Computing
Computing“services”
“services”
••No
knowledge
No knowledgeofofprovider
provider
••Enabled
by
grid
technology
Enabled by grid technology
Cluster
Cluster
•• Tightly
Tightlycoupled
coupled
•• Homogeneous
Homogeneous
•• Cooperative
Cooperativeworking
working
How Are Grids Used?
High-performance computing
Collaborative design
E-Business
High-energy physics
Financial modeling
Data center automation
Drug discovery
Life sciences
E-Science
Collaborative data-sharing
Commitment to e-Infrastructure
A shared resource
That enables science,
research, engineering,
medicine, industry, …
It will improve UK /
European / … productivity
X
X
Lisbon Accord 2000
e-Science Vision SR2000 –
John Taylor
Commitment by UK
government
X
Sections 2.23-2.25
Always there
X
c.f. telephones, transport,
power
UK e-Science Budget
(2001-2006)
Total: £213M + £100M via JISC
EPSRC Breakdown
M RC (£21.1M )
10%
EPSRC (£77.7M )
37%
Applied (£35M)
Staff
45%
HPC (£11.5M)
BBSRC (£18M )
15%
8%
NERC (£15M )
7%
costs only
Grid Resources
Core (£31.2M)
Computers & Network
PPARC 40%
(£57.6M )
funded separately 27%
CLRC (£10M )
5%
ESRC (£13.6M )
6%
+ Industrial Contributions £25M
Source: Science Budget 2003/4 – 2005/6, DTI(OST)
Slide from Steve Newhouse
The e-Science On
The Map Today
Globus Apache
Project & CDIG
National
Centre for
e-Social
Science
NERC
e-Science
Centre
e-Science
Institute
Funded
centres
National
Grid
Service
Digital
Curation
Centre
NGS
Support
Centre
CeSC (Cambridge)
OMII-UK
EGEE-II
National
Institute
for
Environmental
e-Science
Invest in People
• Training
–
–
–
–
Targeted
Immediate goals
Specific skills
Building a workforce
• Education
–
–
–
–
Pervasive
Long term and sustained
Generic conceptual models
Developing a culture
Strengthens
Organisation
Services & Applications
Develop
Enriches
Training
Skilled Workers
Prepares
Invests
Society
Innovation
Create
Invests
Education
Graduates
Prepares
• Both are needed
INFSO-SSA-26637
25 May 2006
Compound Causes of (Geo)Data
Growth
Faster devices
Cheaper devices
Higher-resolution
all ~ Moore’s law
Increased processor throughput
⇒ more derived data
Cheaper & higher-volume storage
Remote data more accessible
Public policy to make research data available
Bandwidth increases
Latency doesn’t get less though
Interpretational Challenges
Finding & Accessing data
Variety of mechanisms & policies
Interpreting data
Variety of forms, value systems & ontologies
Independent provision & ownership
Autonomous changes in availability, form, policy, …
Processing data
Understanding how it may be related
Devising models that expose the relationships
Presenting results
Humans need either
X
X
Derived small volumes of statistics
Visualisations
Interpretational Challenges
Finding & Accessing data
Variety of mechanisms & policies
Interpreting data
Variety of forms, value systems & ontologies
Independent provision & ownership
Autonomous changes in availability, form, policy, …
Processing data
Understanding how it may be related
Devising models that expose the relationships
Presenting results
Humans need either
X
X
Derived small volumes of statistics
Visualisations
Interpretational Challenges
Finding & Accessing data
Variety of mechanisms & policies
Interpreting data
Variety of forms, value systems & ontologies
Independent provision & ownership
Autonomous changes in availability, form, policy, …
Processing data
Understanding how it may be related
Devising models that expose the relationships
Presenting results
Humans need either
X
X
Derived small volumes of statistics
Visualisations
Collaboration
Collaboration is a Key Issue
Multi-disciplinary
Multi-national
Academia & industry
Trustworthy data sharing key for collaboration
Plenty of opportunities for research and innovation
Establish common frameworks where possible
X
Islands of stability – reference points for data integration
Establish international standards and cooperative
behaviour
X
Extend incrementally
Trustworthy code & service sharing also key
Federation
Federation is a Key Issue
Multi-organisation
Multi-purpose
Multi-national
Academia & industry
Build shared standards and ontologies
Require immense effort
Require critical mass of adoption
Trustworthy code & e-Infrastructure sharing
Economic & social necessity
Major Intellectual Challenges
Require
Many approaches to be integrated
Many minds engaged
Many years of effort
Using the Systems
Requires well-tuned models
Well-tuned relationships between systems & people
Flexibility, adaptability & agility
Enabling this
Is itself a major intellectual challenge
Download