e-Science and the Grid

advertisement
e-Science and the Grid –
for Research and Industry
Tony Hey
Director of UK e-Science Core Programme
Tony.Hey@epsrc.ac.uk
The e-Science Paradigm
• The Integrative Biology Project involves the
University of Oxford (and others) in the UK and
the University of Auckland in New Zealand
Models of electrical behaviour of heart cells
developed by Denis Noble’s team in Oxford
Mechanical models of beating heart developed by
Peter Hunter’s group in Auckland
• Researchers need to be able to easily build a
secure ‘Virtual Organisation’ allowing access to
each group’s resources
 Will enable researchers to do different science
The Grid = A set of core middleware services
running on top of high performance global
networks
RCUK e-Science Funding
First Phase: 2001 –2004
• Application Projects
– £74M
– All areas of science
and engineering
• Core Programme
– £15M Research
infrastructure
– £20M Collaborative
industrial projects
Second Phase: 2003 –2006
• Application Projects
– £96M
– All areas of science and
engineering
• Core Programme
– £16M Research
Infrastructure
– £10M DTI Technology
Fund
UK Focus on Data and Security
• Data Access and Integration
– OGSA-DAI and DAIT project with IBM
• Key grid data services
– Workflow, Provenance
– Distributed Query, Knowledge Management
• Data Curation and Data Handling
– Digital Curation Centre with JISC
• Security, AA and all that
– e-Science CA, GSI and WS-Security
– Shibboleth/PERMIS deployment with InterNet2
Comb-e-Chem Project
Video
Simulation
Diffractometer
Properties
Analysis
Structures
Database
X-Ray
e-Lab
Properties
e-Lab
Grid Middleware
myGrid Project
• Imminent ‘deluge’ of
data
• Highly heterogeneous
• Highly complex and
inter-related
• Convergence of data
and literature archives
Discovery Net Project
Interactive
Editor &
Visualisation
Nucleotide Annotation Workflows
Download
sequence
from
Reference
Server
Inter
Pro
SMART
KEGG
EMBL
NCBI
SWISS
PROT
TIGR
SNP
GO
Save to
Distributed
Annotation
Server
1800 clicks
 500 Web access
200 copy/paste
 3 weeks work
in 1 workflow and
few second execution
Execute
distributed
annotation
workflow
DAME Project
In flight data
Global Network
eg: SITA
Airline
Ground
Station
DS&S Engine Health Center
Maintenance Centre
Internet, e-mail, pager
Data centre
eDiaMoND Project
Mammograms have different
appearances, depending on image
settings and acquisition systems
Standard
Mammo
Format
Temporal
mammography
Computer
Aided
Detection
3D View
The UK e-Science Experience:
Phase 1
• All Research Council e-Science funds
committed
– e-Science pilots launched covering many areas
of science, engineering and medicine
• UK e-Science Core Programme
– DTI £20M for collaborative industrial R&D
 About 80 UK companies participating
 Over £30M industrial contributions
• Engineering, Pharmaceutical, Petrochemical
• IT companies, Commerce, Media
UK e-Science: Phase 2
Three major new activities:
1. Deploy National Grid Service and establish
Grid Operation Support Centre
2. Fund Open Middleware Infrastructure
Institute for testing, software engineering
and repository for UK middleware
3. Set up Digital Curation Centre for R&D into
long-term data preservation issues
UK National Grid Service
• From April 2004, NGS offers free access to
two 128 processor compute nodes and two
data nodes
• Initial software is based on GT2 via VDT and
LCG releases plus SRB and OGSA-DAI
• Plan to move to Web Services based Grid
middleware by April 2005
• Need for resource allocation mechanisms
Accounting, Performance Prediction
The Web Services ‘Magic Bullet’
Company A
(J2EE)
Web services
Company C
(.Net)
Company B
(LAMP)
Open Grid Services Architecture
• Development of Web Services
• OGSA/WSRF/… will provide
Naming /Authorization / Security / Privacy/…
 Projects should look at higher level services: Workflow,
Transactions, DataMining, Knowledge Discovery…
 Exploit Synergy: Commercial Internet
with Grid Services
The UK Open Middleware
Infrastructure Institute (OMII)
•
Repository for UK-developed Open Source
‘e-Science/Cyber-infrastructure’ Middleware
• Documentation, specification,QA and standards
• Fund work to bring ‘research project’ software
up to ‘production strength’
• Fund Middleware projects for identified ‘gaps’
• Work with US NSF, EU Projects and others
• Supported by major IT companies
 Southampton selected as the OMII site
Digital Curation Centre (DCC)
•
•
In next 5 years e-Science projects will produce
more scientific data than has been collected in the
whole of human history
In 20 years can guarantee that the operating and
spreadsheet program and the hardware used to
store data will not exist
 Research curation technologies and best practice
 Need to liaise closely with individual research
communities, data archives and libraries
 Edinburgh with Glasgow, CLRC and UKOLN
selected as site of DCC
MIT DSpace Vision
‘As more and more research and educational material
is ‘born digital’, institutions and organizations are
increasingly realizing the need for a stable place in
which such material may be stored and accessed longterm. The Massachusetts Institute of Technology is a
perfect example of an organization with this need.
Much of the material produced by faculty, such as
datasets, experimental results and rich media data as
well as more conventional document-based material
(e.g. articles and reports) is housed on an individual’s
hard drive or department Web server. Such material is
often lost forever as faculty and departments change
over time.’
Three Industry Perspectives
• An SAP view
• BAESystems and Virtual Organisations
• The Burger Model from T-Systems
Naturally Distributed Processing
Pick
or
Produce
Vendor
Build HU
Issue Goods
(Loading)
Associate
Items / Pallet /
Tags
Register ID
of Pallet
Scan IDs
AII
Event
Management
Post
Goods Issue
Create
Event Handler
ERP System
Create HU
Delivery
WM
TRM
Adv. Ship
Notification
Cust.
Order
Purchase Order
Buyer
BAEgrid – deployment of virtual
organisations
•VO needs better
definition to support
asymmetric operation.
BAE
site R
Platform
BAE
site G
BAE
site W
HP Labs
Cardiff
e-Science
BAE
site F
BAE
site B
Manchester
e-Science
Swansea U.
Southampton
e-Science
Singapore
iHPC
• VO lifecycle tools
are required.
Identification
Formation
Operation
Dissolution
T-Systems Burger Model
Pervasive Computing
Information Glue
Corporate Computing
The Commoditization of Middleware
• Microsoft and IBM have agreed on the Web
Services ‘open standard’ approach to
interoperable low level distributed middleware
• Providing high value-added services and
products based on this secure, robust, common
open standard middleware infrastructure will be
central to the new economy.
• The existence of ‘open source’ implementations
of this ‘open standard’ middleware will enable
new SMEs to compete with traditional packaged
software vendors
Realizing Licklider’s Vision
“Lick had this concept of the intergalactic network
which he believed was everybody could use
computers anywhere and get at data anywhere in
the world. He didn’t envision the number of
computers we have today by any means, but he
had the same concept – all of the stuff linked
together throughout the world, that you can use a
remote computer, get data from a remote
computer, or use lots of computers in your job.
The vision was really Lick’s originally.”
Larry Roberts – Principal Architect of the ARPANET
e-Government and the Grid
‘[The Grid] intends to make access to
computing power, scientific data repositories
and experimental facilities as easy as the
Web makes access to information.’
Tony Blair, 2002
Download