e-Science and the Grid - Towards a BioGrid? Tony Hey Director of UK

advertisement
e-Science and the Grid
- Towards a BioGrid?
Tony Hey
Director of UK
e-Science Core Program
Tony.Hey@epsrc.ac.uk
e-Science and the Grid
‘e-Science is about global collaboration
in key areas of science, and the next
generation of infrastructure that will
enable it.’
John Taylor
Director General of Research Councils
Office of Science and Technology
UK e-Science Initiative
• £120M Programme over 3 years
• £75M is for Grid Applications in all
areas of science and engineering
• £35M ‘Core Program’ to encourage
development of generic ‘industrial
strength’ Grid middleware
¾ Require £20M additional ‘matching’
funds from industry
UK e-Science Projects
• £75M for e-Science application ‘pilots’
- span all sciences and engineering
• Particle Physics and Astronomy (PPARC)
- £17M GridPP and £5M AstroGrid
• Engineering and Physical Sciences (EPSRC)
- funding 6 projects at around £3M each
• Biology, Medical and Environmental Science
- funding projects with total value of £23M
Steve Lloyd
Tony Doyle
John Gordon
GridPP Presentation
to PPARC Grid
Steering Committee
26 July 2001
reconstruction
raw
data
batch
batch
physics
physics
analysis
analysis
event
event
reprocessing
reprocessing
analysis
event
event
simulation
simulation
CER
N
processed
data
event
summary
data
analysis objects
(extracted by physics topic)
simulation
interactive
physics
analysis
les.robertson@cern.ch
event
eventfilter
filter
(selection
(selection&&
reconstruction)
reconstruction)
detector
Data Handling and
Computation for
Physics Analysis
Estimated Physics Computation
Capacity at CERN
6.0
5.0
3.0
Moore’s law:
2.0
year
2009
2008
2007
2006
2005
2004
2003
2002
2001
0.0
2010
capacity growth with - a fixed cpu count
- or a fixed annual
budget
1.0
2000
M SI95
4.0
CERN's Users in the World
Europe:
267 institutes, 4603
users
Elsewhere: 208 institutes, 1632
Powering the Virtual
Universe
http://www.astrogrid.ac.uk
(Edinburgh, Belfast, Cambridge,
Leicester, London, Manchester, RAL)
Multi-wavelength showing the jet in M87: from top to bottom
– Chandra X-ray, HST optical, Gemini mid-IR, VLA radio.
AstroGrid will provide advanced, Grid based, federation and
data mining tools to facilitate better and faster scientific
output.
Picture credits: “NASA / Chandra X-ray Observatory /
Herman Marshall (MIT)”, “NASA/HST/Eric Perlman
(UMBC), “Gemini Observatory/OSCIR”, “VLA/NSF/Eric
Perlman (UMBC)/Fang Zhou, Biretta (STScI)/F Owen
(NRA)”
p9
Printed: 27/06/2002
UK ‘BioGrid’ Projects
EPSRC Projects
• Comb-e-Chem (EPSRC)
• myGrid (EPSRC)
• DiscoveryNet (EPSRC)
BBSRC Projects
• Biomolecular Grid (BBSRC)
• Proteome Annotation Pipeline (BBSRC)
• High-Throughput Structural Biology
(BBSRC)
• Global Biodiversity (BBSRC)
UK ‘BioGrid’ Projects
MRC Projects
• Biology of Ageing (BBSRC + MRC)
• Sequence and Structure Data (MRC)
• Molecular Genetics (MRC)
• Cancer Management (MRC + PPARC)
• Clinical e-Science Framework – CLEF
(MRC)
• Neuroinformatics Modeling Tools (MRC)
The Comb-e-Chem Project
• Goal is to integrate simulated and
experimental data within a knowledge
environment
- Accumulate and model data using new
combinatorial methods
- Automate metadata annotation for
provenance
• Southampton, Bristol, Cambridge
Crystallographic Data Centre, Pfizer,
IBM
Comb-e-Chem Architecture
Video
Simulation
Diffractometer
Properties
Analysis
Structures
Database
Globus
X-Ray
e-Lab
Properties
e-Lab
The myGrid Project
• Goal is to develop ‘workbench’ to support:
– Experimental process of data accumulation
– Use of community information
• Provide facilities for resource selection, data
management and process enactment
– Functional genomics, pattern database
annotation
• Manchester, EBI, Newcastle,Nottingham,
Sheffield, Southampton, GSK, AstraZeneca,
Merck, IBM, Sun, …
Functional Genomics Data
• Imminent ‘deluge’
of data
• Highly
heterogeneous
• Highly complex
and inter-related
• Convergence of
data and
literature
archives
myGrid Generic Technologies
1.
2.
3.
4.
5.
Database access from the Grid
Process enactment on the Grid
Personalisation services
Metadata services
Development of Agent Services
Grid Services + Ontologies
¾
Towards the ‘Semantic Grid’
The Discovery Net Project
• Data issues : Calibration
– Diversity of resource: normalisation
– Diversity of quality : Cleaning
• Information issues : Integration
– Information structuring (XML/Schema)
– Information abstraction
• Knowledge issues : Assimilation
– Validation & Reference : knowledge schema
– Management : discovery process
Discovery Deployment
Discovery Component
Active Report
Discovery Process Markup Language
Batch processing
Discovery Service
GRAB - Biodiversity and the Grid
Federated catalogue of life
Species data
from Edinburgh
Specialist
in Cardiff
Climate data
from York
Images
from Cambridge
Application control
at Southampton
GRAB - Biodiversity and the Grid
GRAB Results
Species
…………
…………
…………
Locations
…………
…………
…………
Graphics
e-Science ‘Core Program’
1.
¾
2.
3.
4.
5.
6.
Network of e-Science Centres
UK e-Science Grid and AccessGrid
Generic/Industrial Grid Middleware
e-Health Grid ‘Grand Challenge’
Support for e-Science Applications
Outreach/International Activities
Grid Network Issues
UK e-Science Grid
Edinburgh
Glasgow
DL
Belfast
Newcastle
Manchester
Cambridge
Oxford
Cardiff
RAL
London
Southampton
Hinxton
Access Grid
AccessGrid
‘Grid Computing is one of the three
next big things for Sun and our
customers’
Ed Zander, COO Sun
‘The alignment of OGSA with XML
Web services is important because
it will make Internet-scale,
distributed Grid Computing possible’
Robert Wahbe, General Manager
of Web Services, Microsoft
Timescales for Exploitation?
• IBM see ‘early adopters’ of Grid
technology coming from pharmaceutical,
engineering and petrochemical sectors
¾ UK program confirms this picture
(AstraZeneca, GSK, Merck, Pfizer,
Rolls-Royce, BAESystems, Schlumberger)
• IBM see Grid middleware being adopted by
mainstream commerce and industry in
2003/2004 timeframe
Collaborative Industrial
Grid Projects
• Grid Application Projects have more
than £8M industrial input
- mostly major pharmaceutical and
engineering companies
• Around £15M allocated for
collaborative industrial projects for
middleware/tools
- at present £5M allocated with
matching industrial funding
Databases in the Grid
Data
Complexity
Computational Complexity
OGSA – Data Access and
Integration Project
- Key middleware area for UK Program
- Develop high-quality data-centric
middleware capability
- Total Budget $5M (CP $2M)
- Three Centres: Edinburgh, Manchester
and Newcastle
- Industrial partners: IBM US, IBM
Hursley and Oracle UK.
e-Health ‘Grand Challenge’
• Equator: Technological
innovation in physical and
digital life
• AKT: Advanced Knowledge
Technologies
• DIRC: Dependability of
Computer-Based Systems
• MIAS: From Medical
Images and Signals to
Clinical Information
e-Health Grid Projects
• MIAS-Grid
– Annotated Database of digitized
mammographic data for epidemiology
studies and diagnosis support
• Grid-Enabled Knowledge Services for
Medical Informatics
– Triple Assessment in Breast Cancer:
Fusion of Clinical, Radiological and
Cytological data
• Grid-based Medical Devices for Everyday
Health
– Patient sensors, mobile wireless
communication
Support for e-Science Projects
• Grid Support Centre
- Support Grid middleware for users
- Grid Certification Authority
• National e-Science Institute for UK
Research Seminar and Training Program
– see www.nesc.ac.uk
• Grid Network Team
- QoS project with Cisco on MPLS
- Advise on end-to-end e-Science issues
SuperJanet4, June 2002
Scotland via
Edinburgh
Scotland via
Glasgow
NNW
20Gbps
10Gbps
2.5Gbps
622Mbps
155Mbps
WorldCom
Glasgow
WorldCom
Edinburgh
NorMAN
YHMAN
Northern
Ireland
MidMAN
WorldCom
Manchester
WorldCom
Leeds
EMMAN
WorldCom
Reading
WorldCom
London
EastNet
TVN
South Wales
MAN
SWAN&
BWEMAN
WorldCom
Bristol
External
Links
WorldCom
Portsmouth
LMN
LeNSE
Kentish
MAN
e-Science and the Grid
‘e-Science will change the dynamic
of the way science is undertaken.’
John Taylor
Download