The UK e-Science Programme in 2003 Tony Hey,

advertisement

The UK e-Science Programme in 2003

Tony Hey,

Director of UK e-Science Core Programme

EPSRC, UK

A Definition of the Grid

‘[The Grid] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the

Web makes access to information.’

Tony Blair, 2002

UK e-Science Initiative – Phase 1

• £120M Programme over 3 years

• £75M is for Grid Applications in all areas of science and engineering

• £35M ‘Core Program’ to encourage development of generic ‘industrial strength’ Grid middleware

¾ Require £20M additional ‘matching’ funds from industry

UK Grid Projects: First Phase (1)

Particle Physics and Astronomy (PPARC)

• GRIDPP

• ASTROGRID

• GridOneD

Engineering and Physical Sciences (EPSRC)

• Combe -Chem

• DAME

• DiscoveryNet

• GEODISE

• myGrid

• RealityGrid

UK Grid Projects: First Phase (2)

Natural Environment Applications (NERC)

• Climateprediction.com

• GODIVA Oceanographic Grid

• e-Minerals Molecular Environmental Grid

• NERC DataGrid (with CP)

• GENIE

Biotechnology and Biological Sciences (BBSRC)

• Biomolecular Grid

• Proteome Annotation Pipeline

• High-Throughput Structural Biology

• Global Biodiversity

UK Grid Projects: First Phase (3)

Medical Applications (MRC)

• Biology of Ageing (with BBSRC)

• Sequence and Structure Data

• Molecular Genetics

• Cancer Management (with PPARC)

• Clinical e-Science Framework

• Neuroinformatics Modeling Tools

Computer Science for e-Science

• £9M programme

• 18 Projects funded to date; £5.6M committed

(plus £120k CP) in 2 calls

• Ontologies, dealing with incomplete data sets, autonomic architectures, data publishing & curation,

QoS, Grid Communication Services, Provenance,

Pervasive Computing, SLAs, services for VOs, etc

• Links to applications in Bioinformatics, particle physics, materials modelling, maths etc

¾ Leading CS groups engaged (> 50% in 5* Depts)

Core Programme

Overall Rationale: Four major functions of CP

– Assist development of essential, wellengineered, generic, Grid middleware usable by both e-scientists and industry

– Provide necessary infrastructure support for

UK e-Science Research Council projects

– Collaborate with the international e-Science and Grid communities

– Work with UK industry to develop industrial-strength Grid middleware

Core Programme: Phase 1

1. Network of e-Science Centres, UK e-

Science Grid and e-Science Institute

2. Support for e-Science Projects

3. Development of Generic Grid Middleware

4. Grid IRC Grand Challenge Projects

5. Grid Network Team

6. Outreach and International Involvement

UK e-Science Grid

Glasgow

Belfast

Cardiff

DL

Edinburgh

Newcastle

Manchester

Oxford

RAL

London

Southampton

Cambridge

Hinxton

UK e-Science Grid – Level 2 deployment

¾

A Globus GT2-based Grid infrastructure with resources, middleware and applications from the e-

Science Centres and CCLRC

Implemented by Grid Engineering Task Force

Possibly most heterogeneous Grid deployment worldwide

Middleware from Centre developments deployed

Real applications and e-Science project users starting to generate scientific results

Now identifying steps to full production grid with

JCSR compute and data nodes

e-Science Centres of Excellence

• Birmingham/Warwick – Modelling

• Bristol – Media

• UCL – Networking

• White Rose Grid – Leeds, York, Sheffield

• Lancaster – Social Science

• Leicester – Astronomy

• Reading - Environment

Glasgow

Belfast

Cardiff

DL

Edinburgh

Newcastle

Manchester

Oxford

RL

Soton

Cambridge

London

Hinxton

Support for e-Science Projects

• Grid Support Centre in operation

– supported Grid middleware & users

– see www.grid-support.ac.uk

• National e-Science Institute

– Research Seminars

– Training Programme

– See www.nesc.ac.uk

• National Certificate Authority

– Issue digital certificates for projects

– Goal is ‘single sign-on'

Networking Research Projects

GRS, GRID resource management

GRID

Infrastructure

Service

Infrastructure

Network

Infrastructure

FutureGRID, P2P architecture

GridMcast, Multicastenabled data distribution

MB-NG, QoS Features

GRIDprobe, backbone passive monitoring at

10Gbps

The UK Grid Experience: Phase 1

• UK Programme on Grids for e-Science

– £75M for e-Science Applications

• UK Grid Core Programme for Industry

– £35M for collaborative industrial R&D

¾ Over 80 UK companies participating

¾ Over £30M industrial contributions

• Engineering, Pharmaceutical, Petrochemical

• IT companies, Commerce, Media

Subset of Industrial Involvement

• IT Companies

– Sun, IBM, Intel, Microsoft

– SGI, HP, Fujitsu, Cisco, …

• Major End User Companies

– Rolls Royce, Data Systems and Solutions,

BAESystems, Shell, Siemens

– GSK, Astra-Zeneca, Pfizer, Merck,

Schlumberger, BT, …

• SMEs

– NAG, Cybula, Compusys, Mesophotonics,

Fluent, Epistemics, Mirada, ….

UK e-Science Funding

First Phase: 2001 –2004

• Application Projects

– £74M

– All areas of science and engineering

• Core Programme

– £15M + £20M (DTI)

– Collaborative industrial projects

Second Phase: 2003 –2006

• Application Projects

– £96M

– All areas of science and engineering

• Core Programme

– £16M

– Core Grid Middleware

– DTI follow-on?

Core Programme: Phase 2

1. UK e-Science Grid/Centres and e-Science

Institute

2. Grid Support Centre and Network Monitoring

3. Core Middleware engineering

4. National Data Curation Centre

5. e-Science Exemplars/New Opportunities

6. Outreach and International involvement

Research Prototype Middleware to

Production Quality

• Research projects are not funded to do the regression testing, configuration and QA required to produce production quality middleware

• Common rule of thumb (Brooks) is that it requires at least 10 times more effort to take ‘proof of concept’ research software to production quality

¾ Key issue for UK e-Science projects is to ensure that there is some documented, maintainable, robust grid middleware by the end of the 5 year

£250M initiative

Core ‘e-Science’ Middleware

• Need to develop open source, open standard compliant, Grid Middleware stack that will integrate and federate with industrial solutions

• Software Engineering focus

Aim is to produce robust, well-documented, re-usable software that is maintainable and can evolve to embrace emerging Grid

Service standards

¾ Major focus of Phase 2 of the UK e-Science

Core Programme

A UK Open Middleware

Infrastructure Institute

• Repository for UK-developed Open Source

‘e-Science/Cyber-infrastructure’ Middleware

• Compliance testing for GGF/WS standards

• Documentation, specification and QA

• Fund work to bring ‘research project’ software up to ‘production strength’

• Fund Middleware projects for identified ‘gaps’

• Work with US NMI, EU Projects and others

• Supported by major IT companies

Grid Middleware Gap Analysis

• Examined requirements and services already understood/developed for e-Science (reasonably broad coverage) and e-Business, e-Government and e-Services (more limited coverage)

• Gaps divided into four broad areas

– Near-term Technical

– Education and Support

– Research

– Perception and Organization

• Appendix lists over 60 significant services and tools under development

Categorization of Technical Gaps

Architecture and Style 8.1

Portals

PSE’s

8.10

Basic Technology

Runtime and

Hosting Environment 8.2

Resources

Network 8.11

Information 8.7

Compute/File 8.8

Grid Services: Application Specific

Resource Specific

Generic

Compute Information

Security 8.3

Workflow 8.4

Notification 8.5

Meta-data 8.6

Other 8.9

Security Task Force Activities

• Raise awareness of Security

• Requirements collection and risk analysis

• Develop e-Science Security policy

• Develop Security Technology roadmap

¾ Joint fund key security projects with

EPSRC and JCSR and coordinated effort with NSF NMI Internet2 projects

¾ JCSR £2M call in preparation

UK Data Curation Centre

• In next 5 years e-Science projects will produce more scientific data than has been collected in the whole of human history

• In 20 years can guarantee that the operating and spreadsheet program and the hardware used to store data will not exist

¾ Need to research and develop technologies and best practice for curating digital data

¾ Need to liaise closely with individual research communities, data archive centres and university libraries

UK Data Curation Centre (2)

• With CP and JCSR funding, plan to establish internationally significant Centre for R&D and

Best Practice in Data Curation technologies

¾ Centre will liaise closely with individual research communities and data archives

• Joint call from JCSR and Core Programme

¾ £3M call for initial 3 year funding closes in

October

The UK Dual Support System

Provides two streams of public funding for university research:

– Funding provided by the HEFCs for research infrastructure – salaries of permanent academic staff, premises, libraries & central computing costs

– Funding from the Research Councils for specific projects – in response to proposals submitted & approved through peer review

¾ ‘Well Founded Laboratory’ concept

UK Research Infrastructure Funding

• National Teraflop/s Supercomputer (OST)

¾ 2002 – 3 Teraflop/s

¾ 2004 – 6 Teraflop/s

¾ 2006 – 12 Teraflop/s

• National Academic Network (JISC)

¾ SuperJANET4 plus MANs

¾ ‘UKLight’ lambda connection

Joint Information Systems

Committee (JISC)

Mission:

“Help further and higher education institutions and the research community realise their ambitions in exploiting the opportunities of information and communications technology by exercising vision and leadership, encouraging collaboration and cooperation and by funding and managing national development programmes and services of the highest quality”

JISC Committee for

Support of Research (JCSR)

• Established in 2002 after Follett Review

• Remit is to ensure JISC retains focus on research community

• Budget of £3M p.a.

• Seeking research support requirements from

Research Councils

• Funded analysis of research data curation requirements

• Funded scoping study on legal, IPR and provenance issues for e-Science collaboratories

Initial JCSR Portfolio

• Grid Middleware Testbed with Compute and Data

Clusters with CLRC

• AAA Initiative with JCIE

• Autonomic Computing/Semantic Grid initiative with

EPSRC

• Access Grid Support Service

• e-Social Science Training material with ESRC

• Intelligent Text Mining Service for Biosciences with

BBSRC

• Digital Curation Centre with e-Science Core

Programme

Status of the Grid and e-Science

• Today - ‘early adoption’ phase, like the Web in the early days

¾ Time of vigorous debate and experimentation

• Tomorrow – need to plan for supporting e-Science infrastructure after e-Science

Initiative ends

¾ Need to redefine the ‘well-founded laboratory’ so that JISC supports key e-Science services over SuperJANET network – ‘e-Infrastructure’

e-Science Timeframes

2001 2002 2003 2004 2005 2006 2007

* * * SR2000

SR2002 * * *

SR2004 * * *

LHC/LCG * *

SR2004 – e-Science Infrastructure

1. Persistent UK e-Science Research Grid

2. Grid Operations Centre

3. UK Open Middleware Infrastructure Institute

4. National e-Science Institute

5. UK Digital Curation Centre

6. AccessGrid Support Service

7. e-Science/Grid collaboratories Legal Service

8. International Standards Activity

e-Science and Universities

‘e-Science will change the dynamic of the way science is undertaken.’

John Taylor, 2001

¾ Need to break down the barriers between the

Victorian ‘bastions’ of science – biology, chemistry, physics, ….

¾ Develop ‘permeable’ structures that promote rather than hinder multidisciplinary collaboration

¾ Need to engage University IT Service Departments

– Computing Services, Libraries, ..

Download