The UK e-Science Vision: Building a Sustainable e-Infrastructure Tony Hey Director of UK e-Science Core Programme Tony.Hey@epsrc.ac.uk The UK e-Science Paradigm • The Integrative Biology Project involves seven UK Universities lead by Oxford and the University of Auckland in New Zealand – Models of electrical behaviour of heart cells developed by Denis Noble’s team in Oxford – Mechanical models of beating heart developed by Peter Hunter’s group in Auckland • Researchers need robust middleware services to routinely build secure ‘Virtual Organisations to’ support an international “collaboratory” Goal is to enable ‘faster, better or different’ research RCUK e-Science Funding First Phase: 2001 –2004 • Application Projects – £74M – All areas of science and engineering • Core Programme – £15M Research infrastructure – £20M Collaborative industrial projects Second Phase: 2003 –2006 • Application Projects – £96M – All areas of science and engineering • Core Programme – £16M Research Infrastructure – £11M DTI Technology Fund Some Example e-Science Projects • Particle Physics – global sharing of data and computation • Astronomy – ‘Virtual Observatory’ for multi-wavelength astrophysics • Chemistry – remote control of equipment and electronic logbooks • Engineering – industrial healthcare and virtual organisations • Bioinformatics – data integration, knowledge discovery and workflow • Healthcare – sharing normalized mammograms • Environment – Ocean, weather, climate modelling, sensor networks UK e-Science Grid Edinburgh Glasgow DL Belfast Newcastle Manchester Cambridge Oxford Cardiff RAL London Southampton Hinxton Access Grid – Group Conferencing Multi-site group-to-group conferencing system Continuous audio and video contact with all participants Globally deployed All UK e-Science Centres have AG rooms Widely used for technical and management meetings A Status Report on UK e-Science • An exciting portfolio of Research Council e-Science projects (~40 projects) – Beginning to see e-Science infrastructure deliver some early ‘wins’ in several areas – DiscoveryNet success at SC02 – TeraGyroid success at SC03: ‘heroic’ achievement • The UK is unique in having a strong collaborative industrial component (~50 projects) – Nearly 80 UK companies contributing over £30M – Engineering, Pharmaceutical, Petrochemical, IT companies, Commerce, Media, … AHM 2004 Attendees Breakdown of AHM attendees C o mp ut er Ser vices & N et wo r ks 3% eScience C ent r es 11% Go v 5% C o mp ut er Science 33% Lab o r at o r y 8% Envir o nment al 5% M ed ical 3% Ind ust r y 7% Int er nat io nal 5% Eng & M at h 4% B io 3% Physical Science 8% So cial 5% Identifiable UK e-Science Focus • Data Access and Integration – OGSA-DAI and DAIT project • Grid Data Services – Workflow, Provenance, Notification – Distributed Query, Knowledge Management • Data Curation and Data Handling – Digital Curation Centre • Security, AA and all that – Digital Certificates and Single Sign-On – Federated Shibboleth framework UK e-Science Grid: Second Phase Web Service Grids Edinburgh Glasgow DL Belfast Newcastle Manchester Oxford Cardiff Cambridge RL London Soton Hinxton The e-Science Core Programme: Phase 2 - Building a Sustainable National e-Infrastructure The UK Government ‘Investment Framework for Research and Innovation 2004 – 2014’ report emphasizes the need for: • Creation of a multidisciplinary research environment • National Information Infrastructure – Access to experimental data sets and publications – Collection and preservation of digital information Key Elements of a UK e-Infrastructure 1. 2. 3. 4. 5. 6. 7. Research Network National Grid and HEC Service Open Middleware Infrastructure Institute Digital Curation Centre National e-Science Institute Portals and Discovery Services Access to facilities, data services and repositories 8. Tools and Services to support collaboration 9. National data archive 10. Support for International Standards SuperJANET4/5 Local Research Equipment UK Researchers International Point-of-Access Extended JANET Development Network Existing connections Proposed connections CA*net StarLight Chicago UKLight London 10Gb/s 2.5Gb/s 10Gb/s Abilene CERN 10Gb/s 10Gb/s 10Gb/s NetherLight Amsterdam JISC £6.5M for UKLight ‘Lambda’ Network CzechLight GEANT The Future: Hybrid Networks? • Standard packet routed production network for email, Web access, … • User-controlled ‘lambda’ connections for eScience applications requiring high performance end-to-end Quality of Service NGS “Today” Interfaces Projects e-Minerals e-Materials Orbital Dynamics of Galaxies Bioinformatics (using BLAST) GEODISE project UKQCD Singlet meson project Census data analysis MIAKT project e-HTPX project. RealityGrid (chemistry) Users Leeds Oxford UCL Cardiff Southampton Imperial Liverpool Sheffield Cambridge Edinburgh QUB BBSRC CCLRC. OGSI::Lite NGS “Tomorrow” GOSC Timeline NGS WS Service NGS Expansion (Bristol, Cardiff…) NGS Production Service NGS WS Service 2 OGSA-DAI NGS Expansion WS2 plan WS plan Q2 Q3 Q4 Q1 2004 Q2 Q3 2005 Q4 Q1 Q2 Q3 2006 OMII release gLite release 1 EGEE gLite alpha release EGEE gLite release OMII Release Grid Operation Support Centre Web Services based National Grid Infrastructure Lessons from the NGS? • Do users want an NGS? • What will they use it for? • Will the data nodes be useful? Research Prototypes to Production Quality Middleware? • Research projects are not funded to do the regression testing, configuration and QA required to produce production quality middleware • Common rule of thumb is that it requires at least 10 times more effort to take ‘proof of concept’ research software to production quality Key issue for UK e-Science projects is to ensure that there is some documented, maintainable, robust grid middleware by the end of the 5 year £250M initiative Open Middleware Infrastructure Institute (OMII) Vision • To be the national provider of reliable, interoperable, open source grid middleware • Provide one-stop portal and software repository for grid middleware • Provide quality assured software engineering, testing, packaging and maintenance for our products Located in Southampton with Edinburgh ‘node’ producing OGSA-DAI middleware Digital Curation? • • In next 5 years e-Science projects will produce more scientific data than has been collected in the whole of human history In 20 years can guarantee that the operating and spreadsheet program and the hardware used to store data will not exist Research curation technologies and best practice Need to liaise closely with individual research communities, data archives and libraries Edinburgh with Glasgow, CLRC and UKOLN selected as site of DCC Digital Curation Centre • Actions needed to maintain and utilise digital data and research results over entire life-cycle – For current and future generations of users • Digital Preservation – Long-run technological/legal accessibility and usability • Data curation in science – Maintenance of body of trusted data to represent current state of knowledge in area of research • Research in tools and technologies – Integration, annotation, provenance, metadata, security….. Imperial Imperial EBI EBI GLOBUS UCL UCL User query Local databases GRID sharing Linux farms BioSimGrid Project York Nottingham RAL Oxford Bristol Southampton Application Analyse Data Simulation Data Distributed Query 2nd Level Metadata – Describing the Results of Generic Analyses… 1st Level Metadata – Describing the Simulation Data… London Distributed Raw Data BioSimGRID - A biosimulation GRID database e-HTPx Project Crystallisation Data Collection Phasing Protein Production START Target Selection Protein Structure Structure analysis Deposition NERC DataGrid Project + Remote Access = Grid Challenge British Atmospheric Data Centre Simulations British Oceanographic Data Centre Assimilation http://ndg.nerc.ac.uk Partners for e-Infrastructure • Sustainability requires long-term support • e-Science is creating the e-Infrastructure for research • The support of the e-Infrastructure will, over time, become the role of JISC. • Research and Development of this eInfrastructure will be the responsibility of OST/RCUK. JISC Funded e-Research Projects • • • • • • • Security Knowledge management Collaborative Environments Visualisation Data curation and handling Middleware architectures and development Education and Outreach UK e-Science Grid Vision • TeraGrid and DEISA Vision of Supercomputer Centre Grids • SETI@HOME and ClimatePrediction.Net Vision of Public ‘Idle-Cycle’ Grids • Particle Physicists Vision of truly global ‘ComputeFile’ Grids • UK e-Science ‘Plug and Play’ Grid Vision driven by user needs Need to provide robust, interoperable Middleware Services so that different user communities can connect the resources and research groups that they need for their type of e-Science Grid