The National e-Science Centre and UK e-Science Programme Muffy Muffy Calder Calder University University of of Glasgow Glasgow http: http: //www.nesc.ac.uk //www.nesc.ac.uk http://www.dcs.gla.ac.uk/~muffy http://www.dcs.gla.ac.uk/~muffy An overview • E-Science – challenges challenges and and opportunities opportunities • Grid computing – challenges challenges and and opportunities opportunities • A ctvies Activities – UK UK and and international – Scotland and National e-Science Centre e-Science ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ ‘e-Science will change the dynamic of the way science is undertaken.’ John Taylor Director General of Research Councils Office of Science and Technology The drivers for e-Science • More data –– instrument instrument resolution resolution and and laboratory laboratory automation automation –– storage storage capacity capacity and and data data sources sources • More computation –– computations computations available, available, simulations simulations doubling doubling every every year year • Faster networks –– –– bandwidth bandwidth need need to to schedule schedule • More interplay and collaboration – – between scientists, engineers, computer scientists etc. between computation and data The drivers for e-Science • Exploration of data and models – in silico discovery • Floods of public data – gene sequence doubling every 9 months – Searches required doubling every 4-5 months In summary Shared data, information and computation by geographically dispersed communities. Current model: ad-hoc clientserver HPC Experiment Analysis Storage HPC Scientist Experiment Computing Storage Analysis HPC The vision: computation & information utility Scientist M ID L E W A R E Experiment Analysis Computing Storage Storage Analysis Experiment Computing Computing Storage e-Science Examples • • • • • • • • Bioinformatics/Functional genomics Collaborative Engineering Medical/Healthcare informatics Earth Observation Systems (flood monitoring) TeleMicroscopy Virtual Observatories Robotic Telescopes Particle Physics at the LHC –– –– –– EU EU DataGrid DataGrid particle particle physics, physics, biology biology && medical medical imaging, imaging, Earth Earth observation observation GridPP, GridPP, ScotGrid ScotGrid AstroGrid AstroGrid Multi-disciplinary Simulations Wing Models •Lift Capabilities •Drag Capabilities •Responsiveness Airframe Models Stabilizer Models •Deflection capabilities •Responsiveness Crew Capabilities - accuracy - perception - stamina - re-action times - SOP’s Human Models Engine Models •Braking performance •Steering capabilities •Traction •Dampening capabilities Landing Gear Models •Thrust performance •Reverse Thrust performance •Responsiveness •Fuel Consumption Whole system simulations are produced by coupling all of the sub-system simulations Multi-disciplinary Simulations US National Air Space Simulation Environment Stabilizer Models GRC Engine Models 44,000 Wing Runs Wing Models 50,000 Engine Runs Airframe Models 66,000 Stabilizer Runs ARC LaRC 22,000 Commercial US Flights a day Human Models Simulation 48,000 Human Crew Runs Drivers (Being pulled together under the NASA AvSP Aviation ExtraNet (AEN) Virtual National Air Space VNAS •FAA Ops Data •Weather Data •Airline Schedule Data •Digital Flight Data •Radar Tracks •Terrain Data •Surface Data 22,000 Airframe Impact Runs 132,000 Landing/ Take-off Gear Runs Landing Gear Models Many aircraft, flight paths, airport operations, and the environment are combined to get a virtual national airspace Global in-flight engine diagnostics in-flight data airline global network eg SITA ground station DS&S Engine Health Center internet, e-mail, pager maintenance centre data centre Distributed Aircraft Maintenance Environment: Universities of Leeds, Oxford, Sheffield &York LHC computing assumes PC = ~ 25 SpecInt95 ~PByte/sec ~100 MByte/sec Online System Offline Farm ~20,000 PCs ~100 MByte/sec •one bunch crossing per 25 ns •100 triggers per second •each event is ~1 MByte Tier 1 US Regional Centre ~ Gbit/sec or Air Freight CERN Computer Centre >20,000 PCs Tier 0 Italian Regional Centre French Regional Centre Tier 2 ScotGRID++ ~1000 PCs RAL Regional Centre Tier2 Centre Tier2 Centre Tier2 Centre ~1000 PCs~1000 PCs~1000 PCs ~Gbit/sec Tier 3 Institute Institute ~200 PCs Physics data cache Workstations Institute Institute 100 - 1000 Mbit/sec Tier 4 physicists work on analysis “channels” each institute has ~10 physicists working on one or more channels data for these channels is cached by the institute server Emergency response teams • bring sensors, data, simulations and experts together –– wildfire: wildfire: predict predict movement movement of of fire fire & & direct direct fire-fighters fire-fighters –– also also earthquakes, earthquakes, peacekeeping peacekeeping forces, forces, battlefields,… battlefields,… National Earthquake Simulation Grid Los Alamos National Laboratory: wildfire Earth observation • ENVISAT – – – € 3.5 billion billion 400 terabytes/year 700 users • ground deformation prior to a volcano To achieve e-Science we need… To achieve e-Science we need… Grid the vision Computing resources Instruments Complex problem Data Knowledge GRID Solution People The Grid • Computing cycles, data storage, bandwidth and facilities viewed as commodities. –– like like the the electricity electricity grid grid • Software and hardware infrastructure to support model of computation and information utilities on demand. –– middleware middleware • An emergent infrastructure –– delivering delivering dependable, dependable, pervasive pervasive and and uniform uniform access access to to aa set set of of globally globally distributed, distributed, dynamic dynamic and and heterogeneous heterogeneous resources. resources. The Grid Supporting computations with differing characteristics: • Highthroughput –– Unbounded, Unbounded, robust, robust, scalable, scalable, use use otherwise otherwise idle idle machines machines • On demand –– short-term short-term bounded, bounded, hard hard timing timing constraints constraints • Data intensive –– computed, computed, measured, measured, stored, stored, recalled, recalled, e.g. e.g. particle particle physics physics • Distributed supercomputing –– classical classical CPU CPU and and memory memory intensive intensive • Collaborative –– mainly mainly in in support support of of hhi, hhi, e.g. e.g. access access grid grid Grid Data • Generated by sensor • From a database • Computed on request • Measured on request Grid Services • Deliver bandwidth, data, computation cycles • Architecture and basic services for – – – – – – – – – – – authentication authorisation resource scheduling and coscheduling monitoring, accounting and payment protection and security quality of service guarantees secondary storage directories interoperability interoperability fault-tolerance reliability But…has the emperor got any clothes on? • Maybe string vest and pants: – semantic web web •• machine machine understandable understandable information information on on the the web web – virtual organisations – pervasive computing ““ … … aa billion billion people people interacting interacting with with aa million million e-businesses e-businesses with with aa trillion trillion intelligent intelligent devices devices interconnected interconnected ”” Lou Lou Gerstner, Gerstner, IBM IBM (2000) (2000) National and International Activities • USA –– NASA NASA Information Information Power Power Grid Grid –– DOE DOE Science Science Grid Grid –– NSF NSF National National Virtual Virtual Observatory Observatory etc. etc. • • • • • • • • • • UK e-Science Programme Japan – Grid Data Farm, ITBL Netherlands – VLAM, PolderGrid Germany – UNICORE, Grid proposal France – Grid funding approved Italy – INFN Grid Eire – Grid proposals Switzerland - Network/Grid proposal Hungary – DemoGrid, Grid proposal … UK e-science centres Edinburgh Glasgow AccessGrid always-on video walls DL Belfast Newcastle Manchester Oxford Cardiff RAL Soton Cambridge London Hinxton National e-Science Centre • Edinburgh + Glasgow Universities – – – Physics & Astronomy × 2 Informatics, Computing Science EPCC • £6M EPSRC/DTI + £2M SHEFC over 3 years • e-Science Institute – visitors, workshops, co-ordination, co-ordination, outreach outreach • middleware development – 50 : 50 industry : academia • UK representation at GGF • coordinate regional centres Generic Grid Middleware • All e-Science Centres will donate resources to form a UK ‘national’ Grid • All Centres will run same Grid Software based on Globus*, SRB and Condor • Work with Global Grid Forum (GGF) and major computing companies. *Globus *Globus –– project project to to develop develop open open architecture, architecture, open open source source Grid Grid software, software, Globus Globus toolkit toolkit 2.0 2.0 now now available. available. How can you participate? • Informal collaborations • Formal collaborative LINK funded projects • Attend seminars at eSI • Suggest visitors for eSI • Suggest themes and topics for eSI. eSI forthcoming events Friday, Friday, March March 15 15 Monday, Monday, March March 18 18 Tuesday, Tuesday, March March 19 19 Thursday, Thursday, March March 21 21 Monday, Monday, April April 88 Monday, Monday, April April 15 15 Wednesday, Wednesday, April April 17 17 Monday, Monday, April April 29 29 Sunday, July 21 Sunday, July 21 Blue Blue Gene Gene IWOX: IWOX: Introductory Introductory Workshop Workshop on on XML XML Schema Schema AWOX: AWOX: Advanced Advanced Workshop Workshop on on XML: XML: XML XML Schema, Schema, Web Web Services Services and and Tools Tools GOGO: GOGO: Getting Getting OGSA OGSA Going Going 11 Software Software Management Management for for Grid Grid Projects Projects DAI: DAI: Database Database Access Access and and Integration Integration DBFT DBFT Meeting Meeting The The Information Information Grid Grid GGF5/ HPDC11 GGF5/ HPDC11 Watch the web pages…. www.nesc.ac.uk Thank you!