What is e-Science? Dr. Dave Berry Research Manager www.nesc.ac.uk 19th November 2003 Foundation for e-Science e-Science methodologies will rapidly transform science, engineering, medicine and business driven by exponential growth (×1000/decade) X enabling a whole-system approach HPC software Grid sensor nets instruments Diagram derived from Ian Foster’s slide colleagues mass storage: Shared data The Grid Distributed computing model Platform- and protocol-neutral standards Resource virtualisation and resource sharing Hardware, storage, network, data, function, instruments Service oriented model Discovery Negotiated access and allocation Introspection and management of state Unlimited resources Dependability Performance and scalability Community driven standards process Global Grid Forum (GGF) Open source reference implementations (Globus, OGSA-DAI) Exponential Growth Triumph of Light – Scientific American. George Stix, January 2001 Gilder’s Law (32X in 4 yrs) Storage Law (16X in 4yrs) Moore’s Law (5X in 4yrs) It’s Easy to Forget How Different 2003 is From 1993 Enormous quantities of data: Petabytes For an increasing number of communities gating step is not collection but analysis Ubiquitous Internet: >100 million hosts Collaboration & resource sharing the norm Security and Trust are crucial issues Ultra-high-speed networks: >10 Gb/s Global optical networks Bottlenecks: last kilometre & firewalls Huge quantities of computing: >100 Top/s Moore’s law gives us all supercomputers Organising their effective use is the challenge Moore’s law everywhere Instruments, detectors, sensors, scanners, … Organising their effective use is the challenge Derived from Ian Foster’s slide at ssdbM July 03 The Emergence of Global Knowledge Communities Slide from Ian Foster’s ssdbm 03 keynote Global Knowledge Communities Often Driven by Data: E.g., Astronomy No. & sizes of data sets as of mid-2002, grouped by wavelength • 12 waveband coverage of large areas of the sky • Total about 200 TB data • Doubling every 12 months • Largest catalogues near 1B objects Data and images courtesy Alex Szalay, John Hopkins Database Growth Bases 41,073,690,490 PDB Content Growth Challenging Requirements Dynamic formation and management of virtual organisations Online negotiation of access to services who, what, why, when, how Configuration of applications and systems able to deliver multiple qualities of service Autonomic management of distributed infrastructures, services, and applications Open Grid Services Architecture Share resource Access resource Manage resource Continuous Availability Applications on demand Secure and universal access Business integration Web Services Resources on demand Global Accessibility Vast resource scalability See: The Physiology Of The Grid … Grid Protocols Middleware Architecture Applications for X Research Simulation, Analysis & Integration Technology for X Brokering Integration VOs Transactions Reservation Discovery Replication Workflow Queueing Registry Data Access Accounting OGSA CMM/WSDM Provisioning Authorisation Execution WS-Agreement OGSI: Interface to Grid Infrastructure Distributed Compute, Data & Storage Resources Three-way Alliance Multi-national, Multi-discipline, Computer-enabled Consortia, Cultures & Societies Theory Models & Simulations → Shared Data Requires Much Computing Science Engineering, Systems, Notations & Much Innovation Formal Foundation Experiment & Advanced Data Collection → Shared Data Changes Culture, New Mores, New Behaviours → Process & Trust New Opportunities, New Results, New Rewards Take-home message Technology enables Grids More Data, More Compute Power, More Sensors, … Collaboration essential Combining approaches Combining skills Sharing resources (Structured) Data is the language of Collaboration Data Access & Integration a Ubiquitous Requirement Many hard technical challenges Scale, heterogeneity, distribution, dynamic variation Intimate combinations of data and computation With unpredictable (autonomous) development of both