International Computing e-Infrastructures: Past, Present and Future... Fabrizio Gagliardi EMEA Director Technical Computing Microsoft Corporation Outline of the talk Some personal introductory remarks Definition of e-Infrastructures Need for e-Infrastructures Recent past history Current situation, accomplishments and challenges Outlook for the future CGW'05 2 Definition of e-Infrastructures Infrastructures to support wide geographically distributed communities which share problems and resources to work towards common goals Leveraging international network interconnectivity Based on safe AAA architecture Need persistent software middleware (S/W is integral part of the infrastructure) CGW'05 3 ‘Grids’: A Catch-All Marketing Term ‘Grids’ mean many different things to many different people/companies: P2P desktop cycle-stealing Linked Supercomputer Centers Managed virtual distributed clusters Internet access to giant, distributed repositories Virtualization of data center IT resources Out-sourcing to “utility compute centers” Sharing resources distributed among different administrative domains (Ian Foster) For Microsoft, Grids are about Data Management as much as Compute Cycles CGW'05 4 Need for e-Infrastructures Science, industry and commerce are more and more digital, process vast amounts of data and need massive computing power We live in a “flat” world: Science is more and more an international collaboration and often requires a multidisciplinary approach Need to use technology for the good cause Fight Digital/Divide Industrial uptake has become essential CGW'05 5 Recent past history Meta-computing and distributed computing early examples in the 80’ and 90’ (CASA, I-Way, Unicore, Condor etc.) EU-US workshop in Annapolis in 1999 on large scientific data bases: http://www.cacr.caltech.edu/euus/ EU FP5 and US Trillium and national Grids EU FP6, US OSG, NAREGI/Japan… CGW'05 6 Chronology IGO-----ETICS/EGO---- EDG start 2001 CHEP2000 EGEE-II start 2006 EGEE start 2004 EGEE-XX 2008? We are here CGW'05 7 ARCADE 2002 Barcelona, 14th February 2002 “Unlimited” bandwidth, breaking the frontiers of computing: the path in Europe from FP5 to FP6. Antonella Karlson Research Networks Unit, DG INFSO, EC antonella.karlson@cec.eu.int CGW'05 "The views expressed in this presentation are those of the author and do not necessarily reflect the views of the European Commission" 8 GRIDs - IST projects (~36m Euro) An integrated approach Applications EGSO CROSSGRID Middleware GRIP EUROGRID & Tools DAMIEN GRIDSTART GRIA GRIDLAB DATAGRID DATATAG Underlying Infrastructures Industry / business CGW'05 Science 9 GRIDs: Examples of large testbeds DATAGRID, CROSSGRID • 17 European countries • Collaboration of more than 2000 scientists Application requirements: • Computing > 20 TFlops/s • Downloads > 0.5PBytes • Network speeds at 10 CGW'05 10 Gbps GRIDs: Examples of large testbeds DATATAG (cross-Atlantic testbed) (2Gbps) Links with US projects (GriPhyN, PPDG, iVDGL,…) CGW'05 11 Current situation: accomplishments and challenges Many Grids around the world, very few maintained as a persistent infrastructure (best example is the “secret” Google Grid) Need for public and open Grids (OSG, EGEE and related projects, NAREGI, and TERAGRID, DEISA good prototypes) Persistence, support, sustainability, long term funding, easy access are the major challenges CGW'05 12 Projects in Europe (I) Enabling Grids for E-sciencE • Access to IT-resources (connectivity, computing, data, instrumentation…) for scientists: – Providing e-Infrastructure Géant2 EGEE DEISA SEE-GRID – Benefiting from e-Infrastructure DILIGENT SIMDAT GRIDCC CoreGRID GridLab – Concertation: GRIDSTART, GridCoord – Grid mobility: Akogrimo INFSO-RI-508833 CGW'05 13 Projects in Europe (II) Enabling Grids for E-sciencE • Sample of National Grid projects: – Austrian Grid Initiative – DutchGrid – France: e-Toile ACI Grid – Germany D-Grid Unicore D-GRID – Grid Ireland – Italy INFNGrid GRID.IT – NorduGrid – UK e-Science National Grid Service OMII GridPP project INFSO-RI-508833 CGW'05 14 Policy Forums Enabling Grids for E-sciencE • The e-Infrastructures Reflection Group (eIRG) – Mission: study and promote policies for easy and cost-effective shared use of electronic resources in Europe – 25 countries (government-appointed representatives), EU: 2 members – White Papers • European Strategy Forum on Research Infrastructures (ESFRI) – Role: to support a coherent approach to policy-making on research infrastructures in Europe, and to act as an incubator for international negotiations about concrete initiatives – representatives of the 25 EU Member States, appointed by Research Ministers and a representative of the European Commission • ESFRI + eIRG: European roadmap for new research infrastructures of pan-European interest (10-20 years) INFSO-RI-508833 CGW'05 15 Enabling Grids for E-sciencE Géant2 • GÉANT2 is the 7th generation of the pan-European research and education network, successor to the multi-gigabit research network GÉANT. – Official start: 1 September 2004, Duration: 4 years – Funding: EC, national research, education networks – Managed by DANTE • Goal: – To connect 34 countries through 30 national research and education networks (NRENs) – using multiple 10Gbps wavelengths • Status: – Equipment and services currently in operations (officially inaugurated by Commissioner Reding last June) – Transition from GÉANT network to GÉANT2 gradually completing, started in the first quarter of 2005 INFSO-RI-508833 CGW'05 16 EU Grid technology & infrastructure (I) Enabling Grids for E-sciencE New Grid Research Projects in FP6 EU Funding:52 MILLION - Start: SUMMER 2004 GRIDCOORD Building the ERA in Grid research K-WF Grid inteliGRID Knowledge based workflow & collaboration Semantic Grid based virtual organisations Grid-based generic enabling application technologies to facilitate solution of industrial problems OntoGrid SIMDAT UniGridS Extended OGSA Implementation based on UNICORE EU-driven Grid services architecture for businesS and industry NextGRID Mobile Grid architecture and services for dynamic virtual organisations Akogrimo HPC4U Fault tolerance, dependability for Grid Knowledge Services for the semantic Grid European-wide virtual laboratory for longer term Grid research-creating the foundation for next generation Grids CoreGRID Specific support action Integrated project Network of excellence DataminingGrid Datamining tools & services Provenance Trust and provenance for Grids Specific targeted research project From a talk by Ulf Dahlsten, Den Haag, Nov 2004 INFSO-RI-508833 CGW'05 17 EU Grid technology & infrastructure (II) Enabling Grids for E-sciencE Building the European eInfrastructure for research 2000 2001 TEN 155 network 2002 2003 2004 GÉANT network 2005 2006 2007 2008 GÉANT network (FP6) IPv6 testbeds IPv6 actions Grid testbeds Grid enabled Infrastructures (EGEE, DEISA, SEE-GRID,…) (other) testbeds (other) testbeds FP5 FP6 FP7 Complementary to National infrastructures From a talk by Ulf Dahlsten, Den Haag, Nov 2004 INFSO-RI-508833 CGW'05 18 EU Grid technology & infrastructure (III) GRIDCC MUPPET DILIGENT Flexible Quality of Service Assurance Optical solutions for Grid infrastruct. New user communities using Grids – Digital Libraries eInfrastructure – Testbeds Real time Grid for remote control of instruments Enabling Grids for E-sciencE EUQoS IPv6TF SC IPv6 Task Force support Courtesy of Specific Support Actions K. Baxevanidis, EU INFSO-RI-508833 EUROLABS LOBSTER Experimental testbeds Traffic monitoring CGW'05 19 EU Grid technology & infrastructure (V) Enabling Grids for E-sciencE • eInfrastructure – achievements • Connectivity service • • Computing, storage service • • GÉANT network: 10Gbit/s, IPv6 enabled, 3900 Research Centres connected EGEE: production quality, >10000 CPUs, >5PB storage, training, coverage of 27 countries DEISA: Supercomputer network, reaching 40 Tflop/s • Testbeds • Rich set of technologies tested/ verified (IPv6, Grids, Optical, End-to-End QoS, Security, Mobility…) and communities involved (scientific, industry) • International links • USA, Russia, Mediterranean, Asia, Latin America... From a talk by Ulf Dahlsten, Den Haag, Nov 2004 INFSO-RI-508833 CGW'05 20 Outlook for the future Outlook for the future CGW'05 21 Supercomputing Goes Personal 1991 1998 2005 System Cray Y-MP C916 Sun HPC10000 Shuttle @ NewEgg.com Architecture 16 x Vector 4GB, Bus 24 x 333MHz UltraSPARCII, 24GB, SBus 4 x 2.2GHz x64 4GB, GigE OS UNICOS Solaris 2.5.1 Windows Server 2003 SP1 GFlops ~10 ~10 ~10 Top500 # 1 500 N/A Price $40,000,000 $1,000,000 (40x drop) < $4,000 (250x drop) Customers Government Labs Large Enterprises Every Engineer & Scientist Applications Classified, Climate, Physics Research Manufacturing, Energy, Finance, Telecom Bioinformatics, Materials Sciences, Digital Media CGW'05 22 The Continuing Trend Towards Decentralized, Networked Resources Grids of personal & departmental clusters Personal workstations & departmental servers Minicomputers Mainframes CGW'05 23 Leverage IT Industry’s Existing R&D Parallel applications development High-productivity IDEs Integrated debugging/profiling/tracing/analy sis Code designer wizards Concurrent programming frameworks Platform optimizations Dynamic, profile-guided optimization New programming abstractions Digital experimentation Collaboration-enhanced Office productivity tools Structure experiment data and derived results in a manner appropriate for human reading/reasoning (as opposed to optimizing for query processing and/or storage efficiency) Enable collaboration among colleagues (Scientific) workflow environments Distributed systems issues Web Services & HPC grids Automated orchestration Visual scripting Provenance Security Interoperability Scalability Dynamic Systems Management Self (re)configuration & tuning Reliability & availability RDMS + data mining Ease-of-use Advanced indexing & query processing Advanced data mining algorithms CGW'05 24 Scientific Information Worker: Past and Future Past Buy lab equipment Keep lab notebook Run experiments by hand Assemble & analyze data (using stat pkg) Collaborate by phone/email; write up results with Latex Metaphor: Physical experimentation “Do it yourself” Lots of disparate systems/pieces Future Buy hardware & software Automatic provenance Workflow with 3rd party domain packages Excel & Access/SQL-Server Office tool suite with collaboration support Metaphor: Digital experimentation Turn-key desktop supercomputer Single integrated system CGW'05 25 Where Grids will be in 5 years? Like in the past ES, AI, networking, OS they will disappear from the hot research (and hype) space and become mainstream technology Major Grids already work in production (EGEE: 18’000 computers, Google: 100’000 computers?...) Major IT vendors will integrate Grid middleware in their standard products (industrial uptake) Computing and data resources will become commodities on the Internet ISPs will offer a wide range of services Grid based, a full mature market will develop for these services The result will be a tremendous computing and data processing power which will enable a new set of scientific applications and generate large revenues for business applications A potential leveler for a worldwide science and economy => digital Divide could be moderated CGW'05 26 … And time will tell how wrong we are in our predictions now See you back here next year! CGW'05 27