HPC in France and Europe Overview of GENCI and PRACE Stéphane REQUENA, CTO GENCI Supercomputing - driving Science and Industry through simulation Aerospace Environment Weather / Climatology Pollution / Ozone Hole Ageing Society Medicine Biology Materials / Inf. Tech Spintronics Nano-science Energy Plasma Physics Fuel Cells Virtual power plant Automotive Finance 06/11/2012 Multimedia Franco-British Workshop on Big Data in Science 2 HPC is a «key technology» Supercomputers : an indispensable tool to solve the most challenging problems via simulations A 200M$ effort by 6 agencies Access to world class computers : essential to be competitive in science and engineering Providing competitive HPC services : a continuous endeavor This has been acknowledged by leading industrial nations → Europe : PRACE → France : GENCI 06/11/2012 Franco-British Workshop on Big Data in Science 3 GENCI Grand Equipement National de Calcul Intensif 10 % 1 % 20 % 49 % 20 % Missions : To implement a national HPC strategy in France and to provide the 3 national HPC academic centres with supercomputers To contribute to the creation of the European HPC ecosystem To promote numerical simulation and HPC in academia and industry 06/11/12 Franco-British Workshop on Big Data in Science GENCI : powering the 3 national HPC centres Coordination and optimization of investments in HPC Common allocation of HPC hours via call for proposals Orsay IDRIS Bruyèresle-Châtel TGCC Montpellier CINES CT Scientific Area 1 Environment 2 CFD, reactive & complex flows 3 Bio medical and Health 4 Astrophysics and geophysics 5 Theoretical and plasma physics 6 CS, algorithmic and mathematics 7 Molecular systems and biology Répar divided on par CT Resources into scientific areas CT 1 7 CT2 58 96 CT 3 164 109 CT 4 CT 5 59 8 Quantum chemistry and molecular simulation 9 Physics, chemistry and materials 10 New and transverse applications 23 38 CT 6 50 CT 7 11 CT 8 CT 9 06/11/2012 Franco-British Workshop on Big Data in Science CT 10 5 A huge effort for increasing French HPC capacities Spécifications de l’extension Disposer d’environ 100 Tflops complétant la plate-forme de production actuel BULL du CCRT: Category utilisant les technologies « standard » de dernière génération, Features basée sur les produits BULL - 103 Opensource (linux, Teraflop/s Lustre, Ofed) Hybrid cluster (+ 192 GPU Teraflop/s SP) proposant les mêmes environnements deIBM travailx3750M4 aux utilisateurscluster – 233 Cluster of SMP Fat nodes Teraflop/s Accéder à des technologies alternatives comme le GPU sur une partie de la machine via une intégration complète pour: MPP IBM BG/Q - 836 Teraflop/s gagner un facteur d’accélération important sur des codes ciblés Cluster of SMP Thin nodes SGI Altix ICE - 267 Teraflop/s Cluster of SMP Thin, Hybrid and Fat nodes 06/11/2012 valider une plate-forme de recherche pour anticiper les futures machines d production BULL Bullx cluster 2.0 Petaflops CEA/DIF- C.Ménaché Franco-British Workshop on Big Data in Science 11/06/2008 6 PRACE: a European Research Infrastructure (RI) & ESFRI list-item PRACE RI is in operation since April 2010 • • 1st Council June 9, 2010 PRACE AISBL created with 20 countries, head office in Brussels Now 25 member countries PRACE RI is providing services since august 2010 • • Now 6 Tier0 systems available 4.3 billions core hours awarded to 159 projects through a single pan-European peer review process Funding secured for 2010-2015 • • 06/11/12 400 Million€ from France, Germany, Spain and Italy, provided as Tier0 services on TCO basis 130 Million€ additional funding = 70 Million€ from EC FP7 preparatory and implementation projects + 60 Million€ from PRACE members : Technical, organizational and legal support for PRACE • Prepared the creation of the AISBL as a legal entity • Established the PRACE brand • Provided extensive HPC Training 06/11/2012 • Deployed and evaluated promising architectures • Ported and petascaled applications PRACE-3IP kick-off in Paris Franco-British Workshop on Big Data in Science 7 2012: PRACE is providing nearly 15 PFlop/s... JUQUEEN: IBM BlueGene/Q at GCS partner FZJ (Forschungszentrum Jülich) Mare Nostrum: IBM at BSC FERMI: IBM BlueGene/Q at CINECA CURIE: Bull Bullx at GENCI partner CEA. HERMIT: Cray at GCS partner HLRS 06/11/2012 (High Performance Computing Center Stuttgart). SuperMUC: IBM at GCS partner LRZ (Leibniz-Rechenzentrum) 06/11/12 Franco-British Workshop on Big Data in Science 8 PRACE boosts Science to face the tempest! The UPSCALE project aims to continue developing our climate modelling capability and goes for even higher global resolution, all the way to 12km, which is not even envisioned for Met Office global weather forecasting before 2015. AWARD : 144M CPU HOURS Credits: Prog. Pier Luigi Vidale, Univ. Reading, U.K. - Cray XE6 System Hermit in GCS@HLRS also in NATURE Climatemeeting Change July 9 PRACE-1IP Kick-off in2012 9 CURIE : the French PRACE supercomputer CURIE, France’s commitment to PRACE, is overseen by GENCI Located in and operated by CEA DAM teams A modular and balanced architecture by Cluster of SMP nodes with fat, thin and hybrid nodes Complementary to other PRACE Tier0 systems Fully available since March 8, 2012 In honour of Marie Curie Global peak performance of 2 PFlop/s > 92 000 Intel cores, 360 TB memory, 15 PB Lustre @ 250 GB/s, 120 racks, < 200 m2 - 2,5 MW 50 kms of cables 06/11/12 Franco-British Workshop on Big Data in Science Example of recent results on CURIE Understanding the evolution of the Universe (1/2) Grand challenge conducted by Observatoire de Paris and the DEUS Consortium (http://www.deus-consortium.org) Goal : perform 3 FULL Universe simulations, from Big Bang to nowdays using 3 different dark energy distributions Influence of the dark matter wrt evolution of the Universe Direcly linked with the 2011 Physics Nobel Prize Data will be used to feed next EU EUCLID telescope Unprecedented HPC requirements 06/11/2012 >550 billions particles, 81923 mesh in a 21 h-1 Gpc box RAMSES code (CEA) and a dedicated workflow toolchain 76k cores, >300 TB of main memory Specific memory, MPI and parallel I/O optimisations Franco-British Workshop on Big Data in Science 11 Example of recent results on CURIE Understanding the evolution of the Universe (2/2) WORLWIDE record finished 2 months ago First FULL Universe ΛCDM, LCDM and RPCDM simulations performed on Curie thin nodes 3 runs for a total of 92 hours elapsed on 76 032 cores, last run lasted 29 hours without any failure WWOOUUAAHHH CURIE is very stable ! A strong need of substained I/O rate on the Lustre scratch fs We have here a BigData problem A total of 10 PB of full data (scratch and rough data) generated 4 PB of rough results after simulation 1.2 PB of refined data after post processing for the 3 dark energy simulations need to be made available to worldwide scientists ! 06/11/2012 Franco-British Workshop on Big Data in Science 12 Explosion of computational data An another example from climatology Evolution of the global climate 5th IPCC campaign , French production on a dedicated NEC SX9 : > 1TB/day Strong issues with storage, post processing and archive of data And the future is : Year Power factor Npp Resolution [km] Number of mesh points [millions] Ensemble size Number of variables Interval of 3-dimensional output (hours) Years simulated Storage density Archive size (Pb) (atmosphere) 06/11/2012 CMIP5 2012 1 200 100 3,2 200 800 6 90000 0,00002 5,31 CMIP6 2017 30 357 56 18,1 357 1068 4 120170 0,00002 143,42 Franco-British Workshop on Big Data in Science CMIP7 2022 1000 647 31 108,4 647 1439 3 161898 0,00002 3766,99 13 One conclusion Data is exploding Observational/experimental data Particle accelerators and detectors (LHC@CERN) Genome sequencer and personalized medecine Next gen satellites and (radio)telescopes Captors/sensors in weather forecast/climatology or oil & gas Finance, insurance, … Computational data Increase on HPC resources (PRACE = 15 PF in 2012) Increase of space and time resolution of models Multi-physics and multi scale simulations Rise of uncertainties quantification, ensemble simulations, … With problems relative to Size of data (number and size of files), (un)structured data, format, ..) Uncertainties of data and fault tolerance Metadata issue Post processing (20% of time) Dissemination of refined data to worldwide communities during decades That means : To deploy PERENE and SUBSTAINABLE Research Infrastructures 06/11/2012 Franco-British Workshop on Big Data in Science 14 Another conclusion People is aware about that ! On the hardware and system software side Multi level storage with new I/O devices : mix of flash based memory (SSD, PCM, …) with hard drives Asynchronous I/O and Active I/O (servers embedded into I/O controllers) Next generation of parallel file system (Lustre, GPFS, Xyratec, …) Flops will be “almost free” -> post processing at the same time as computation A lot of European projects and R&D initiative PRACE implementation projects : data management, remote viz, portals, … EUDAT : data services between computing/data centers and end users communities EESI2 : cartography about Exascale R&D efforts French INRIA Blobseer R&D project, … But a lot of applications will need to be rewritten/adapted 06/11/2012 The complete I/O strategy need to be re thinked New methods for data analysis/exploration are needed (MapReduce, Hadoop, NOSQL, …?) Rough data will stay in computing/data center and ONLY refined data will go out Networks bandwidth will need to increase Use of remote visualisation Franco-British Workshop on Big Data in Science 15 HPC : en route to international synergies HPC enables scientific discoveries and innovation for both research and industry We mustn’t follow the trends. We must anticipate them ! To face future societal or industrial challenges To prepare users for future parallel architectures and applications To increase involvement of scientists or engineers in these techniques Global European HPC ecosystem integration 06/11/2012 Franco-British Workshop on Big Data in Science 16