GENCI data policy

advertisement
HPC in France and Europe
Overview of GENCI and PRACE
Stéphane REQUENA, CTO GENCI
Supercomputing - driving Science and
Industry through simulation
Aerospace
Environment
Weather / Climatology
Pollution / Ozone Hole
Ageing Society
Medicine
Biology
Materials / Inf. Tech
Spintronics
Nano-science
Energy
Plasma Physics
Fuel Cells
Virtual power plant
Automotive
Finance
06/11/2012
Multimedia
Franco-British Workshop on Big Data in Science
2
HPC is a «key technology»
 Supercomputers : an indispensable tool to solve the
most challenging problems via simulations
A 200M$ effort by 6 agencies
 Access to world class computers : essential to be
competitive in science and engineering
 Providing competitive HPC services : a continuous
endeavor
 This has been acknowledged by leading industrial
nations
→ Europe : PRACE
→ France : GENCI
06/11/2012
Franco-British Workshop on Big Data in Science
3
GENCI
Grand Equipement National de Calcul Intensif
10 % 1 %
20 %
49 %
20 %
Missions :
 To implement a national HPC strategy in France and to provide the
3 national HPC academic centres with supercomputers
 To contribute to the creation of the European HPC ecosystem
 To promote numerical simulation and HPC in academia and industry
06/11/12
Franco-British Workshop on Big Data in Science
GENCI : powering the 3 national
HPC centres
 Coordination and optimization
of investments in HPC
 Common allocation of HPC hours via call
for proposals
Orsay
IDRIS
Bruyèresle-Châtel
TGCC
Montpellier
CINES
CT
Scientific Area
1
Environment
2
CFD, reactive & complex flows
3
Bio medical and Health
4
Astrophysics and geophysics
5
Theoretical and plasma physics
6
CS, algorithmic and mathematics
7
Molecular systems and biology
Répar divided
on par CT
Resources
into
scientific areas
CT 1
7
CT2
58
96
CT 3
164
109
CT 4
CT 5
59
8
Quantum chemistry and molecular simulation
9
Physics, chemistry and materials
10
New and transverse applications
23
38
CT 6
50
CT 7
11
CT 8
CT 9
06/11/2012
Franco-British Workshop on Big Data in Science
CT 10
5
A huge effort for increasing French
HPC capacities
Spécifications de l’extension
Disposer d’environ 100 Tflops complétant la plate-forme de production actuel
BULL du CCRT:
Category
utilisant les technologies « standard » de dernière génération,
Features
basée sur les produits
BULL - 103
Opensource
(linux, Teraflop/s
Lustre, Ofed)
Hybrid cluster
(+ 192 GPU Teraflop/s SP)
proposant les mêmes environnements
deIBM
travailx3750M4
aux utilisateurscluster – 233
Cluster of SMP
Fat nodes
Teraflop/s
Accéder à des technologies alternatives comme le GPU sur une partie de la
machine via une intégration complète pour:
MPP
IBM BG/Q - 836 Teraflop/s
gagner un facteur d’accélération important sur des codes ciblés
Cluster of SMP
Thin nodes
SGI Altix ICE - 267 Teraflop/s
Cluster of SMP
Thin, Hybrid and Fat
nodes
06/11/2012
valider une plate-forme de recherche pour anticiper les futures machines d
production
BULL Bullx cluster
2.0 Petaflops
CEA/DIF- C.Ménaché
Franco-British Workshop on Big Data in Science
11/06/2008
6
PRACE: a European Research Infrastructure (RI)
& ESFRI list-item
 PRACE RI is in operation since April 2010
•
•
1st Council
June 9, 2010
PRACE AISBL created with 20 countries, head office in Brussels
Now 25 member countries
 PRACE RI is providing services since august 2010
•
•
Now 6 Tier0 systems available
4.3 billions core hours awarded to 159 projects through a single pan-European peer review process
 Funding secured for 2010-2015
•
•
06/11/12
400 Million€ from France, Germany, Spain and Italy, provided as Tier0 services on TCO basis
130 Million€ additional funding = 70 Million€ from EC FP7 preparatory and implementation projects +
60 Million€ from PRACE members :
 Technical, organizational and legal support for PRACE
• Prepared the creation of the AISBL as a legal entity
• Established the PRACE brand
• Provided extensive HPC Training
06/11/2012
• Deployed and evaluated promising architectures
• Ported and petascaled applications
PRACE-3IP kick-off in Paris
Franco-British Workshop on Big Data in Science
7
2012: PRACE is providing nearly 15 PFlop/s...
JUQUEEN: IBM BlueGene/Q
at GCS partner FZJ
(Forschungszentrum Jülich)
Mare Nostrum: IBM
at BSC
FERMI: IBM BlueGene/Q at
CINECA
CURIE: Bull Bullx at
GENCI partner CEA.
HERMIT: Cray
at GCS partner HLRS
06/11/2012
(High Performance Computing Center Stuttgart).
SuperMUC: IBM
at GCS partner LRZ
(Leibniz-Rechenzentrum)
06/11/12
Franco-British Workshop on Big Data in Science
8
 PRACE boosts Science
to face the tempest!
The UPSCALE project
aims to continue
developing our climate
modelling capability
and goes for even
higher global
resolution, all the way
to 12km, which is not
even envisioned for
Met Office global
weather forecasting
before 2015.
AWARD : 144M CPU HOURS
Credits: Prog. Pier Luigi Vidale, Univ. Reading, U.K. - Cray XE6 System Hermit
in GCS@HLRS
also in NATURE
Climatemeeting
Change July
9
PRACE-1IP
Kick-off
in2012
9
CURIE : the French PRACE
supercomputer
 CURIE, France’s commitment to PRACE, is overseen by GENCI
 Located in
and operated by CEA DAM teams
 A modular and balanced architecture by
 Cluster of SMP nodes with fat, thin and hybrid nodes
 Complementary to other PRACE Tier0 systems
 Fully available since March 8, 2012
In honour of
Marie Curie
Global peak performance of
2 PFlop/s
> 92 000 Intel cores,
360 TB memory,
15 PB Lustre @ 250 GB/s,
120 racks, < 200 m2 - 2,5 MW
50 kms of cables
06/11/12
Franco-British Workshop on Big Data in Science
Example of recent results on CURIE
Understanding the evolution of the Universe (1/2)
 Grand challenge conducted by Observatoire de Paris and the DEUS
Consortium (http://www.deus-consortium.org)
 Goal : perform 3 FULL Universe simulations, from Big Bang to
nowdays using 3 different dark energy distributions
 Influence of the dark matter wrt evolution of the Universe
 Direcly linked with the 2011 Physics Nobel Prize
 Data will be used to feed next EU EUCLID telescope
 Unprecedented HPC requirements




06/11/2012
>550 billions particles, 81923 mesh in a 21 h-1 Gpc box
RAMSES code (CEA) and a dedicated workflow toolchain
76k cores, >300 TB of main memory
Specific memory, MPI and parallel I/O optimisations
Franco-British Workshop on Big Data in Science
11
Example of recent results on CURIE
Understanding the evolution of the Universe (2/2)
 WORLWIDE record finished 2 months ago
 First FULL Universe ΛCDM, LCDM and RPCDM simulations performed on
Curie thin nodes
 3 runs for a total of 92 hours elapsed on 76 032 cores, last run lasted 29
hours without any failure  WWOOUUAAHHH CURIE is very stable !
 A strong need of substained I/O rate on the Lustre scratch fs
 We have here a BigData problem
 A total of 10 PB of full data
(scratch and rough data) generated
 4 PB of rough results after simulation
 1.2 PB of refined data after post processing for the 3 dark energy
simulations need to be made available to worldwide scientists !
06/11/2012
Franco-British Workshop on Big Data in Science
12
Explosion of computational data
An another example from climatology
 Evolution of the global climate
 5th IPCC campaign , French production on a dedicated NEC SX9 : > 1TB/day
 Strong issues with storage, post processing and archive of data
 And the future is :
Year
Power factor
Npp
Resolution [km]
Number of mesh points [millions]
Ensemble size
Number of variables
Interval of 3-dimensional output (hours)
Years simulated
Storage density
Archive size (Pb) (atmosphere)

06/11/2012
CMIP5
2012
1
200
100
3,2
200
800
6
90000
0,00002
5,31
CMIP6
2017
30
357
56
18,1
357
1068
4
120170
0,00002
143,42
Franco-British Workshop on Big Data in Science
CMIP7
2022
1000
647
31
108,4
647
1439
3
161898
0,00002
3766,99
13
One conclusion
Data is exploding
 Observational/experimental data





Particle accelerators and detectors (LHC@CERN)
Genome sequencer and personalized medecine
Next gen satellites and (radio)telescopes
Captors/sensors in weather forecast/climatology or
oil & gas
Finance, insurance, …
 Computational data




Increase on HPC resources (PRACE = 15 PF in 2012)
Increase of space and time resolution of models
Multi-physics and multi scale simulations
Rise of uncertainties quantification, ensemble simulations, …
 With problems relative to





Size of data (number and size of files), (un)structured data, format, ..)
Uncertainties of data and fault tolerance
Metadata issue
Post processing (20% of time)
Dissemination of refined data to worldwide communities during decades
That means :
To deploy PERENE and SUBSTAINABLE Research Infrastructures
06/11/2012
Franco-British Workshop on Big Data in Science
14
Another conclusion
People is aware about that !
 On the hardware and system software side




Multi level storage with new I/O devices : mix of flash based memory (SSD, PCM, …) with hard drives
Asynchronous I/O and Active I/O (servers embedded into I/O controllers)
Next generation of parallel file system (Lustre, GPFS, Xyratec, …)
Flops will be “almost free” -> post processing at the same time as computation
 A lot of European projects and R&D initiative




PRACE implementation projects : data management, remote viz, portals, …
EUDAT : data services between computing/data centers and end users communities
EESI2 : cartography about Exascale R&D efforts
French INRIA Blobseer R&D project, …
 But a lot of applications will need to be rewritten/adapted





06/11/2012
The complete I/O strategy need to be re thinked
New methods for data analysis/exploration are needed (MapReduce, Hadoop, NOSQL, …?)
Rough data will stay in computing/data center and ONLY refined data will go out
Networks bandwidth will need to increase
Use of remote visualisation
Franco-British Workshop on Big Data in Science
15
HPC : en route to international synergies
HPC enables scientific discoveries and innovation for both
research and industry
We mustn’t follow the trends. We must anticipate them !
 To face future societal or industrial challenges
 To prepare users for future parallel architectures and applications
 To increase involvement of scientists or engineers in these techniques
Global European HPC ecosystem integration
06/11/2012
Franco-British Workshop on Big Data in Science
16
Download