NCAR supercomputing, 1963–2005 - Computational Information

advertisement
Supercomputing Systems at NCAR
SC2005
Marc Genty
Supercomputing Services Group
November 15 - 17, 2005
November 15, 2005
1
NCAR Mission
The National Center for Atmospheric Research (NCAR) is a federally funded research and development
center. Together with our partners at universities and research centers, we are dedicated to exploring
and understanding our atmosphere and its interactions with the Sun, the oceans, the biosphere, and
human society. NCAR consists of the following:
•Computational and Information Systems Laboratory (CISL)
SCD
IMAGe
Scientific Computing Division
Institute for Mathematics Applied to Geosciences
•Earth and Sun Systems Laboratory (ESSL)
ACD
Atmospheric Chemistry Division
CGD
Climate & Global Dynamics Division
HAO
High Altitude Observatory
MMM
Mesoscale & Microscale Meterology Division
TIMES
The Institute for Multidisciplinary Earth Studies
•Earth Observing Laboratory (EOL)
HIAPER
High-Performance Instrumented Airborne Platform
for Environmental Research
•Research Applications Laboratory (RAL)
RAP
Research Applications Programs
•Societal-Environmental Research and Education Laboratory (SERE)
ASP
Advanced Study Program
CCB
Center for Capacity Building
ISSE
Institute for the Study of Society and Environment (formerly ESIG)
November 15, 2005
2
NCAR Science
Space Weather
Climate
Turbulence
Weather
The Sun
Atmospheric Chemistry
More than just the atmosphere… from the earth’s oceans to the solar interior
November 15, 2005
3
2005: Climate Simulation Lab Science
•
Community Climate System Model (CCSM)
•
Modeling Climate Change and Climate Variability in Coupled Climate-Land Vegetation
Models: Present, Past, and Future Climates
•
50-year Regional Downscaling of NCEP/NCAR Reanalysis Over California
Using the Regional Spectral Model
•
Climate Variability in the Atlantic Basin
•
Aerosol Effects on the Hydrological Cycle
•
Pacific Decadal Variability due to Tropical-Extratropical Interaction
•
Predictability of the Coupled Ocean-Atmosphere-Land Climate System:
Seasonal-to-Interannual Time Scales
•
The Whole Atmosphere Community Climate Model
•
Decadal to Century Coupled Ocean/Ice Simulations at High Resolution (0.2)
Using an Innovative Geodesic Grid
•
Ocean State Estimation
•
Devlopment and Application of Seasonal Climate Predictions
•
=> http://www.scd.ucar.edu/csl/cslannual0505.html
November 15, 2005
4
69 Member Universities
University of Alabama in Huntsville
University of Illinois at Urbana-Champaign
Princeton University
University of Alaska
University of Iowa
Purdue University
University at Albany, State University of
New York
Iowa State University
University of Rhode Island
The Johns Hopkins University
Rice University
University of Arizona
University of Maryland
Rutgers University
Arizona State University
Massachusetts Institute of Technology
Saint Louis University
California Institute of Technology
McGill University
University of California, Berkeley
University of Miami
Scripps Institution of Oceanography at
Stanford University UCSD
University of California, Davis
University of Michigan - Ann Arbor
Texas A&M University
University of California, Irvine
University of Minnesota
University of Texas at Austin
University of California, Los Angeles
University of Missouri
Texas Tech University
University of Chicago
Naval Postgraduate School
University of Toronto
Colorado State University
University of Nebraska, Lincoln
Utah State University
University of Colorado at Boulder
Nevada System of Higher Education
University of Utah
Columbia University
University of New Hampshire, Durham
University of Virginia
Cornell University
University of Washington
University of Denver
New Mexico Institute of Mining and
Technology
Drexel University
New York University
University of Wisconsin - Madison
Florida State University
North Carolina State University
University of Wisconsin - Milwaukee
Georgia Institute of Technology
The Ohio State University
Woods Hole Oceanographic Institution
Harvard University
University of Oklahoma
University of Wyoming
University of Hawaii
Old Dominion University
Yale University
University of Houston
Oregon State University
York University
Howard University
Pennsylvania State University
http://www.ucar.edu/governance/
members/institutions.shtml
November 15, 2005
Washington State University
5
SCD Mission
The Scientific Computing Division (SCD) is part of the
National Center for Atmospheric Research (NCAR) in
Boulder, Colorado. The goal of SCD is to enable the best
atmospheric research in the world by providing and
advancing high-performance computing technologies. SCD
offers computing, research datasets, data storage,
networking, and data analysis tools to advance the scientific
research agenda of NCAR. NCAR is managed by the
University Corporation for Atmospheric Research (UCAR)
and is sponsored by the National Science Foundation.
November 15, 2005
6
The NCAR Mesa Lab
November 15, 2005
7
History of Supercomputing at NCAR
Production Machines
Non-Production Machines
Currently in the NCAR/SCD computing facility
IBM p5-575/624 bluevista
Aspen Nocona/InfiniBand coral
IBM BlueGene/L frost
IBM e1350/140 pegasus
IBM e1350/264 lightning
IBM p690-C/1600 bluesky
IBM p690-F/64 thunder
IBM p690-C/1216 bluesky
SGI Origin 3800/128 tempest
IBM p690-C/16 bluedawn
IBM SP WH2/1308 blackforest
IBM SP WH2/604 blackforest
IBM SP WH2/64 babyblue
Compaq ES40/36 prospect
IBM SP WH1/296 blackforest
IBM SP WH1/32 babyblue
Beowulf/16 tevye
SGI Origin2000/128 ute
Cray J90se/24 chipeta
HP SPP-2000/64 sioux
Cray T3D/128
Cray C90/16 antero
Cray J90se/24 ouray
Cray J90/20 aztec
Cray J90/16 paiute
Cray T3D/64
Cray Y-MP/8I antero
CCC Cray 3/4 graywolf
IBM SP1/8 eaglesnest
TMC CM5/32 littlebear
IBM RS/6000 Cluster
Cray Y-MP/2 castle
Cray Y-MP/8 shavano
TMC CM2/8192 capitol
Cray X-MP/4
Cray 1-A S/N 14
Cray 1-A S/N 3
CDC 7600
CDC 6600
R e v is e d N o v '0 5
CDC 3600
1960
1965
November 15, 2005
1970
1975
1980
1985
1990
1995
2000
2005
8
In the beginning…
(1963)
November 15, 2005
9
CDC 3600 System Overview
• Circuitry Design:
Seymour Cray
• Clock Speed:
0.7 MHz
• Memory:
32 Kbytes
• Peak Performance:
1.3 MFLOPs
November 15, 2005
10
Today’s NCAR Supercomputers
•
Bluesky
[IBM POWER4 AIX - Production - General Scientific Use]
– 125-node (50 frame) p690 cluster, 1600 1.3GHz CPUs, SP Switch2, 15TB FC disk
– Configured as 76 8-way (LPAR) nodes and 25 32-way (SMP) nodes.
•
Bluevista
[IBM POWER5 AIX - Production - General Scientific Use]
– 78-node p575 cluster, 624 1.9GHz CPUs, HPS Switch, 55TB FC disk
– NCAR codes are typically seeing a speedup of 2x-3x over bluesky
•
Frost
[IBM Blue Gene/L - Single Rack - Pset Size = 32]
•
Lightning
[IBM SuSE Linux - Production - General Scientific Use]
– 132-node AMD64/Xeon cluster, 264 2.2/3.0GHz CPUs, Myrinet Switch, ~6TB SATA disk
•
Pegasus
[IBM SuSE Linux - Production - Real-Time Weather Forecasting]
– 70-node AMD64/Xeon cluster, 140 2.2/3.0GHz CPUs, Myrinet Switch, ~6TB SATA disk
•
Coral
[Aspen Systems SuSE Linux - Production - IMAGe Divisional System]
– 24-node Nacona cluster, 44 3.2GHz CPUs, InifinBand Switch, ~6TB SATA disk
•
Test Systems: Thunder [P4/HPS], Bluedawn [P4/SP Switch2], Otis [P5/HPS]
November 15, 2005
11
Bluesky
November 15, 2005
12
Bluesky System Overview
• IBM POWER4 Cluster 1600
• AIX 5.1, PSSP, GPFS, LoadLeveler
• 125-node (50 frame) p690 cluster
• Compute Node Breakdown: 76 8-way (LPAR) & 25 32-way (SMP)
• 1600 1.3GHz CPUs
• SP Switch2 (Colony)
• 15TB FC disk
• General purpose, computational resource
November 15, 2005
13
Bluesky 32-Way LPAR
Usage
bluesky 32-way LPAR Usage
Utilization
% User
% Idle
% System
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
November 15, 2005
10/23
10/9
9/25
9/11
8/28
8/14
7/31
7/17
7/3
6/19
6/5
5/22
5/8
4/24
4/10
3/27
3/13
2/27
2/13
1/30
1/16
1/2
12/19
12/5
11/21
11/7
10/24
10/10
9/26
9/12
8/29
0%
14
Bluesky 8-Way LPAR
Usage
bluesky 8-way LPAR Usage
Utilization
% User
% Idle
% System
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
November 15, 2005
10/23
10/9
9/25
9/11
8/28
8/14
7/31
7/17
7/3
6/19
6/5
5/22
5/8
4/24
4/10
3/27
3/13
2/27
2/13
1/30
1/16
1/2
12/19
12/5
11/21
11/7
10/24
10/10
9/26
9/12
8/29
0%
15
Bluesky Science Highlights
• CCSM3: The Community Climate System Model
– Fully-coupled, global climate model that provides
state-of-the-art computer simulations of the
Earth's past, present, and future climate states
– The CCSM3 IPCC (Intergovernmental Plan on
Climate Change) integrations now include
roughly 11,000 years of simulated climate
(19th - 24th centuries)
– The CCSM3 control run archive contains 4,500
years of simulated climate at three resolutions
– http://www.ccsm.ucar.edu/
November 15, 2005
16
Bluesky Science Highlights
• ARW: Advanced Research
WRF (Weather Research &
Forecasting) Model
– Next-Generation Mesoscale Numerical Weather
Prediction System
– http://www.mmm.ucar.edu/index.php
November 15, 2005
17
Bluevista
November 15, 2005
18
Bluevista System Overview
• IBM POWER5 Cluster
• AIX 5.2, CSM, GPFS, LSF
• 78-node p575 cluster
• 624 1.9GHz CPUs
• HPS Switch (Federation)
• 55TB FC disk
• General purpose, computational resource
• NCAR codes are typically seeing a speedup of 2x-3x over bluesky
• The bluevista cluster is estimated to have the same sustained
computing capacity as the bluesky cluster
November 15, 2005
19
Bluevista Usage
(Not Yet In Full Production)
bluevista Usage
Utilization
% User
% Idle
% System
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
November 15, 2005
10/20
10/13
10/6
0%
20
Bluevista Science Highlights
• 2005: Nested Regional
Climate Model (NRCM)
– Focus: To develop a state-of-the-science nested climate model
based in WRF and to provide this to the community
– http://www.mmm.ucar.edu/facilities/nrcm/nrcm.php
• 2005: Limited friendly-user time also allocated
• 2006: General scientific production system
– Will augment bluesky capacity
November 15, 2005
21
Frost
November 15, 2005
22
Frost System Overview
•
IBM Blue Gene/L
•
Single rack
– One I/O node per thirty-two compute nodes (pset size = 32)
•
Service node
– One IBM p630 server
• Two POWER4+ 1.2GHz CPUs & 4GB memory
– SuSE (SLES9), DB2 FixPak9
•
Front-end nodes
– Four IBM OpenPower 720 servers
• Four POWER5 1.65GHz CPUs & 8GB memory
– SuSE (SLES9), GPFS, COBALT
November 15, 2005
23
Blue Gene/L At NCAR
Blue Gene/L is jointly owned and managed
collaboratively by NCAR
and the University of Colorado
(Boulder & Denver).
There are Principal Investigators (PIs)
associated with each research facility, and
each PI has a small group of scientists
running on the system.
Blue Gene/L is a targeted system at this
time with allocations split among the three
primary research facilities.
November 15, 2005
24
Bluesky / Frost Side-By-Side
Processors:
Peak Teraflops:
Linpack:
Power (kW):
1600
8.3
4.2
400*
Processors:
Peak Teraflops:
Linpack:
Power (kW):
2048
5.73
4.6
25*
*The average personal computer consumes about 0.12 kW
November 15, 2005
25
Frost Usage
November 15, 2005
26
Frost Principal Research Areas
• Climate and Weather Simulation
– http://www.ucar.edu/research/climate/
– http://www.ucar.edu/research/prediction/
• Computational Fluid Dynamics and Turbulence
– http://www.image.ucar.edu/TNT/
• Coupled Atmosphere-Fire Modeling
– http://www.ucar.edu/research/climate/drought.shtml
• Scalable Solvers
– http://amath.colorado.edu/faculty/tmanteuf/
• Aerospace Engineering
– http://icme.stanford.edu/faculty/cfarhat.html
November 15, 2005
27
Frost Science Highlights
“Modeling Aqua Planet on Blue Gene/L”
Dr. Amik St-Cyr - Scientist
Computational Science Section
Scientific Computing Division
• NCAR Booth: 1:00PM - Tuesday, November 15, 2005
• NCAR Booth: 1:00PM - Thursday, November 16, 2005
November 15, 2005
28
Lightning
November 15, 2005
29
Lightning System Overview
• IBM Cluster 1350
• SuSE (SLES9) Linux, CSM, GPFS, LSF
• 132-node AMD64/Xeon cluster
• 264 2.2/3.0GHz CPUs
• Myrinet Switch
• ~6TB SATA disk
• General purpose, computational resource
November 15, 2005
30
November 15, 2005
Utilization
% User
% Idle
% System
10/17/05
10/3/05
9/19/05
9/5/05
8/22/05
8/8/05
7/25/05
7/11/05
6/27/05
lightning Usage
6/13/05
5/30/05
5/16/05
5/2/05
4/18/05
4/4/05
3/21/05
3/7/05
2/21/05
2/7/05
1/24/05
1/10/05
12/27/04
12/13/04
11/29/04
11/15/04
11/1/04
Lightning Usage
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
31
Lightning Science Highlights
• TCSP: Tropical Cloud Systems and
Processes field research investigation
– Joint NCAR/NASA/NOAA study of the dynamics and
thermodynamics of precipitating cloud systems, including
tropical cyclones
– http://box.mmm.ucar.edu/projects/wrf_tcsp/
November 15, 2005
32
Lightning Science Highlights
• TIME-GCM: Thermosphere Ionosphere
Mesosphere Electrodynamics General
Circulation Model
– Distributed memory parallelism
using eight nodes (16 MPI tasks)
– Run completed in 12 jobs at 5-6 wallclock hours each
for a total of 70 hours (~12 minutes per simulated day)
– http://www.hao.ucar.edu/Public/models/models.html
November 15, 2005
33
Pegasus
November 15, 2005
34
Pegasus System Overview
• IBM Cluster 1350
• SuSE (SLES9) Linux, CSM, GPFS, LSF
• 70-node AMD64/Xeon cluster
• 140 2.2/3.0GHz CPUs
• Myrinet Switch
• ~6TB SATA disk
• Essentially a 0.5 scale model of the lightning cluster
• Real-Time Weather Forecasting
November 15, 2005
35
Pegasus Usage
pegasus Usage
Utilization
% User
% Idle
% System
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
November 15, 2005
10/1/05
9/1/05
8/1/05
7/1/05
6/1/05
5/1/05
4/1/05
0%
36
Pegasus Science Highlights
• AMPS: Antarctic
Mesoscale Prediction
System (Polar MM5)
– Twice daily operational forecasts for the Antarctic Region
(McMurdo Station, Antarctica)
– Sponsored by the NSF Office of Polar Programs
– http://box.mmm.ucar.edu/rt/mm5/amps/
November 15, 2005
37
Coral
November 15, 2005
38
Coral System Overview
• Aspen Systems 24-node Nacona cluster (48 3.2/3.6GHz CPUs)
• SuSE (SLES9) Linux, ABC, NFS, LSF
• Two HP Visualization Nodes (RedHat Enterprise Linux V3)
• InifiniBand Switch
• ~6TB SATA disk
• Dedicated resource belonging to the Institute for Mathematics
Applied to the Geosciences (IMAGe) Division
– http://www.image.ucar.edu/
November 15, 2005
39
Tempest
•
SGI Origin 3800
•
IRIX 6.5.25, NQE
– No cluster mgt s/w or parallel file system
•
128 500-MHz R14000 CPUs
•
64GB Distributed Shared Memory
•
NUMAlink Interconnect
•
~8.5TB Ciprico SATA RAID disk
•
General purpose, post-processing and
data analysis server
•
Managed by the Data Analysis Services
Group (DASG)
November 15, 2005
40
Tempest Science Highlights
“Desktop Techniques for the Exploration of
Terascale-sized Turbulence Data Sets”
John Clyne - Senior Software Engineer
High-Performance Systems Section
• NCAR Booth: 4:00PM - Tuesday, November 15, 2005
• NCAR Booth: 3:00PM - Thursday, November 17, 2005
November 15, 2005
41
Blackforest
(318 WHII Node RS/6000 SP)
R.I.P. - 12Jan05 @ 8am
November 15, 2005
42
Blackforest Highlights
• 5.4 Year Lifetime
• 30.5 Million CPU Hours Of Work
• 600,000 Batch Jobs
• 50 CPU Hours/Job (On Average)
• 27.28 CPUs (7 Nodes) - Average Job
November 15, 2005
43
NCAR Supercomputer
Performance Numbers
System Name
Peak
TFLOPs
Memory
(Tbytes)
4.742
8.320
1.162
0.128
1.248
3.328
0.544
0.064
Est'd
Power
measured
Est'd
Sustained
Disk
Consumption
Code
or
Sustained MFLOPs/
(Tbytes)
(kWatts)
Efficiency estimated GFLOPs
Watt
Production Systems
bluevista
bluesky
lightning
tempest
55.0
28.5
7.8
7.9
276
385
48
50.0
7.8%
4.3%
5.8%
9.8%
est.
meas.
est.
est.
369.9
355.3
67.4
12.5
1.34
0.92
1.40
0.25
6.8%
5.8%
5.0%
5.1%
4.3%
est.
est.
est.
est.
est.
389.9
35.7
12.8
17.1
3.0
15.5
1.28
1.42
1.10
0.46
1263.61
23.65
Research Systems, Divisional Systems & Test Systems
frost (BG/L)
pegasus (AMPS)
coral (IMAGe)
thunder
bluedawn
5.734
0.616
0.256
0.333
0.070
0.524
0.288
0.088
0.128
0.032
6.6
5.6
6.4
1.2
0.7
25.2
28
9
15.5
6.5
TOTAL
21.36
6.24
119.63
843.20
November 15, 2005
44
Conclusion & Questions
• Read more about it:
– http://www.scd.ucar.edu/main/computers.html
• Questions or Comments?
• Special thanks to:
–
–
–
–
–
–
November 15, 2005
Lynda Lester / Pam Gillman (SCD):
Tom Engel (SCD):
Irfan Elahi / John Clyne (SCD):
Ben Foster (HAO):
Sean McCreary (CSS/CU):
BJ Heller (SCD):
Photographs
Utilization Charts / Stats
Fun Facts
TIME-GCM Data
BG/L Research Areas
Production Assistance
45
Download