Stuart Bell

advertisement
Met Office Science Review Meetings 2012
MOSAC 17.5
Strategy for optimal use of
HPC for NWP 2013-2015
Stuart Bell, Weather Science
© Crown copyright 2012 Met Office
Presentation overview
• NWP 2013
• IBM P7 Performance
• Resources for Operations
• Issues
• NWP 2015?
© Crown copyright Met Office
Production NWP (end FY2013-14)
12km NAE
18km MOGREPS-R
•1.5km model
•2.2km ensemble
•Up to 36hr f/c
•Up to 36hr f/c
•3-hourly update
•6-hourly update
•4.4km model
•33km ensemble
•Up to 120hr f/c
•Up to 3day f/c
•6-hourly update
•6-hourly update
•60km coupled model
•Up to 6 months
•Daily lagged ensemble
•17km model
•Up to 144hr f/c
•6-hourly update
Production NWP (end FY2013-14)
Model
Resolution Forecast
Length
Frequency Other Details
(Members)
UK
1.5
T+36
8
3dvar
UK ensemble 2.2
T+36
4 (12)
lbc perturbed
Europe
4.4
T+144
2+
(T+72 at 06,18) 2
No DA – global
downscaler
Global
17
T+144
2+
(T+72 at 06,18) 2
Hybrid 4dvar
Global
Ensemble
33
T+72
ETKF + Stochastic
Phys
Global
Extended
Ensemble
60
T+72=>monthly tbd
and seasonal
© Crown copyright Met Office
4 (12+)
MOGREPS-15 and
Glosea5 merger
details tbd
Implementation Plan
By Dec 2012:
• 33km Global EPS
• 2.2km UK EPS
• Additional Global EPS members to T+12 for Hybrid DA
• Expanded domain for 4km global downscaler
Q1 & Q2 2013
• Business as usual science changes (eg UK model physics; New Satellites)
• Global 4DVAR resolution increase
• Retire various deprecated model configurations
• New Suite Control System
Autumn 2013
• 17km Global (with ENDGAME dynamical core)
• Migrate Global EPS to ENDGAME
Spring 2014
• Migrate UK and UK EPS to ENDGAME
© Crown copyright Met Office
HPC Resource
Number of
Compute Cores
Linpack
Performance
(TFlops)
Disk Capacity
(TBytes)
IBM P6
IBM P7
Factor
7904
38912
4.9
120
879
7.3
860
1500
1.7
?
© Crown copyright Met Office
Benchmark performance
Speedup if same share of
resource used
Global N512
1.5km
Fraction of Cluster (416 nodes for P7)
50%
25%
25%
12.5%
Speedup (P7: P6_2009)
2.5
2.0
3.6
4.1
Speedup (P7: P6_2012)
2.8
2.0
2.4
3.3
Speedup if same number
of compute cores used
1.5km
Global
N512
Global
N96
Nodes (32 cores)
52
26
2
Speedup (P7: P6_2009)
1.8
2.0
1.6
Speedup (P7: P6_2012)
1.9
1.3
1.4
© Crown copyright Met Office
ENDGAME Coupled Model
Performance (N216L85 + O0.25L75)
•Coupled Core Count =
Sum of components
•Coupled Time = Max
Component time plus
coupling overhead
• Currently more Cores
needed
• But scalability much better
• Scope for more
optimisation
© Crown copyright Met Office
• Scope for longer timestep
ENDGAME Performance (N512L70)
Planned Resource
© Crown copyright Met Office
Operational Constraints
• Cluster Sizes : 6 frames + 5 frames (+ 2
frames)
• Seasonal Forecast System = 1 frame
• Operational NWP : max 2 frames
• < Half cluster limit keeps NWP R&D/Operations ratio
in check at hopefully 3:1 ratio
• Delivery constraints for NWP
• Many Operational Configurations => Forces overlap of
suites => quarter cluster limit per configuration
• Customers require end-product as soon as possible
after data time => Maximum run length ~60 minutes
© Crown copyright Met Office
Table 5 – Breakdown of resource allocation
Modelling System
Global Deterministic
Global Ensemble
(excluding MOGREPS-15)
Monthly & Seasonal Ensembles
(plus Hindcast)
Regional and Local Deterministic
(for UK & Europe)
Regional and Local Ensembles
(for UK)
Other LAMs
Wet Models
Total Operational
NWP (R&D)
Climate (Production and R&D)
© Crown copyright Met Office
Resource Shares
Fraction of Total
resource on IBM P6
(Jan 2012)
3.5%
1.0%
Projected Fraction
resource on IBM P7
(Apr 2014)
~2.5%
~1.5%
2.1%
~5.5%
4.9%
~2.5%
0.8%
~2%
1.3%
0.6%
<1%
<1%
14.3%
36.7%
49.0%
15%
30%
55%
Expected Usage Profile
: March 2014
Projected NWP Resource Usage
P7 Nodes
288
192
96
0:
00
1:
00
2:
00
3:
00
4:
00
5:
00
6:
00
7:
00
8:
00
9:
00
10
:0
0
11
:0
0
12
:0
0
13
:0
0
14
:0
0
15
:0
0
16
:0
0
17
:0
0
18
:0
0
19
:0
0
20
:0
0
21
:0
0
22
:0
0
23
:0
0
0:
00
0
Seasonal
© Crown copyright Met Office
Global
Global EPS
UK
Euro
UK EPS
WHY EURO4?
BBC
12=>4
•Increasing the domain
•Increasing the resolution.
PRIORITIES FOR UK MODELLING
RESOURCE 2014-15
Assuming Best Case for ENDGAME COST and
OPTIMISATION PROGRESS
Priorities if affordable
1.Increasing the number of levels.
2.Extending the forecast range beyond T+36
3.Increasing the domain
4.Increasing the resolution
© Crown copyright Met Office
•Increasing the domain
•Increasing the resolution.
PRIORITIES FOR GLOBAL
ENSEMBLE 2014-15
Assuming Best Case for ENDGAME COST and
OPTIMISATION PROGRESS
Priorities if affordable
1.Extending the forecast range to Day 5
•
Does Short Range Forecasting Extend Out to Day 5
•
Tim Johns and Adam Scaife review Global EPS beyond Day 5 in
a later talk
2.Increasing the number of levels =>85
© Crown copyright Met Office
What about 2015 ?
1/3 towards the 2020 vision :
•10km Global Coupled EPS &
•1km Regional Coupled EPS or 500m Local Coupled EPS – which?
2015 Strawman – Short-range NWP
• Global 12km with >100levels
• Global EPS at 25km
• UK 1.5km ensemble + bigger domain [(x,y)=>(1.5x*1.5y)] + >100 levels
• UK DA =>4DVAR; hourly with NWP Nowcast
• Monthly / Seasonal 40km? or embedded 10km? – Question for a later
presentation
© Crown copyright Met Office
Questions, answers & discussion
© Crown copyright Met Office
Download