ChinaProb copy - Atmospheric Sciences

advertisement
Probabilistic Prediction
Cliff Mass
University of Washington
Uncertainty in Forecasting
• Most numerical weather prediction (NWP) today
and most forecast products reflect a deterministic
approach.
• This means that we do the best job we can for a
single forecast and do not consider uncertainties in
the model, initial conditions, or the very nature of
the atmosphere.
• However, the uncertainties are usually very
significant and information on such uncertainty
can be very useful.
This is really ridiculous!
A Fundamental Issue
• The work of Lorenz (1963, 1965,
1968) demonstrated that the
atmosphere is a chaotic system, in
which small differences in the
initialization, well within
observational error, can have large
impacts on the forecasts, particularly
for longer forecasts.
• In a series of experiments he found
that small errors in initial conditions
can grow so that all deterministic
forecast skill is lost at about two
weeks.
Butterfly Effect: a small change
at one place in a complex system
can have large effects elsewhere
Not unlike a pinball game
Uncertainty Extends Beyond
Initial Conditions
• Also uncertainty in our model physics.
– such as microphysics and boundary layer
parameterizations.
• And further uncertainty produced by our
numerical methods.
Probabilistic NWP
• To deal with forecast uncertainty, Epstein (1969)
suggested stochastic-dynamic forecasting, in which
forecast errors are explicitly considered during
model integration.
• Essentially, uncertainty estimates are added to each
term in the primitive equations.
• This stochastic method was not and still is not
computationally practical.
Probabilistic-Ensemble
Numerical Prediction (NWP)
• Another approach, ensemble prediction, was
proposed by Leith (1974), who suggested that
prediction centers run a collection (ensemble) of
forecasts, each starting from a different initial state.
• The variations in the resulting forecasts could be
used to estimate the uncertainty of the prediction.
• But even the ensemble approach was not possible at
this time due to limited computer resources.
• Became practical in the late 1980s as computer
power increased.
Ensemble Prediction
• Can use ensembles to estimate the probabilities that
some weather feature will occur.
•The ensemble mean is more accurate on average than
any individual ensemble member.
•Forecast skill of the ensemble mean is related to the
spread of the ensembles
•When ensemble forecasts are similar, ensemble
mean skill tend to be higher.
•When forecasts differ greatly, ensemble mean
forecast skill tends to be less.
Deterministic Forecasting
An analysis produced to run an NWP model
is somewhere in a cloud of likely states.
Any point in the cloud is equally likely
to be the truth.
12h
forecast
12h
observation
T
T
The true state of the atmosphere
exists as a single point in phase
space that we never know exactly.
48h
forecast
24h
forecast
36h
forecast
Nonlinear error growth
and model deficiencies drive apart
24h
observation the forecast and true trajectories
T
(i.e., Chaos Theory)
36h
observation
T
48h
observation
T
P
H
S AS
PA E
C11
E
A point in phase space completely describes an
instantaneous state of the atmosphere. (pres, temp,
etc. at all points at one time.)
Ensemble Forecasting, a Stochastic Approach
An ensemble of likely analyses leads to an ensemble of likely forecasts
T
T
P
H
S AS
PA E
C12
E
Ensemble Forecasting:
 Encompasses truth
 Reveals flow-dependent uncertainty
 Yields objective stochastic forecast
Probability Density Functions
0.4
0.4
0.2
0.2
0
0
• Usually we fit the distribution of ensemble
members with a gaussian or other
reasonably smooth theoretical distribution
as a first step
22 May 2003 1:30 PM
General Examination Presentation
A critical issue is the development of
ensemble systems that create probabilistic
guidance that is both reliable and sharp.
We Need to Create Probability Density
Functions (PDFs) of Each Variable That have
These Characteristics
Elements of a Good Probability Forecast:
• Sharpness (also known as resolution)
– The width of the predicted distribution should
be as small as possible.
Sharp
Less
Sharp
Probability Density
Function (PDF)
for some forecast
quantity
Elements of a Good Probability Forecast
• Reliability (also known as calibration)
– A probability forecast p, ought to verify with relative
frequency p.
– Forecasts from climatology are reliable (by definition), so
calibration alone is not enough.
Reliability
Diagram
Verification Rank Histogram
(a.k.a., Talagrand Diagram)-Another Measure of Reliability
Over many trials, record verification’s position (the “rank”) among the ordered EF members.
Under-Spread EF
0.2
0.1
0.3
0.2
0.1
0
0
1
Frequency
Over-Spread EF
Probability
0.3
Probability
Probability
Reliable EF
2
3 4 5 6 7 8
Verification Rank
2
3 4 5 6 7 8
Verification Rank
1
0.3
0.3
0.2
0.2
0.2
0.1
0.1
0.1
5
10
15
20
0
0.1
9
0.3
0
0.2
0
1
9
0.3
5
10
15
20
0
2
5
3 4 5 6 7 8
Verification Rank
10
15
9
20
Cumulative Precip. (mm)
17
EF PDF (curve) & 8 sample members (bars) True PDF (curve) & verification value (bar)
Brier Skill Score (BSS)
directly examines reliability, resolution, and overall skill
by Discrete, Contiguous Bins
Continuous
Brier Score
1 M
BS 

M j1
Decomposed Brier Score
  p e j  o j 
1 I
1 I
2
2
BS  
N i   pe' i  oi  
N i  oi  o   o 1  o 


M i 1
M i 1
2
(reliability, rel) (resolution, res) (uncertainty, unc)
M : number of fcst/obs pairs
 pe j : forecast probability {0.0…1.0}
oj : observation {0.0 = no, 1.0 = yes}
BS = 0 for perfect forecasts
BS = 1 for perfectly wrong forecasts
Brier Skill Score
BSS 
BS fcst  BS clim
BS perf  BS clim
I : number of probability bins (normally 11)
N : number of data pairs in the bin
( p'e )i : binned forecast probability (0.0, 0.1,…1.0 for 11 bins)
oi : observed relative frequency for bin i
o : sample climatology (total occurrences / total forecasts)
Brier Skill Score′
 1
BS fcst
BS clim
BS S   1 
rel fcst  res fcst  unc fcst
BSS  
relclim  resclim  uncclim
0
BSS = 1 for perfect forecasts
BSS < 0 for forecasts worse than climo
res  rel
unc
0
ADVANTAGES:
1) No need for long-term climatology
2) Can compute and visualize in reliability diagram
Probabilistic Information Can
Produce Substantial Economic
and Public Protection Benefits
There is a decision theory on
using probabilistic information
for economic savings
C= cost of protection
L= loss if a damaging event occurs
Decision theory says you should
protect if the probability of
occurrence is greater than C/L
Critical Event: surface winds > 50kt
Cost (of protecting): $150K
Loss (if damage ): $1M
C/L = .15 (15%)
Deterministic
Deterministic Observation
Observation
Probabilistic
Probabilistic
Case
Case Forecast
Forecast (kt)
(kt)
(kt)
(kt)
Cost
Cost ($K)
($K)
Forecast
Forecast
11
65
65
54
54
150
150
42%
42%
22
58
58
63
63
150
150
71%
71%
33
73
73
57
57
150
150
95%
95%
44
55
55
37
37
150
150
13%
13%
55
39
39
31
31
00
3%
3%
66
31
31
55
55
1000
1000
36%
36%
77
62
62
71
71
150
150
85%
85%
88
53
53
42
42
150
150
22%
22%
99
21
21
27
27
00
51%
51%
10
10
52
52
39
39
150
150
77%
77%
Total
Total Cost:
Cost:
$$ 2,050
2,050
Observed?
Decision Theory Example
YES
NO
Forecast?
YES
NO
Hit
$150K
False
Alarm
$150K
Miss
$1000K
Correct
Rejection
$0K
Cost
Cost ($K)
($K) by
by Threshold
Threshold for
for Protective
Protective Action
Action
0%
0%
20%
20%
40%
40%
60%
60%
80%
80% 100%
100%
150
150
150
150
150
150
1000
1000
1000
1000
1000
1000
150
150
150
150
150
150
150
150
1000
1000
1000
1000
150
150
150
150
150
150
150
150
150
150
1000
1000
150
150
00
00
00
00
00
150
150
00
00
00
00
00
150
150
150
150
1000
1000
1000
1000
1000
1000
1000
1000
150
150
150
150
150
150
150
150
150
150
1000
1000
150
150
150
150
00
00
00
00
150
150
150
150
150
150
00
00
00
150
150
150
150
150
150
150
150
00
00
$$1,500
1,500 $$1,200
1,200 $$1,900
1,900 $$2,600
2,600 $$3,300
3,300 $$5,000
5,000
Optimal Threshold = 15%
History of Probabilistic Weather
Prediction (in the U.S.)
Early Forecasting Started
Probabilistically!!!
• Early forecasters, faced with large gaps in their
young science, understood the uncertain nature of
the weather prediction process and were comfortable
with a probabilistic approach to forecasting.
• Cleveland Abbe, who organized the first forecast
group in the United States as part of the U.S. Signal
Corp, did not use the term “forecast” for his first
prediction in 1871, but rather used the term
“probabilities,” resulting in him being known as
“Old Probabilities” or “Old Probs” to the public.
“Ol Probs”
•Professor Cleveland Abbe,
issued the first public
“Weather Synopsis and
Probabilities” on February 19,
1871
•A few years later, the term
indications was substituted for
probabilities, and by 1889 the
term forecasts received
official approval(Murphy
1997).
History of Probabilistic Prediction
• The first modern operational probabilistic
forecasts in the United States were
produced in 1965. These forecasts, for the
probability of precipitation, were produced
by human weather forecasters and thus were
subjective probabilistic predictions.
• The first objective probabilistic forecasts
were produced as part of the Model Output
Statistics (MOS) system that began in 1969.
NOTE: Model Output Statistics
(MOS)
• Based on simple linear regression with 12
predictors.
• Y = a0 +a1X1 + a2X2 + a3X3 + a4X4 …
Ensemble Prediction
• Ensemble prediction began an NCEP in the early
1990s. ECMWF rapidly joined the club.
• During the past decades the size and sophistication
of the NCEP and ECMWF ensemble systems have
grown considerably, with the medium-range
global ensemble system becoming an integral tool
for many forecasters.
• Also during this period, NCEP has constructed a
higher resolution, short-range ensemble system
(SREF) that uses breeding to create initial
condition variations.
Example: NCEP Global Ensemble System
• Begun in 1993 with the MRF (now GFS)
• First tried “lagged” ensembles as basis…using runs of
various initializations verifying at the same time.
• Then used the “breeding” method to find perturbations
to the initial conditions of each ensemble members.
• Breeding adds random perturbations to an initial state,
let them grow, then reduce amplitude down to a small
level, lets them grow again, etc.
• Give an idea of what type of perturbations are growing
rapidly in the period BEFORE the forecast.
• Does not include physics uncertainty.
• Now replaced by Ensemble Transform Filter
Approach
NCEP Global Ensemble
• 20 members at 00, 06, 12, and 18 UTC plus
two control runs for each cycle
• 28 levels
• T190 resolution (roughly 80km resolution)
• 384 hours
• Uses stochastic physics to get some physics
diversity
ECMWF Global Ensemble
• 50 members and 1 control
• 60 levels
• T399 (roughly 40 km) through 240 hours,
T255 afterwards
• Singular vector approach to creating
perturbations
• Stochastic physics
Several Nations Have Global
Ensembles Too!
• China, Canada, Japan and others!
• And there are combinations of global
ensembles like:
– TIGGE: Thorpex Interative Grand Global
Ensemble from ten national NWP centers
– NAEFS: North American Ensemble
Forecasting System combining U.S. and
Canadian Global Ensembles
Popular Ensemble-Based
Products
Spaghetti Diagram
Ensemble Mean
‘Ensemble
 “best guess” = high-resolution control
forecast or ensemble mean
 ensemble spread = standard deviation of
the members at each grid point
 Shows where “best guess” can be trusted
(i.e., areas of low or high predictability)
 Details unpredictable aspects of waves:
amplitude vs. phase
37
Global Forecast System (GFS) Ensemble
http://www.cdc.noaa.gov/map/images/ens/ens.html
Spread Chart
Meteograms Versus “Plume Plots”
5520
1000/500 Hpa Geopotential Thickness [m] at Yokosuka
Initial DTG 00Z 28 JAN 1999
5460
5400
5340
5280
5220
5160
 Data Range = meteogram-type trace of each ensemble
5100
member’s raw output
5040
 Excellent tool
for point forecasting, if calibrated
 Can easily4980
(and should) calibrate for model bias
 Calibrating for0 ensemble
spread
problems
5
4is difficult
3
2
1
6
7
8
9
10
Forecast Day
 Must 38
use box & whisker, or confidence interval plot for
large ensembles
FNMOC Ensemble Forecast S
Box and Whisker Plots
39
http://www.weatheroffice.gc.ca/ensemble/index_naefs_e.html
40
http://www.weatheroffice.gc.ca/ensemble/index_naefs_e.html
Gray shaded area is 90%
Confidence Interval (CI)
Misawa AB, Japan
AFWA Forecast Multimeteogram
JME Cycle: 11Nov06, 18Z
RWY: 100/280
50
15km
Resolution 45
Extreme
Max
Wind
(kt)
Speed
Wind
Speed
(kt) .
40
35
30
25
90%
CI
20
15
10
5
0
Wind
Direction
11/18
41
Mean
12/00
06
12
Extreme
18
13/00
06
Min
Valid Time (UTC)
Valid Time
12
18
14/00
06
Hurricane Track Forecast & Potential
3
42
Ensemble-Based Probabilities
Postage Stamp Plots
Verification
SLP and winds
1: cent
- Reveals high uncertainty in storm track and intensity
- Indicates low probability of Puget Sound wind event
2: eta
5: ngps
8: eta*
11: ngps*
3: ukmo
6: cmcg
9: ukmo*
12: cmcg*
4: tcwb
7: avn
10: tcwb*
13: avn*
A Number of Nations Are
Experimenting with HigherResolution Ensembles
European MOGREPS
– 24 km resolution
– Uses ETKF for diversity
breeding)
– Stochastic physics
NCEP Short-Range Ensembles
(SREF)
• Resolution of 32 km
• Out to 87 h twice a day (09 and 21 UTC
initialization)
• Uses both initial condition uncertainty
(breeding) and physics uncertainty.
• Uses the Eta and Regional Spectral Models
and recently the WRF model (21 total
members)
SREF Current System
Model
RSM-SAS
RSM-RAS
Res (km) Levels Members
45
28 Ctl,n,p
45
28 n,p
Cloud Physics
GFS physics
GFS physics
Convection
Simple Arak-Schubert
Relaxed Arak-Schubert
Betts-Miller-Janjic
BMJ-moist prof
Eta-BMJ
Eta-SAT
32
32
60 Ctl,n,p
60 n,p
Op Ferrier
Op Ferrier
Eta-KF
Eta-KFD
32
32
60 Ctl,n,p
60 n,p
Op Ferrier
Op Ferrier
Kain-Fritsch
Kain-Fritsch
with enhanced
detrainment
PLUS
* NMM-WRF control and 1 pert. Pair
* ARW-WRF control and 1 pert. pair
The UW Ensemble System
• Perhaps the highest resolution operational
ensemble systems are running at the
University of Washington
• UWME: 8 members at 36 and 12-km
• UW EnKF system: 60 members at 36 and
4-km
Calibration (Post-Processing) of
Ensembles Is Essential
Calibration of Mesoscale
Ensemble Systems: The Problem
• The component models of virtually all ensemble
systems have systematic bias that substantially
degrade the resulting probabilistic forecasts.
• Since different models or runs have different
systematic bias, this produces forecast variance that
DOES NOT represent true forecast uncertainty.
• Systematic bias reduces sharpness and degrades
reliability.
• Also, most ensemble systems produce forecasts that
are underdispersive. Not enough variability!
Example of Bias Correction for
UW Ensemble System
Uncorrected + T2
4.0
3.5
2.0
1.5
1.0
0.5
48 h
2.5
12 h
24 h
36 h
Average RMSE (C)
and
Bias
Average
(shaded)
Average
Average
RMSE
RMSE
and
and Bias
Bias
(mb)
(C)
3.0
0.0
-0.5
-1.0
-1.5
-2.0
-2.5
plus01
plus02
plus03
plus04
plus05
plus06
plus07
plus08
mean
Bias-Corrected T2
4.0
3.5
2.0
1.5
1.0
0.5
48 h
2.5
12 h
24 h
36 h
Average RMSE (C)
and
Bias
Average
(shaded)
Average
Average
RMSE
RMSE
and
andBias
Bias
Bias
(mb)
(C)
RMSE
and
(mb)
3.0
0.0
-0.5
-1.0
-1.5
-2.0
-2.5
plus01
plus01
plus02
plus02
plus03
plus03
plus04
plus04
plus0
5
plus05
plus06
plus06
plus07
plus07
plus08
plus08
mean
mean
B
Skill for
Probability of T2 < 0°C
0.2
*ACMEcore
*UW Basic Ensemble with bias correction
ACMEcore
UW Basic Ensemble, no bias correction
*ACMEcore+
*UW Enhanced Ensemble with bias cor.
ACMEcore+
UW Enhanced Ensemble without bias cor
Uncertainty
0.1
0.0
-0.1
00
0.6
03
06
09
12
15
18
21
24
42
45
48
0.5
BSS
0.4
0.3
0.2
0.1
0.0
-0.1
00
03
06
09
12
15
18
21
24
27
30
33
36
BSS: Brier Skill Score
39
27
30
33
36
39
42
The Next Step: Bayesian Model
Averaging
• Although bias correction is useful it is
possible to do more.
– Optimize the variance of the forecast
distributions
– Weight the various ensemble members using
their previous performance.
– An effective way to do this is through Bayesian
Model Averaging (BMA).
Bayesian Model Averaging
• Assumes a gaussian (or other) PDF for each
ensemble member.
• Assumes the variance of each member is the same
(in current version).
• Includes a simple bias correction for each
member.
• Weights each member by its performance during a
training period (we are using 25 days)
• Adds the pdfs from each member to get a total
pdf.
Application of BMA-Max 2-m Temperature
(all stations in 12 km domain)
Being Able to Create Reliable
and Sharp Probabilistic
Information is Only Half the
Problem!
Even more difficult will be
communication and getting
people and industries to use it.
Deterministic Nature?
• People seem to prefer deterministic
products: “tell me what is going to happen”
• People complain they find probabilistic
information confusing. Many don’t
understand POP (probability of
precipitation).
• Media and internet not moving forward
very quickly on this.
National Weather Service Icons
are not effective in
communicating probabilities
And a “slight” chance of freezing
drizzle reminds one of a trip to
Antarctica
Commercial
sector
is no better
(Weather.Com)
A great deal of research and
development is required to
develop effective approaches for
communicating probabilistic
forecasts which will not
overwhelm people and allow
them to get value out of them.
Download