BAYESIAN PROCESSOR OF ENSEMBLE (BPE)

advertisement
BAYESIAN PROCESSOR OF ENSEMBLE (BPE)
By
Roman Krzysztofowicz
Professor of Systems Engineering and Professor of Statistics
University of Virginia
Presented at the
Forecast Applications Branch
NOAA / OAR / ESRL / GSD
Boulder, Colorado
11 May 2010
Acknowledgments:
Work supported by the National Science Foundation
under Grant No. ATM-0641572.
Collaboration of Zoltan Toth, Forecast Applications Branch
NOAA / OAR / ESRL / GSD.
1
CONCEPT OF BAYESIAN PROCESSOR
NWP Models
Ensemble
Forecast
Climatic Data
High Resolution
Forecast
BPE
Distribution Function
Adjusted Ensemble
1. Extracts and Fuses Information
maximizes informativeness
2. Quantifies Total Uncertainty
guarantees calibration
Estimated for: predictand, grid point, day, lead time
Versions for: binary, multi-category, continuous predictands
2
INFORMATION FUSION
Example
Predictand: 2m temp. at 12 UTC
Location:
Savannah, Georgia
Forecast time: 00 UTC
Lead time:
108 h
Data Samples for Estimation
Asymmetric samples
Two sources
W — predictand
Y — ensemble (vector)
● Climatic sample of W — long
NCEP/NCAR re-analysis
Jan. 1959 – Dec. 1998 (40 years)
Sample size: M  600 per day ( 40 y  15 d )
estimate prior density function
● Joint sample of Y, W — short
NCEP ensemble forecasts and analyses
Mar. 2007 – Feb. 2009 (2 years)
Sample size N ≃ 360 per season 2 y  180 d
estimate likelihood function
3
BAYESIAN THEORY: General
W k — predictand on day k k  1, . . . , 365
W k−l — antecedent observation on day k − l (forecast day)
Y kl — ensemble forecast (vector of estimators) for day k with lead time l days
Wk
Wk-l
prior
likelihood
posterior
Ykl
h kl wk | wk−l  — climatic l -step transition density function (prior)
f kl y kl | wk , wk−l  — conditional density function (likelihood function)
Bayes Theorem: Posterior density function
 kl wk | y kl , wk−l  
f klykl | w k , w k−l h kl w k | w k−l
 klykl | w k−l
Total Probability Law: Expected density function
 kl y kl | wk−l   

−
f kl y kl | wk , wk−l  h kl wk | wk−l  dwk
4
THEORETIC PROPERTIES OF BPE
f klykl | w k , w k−l h klw k | w k−l
 kl wk | y kl , wk−l  
 klykl | w k−l
Revision of Uncertainty: Climatic forecast, based on ensemble forecast
Fusion of Information: Two asymmetric samples
prior
W k climatic sample — long (e.g., 40 years)
Y kl joint sample
— short (e.g., 3 months)
likelihood
Convergence: As the informativeness of Y kl decreases
 kl wk | y kl , wk−l 
→
h kl wk | wk−l 
As the lead time increases
h kl wk | wk−l 
→ g k wk 
Calibration: Against the climatic density on day k (standard)
E kl wk |Y kl , W k−l   g k wk 
5
MODELING CHALLENGES
Statistical properties of meteorological variates
1. Asymmetric samples from two sources
2. Non-stationary daily time series (seasonality)
3. Non-Gaussian marginal distributions (of many forms)
4. Non-linear, Heteroscedastic dependence structures
5. Non-random ensembles
Why to worry?
Want to: maximize INFORMATIVENESS
guarantee CALIBRATION
for all events (extremes as well)
at every grid point
 for every real user
(not “average user”)
on every day of year

*BPE offers theoretical and numerical solutions
6
METHODOLOGY to Implement BPE
Comprehensive approach to post-processing
(formulated jointly with Zoltan Toth)
Mathematical models and statistical procedures
(tested partially: on temperature, precipitation
at representative points)
This talk highlights 4 components:
1. STANDARDIZATION of Time Series
2. NORMAL QUANTILE TRANSFORM (NQT) of Variate
3. SUFFICIENT STATISTICS of Ensemble
4. Basic GAUSSIAN-GAMMA BPE
7
STANDARDIZATION
Obtain time series that are:
(1) Margin-stationary
(2) Quasi-ergodic
k k  1, . . . , 365 : 40 years x 15 days = 600
m k — prior (climatic) mean
sk
Sample
Fourier
300
298
Mean m k [K]
296
294
292
290
288
286
— prior (climatic) standard deviation
6
Standard Deviation sk [K]
Climatic sample for day
Sample
Fourier
5
4
3
2
1
284
282
0
50
100
150
200
Day of Year k
250
300
350
0
0
50
100
150
200
250
300
350
Day of Year k
8
STANDARDIZATION
Standardized joint realization on day
wk−m k
w k 
sk
′
y ′j k
Original ensemble
Standardized Temperature
Temperature [°K]
5
2008
300
290
280
0
50
100
150
200
Day of Year k
y j k−m k

sk
,
j  1, . . . , 20
Standardized: (1) Margin-stationary
(2) Quasi-ergodic
Lead Time 108 h
270
k
250
300
350
Lead Time 108 h
2008
4
3
2
1
0
-1
-2
-3
-4
-5
0
50
100
150
200
250
300
350
Day of Year k
9
PROPERTIES OF ENSEMBLE
(1) Approx. Identically
Distributed:
P(Yj ≤ yj )
0.8
Lead Time 108 h
Warm Season
N = 366 × 20
1.0
0.8
P(X ≤ x )
1.0
(4) Miscalibrated:
0.6
0.4
0.6
0.4
0.2
0.2
0.0
0.0
-4
-3
-2
-1
0
1
2
3
4
5
Lead Time 108 h
Warm Season
N = 366
Analysis
All 20 Members
-4
-3
yj
-2
-1
0
1
2
3
4
5
x
(2) NOT Stochastically Independent:
0.83 < Cor (Yi, Yj) < 0.92
(3) NOT Conditionally Stochastically Independent:
0.69 < Cor (Yi, Yj | W = w) < 0.85
⇒ Ensemble is NOT a random sample
10
PROPERTIES OF ENSEMBLE
Conditional Correlation:
Informativeness Score:
CorY i , Y j | W  w
1.0
Ensemble members
Ensemble mean
Best combination of members
0.8
0.7
0.6
0.5
Max
0.3
Average
0.2
Min
0.1
Informativeness Score (IS)
0.9
0.4
1
−1
1.0
Warm Season
0.9
Conditional Correlation
IS 
2
a2
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
12
60
108
156
204
252
Lead Time [h]
300
348
0.0
12
60
108
156
204
252
300
348
Lead Time [h]
11
NORMAL QUANTILE TRANSFORM (NQT)
Obtain variates that have:
(1) Gaussian marginal distributions
(2) Linear and Homoscedastic dependence structure
G ′ — prior (climatic) distribution of W ′
′
K̄ ′ — marginal (initial) distribution of X
Q −1 — inverse of the standard normal distribution
V  Q −1 G ′ W ′ 
̄ ′ X ′ 
Z  Q −1 K
Theoretic implication:
Testable assumption:
V  N0, 1
Z | V  v  Nav  b,  2 
12
Distribution Functions
KUIL 12-36h Cond. Precip. Amount
N = 818
MARGINAL OF X
1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
P(X ≤ x | W > 0)
P(W ≤ w | W > 0)
PRIOR (CLIMATIC) OF W
Cool
0.6
0.5
0.4
N = 470
0.6
0.5
0.4
0.3
0.3
Weibull
α = 0.592, β = 0.880
0.2
0.1
Weibull
α1 = 9.603, β1 = 0.910
0.2
0.1
0.0
0.0
0
10
20
30
40
50
60
70
80
90 100 110 120
Precip. Amount w [mm]
0
10
20
30
40
50
60
70
80
90 100 110 120
24-h Total Precip. x [mm]
13
Likelihood Dependence Structure
125
125
ρ = 0.613
100
100
24-h Total Precip. x [mm]
24-h Total Precip. x [mm]
Original
Non-Gaussian
Non-linear
Heteroscedastic
75
50
25
75
conditional
median
50
25
0
0
0
25
50
75
100
125
0
25
75
100
125
Precip. Amount w [mm]
3.5
3.5
2.5
2.5
1.5
1.5
_
z = Q-1(K(x))
_
Transformed
Gaussian
Linear
Homoscedastic
z = Q-1(K(x))
Precip. Amount w [mm]
50
0.5
-0.5
conditional
mean
γ = 0.631
0.5
-0.5
-1.5
-1.5
-2.5
-2.5
-3.5
-3.5
-3.5
-3.5
a = 0.681
b = 0.028
σ = 0.829
-2.5
-1.5
-0.5
0.5
v = Q-1(G(w))
1.5
2.5
3.5
-2.5
-1.5
-0.5
0.5
v = Q-1(G(w))
1.5
2.5
3.5
14
SUFFICIENT STATISTICS OF ENSEMBLE
Reduce dimension without loss of information
W – predictand
Y – ensemble (vector of estimators)
Y
Function

Y  Y1 , . . . , YJ 
X, T
X – predictor of central tendency of W
T – predictor of uncertainty about W
Experimental Task: Find
(e.g., ensemble mean)
(e.g., ensemble range)
 such that for every x, t  y
and
w
w|x, t  w|y
 X, T and Y are equally informative
15
Basic BPE
Empirical Findings ( ≡ Assumptions)
X is stochastically dependent on T, W
T is stochastically independent of W
Basic Processor
fx | t,w gw
w | x, t 
x | t
x | t  

−
fx | t, w gw dw
Parametric Models: Conjugate families of distributions
Gaussian-Gamma
e.g., W — temperature
Meta-Gaussian-Gamma (After NQT)
e.g., W — precip. amount
16
X|W  w
Step 1: GAUSSIAN MODEL of_________
Predictive Regression
x  x 1 , . . . , x I 
EW |x  ∑ i1 c i x i  c 0
I
Aggregate Predictor
X  ∑ i1 c i X i
I
Dependence Structure
X  aw  b  Θ
Θ independent of W, Gaussian
EΘ  0
EX|w  aw  b
VarX|w   2
VarΘ   2
Conditional density (Intermediate result)
f 0 x|w ≃ Naw  b,  2 
−  x  
−  w  
17
VALIDATION OF GAUSSIAN MODEL of (X |W = w)
Linear
5
3
x = ens. mean
Lead Time 108 h
Warm Season
N = 366
3
2
1
1
θ
2
0
0
E(X | w) = aw + b
-1
Var(X | w) = σ
a = 0.60
b = 0.49
σ = 0.70
IS = 0.43
2
-2
-3
-4
-5
-4
-3
-2
x = ens. mean
Lead Time 108 h
Warm Season
N = 366
4
-1
0
1
2
3
4
-1
-2
-3
5
-5
w
-4
-3
-2
-1
0
1
2
3
4
5
w
Gaussian
x = ens. mean
Lead Time 108 h
Warm Season
N = 366
1.0
0.8
P( Θ ≤ θ )
x = ensemble mean
4
Homoscedastic
0.6
0.4
N(0, σ 2)
σ = 0.70
0.2
0.0
-3
-2
-1
0
1
θ
2
3
4
18
X|T  t, W  w
Step 2: GAUSSIAN–GAMMA MODEL of___________
Dependence Structure
X  aw  b  Θ
EX|t, w  aw  b
VarX|t, w  v 2 t
Θ independent of W, Gaussian
Θ dependent on T
EΘ|t  0
VarΘ|t  v 2 t
Distribution of Uncertainty Predictor
1
T
 Gamma, 
, scale   0, shape   1
Unit Variance
v 2   − 1 2
Conditional density (final result)
fx | t, w ≃ Naw  b, v 2 t
−  x  
−  w  
0t
Expected density (for validation)
x | t ≃ NaM  b, a 2 S 2  v 2 t
19
VALIDATION OF GAMMA MODEL of T
Gamma
Linear
7
t = range
Lead Time 108 h
Warm Season
N = 366
6
E(Θ2|t) = ν2t
9
1.0
8
0.6
t = range
Lead Time 108 h
Warm Season
N = 366
0.4
θ2
P(1/Τ ≤1/ t )
0.8
0.0
= 0.34
LSQ regression
4
Cor(Θ2,Τ ) = 0.25
3
Gamma (α, β )
α = 0.38
β = 2.80
0.2
ν2 = α (β−1)σ2
5
2
1
0
0
1
2
3
4
5
6
7
8
0
1
2
1/ t
Independent
4
5
Constant - Heteroscedastic
4
Cor(Τ, W) = -0.19
3
2
1
t = range
Lead Time 108 h
Warm Season
N = 366
6
5
x = ensemble mean
t = range
Lead Time 108 h
Warm Season
N = 366
5
t = range
3
t
4
3
2
E(X | t ) = aM + b
1
0
1 Std. Dev.
-1
2 Std. Dev.
-2
-3
-4
0
-5
-4
-3
-2
-1
-0
w
1
2
3
4
5
0
1
2
3
t
4
5
20
POSTERIOR DENSITY FUNCTION
w|x, t ≃ NEW |x, t, VarW |x, t
Posterior Mean
2
Mv 2 t − abS 2
aS
EW |x, t  2 2 2 x  2 2 2
a S v t
a S v t
Posterior Variance
S2v 2t
VarW |x, t  2 2 2
a S v t
−  w  
−  x  
0t
linear in x
non-linear in t
non-linear in
t
COMBINING PREDICTORS (FORECASTS, MODELS)
Aggregate Predictors
x  ∑ i1 c i x i
I
t  t 1 , . . . , t I 
21
EW |X  x, T  t
POSTERIOR MEAN_______________
w — temperature [°F], x — ensemble mean [°F], t — ensemble range [°F]
↓ t * = σ 2/ν2 = 10.6
110
90
Ensemble Mean x [°F]
100
80
90
70
80
60
70
60
E(W | x, t) = 50
←
x = aM + b
= 54
50
40
40
30
30
20
20
10
10
0
0
2
4
6
8
10
12
14
16
18
20
Ensemble Range t [°F]
22
Var 1/2 W |X  x, T  t
POSTERIOR STD. DEVIATION_________________
w — temperature [°F], x — ensemble mean [°F], t — ensemble range [°F]
S= 3
Var1/2(W | x,t)
|a|/ν = 0.5
2
1
2
1
0
0
2
4
6
8
10
12
14
Ensemble Range t [°F]
16
18
20
23
FORECAST for Savannah, GA, 1 May 2008, 00 UTC
For: 5 May 2008, 12 UTC
Ensemble Forecast:
mean
66.3 °F
std. dev. 1.91 °F
x  −0. 261
t  2. 057
1
0.98
0.92
0.77
P(W ≤ w)
Sufficient Statistics:
LT: 108 h [day 5]
Ensemble
N(66.3, 3.65)
Prior
N(67.4, 16.92)
Posterior
N(65.6, 11.13)
Actual Temperature
70.4
0.5
0
55
60
65
70
75
80
Temperature w [°F]
24
BPE — Outputs
Input:
Model ensemble (for predictand, grid point, lead time)
Output: (1) Posterior
density function
(2) Posterior distribution function
(3) Posterior quantile function
(4) Posterior ensemble
(calibrated ensemble)
Each member is mapped into a posterior quantile
via the inverse of the posterior distribution function
Usage
(1) Solve a decision problem analytically
(2) Find probability of non-exceedance of given threshold
(3) Find quantile corresponding to given non-exceedance probability
(4) Simulate operation of a dynamic system
25
BPE — Probabilistic Forecast
Prior d.f.
w | x, t
Posterior D.F.
gw
Posterior d.f. w | x, t
Model ensemble
○
Posterior ensemble
1.0
1.0
0.9
0.9
p(3)
Density
0.8
0.8
0.7
0.7
0.6
0.6
p(2)
0.5
0.5
0.4
0.4
0.3
0.3
0.2
p(1)
0.1
0.2
0.1
0.0
0
10
20
30
40
50
60
70
80
90 100 110 120
0.0
0
10
20
30
40
50
60
70
80
90 100 110 120
y(1)
Precip. Amount w [mm]
w(1)
w(2)
y(2) y(3)
w(3)
26
BPE — Basic Properties
Theoretically-based optimal fusion of ensemble forecast with climatic data
Revises prior (climatic) distribution given ensemble forecast
based on comparison of past forecasts with observations
1. CORRECT THEORETIC STRUCTURE
• Always valid
• Modular: Framework for different – modeling assumptions
– estimation procedures
2. FLEXIBLE ANALYTIC MODELS
•
•
•
•
Handle distributions of any form (not only normal)
Handle non-linear, heteroscedastic dependence structures
Parametric (easy to estimate and manipulate)
Robust when joint sample is small (lesser need for “freeze” or “re-forecast ”)
3. UNIQUE PERFORMANCE ATTRIBUTES
• Removes bias in distribution
• Guarantees calibration of the adjusted ensemble
• Stable calibration (against climatic distribution | antecedent, regime)
• Stationary calibration (equally good for all lead times)
• User-specific calibration (point-specific, time-specific)
• When predictability vanishes:
climatic ensemble
adjusted ensemble
• Preserves spatial / temporal / cross-variate rank correlations in ensemble
BAYESIAN FORECASTING THEORY
“Why Should a Forecaster and a Decision Maker Use
Bayes’ Theorem?” R. Krzysztofowicz
Water Resources Research, 19(2), 327–336, 1983.
•
•
•
“Probabilistic Forecasts from the National Digital Forecast
Data Base”, R. Krzysztofowicz and W.B. Evans,
Weather and Forecasting, 23(2), 270–289, 2008.
“The Role of Climatic Autocorrelation in Probabilistic
Forecasting”, R. Krzysztofowicz and W.B. Evans,
Monthly Weather Review, 136(12), 4572–4592, 2008.
28
* * *
Some Details of BPE
29
ENSEMBLE PROCESSING: Challenges & Solutions
W – predictand
Y – ensemble (vector of estimators) Y  Y 1 , . . . , Y J 
1. Samples are asymmetric
• Climatic sample of W – long
NCEP / NCAR re-analysis ~40 years, 2.5 x 2.5 grid
• Joint sample of Y, W – short
Recent ensemble forecasts and observations ~ 90 days
* BPE: prior distribution, likelihood function
Bayesian fusion
2. Time series of Y, W are non-stationary (seasonality)
* “Standardize” using climatic statistics for the day
Margin-Stationary
Quasi-Ergodic
30
ENSEMBLE PROCESSING: Challenges & Solutions
3. Ensemble members are:
• not independent
0. 82  RankCorYi , Yj   0. 93 i ≠ j ,
j  1, . . . , 20
• not conditionally independent
0. 68  RankCorYi , Yj | W  0. 86
warm,
108 h
* BPE: models dependence (meta-Gaussian likelihood function)
4. Ensemble members have
• different informativeness
0. 31  IS j  0. 42
j  1, . . . , 20
• diminishing marginal informativeness
ISY14   0. 411
ISY14 , Y4   0. 440
* BPE: Sufficient Statistic: X  Y
ensemble Y
dimension 20
X
statistic
dimension 1–7
• X is as informative as Y
• fy|w is replaced with fx|w
ISY14 , Y4 , Y16   0. 447
(NCEP, 2008)
31
ENSEMBLE PROCESSING: Challenges & Solutions
5. Distributions of W and X i have many forms (non-Gaussian)
* BPE: allows any form of the distribution (meta-Gaussian model)
• Parametric distribution of each variate (2–4 parameters)
• Library of 43 parametric distributions
• Automatic estimation and selection
6. Dependence structure between X and W is
• non-linear (in mean)
• heteroscedastic (in variance)
* BPE: models structure (meta-Gaussian likelihood function)
• Normal Quantile Transform (NQT)
• Each variate transformed into a standard normal
• Multiple linear regression
32
VALIDATION OF GAUSSIAN MODEL of (X |W = w)
Linear
5
3
x = ens. mean
Lead Time 60 h
Warm Season
N = 368
3
2
1
1
θ
2
0
0
E(X | w) = aw + b
-1
Var(X | w) = σ
a = 0.69
b = 0.47
σ = 0.65
IS = 0.54
2
-2
-3
-4
-5
-4
-3
x = ens. mean
Lead Time 60 h
Warm Season
N = 368
4
-2
-1
0
1
2
3
4
-1
-2
-3
5
-5
w
-4
-3
-2
-1
0
1
2
3
4
5
w
Gaussian
x = ens. mean
Lead Time 60 h
Warm Season
N = 368
1.0
0.8
P(Θ ≤ θ )
x = ensemble mean
4
Homoscedastic
0.6
0.4
N(0, σ 2)
σ = 0.65
0.2
0.0
-3
-2
-1
0
1
θ
2
3
4
33
VALIDATION OF GAMMA MODEL of T
Gamma
Linear
1.0
12
0.8
10
E(Θ2|t) = ν2t
0.6
t = range
Lead Time 60 h
Warm Season
N = 368
0.4
8
θ2
P(1/Τ ≤1/ t )
t = range
Lead Time 60 h
Warm Season
N = 368
0.0
= 0.50
LSQ regression
6
Cor(Θ2,Τ ) = 0.29
4
Gamma (α, β )
α = 0.60
β = 3.00
0.2
ν2 = α (β−1)σ2
2
0
0
1
2
3
4
5
6
7
8
9
0.0
0.5
1.0
1.5
1/ t
2.5
3.0
3.5
4.0
t
Independent
Constant - Heteroscedastic
2.5
Cor(Τ, W) = -0.08
2.0
1.5
1.0
0.5
t = range
Lead Time 60 h
Warm Season
N = 368
6
5
x = ensemble mean
t = range
Lead Time 60 h
Warm Season
N = 368
3.0
t = range
2.0
4
3
2
E(X | t ) = aM + b
1
0
1 Std. Dev.
-1
2 Std. Dev.
-2
-3
-4
0.0
-5
-4
-3
-2
-1
0
w
1
2
3
4
5
0.0
0.5
1.0
1.5
t
2.0
2.5
3.0
34
VALIDATION OF GAUSSIAN MODEL of (X |W = w)
Linear
5
3
x = ens. mean
Lead Time 252 h
Cool Season
N = 334
3
2
1
1
θ
2
0
0
E(X | w) = aw + b
-1
Var(X | w) = σ
a = 0.27
b = 0.25
σ = 0.62
IS = 0.16
2
-2
-3
-4
-5
-4
-3
-2
x = ens. mean
Lead Time 252 h
Cool Season
N = 334
4
-1
0
1
2
3
4
-1
-2
-3
5
-5
w
-4
-3
-2
-1
0
1
2
3
4
5
w
Gaussian
x = ens. mean
Lead Time 252 h
Cool Season
N = 334
1.0
0.8
P(Θ ≤ θ )
x = ensemble mean
4
Homoscedastic
0.6
0.4
N(0, σ 2)
σ = 0.62
0.2
0.0
-3
-2
-1
0
1
θ
2
3
4
35
VALIDATION OF GAMMA MODEL of T
Gamma
Linear
t = r (ck – 1)
Lead Time 252 h
Cool Season
N = 334
7
1.0
6
0.8
0.6
t = r (ck – 1)
Lead Time 252 h
Cool Season
N = 334
0.4
θ2
P(1/Τ ≤1/ t )
5
0.0
ν2 = α (β−1)σ2
4
= 0.06
LSQ regression
3
Cor(Θ2,Τ ) = 0.32
2
Gamma (α, β )
α = 0.14
β = 2.07
0.2
E(Θ2|t) = ν2t
1
0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
0
10
20
1/ t
Independent
40
Constant - Heteroscedastic
Cor(Τ, W) = -0.05
20
10
t = r (ck – 1)
Lead Time 252 h
Cool Season
N = 334
6
5
x = ensemble mean
t = r (ck – 1)
Lead Time 252 h
Cool Season
N = 334
30
t = r (ck – 1)
30
t
4
3
2
1
E(X | t ) = aM + b
0
1 Std. Dev.
-1
2 Std. Dev.
-2
-3
-4
0
-5
-4
-3
-2
-1
0
w
1
2
3
4
5
0
10
20
t
30
36
Download