BAYESIAN PROCESSOR OF ENSEMBLE (BPE): CONCEPT AND DEVELOPMENT By Roman Krzysztofowicz University of Virginia Presented at the 19th Conference on Probability and Statistics in the Atmospheric Sciences New Orleans, Louisiana 20–24 January 2008 Acknowledgments: Work supported by the National Science Foundation under Grant No. ATM0641572. Collaboration of Zoltan Toth, Environmental Modeling Center NOAA / NWS / NCEP. NEW STATISTICAL TECHNIQUES for PROBABILISTIC WEATHER FORECASTING Techniques Bayesian Processor of Output (BPO) Bayesian Processor of Ensemble (BPE) NWP Model Ensemble Climatic Data BPE Distribution Function Adjusted Ensemble • extracts and fuses information • quantifies total uncertainty • calibrates (de-biases) ensemble Versions for binary predictands multi-category predictands continuous predictands ENSEMBLE PROCESSING: Challenges & Solutions W – predictand Y – ensemble (vector of estimators) Y Y0 , Y1 , . . . , YJ 1. Samples are asymmetric • Climatic sample of W – long NCEP / NCAR re-analysis ~40 years, 2.5 x 2.5 grid • Joint sample of Y, W – short Recent ensemble forecasts and observations ~ 90 days * BPE: prior distribution, likelihood function Bayesian fusion 2. Time series of Y, W are non-stationary (seasonality) * “Standardize” using climatic statistics for the day Stationary (in marginal distributions) (?) Ergodic * “Homogenize” across variates using ensemble statistics ENSEMBLE PROCESSING: Challenges & Solutions 3. Ensemble members are: • not independent 0. 42 Rank CorY´i , Y´j 0. 80 i j, j 0, 1, . . . , 10 • not conditionally independent fy ´i , y ´j |w f i y ´i |wf j y ´j |w * BPE: models dependence (meta-Gaussian likelihood function) 4. Ensemble members have • very different informativeness 0. 259 IS j 0. 53 j 0, 1, . . . , 10 • diminishing marginal informativeness ISY0 0. 5259 ISY0 , Y2 0.563, * BPE: Sufficient Statistic: X TY´ ensemble Y´ dimension 22 X statistic dimension 2–5 • X is as informative as Y´ • fy ´ |w is replaced with fx|w ISY0 , Y2 , Y8 0. 575 (NCEP, 2008) ENSEMBLE PROCESSING: Challenges & Solutions 5. Distributions of W and X i have many forms (non-Gaussian) * BPE: allows any form of the distribution (meta-Gaussian model) • Parametric distribution of each variate (2–3 parameters) • Library of 43 parametric distributions • Automatic estimation and selection 6. Dependence structure between X and W is • non-linear (in mean) • heteroscedastic (in variance) * BPE: models structure (meta-Gaussian likelihood function) • Normal Quantile Transform (NQT) • Each variate transformed into a standard normal • Multiple linear regression THEORY for CONTINUOUS PREDICTAND Variates W – predictand X – vector of predictors, X X 1 , . . . , X I Bayesian Theory gw fx|w prior (climatic) conditional (likelihood) x fx|w gw dw w|x In real time, x is given; write fx|w gw w x w Fusion • Two sources • Asymmetric samples: climatic sample of W – long joint sample of X, W – short FORECASTING EQUATIONS • Posterior Distribution Function w Q 1 T I Q1 Gw c i Q1 K i x i c 0 i1 • Posterior Density Function exp 12 Q1 w 2 w 1 T exp 1 Q1 Gw 2 2 gw • Posterior Quantile I w p G 1 Q c i Q1 K i x i c 0 TQ1 p i1 0p1 p 0. 1, 0. 25, 0. 5, 0. 75, 0. 9 Distribution Functions KUIL 12-36h Cond. Precip. Amount MARGINAL OF X N = 818 1.0 1.0 0.9 0.9 0.8 0.8 0.7 0.7 P(X ≤ x | W > 0) P(W ≤ w | W > 0) PRIOR (CLIMATIC) OF W Cool 0.6 0.5 0.4 N = 470 0.6 0.5 0.4 0.3 0.3 Weibull α = 0.592, β = 0.880 0.2 0.1 Weibull α1 = 9.603, β1 = 0.910 0.2 0.1 0.0 0.0 0 10 20 30 40 50 60 70 80 90 100 110 120 Precip. Amount w [mm] 0 10 20 30 40 50 60 70 80 90 100 110 120 24-h Total Precip. x [mm] Likelihood Dependence Structure 125 125 ρ = 0.613 100 24-h Total Precip. x [mm] 24-h Total Precip. x [mm] 100 75 50 25 75 50 25 0 0 0 25 50 75 100 125 0 25 75 100 125 Precip. Amount w [mm] 3.5 3.5 2.5 2.5 1.5 1.5 _ z = Q-1(K(x)) _ z = Q-1(K(x)) Precip. Amount w [mm] 50 0.5 -0.5 γ = 0.631 0.5 -0.5 -1.5 -1.5 -2.5 -2.5 -3.5 -3.5 -3.5 -3.5 a = 0.681 b = 0.028 σ = 0.829 -2.5 -1.5 -0.5 0.5 v = Q-1(G(w)) 1.5 2.5 3.5 -2.5 -1.5 -0.5 0.5 v = Q-1(G(w)) 1.5 2.5 3.5 Probabilistic Forecast Cool 1.0 1.0 0.9 0.9 0.8 0.8 0.7 0.7 Conditional Density P(W ≤ w | W > 0) KUIL 12–36h Cond. Precip. Amount 0.6 0.5 0.4 0.6 Prior Posterior 0.5 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 0 10 20 30 40 50 60 70 80 90 100 110 120 Precip. Amount w [mm] 0 10 20 30 40 50 60 70 80 90 100 110 120 Precip. Amount w [mm] BPE — Outputs Input: Ensemble forecast of a predictand Posterior distribution function (continuous cdf) (2) Posterior density function (continuous pdf) (3) Adjusted ensemble Output: (1) Each member is mapped into a posterior quantile via the inverse of the posterior distribution function (4) Probability of non-exceedance for each member This probability is identical for all predictands USAGE • Given (3) and (4), the user can construct a discrete approximation to the posterior cdf • Given (1), any quantile can be calculated (0.1, 0.5, 0.9 for NDGD)