MWRreforecast_figs

advertisement
Figure 1: Previous day’s temperature (persistence) used as a forecast of 24hr
temperature, Salt Lake City airport data, 1979 to 2001. Red line is the fit for the central
tendency (mean) using standard linear linear regression; middle black line is the fit of the
median (0.5 quantile); upper black 0.9 quantile; lower black 0.1 quantile. Notice the
similarity but noticeable divergence of the median and mean fits for larger temperatures.
Notice also the heteroscedastic behavior of the persistence fitting, which is seen by the
convergence of the 0.1 and 0.9 quantile lines for higher temperatures.
Figure 2: Time-series of the daily uncalibrated 15-member ensemble temperature
forecasts (colors) versus the observation (black) at station KSLC over the period of 19902001 for: a) 24hr lead-time January forecasts; b) 24hr July; c) 360hr January; d) 360hr
July. Note the strong underbias of the forecasts for both seasons and lead-times. (Red
oval in panel b discussed in text and Figure 7.)
Figure 3: Rank histograms for the same data shown in Figure 2, but for the complete
data set (period of 1979-2001) although sub-sampled to remove temporal autocorrelations
(see text). Red dotted lines show 95% confidence limits for a perfectly calibrated
forecast. Note the strong underbias of the forecasts for both seasons and lead-times.
Figure 4: schematic of the logistic regression ensemble fitting procedure: step 1 –
prescribe climatological temperature thresholds to estimate for (99 chosen); step 2 – fit
LR model and generate out-of-sample conditional probabilities (CDF) of being less than
or equal to each threshold; step 3 – use CDF to estimate (linearly-interpolate) evenlyspaced 15 member ensemble for each day, each lead-time; final result is a “sharper”
posterior forecast PDF than the climatological prior, but used as an independent regressor
set in the QR procedure.
Figure 5: Schematic of the QR post-processing procedure. See text for details.
Figure 6: Same as Figure 2, but for the spread-interval-post-processed time-series. See
text for details. (Red oval in panel b discussed in text and Figure 7.)
Figure 7: Rank histograms of July 24-hr lead-time 15-member (16 interval)
postprocessed ensemble using logistic regression (LR) and 2mo training periods (panel
a), dispersion-selected quantile regression (QR) and 2mo training periods (panel b), LR
and 22mo training periods (panel c), and QR and 22mo training periods (panel d). Red
dotted lines show 95% confidence limits for a perfectly calibrated forecast (upper line in
panel b and c not shown).
Figure 8: A kernel fitting creates a PDF out of the original uncalibrated (black line) and
calibrated (blue) 24-hr ensemble forecast for one day (July 3, 1995), as highlighted by the
red oval in Figure 6 panel b. Comparison to the observation (red) shows the bias shift and
increase in dispersion that calibration performs. Also shown is the tail of the
climatological PDF (dashed), showing the forecasts for this anomalously cold event are
significantly sharper than climatology. Is the ungaussian behavior of the calibrated
forecast PDF consistent across other forecasts?
Figs:
-- add obs noise to raw forecasts rank hist only (say, .5deg), commenting on trying to
inflate dispersion (don’t do for postprocess since calib accounted for need for
addition obs spread)
-- convert rank histograms to skill scores w/ error bars, including error bars for
perfect and no-skill forecast, then define “potential signal to noise” as perf-noskill /
95% confid of perfect forecast, point out this should be >> 1 to be useful, then 1)
generate SS w/ error bars; 2) compare PDF’s of perfect forecasts w/ calib using ROC
and 3) KS tests
-- 2D skill plots with error bars – jan only, 2-yr & 22-yr on same plot: RMSE, brier
(jan, lower 10%), RPSS, ROC
-- 3D skill (training window vs fcst lead vs SS => gray scale (white being 0%, black
100%)) score plots using persistence/climatology/raw (whichever is stricter) as
reference (state when each is used): raw, LR, QR, SS-QR for winter/summer for SS:
RMSE, brier 10% (jan), 90% (july), RPSS, ROC, myscore (drop ones that are similar,
and just describe);
-- redo, but using raw ensemble as ref
-- utility of usage for regressors (grouping all quantiles); bar plots by season and
lead-time (1-day, 5-day, 10-day, 15-day) max window – then do same but using SS
gray-scale plot format
Download