Application of Forecast Verification Science to Operational

advertisement
Application of Forecast Verification
Science to Operational River Forecasting
in the National Weather Service
Julie Demargne, James Brown,
Yuqiong Liu and D-J Seo
UCAR
NROW, November 4-5, 2009
Approach to river forecasting
Observations
Forecasters
Models
Input forecasts
EVAPOTRANSPIRATION
INFILTRATION
FREE
PERCOLATION
LOWER
ZONE
Users
TENSION
UPPER
ZONE
PRIM ARY
FREE
RESERVED
INTERFLOW
SURFACE
RUNOFF
TENSION TENSION
SUPPLEM ENTAL
FREE
RESERVED
BASEFLOW
SUBSURFACE
OUTFLOW
DIRECT
RUNOFF
Forecast
products
Forecasters
2
Where is the …?
In the past
Verification
??
?
• Limited verification of
hydrologic forecasts
• How good are the
forecasts for application X?
3
Where is the …?
Now
Verification
!!!
Papers
Verification
Experts
Verification
Products
Verification
Systems
4
Hydrologic forecasting: a multi-scale problem
National
Forecast group
Major river system
River basin with
river forecast points
Headwater basin with
radar rainfall grid
High resolution flash
flood basins
Hydrologic forecasts must be verified consistently across
all spatial scales and resolutions.
5
Hydrologic forecasting: a multi-scale problem
Forecast
Uncertainty
Years
Seasons
Months
Forecast
Lead
Time
Weeks
Days
Hours
Minutes
Protection of
Life & Property
Benefits
Hydropower
Flood Mitigation
& Navigation
Recreation
Agriculture
Ecosystem
Reservoir
Control
State/Local
Planning
Health
Environment
Commerce
Seamless probabilistic water forecasts are required for all
lead times and all users; so is verification information.
6
Need for hydrologic forecast verification
• In 2006, NRC
recommended NWS
expand verification of
its uncertainty products
and make it easily available
to all users
in near
real time
Users decide whether to take action with risk-based decision
Must educate users on how to interpret forecast and verification info
7
River forecast verification service
http://www.nws.noaa.gov/oh/
rfcdev/docs/
Final_Verification_Report.pdf
http://www.nws.noaa.gov/oh/rfcdev/docs/
NWS-Hydrologic-Forecast-VerificationTeam_Final-report_Sep09.pdf.pdf
8
River forecast verification service
• To help us answer
 How good are the forecasts for application X?
 What are the strengths and weaknesses of the forecasts?
 What are the sources of error and uncertainty in the
forecasts?
 How are new science and technology improving the
forecasts and the verifying observations?
 What should be done to improve the forecasts?
 Do forecasts help users in their decision making?
9
River forecast verification service
River forecasting system
Observations
Verification
systems
Models
Input forecasts
Forecasters Users
EVAPOTRANSPIRATION
TENSION
UPPER
ZONE
INFILTRATION
FREE
PERCOLATION
LOWER
ZONE
PRIM ARY
FREE
INTERFLOW
SURFACE
RUNOFF
TENSION TENSION
SUPPLEM ENTAL
FREE
RESERVED
RESERVED
BASEFLOW
SUBSURFACE
OUTFLOW
Users
DIRECT
RUNOFF
Forecast
products
Verification
products
10
River forecast verification service
•
Verification Service within Community Hydrologic
Prediction System (CHPS) to:
 Compute metrics
 Display data & metrics
 Disseminate data & metrics
 Provide real-time access to metrics
 Analyze uncertainty and error in forecasts
 Track performance
11
Verification challenges
• Verification is useful if the information generated leads to
decisions about the forecast/system being verified
 Verification needs to be user oriented
• No single verification measure provides complete
information about the quality of a forecast product
 Several verification metrics and products are needed
• To facilitate communication of forecast quality, common
verification practices and products are needed from
weather and climate forecasts to water forecasts
 Collaborations between meteorology and hydrology communities
are needed (e.g., Thorpex-Hydro, HEPEX)
12
Verification challenges: two classes of
verification
• Diagnostic verification:
 to diagnose and improve model performance
 done off-line with archived forecasts or hindcasts to analyze
forecast quality relative to different conditions/processes
• Real-time verification:
 to help forecasters and users make decisions in real-time
 done in real-time (before the verifying observation occurs)
using information from historical analogs and/or past forecasts
and verifying observations under similar conditions
13
Diagnostic verification products
• Key verification metrics for 4 levels of information for
single-valued and probabilistic forecasts
1. Observations-forecasts comparisons (scatter plots, box plots,
time series plots)
2. Summary verification (e.g. MAE/Mean CRPS, skill score)
3. More detailed verification (e.g. measures of reliability,
resolution, discrimination, correlation, results for specific
conditions)
4. Sophisticated verification (e.g. for specific events with ROC)
To be evaluated by forecasters and forecast users
14
Diagnostic verification products
Forecast value
• Examples for level 1: scatter plot, box-and-whiskers plot
User-defined
threshold
Observed value
15
Diagnostic verification products
• Examples for level 1: box-and-whiskers plot
‘Errors’ for
one forecast
Max.
90%
80%
Median
20%
10%
Forecast error (forecast - observed) [mm]
American River in California – 24-hr precipitation ensembles (lead day 1)
Zero error line
“Blown” forecasts
High bias
Low bias
Min.
Observed daily total precipitation [mm]
16
Diagnostic verification products
• Examples for level 2: skill score maps by months
January
April
October
Smaller score, better
17
Diagnostic verification products
• Examples for level 3: more detailed plots
Score
Performance under
different conditions
Score
Performance for different
months
18
Diagnostic verification products
• Examples for level 4: event specific plots
Event: > 85th percentile from observed distribution
Reliability
Perfect
Predicted Probability
Discrimination
Probability of Detection POD
Observed frequency
Perfect
Probability of False Detection POFD
19
Diagnostic verification products
• Examples for level 4: user-friendly spread-bias plot
“Hit rate” = 90%
60% of time, observation should fall in window
covering middle 60% (i.e. median ±30%)
60%
“Underspread”
20
Diagnostic verification analyses
• Analyze any new forecast process with verification
• Use different temporal aggregations
 Analyze verification statistic as a function of lead time
 If similar performance across lead times, data can be pooled
• Perform spatial aggregation carefully
 Analyze results for each basin and results plotted on spatial maps
 Use normalized metrics (e.g. skill scores)
 Aggregate verification results across basins with similar hydrologic
processes (e.g. by response time)
• Report verification scores with sample size
 In the future, confidence intervals
21
Diagnostic verification analyses
• Evaluate forecast performance under different conditions
w/ time conditioning: by month, by season
w/ atmospheric/hydrologic conditioning:
– low/high probability threshold
– absolute thresholds (e.g., PoP, Flood Stage)
Check that sample size is not too small
• Analyze sources of uncertainty and error
Verify forcing input forecasts and output forecasts
For extreme events, verify both stage and flow
Sensitivity analysis to be set up at all RFCs:
1) what is the optimized QPF horizon for hydrologic forecasts?
2) do run-time modifications made on the fly improve forecasts?
22
Diagnostic verification software
• Interactive Verification Program (IVP) developed at OHD:
verifies single-valued forecasts at given locations/areas
23
Diagnostic verification software
• Ensemble Verification System (EVS) developed at OHD:
verifies ensemble forecasts at given locations/areas
24
Dissemination of diagnostic verification
• Example: WR water supply website
http://www.nwrfc.noaa.gov/westernwater/
Data
Visualization
Error
•MAE, RMSE
•Conditional on
lead time, year
Skill
•Skill relative to
Climatology
•Conditional
Categorical
•FAR, POD, contingency table (based
on climatology or user definable)
25
Dissemination of diagnostic verification
• Example: OHRFC bubble plot online
http://www.erh.noaa.gov/ohrfc/bubbles.php
26
Real-time verification
• How good could the ‘live’ forecast be?
Live
forecast
Observations
27
Real-time verification
• Select analogs from a pre-defined set of historical events and
compare with ‘live’ forecast
Analog 3
Analog 2
Analog 1
Observed
Live forecast
Analog Observed
Analog Forecast
“Live forecast for Flood is likely to be too high”
28
Real-time verification
• Adjust ‘live’ forecast based on info from the historical analogs
Live
forecast
What
happened
Live forecast was too high
29
Real-time verification
• Example for ensemble forecasts
Live forecast (L)
Analog forecasts (H): μH = μL ± 1.0˚C
Temperature (oF)
Analog observations
“Day 1 forecast is probably too high”
Forecast lead day
30
Real-time verification
• Build analog query prototype using multiple criteria
 Seeking analogs for precipitation: “Give me past forecasts for
the 10 largest events relative to hurricanes for this basin.”
 Seeking analogs for temperature: “Give me all past forecasts
with lead time 12 hours whose ensemble mean was within 5%
of the live ensemble mean.”
 Seeking analogs for flow: “Give me all past forecasts with lead
times of 12-48 hours whose probability of flooding was >=0.95,
where the basin-averaged soil-moisture was > x and the
immediately prior observed flow exceeded y at the forecast
issue time”.
Requires forecasters’ input!
31
Outstanding science issues
•
•
•
•
•
•
Define meaningful reference forecasts for skill scores
•
Account for observational error (measurement and
representativeness errors) and rating curve error
•
Account for non-stationarity (e.g., climate change)
Separate timing error and amplitude error in forecasts
Verify rare events and specify sampling uncertainty in metrics
Analyze sources of uncertainty and error in forecasts
Consistently verify forecasts on multiple space and time scales
Verify multivariate forecasts (issued at multiple locations and
for multiple time steps) by accounting for statistical
dependencies
32
Verification service development
OHD
OCWWS
NCEP
Forecasters
Users
Academia
Forecast agencies COMET-OHD-OCWWS
OHD-NCEP
Thorpex-Hydro project
collaboration on training
Private
OHD-Deltares collaboration
for CHPS enhancements
HEPEX Verification Test Bed
(CMC, Hydro-Quebec, ECMWF)
33
Looking ahead
•
•
2012:

Info on quality of forecast service
available online

real-time and diagnostic
verification implemented in CHPS

RFC verification standard
products available online along
with forecasts
2015:

Leveraging grid-based verification
tools
34
Thank you
Questions?
FORECASTER
FORECASTER
Julie.Demargne@noaa.gov
35
Extra slide
36
Diagnostic verification products
•
Key verification metrics from NWS Verification Team report
37
Download