5.1 Decision-making from Probability Forecasts Using

advertisement

Decision-Making from Probability Forecasts using Calculations of

Forecast Value

Kenneth R. Mylne

The Met. Office, Bracknell, UK

(To be submitted to Meteorological Applications )

'How do I make a decision based on a probability forecast?'

Abstract: A method of estimating the economic value of weather forecasts for decision-making is described. The method may be applied equally to either probability forecasts or deterministic forecasts, and provides a forecast user with a direct comparison of the value of each in terms of money saved, which is more relevant to users than most standard verification scores. For a user who wishes to use probability forecasts to decide when to take protective action against a weather event, the method identifies the optimum probability threshold for action, thus answering the question of how to use probability forecasts for decision-making. The system optimises decision-making for any probability forecast system, whatever its quality, and therefore removes any need to calibrate the probability forecasts. The method is illustrated using site-specific probability forecasts generated from the

ECMWF ensemble prediction system and deterministic forecasts from the ECMWF high-resolution global model. It is found that for most forecast events and most users the probability forecasts have greater user value than the deterministic forecasts from a higher resolution model.

1. Introduction

A weather forecast, however skilful, has no intrinsic value unless it can be used to make decisions which bring some benefit, financial or otherwise, to the end user. Conventionally in most weather forecast services the forecast provider supplies the user with their best estimate of whether a defined event will occur (e.g. wind speed will or will not exceed 15ms -1 ), or of a value for a measurable parameter (e.g. maximum wind speed =18 ms -1 ). Decisionmaking is often based on whether a defined event is expected to occur or not.

For example the owner of a small fishing boat may decide to seek shelter when the forecast wind speed exceeds 15 ms -1 .

The nature of atmospheric predictability is such that there is frequently a significant uncertainty associated with such deterministic forecasts. Forecast uncertainty can be expressed in many ways, either qualitatively or quantitatively, and where such information is included in a forecast this can aid the decision-maker who understands the potential impact of a wrong decision. However, uncertainty is most commonly estimated subjectively by a forecaster; such estimates are often inconsistent, and may be affected by factors such as forecasters “erring on the safe side”, which may not lead to optimal decision-making. In recent years there has been considerable development of objective methods of estimating forecast uncertainty, notably ensemble prediction systems (EPS) such as those operated by the European

Centre for Medium Range Weather Forecasts (ECMWF) (Molteni et al, 1996,

1

Buizza and Palmer 1998) and the US National Centers for Environmental

Prediction (NCEP) (Toth and Kalnay, 1993). Output from an EPS is normally in the form of probability forecasts, and there is growing evidence (e.g.

Molteni et al, 1996, Toth et al 1997) that these have greater skill than equivalent deterministic forecasts based on single high-resolution model forecasts, particularly on medium range time-scales. To make use of this additional skill, the decision-maker needs to know how to respond to a forecast such as

‘There is a 30% probability that the wind speed will exceed

15 ms -1 .’

This paper will describe a technique which estimates the economic value of a probability forecast system for a particular user based on verification of past performance, and use it to determine the user's optimal decision-making strategy. The value of deterministic forecasts can be calculated in the same way, and this allows a direct comparison of the utility of probability and equivalent deterministic forecasts in terms which are clear and relevant to the user.

2. Background to Ensemble Probability Forecasts

Uncertainty in weather forecasts derives from a number of sources, in particular uncertainty in the initial state of the atmosphere and approximations in the model used to predict the atmospheric evolution. Errors in the analysis of the initial state result from observational errors, shortage of observations in some regions of the globe and limitations of the data assimilation system.

Model errors are due to numerous approximations which must be made in the formulation of a model, most notably the many small-scale processes which cannot be resolved explicitly, and whose effect must therefore be represented approximately by parametrization. The non-linear nature of atmospheric evolution means that even very small errors in the model representation of the atmospheric state, whether due to the analysis or the model formulation, will be amplified through the course of a forecast and can result in large errors in the forecast. This sensitivity was first recognised by Lorenz (1963), and was influential in the development of chaos theory. Gross errors in the synoptic-scale evolution are common in medium range forecasts (over 3 days), but can occasionally occur even at less than 24 hours. Errors in the fine detail of a forecast, such as precipitation amounts or locations, are common even in short-range forecasts. Ensemble prediction systems have been developed in an attempt to estimate the probability density function (pdf) of forecast solutions by sampling the uncertainty in the analysis and running a number of forecasts from perturbed analyses (Molteni et al, 1996; Toth and

Kalnay, 1993). In more recent developments, Buizza et al (1999) have included some allowance for model errors in the ECMWF EPS, by adding stochastic perturbations to the effects of model physics. Houtekamer et al

(1996) describe an ensemble which accounts for both model errors and analysis errors by using a range of perturbations in both model formulation and analysis cycles. With any of these ensemble systems, probability

2

forecasts may be generated by interpreting the proportion of ensemble members predicting an event to occur as giving a measure of the probability of that event.

A range of standard verification diagnostics are used to assess the skill of such probability forecasts. For example the Brier Score (see Wilks, 1995), originally introduced by Brier (1950), is essentially the mean square error for probability forecasts of an event. Murphy (1973) showed how the Brier score could be decomposed into three terms, reliability, resolution and uncertainty, which measure different aspects of probabilistic forecasting ability. Of these the reliability term measures how well forecast probabilities relate to the actual probability of occurrence of the event, and resolution measures how effectively the forecast system is able to distinguish between high and low probabilities on different occasions. ROC (Relative Operating

Characteristics), described by Stanski et al (1989), measures the skill of a forecast in predicting an event in terms of Hit Rates and False Alarm Rates.

Rank Histograms (Hamill and Colucci, 1997 and 1998) specifically measure the extent to which the ensemble spread covers the forecast uncertainty, and can also reveal biases in the ensemble forecasts. However while all these diagnostics are of great value to scientists developing ensemble systems, they are of little interest or relevance to most forecast users. In particular they do not tell users how useful or valuable the forecasts will be for their applications, nor do they answer the question of how to use probability forecasts for decision-making.

3. Calculation of Forecast Value

An overview of techniques for estimating the economic value of weather forecasts is given by Murphy (1994), and a comprehensive review by Katz and Murphy (1997). The method applied in this study is closely related to

ROC verification (Stanski et al, 1989). It has recently been discussed by

Richardson (2000), and is rapidly becoming accepted as a valuable tool for user-oriented verification of probability forecasts. The method has also been applied to seasonal forecasts by Graham et al (2000). The aim of this paper is to present the method in a way which is particularly suitable for aiding forecast users with decision making.

The concept of forecast value is that forecasts only have value if a user takes action as a result, and the action saves the user money. Calculation of forecast value for predictions of a defined event therefore requires information on (a) the ability of the forecast system to predict the event, and (b) the user's costs and losses associated with the various possible forecast outcomes.

Consequently the value depends on the application as well as on the skill of the forecast. Forecast value will be defined first for a simple deterministic forecast, and the generalisation to probability forecasts will be considered in more detail in section 3.6.

3.1 Ability of the Forecast System

3

The basis of most estimates of forecast value is the cost-loss situation described by Murphy (1977). This is based on forecasts of a simple binary event, against which a user can take protective action when the event is expected to occur. For such an event the ability of a forecast system is fully specified for a deterministic forecast by the 2

2 contingency table shown in

Table 1, where h, m, f and r give the frequencies of occurrence of each possible forecast outcome.

3.2 User Costs and Losses

For any user making decisions based on forecasts, each of the four outcomes in table 1 has an associated cost, or loss, as given in table 2. For

Event Forecast

Yes - User

Protects

No - No

Protective

Action

Event

Yes

Observed No

Hit H

(h)

False

Alarm F

(f)

Miss

(m)

M

Correct

Rejection R

(r)

Table 1: Contingency table of forecast performance. Upper case letters H, M,

F, R represent the total numbers of occurrences of each contingency, while the lower case versions in brackets represent the relative frequencies of occurrences.

Event Forecast

Yes - User

Protects

No - No

Protective

Action

Event

Observed

Yes Mitigated

Loss Lm

No Cost C

Loss L

Normal

Loss N=0

Table 2: Contingency table of generalised user costs and losses. Note: in the simple cost/loss situation described by Murphy (1977), this is simplified such that Lm = C . convenience it is normal to measure all costs and losses relative to the user's costs for a Correct Rejection, so the 'Normal Loss' N for this contingency is set to zero. (Note however that this assumption is not necessary, and the method readily accounts for non-zero values of N .) If the event occurs with no

4

protective action being taken, the user incurs a loss L . If the event is forecast to occur, the user is assumed to take protective action at cost C . In the simple cost/loss situation (Murphy, 1977), this action is assumed to give full protection if the event does occur, so the user incurs the cost C for both Hits and False Alarms. In reality protection will often not be fully effective in preventing all losses, and the losses may be generalised by specifying a

Mitigated Loss Lm for Hits, as in table 2.

For a forecast to have value it is necessary that Lm<L . In most circumstances it would be expected that C

Lm<L , but it is possible that in some circumstances Lm<C . For example, protective action could involve using an alternative process which works effectively in the weather conditions specified by the event, but does not work in the non-event conditions - in this case the cost C of a False Alarm would be high compared to Lm .

The above examples assume that costs, losses and forecast value are specified in monetary terms. They could, instead, be calculated in terms of a user's energy consumption, for example - the concept is the same. Note that one limitation of the system comes where a forecast is used to protect life, due to the difficulty in objectively placing a cost on lives lost.

3.3 Mean Expense Following Forecasts

Given the information specified in tables 1 and 2, and assuming the user takes protective action whenever the event is forecast, it can be expected that over a number of forecast occasions the user will experience a mean expense E fx of

E fx

 hL m

 mL

 fC

 rN (1)

Note that the last term rN in equation (1) is normally zero, but this specifies the generalisation to any definition of the Normal Loss.

3.4 Climatological Reference Expense

Forecast value represents the saving the user gains by using the forecasts, and therefore requires a baseline reference expense for comparison with E fx

. If no forecast is available the user has two options, either always protect or never protect.

In Murphy's (1977) simple cost-loss situation where L m

=C these options will incur mean expenses E cl over many occasions of C or o L respectively, where o is the climatological frequency of occurrence of the event. The user's best choice is always to protect if C< o L, or C/L< o , and never to protect otherwise. Assuming the user takes the best option, the mean climatological expense in the simple cost-loss situation is thus given by

E cl

 min( C , o L ) (2)

For the generalised user loss matrix given in Table 2, the mean expense of the always protect option is given by ( 1

 o )C

 o L m

, and the mean expense of following climatology is given by

5

E cl

 min(( 1

 o ) C

 o L m

, o L )

Fully generalising this to allow for the possibility of N

0 gives

E cl

 min(( 1

 o ) C

 o L m

, ( 1

 o ) N

 o L )

(3)

(4)

In this case the user's best strategy is to always take protective action if

 

C

N

 

L m

N

 o (5 ) where

 is the generalised cost/loss ratio introduced by Richardson (2000). In some circumstances one of the climatological options may not be viable for a user, since taking protective action may involve stopping doing their normal economic activity (e.g. a fisherman’s protective action against strong winds may be to leave his boat in port). The user cannot do that all the time or he would go out of business. In this case the forecast value should be calculated using the viable climate option. This will be considered further in section 4.3.

3.5 Definition of Forecast Value

The value of the forecast in monetary terms, V , is the saving the user can expect to make from following the forecast, averaged over a number of occasions:

V

E cl

E fx

(6)

This basic definition of value is the most relevant for a user, except that it does not account for the cost of buying the forecast service. The true value to the user is therefore

V u

V

C fx

E cl

E fx

C fx

(7) where C fx

is the purchase price of the forecast. However, although V u

is the correct definition to use when estimating the value of a forecast to a user, C fx is specific to any forecast service and cannot be estimated in general terms.

For the purposes of this paper this term will therefore be ignored and value will be defined as in equation (6).

For general assessments of forecast value, it is convenient to scale V relative to the value of a perfect forecast, in a similar fashion to the normal definition of a skill score (see Wilks, 1995). With a perfect forecast m=f=0 and the user takes protective action only when the event occurs. The mean expense is therefore E p

 o L m

(or if N

0 , then E p

 o L m

( 1

 o ) N ). The relative economic value of a forecast is then defined as:

V r

E

E cl cl

E

E fx p

(8)

V r

has a maximum of 1 for a perfect forecast system, and is zero for a climatology forecast. V r

may also be multiplied by 100 and expressed as a percentage.

It is important to note that while V r

has a maximum value of 1 (or 100%), there is no lower limit, and from equations (1) and (8) it is clear that negative

6

values are likely when m or f , or their corresponding user losses L m

or C , are large. Negative value simply indicates that the forecast system does not have sufficient skill to provide value to this particular user. In this case the user's best strategy is to follow the best climatological option. It may also be possible to find a different event for which the associated costs and losses will be different, and for which the forecasts may be more skilful.

3.6 Value for a Probability Forecast

Forecast value is defined above for deterministic forecasts of a simple binary event, against which the user takes protective action when the event is expected to occur. Probability forecasts are normally also defined for binary events, since they are expressed as the probability that an event will occur.

To make decisions from probability forecasts, the user takes protective action

(‘Forecast=Yes’ in Table 1) when the probability p of the event exceeds a chosen threshold p t

.

The value of the forecast therefore depends on the choice of p t

. To completely specify the value of a probability forecast system, the value is calculated for a range of probability thresholds and plotted as a function of p t

as shown in figure 1. (This use of probability thresholds is identical to that used in ROC verification, for which hit rates ( HR ) and false alarm rates ( FAR ) are calculated for various probability thresholds using the same contingency table as in table 1. The appendix describes how forecast value may be evaluated directly for any forecast system for which ROC verification is available.)

4 Results

Three examples of forecast value plots are shown in figure 1. All are based on verification of the same forecasts, but the user costs and losses are different, as shown in Table 3. The verification is based on daily forecasts of the event “10m wind speed of Beaufort Force 5 or more” for 41 sites in the

UK, over two winter seasons (DJF, 1998/99 and 1999/2000). Results are shown for forecast lead-times of 48, 96, 144 and 192 hours. Probability forecasts were taken from the 51-member ECMWF operational Ensemble

Prediction System (EPS). The value of the probability forecasts is calculated and plotted at probability thresholds of 0, 10, 20, 30, …, 90, 100%. For comparison, equivalent deterministic forecasts from the ECMWF high resolution (T

L

319) global model are also included.

(a) (b)

7

Figure 1 (a) and (b): f or full caption see figure 1(c) overleaf.

User N

(CR)

Lm

(H)

L

(M)

C

(FA)

A

B

C

0

0

0

1

1

2

5

2

10

1

1

1

Table 3: User costs and losses in arbitrary monetary units for the three examples of forecast value plots shown in figure 1. (For the meaning of the column headings, see tables 1 and 2.)

8

(c) Figure 1: Examples of relative forecast value V r

, expressed as a percentage of the value of a perfect forecast, plotted against probability threshold p t

for probability forecasts of an event. The values of equivalent deterministic forecasts are shown in the columns at the right hand side of each graph. The best climatological option, selected following equation (4), is labelled above the graph. Details of the forecasts used are given in Section 4.

Forecast lead-times are: 48h (solid line), 96h (dotted), 144h (dot-dot-dotdash) and 192h (dashed). The three graphs are for the same forecasts, but for different user loss functions as given in Table 3: (a) A (b) B and (c) C.

(Note that values for deterministic forecasts at 144h and 192h in figure

1(c) are missing because they are off the graph below –40%.)

4.1 Value of Probability Forecasts

From figure 1(a) it can be seen that the value of a probability forecast is strongly dependent on p t . (Note that the form of this function is identical for V and V r , since the scaling of V by E cl

-E p

in equation (8) is independent of p t .)

The requirement of the user is to maximise the value of the forecast, and therefore the practical value of the probability forecast is given by the maximum of the curve at p t

= p max

. By using p max as a decision threshold, the user can obtain the optimum value out of the probability forecast. The form of the graphs in figure 1 allows easy identification of p max for a particular user, and also allows direct comparison of the value of the probability forecast

V r ( p max

) with the value of an equivalent deterministic forecast.

For User A (figure 1(a)) the forecast value for all lead-times is a maximum for probability thresholds of around 20 or 30%, and the value falls negative for large thresholds. For User B (figure 1(b)) maximum value is obtained with p t of 70-80%. The only difference between the two sets of losses is that L for a missed event is greater for User A than for User B. User A therefore benefits more by lowering the miss rate. Reducing p t

increases the number of

9

occasions when protective action is taken, thus increasing the hit rate and false alarm rate, but reducing the frequency of misses and correct rejections.

Thus user A obtains optimum value by using a lower probability threshold than user B, because the benefit of reducing the miss rate is greater than the cost of a higher false alarm rate.

The example of user C shown in figure 1(c) is similar to user A, except that the False Alarm cost is half that of user A, resulting in an even lower optimum threshold p max

. This example demonstrates the flexibility of the system in allowing L m

C . In figure 1(c), except at 192h, maximum value is obtained with p max

=10%, indicating that, with a 51-member ensemble, the user should take protective action when only 5 ensemble members predict the event to occur. For 96 and 144h forecasts, the value is still increasing rapidly as the threshold decreases to 10%, and it appears likely that even better value could be obtained using a threshold of 2 or 3 ensemble members. For the 192h forecasts the value is negative even at p t

=10%.

Again, it is possible that a small positive benefit could be obtained from the forecasts using a lower threshold. In this study forecast verification was only available at 10% probability thresholds, so it is not possible to confirm this.

However for practical applications where a user has a low cost/loss ratio ( C/L ) it is important to carry out the verification for all possible low probability thresholds down to a single ensemble member, in order to ensure optimum use of the ensemble information. This is likely to be particularly important when considering the use of ensembles for predicting severe weather events, when the losses from not protecting are often high.

It must be noted that for some users the forecast value is always negative, even though the forecast has skill, because the costs of misses and false alarms are too high relative to the savings made from a correct forecast.

4.2 Comparison of Probability and Deterministic Forecasts

Also shown in figure 1 are the values of the deterministic forecasts from the higher resolution T

L

319 ECMWF model. Forecast value provides a direct comparison between the benefits of deterministic and probabilistic forecasts since it is calculated in exactly the same way for each, but with only a single yes/no threshold instead of a range of probability thresholds. This comparison is in terms which are clear and relevant to users of the forecast, for whom other verification tools, such as RMS Errors, Brier Scores and ROC, may be meaningless. In the case of the 48-hour forecasts for user A the deterministic value is almost identical to the maximum value of the probability forecast; in every other case shown the probability forecast has more user value than the equivalent deterministic forecast, despite the higher resolution of the deterministic model. In many cases the probability forecast has positive value when the deterministic has negative value. This is largely due to the fact that since the deterministic forecast can generate probabilities of only 0 or 100%, it cannot adapt to the requirements of different users. By contrast the probability forecast can use a range of probabilities, and can be interpreted appropriately for each user as explained in 4.1 above (Toth et al, 2000).

10

For some forecast events and users the probability forecast can have greater value even at 12 or 24 hours, although in general the benefits are greater at longer lead-times. For longer range forecasts there is inherently more uncertainty and a deterministic forecast is likely to be wrong more frequently, whereas a probability forecast can account for the uncertainty by using more mid-range probabilities. Occasionally it is found that the deterministic forecast is more valuable, even at quite long lead-times. This occurs when the behaviour of the deterministic model for the defined event is welltuned to the particular user’s losses. In any case, the method allows the best strategy for any user to be identified, and in most cases this will be from the probability forecasts.

4.3 Effect of the Climatological Reference Expense

From figure 1 it can be seen that in cases A and C the value of the probability forecasts converges to zero at p t =0 points at p t =0

and in case B at p t =1.

The and 1 correspond to the event always and never being predicted respectively. Forecast value V is therefore equal to zero by definition at one of these points because E fx

=E cl

.

Which threshold has V=0 depends on which of the climatological options is the cheaper, as given by equation (4).

In general the correct climatological reference option to use will be the cheapest, as described in section 3.2. This minimises the forecast value and therefore avoids any risk of over-estimating it. However in some real situations the user may not have that option, as described at the end of 3.2.

For such users it would be appropriate to choose a fixed climatological baseline for the forecast value, rather than the minimum expense baseline given by equation (4). In figure 2 the example of figure 1(a) is shown assuming that the user does not have an option to Always Protect. In this case it is found that the relative value of the forecasts is now much greater.

Since it is only E cl

which has changed in equation (8), and E cl

>E p

, the absolute value V is also larger.

5. Calibration of Probability Forecasts

It is easily shown (Richardson, 2000; Toth et al 2000) that for perfect probability forecasts the optimum probability threshold p max

should be equal to the cost/loss ratio C/L in the simple cost/loss situation of Murphy (1977); for the generalised cost/loss situation of table 2, p max

should be equal to the generalised cost/loss ratio

given in equation (5). For user A above,

= C/L=0.2

, so p max

should be 20%. However this assumes that the probability forecasts are perfectly reliable, so that when the forecast probability is 20%, the event will occur on exactly 20% of those occasions. In practice pure ensemble forecasts are not perfectly reliable, and indeed it can

11

Figure 2: As figure 1(a), but with the climatological reference expense restricted to the ‘Never Protect’ option. be seen from figure 1(a) that while the optimum threshold p max

is 20% for lead-times of 144h and 192h, it is actually 30% at 48h and 96h.

The reliability of the probability forecasts used in figure 1 are illustrated by the reliability diagram in figure 3. Forecasts are categorised by the forecast probability, and the frequency of occurrence of the event for each category plotted against the forecast probability. For a perfectly reliable forecast the points will lie along the diagonal line, but it can be seen that the ensemble forecasts are generally over-confident. For high forecast probabilities, the event occurs less frequently than predicted, and vice versa for low probabilities. It can be seen in figure 3 that for lead-times of 48h and 96h the observed probability is closer to 20% for forecast probabilities of 30% than of

20%, in agreement with the optimum thresholds identified from the forecast value curves.

A representative reliability diagram such as that in figure 3 can be used to calibrate forecast probabilities and produce near-perfectly reliable forecasts, although this process reduces the resolution of the forecasts by reducing the range of probabilities used in the issued forecasts. Toth et al (2000) propose using calibrated probabilities so that the ideal decision threshold can be deduced directly from the C/L ratio. The method described in this paper does not require calibrated probability forecasts. The method identifies the ideal

12

p max for the uncalibrated forecasts, and in effect the calibration of the forecasts is built into the forecast value system. This allows the user to extract the maximum benefit from any probability forecast system, without any need to calibrate the forecast probabilities.

Figure 3: Reliability diagram for the probability forecasts used to generate the forecast value graphs of figures 1 and 2. Line types indicate the forecast lead-times as in figure 1.

6. Practical Application

Practical application of this technique for the use of probability forecasts as decision-making tools will require close collaboration between forecast providers and users. The user is required to define precise events which are to be forecast, and also an estimate of the losses expected in each of the contingencies. The forecast provider must provide a set of verification diagnostics for the forecast system's ability to predict the defined event. Given these data, forecast value functions can be plotted. The probability threshold which gives the maximum value can then be identified, and the user then takes protective action whenever the forecast probability exceeds this threshold.

Although this process is simple in principle, it is less straightforward in practice. Many decision-making processes are more complex than a simple protect/no protect choice. Estimation of users' losses is not easy, and verification data may not be immediately available for the defined event.

Benefits are likely to be greatest for users with large weather-sensitivity and large losses, for whom the investment in initial analysis is justified by the returns. Forecast providers also need to be aware that analysis of some customers' requirements may reveal that it is not possible to provide a forecast with value, in which case the customer is best advised not to

13

purchase the forecast! Conversely, Graham et al (2000) demonstrate that the technique may be used to gain useful information from seasonal forecasts for which forecast skill measured by conventional verification techniques, such as

ROC, is relatively low compared to medium-range forecasts.

4. Conclusions

A technique for estimating forecast value has been described which allows direct comparison of the value of probability and deterministic forecasts in terms which are relevant to forecast users. Probability forecasts are usually found to have greater value than equivalent deterministic forecasts, particularly for medium-range forecasts with lead-times greater than about 48 hours.

The technique may be used to apply probability forecasts as decisionmaking tools by identifying a probability threshold which maximises the forecast value. The forecast user then takes protective action against the event whenever the forecast probability exceeds this threshold.

5. Acknowledgements

The author would like to thank Tim Legg and Caroline Woolcock of The

Met.Office for providing the ensemble verification data used in this study, and also Mike Harrison, now at WMO in Geneva, for stimulating the ideas behind the project.

6. References

Brier,G.W., 1950: Verification of forecasts expressed in terms of probability.

Mon.Wea.Rev.

78 , 1-3.

Buizza,R. and Palmer,T.N. 1998 Impact of Ensemble Size on Ensemble

Prediction Mon.Wea.Rev.

126 , No 9, pp2503-2518.

Buizza,R., Miller,M. and Palmer,T.N. 1999 Stochastic representation of model uncertainties in the ECMWF EPS. Quart. J. Roy. Meteorol. Soc , 125 , No.

560, 2887-2908.

Graham,R.J., Evans,A.D.L., Mylne,K.R., Harrison,M.S.J. and Robertson,K.B.

2000 An assessment of seasonal predictability using Atmospheric, Quart. J.

Roy. Meteorol. Soc ., to appear.

Hamill,T.M. and Colucci,S.J. 1997 Verification of Eta-RSM short-range ensemble forecasts, Mon.Wea.Rev

., 125 , 1312-1327.

Hamill,T.M. and Colucci,S.J. 1998 Evaluation of Eta-RSM ensemble probabilistic precipitation forecasts, Mon.Wea.Rev

., 126 , 711-724.

Houtekamer,P.L., Lefaivre,L, Derome,J., Ritchie,H. and Mitchell,H.L. 1996 A system simulation approach to ensemble prediction Mon.Wea.Rev.

124 , No

6, 1225-1242.

14

Katz,R.W. and Murphy,A.H., (editors) 1997: Economic value of weather and climate forecasts, Cambridge, UK. Cambridge University Press .

Lorenz,E.N., 1963 Deterministic nonperiodic flow, J.Atmos.Sci., 20 , 130-141.

Molteni,F., Buizza,R., Palmer,T.N. and Petroliagis,T., 1996: The ECMWF

Ensemble Prediction System: Methodology and Validation.

Quart.J.Roy.Meteor.Soc.

122 , 73-119.

Murphy,A.H., 1973: A new vector partition of the probability score, J. Appl.

Meteorol.

12 , 595-600.

Murphy,A.H., 1977: The value of climatological, categorical and probabilistic forecasts in the cost-loss situation, Mon.Wea.Rev.

105 , 803-816.

Murphy,A.H., 1994: Assessing the economic value of weather forecasts: an overview of methods, results and issues, Meteorol. Appl.

1 , 69-74.

Richardson,D. 2000 Skill and relative economic value of the ECMWF ensemble prediction system Quart. J. Roy. Meteorol. Soc . 126 , No 563, pp

649-667, January 2000, Part B.

Stanski,H.R., Wilson,L.J. and Burrows,W.R., 1989: Survey of Common

Verification Methods in Meteorology, WMO WWW Tech Report No 8, WMO

TD No 358.

Toth,Z., and Kalnay,E., 1993: Ensemble Forecasting at the NMC: The generation of perturbations. Bull.Amer.Meteor.Soc.

, 74 , 2317-2330.

Toth,Z, Kalnay,E., Tracton,S.M., Wobus,R., and Irwin,J. 1997 A synoptic evaluation of the NCEP ensemble, Weather and Forecasting 12 , No 1, 140-

153.

Toth,Z, Zhu,Y. and Wobus,R. 2000, On the economic value of ensemble based weather forecasts, submitted to the Bulletin of the American

Meteorological Society.

Wilks,D.S., 1995: Statistical Methods in the Atmospheric Sciences - An

Introduction, International Geophysics Series Vol 59, Academic Press.

15

Appendix - The Link between Forecast Value and ROC Verification

The contingency table in Table 1 is the same as is used in ROC verification (see Stanski et al, 1989). ROC expresses the forecast performance in terms of a hit rate ( HR ) and false alarm rate ( FAR ) defined as:

HR

H

H

M

(A1)

FAR

F

F

R

(A2)

To evaluate E fx

from equation (1) we require values of h, m, f and r . By definition, the frequency of hits h is h

H

H

M F R

(A3) and the climatological frequency of the event is o

H

M

H

M F R

(A4)

Hence combining equations (A1), (A3) and (A4): h

 oHR

Similarly: f m

 o (1

HR )

 

)

 

  r

  o )(1

FAR )

In ROC verification of probability forecasts HR and FAR are evaluated for a range of probability thresholds in exactly the same way as described in this paper for forecast value. Thus equation (1), and hence the forecast value, are easily evaluated for any forecast for which ROC verification data are available and the climatological frequency of the event, o , is known.

16

Download