A Managerial Approach to the Parameterization

advertisement
A Managerial Approach to Using Error Measures in the Evaluation of Forecasting Methods
[published in the International Journal of Business Research, Vol. 7, No. 3, 2007, pp. 143-149]
James E. Cox, Jr., Illinois State University, Normal, Illinois, USA
David G. Loomis, Illinois State University, Normal, Illinois, USA
ABSTRACT
Summary error accuracy measures are used in several steps of the forecasting process. For example,
error measures are used to evaluate and choose forecasting methods. However, more effort should be
made to take into account the implicit managerial assumptions made by using these error measures.
Keywords: Error Measures, Forecasting, Forecasting Management, Accuracy Measures, Forecasting
Process, Forecasting Techniques.
1. INTRODUCTION
The topic of forecasting error has been addressed by many authors. Forecasting books have shown how
to calculate basic summary error (accuracy) measures [Makridakis, Wheelwright, Hydman, 1998;
DeLurgio, 1998]. In addition articles have been written on ways of making forecasts more accurate
[Guerts and Whitlark, 2000], how to measure the impact of error on an enterprise [Kahn, 2003],
measuring the cost of forecasting error [Jain, 2004], examining errors in various industries [Jain, 2003],
and the empirical measurement of error [Mentzer and Cox, 1984]. However, surprisingly absent from the
literature is what impact the implicit managerial assumptions regarding the summary error measures have
on the selection of the proper methods to use in a forecasting situation. What the forecast user or
manager is ready to assume will influence what summary error measures should be used in the
technique selection process. The choice of error measures can affect the selection and ranking of
methods [Armstrong, 2001].
Empirical research has shown that the most preferred error measures have varied over time. A study in
1982 by Carbone and Armstrong (1982) found that 33% of practitioners preferred root mean squared
error (RMSE) which was the most popular of seven error measures and 11 % preferred mean absolute
percent error (MAPE). A later study by Mentzer and Kahn (1995) showed that 52% of their sample of
companies used MAPE and 10% used RMSE.
The purpose of this article is not to do a survey of all possible error measures but rather to draw attention
to the importance of considering the assumptions made in selecting an error measure. Understanding
the context and managerial assumptions/preferences for which a forecast is made will assist the
forecaster in choosing the appropriate error measure for that particular situation. This is important
regardless of whether managers are generating their own forecasts or whether forecasters are preparing
the forecasts for managers. Our focus for this article is to give managers/forecasters advice on the best
error measure to use for a particular time series rather than choosing the best error measurement across
multiple series. Focusing on the best error measure for one series would be an important forecasting
perspective when a particular product would account for a significant proportion of company sales. The
best error measure across multiple series is dicussed in other literature, such as Makridakis et. al (1982),
Thompson (1990), Gardner (1990), Collopy and Armstrong (1992), Fildes (1992), and Armstrong and
Fildes (1995), has addressed the multi-series situation. Armstrong and Fildes (1995, 68) state, “We
believe that most users of forecasts are able to examine more than one error measure. Computer
technology is making such comparisons easier…Software designers now include a variety of summary
statistics and the issue should be which error measure or measures are appropriate for a given situation.”
2. THE FORECASTING PROCESS
In generating forecasts the manager or forecaster will go through a forecasting process similar to the one
below in Figure 1.
FIGURE 1 – The Forecasting Process
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
Step 7
Set objectives for the forecast
Select possible forecasting techniques
Data collection and preparation
Parameterize the technique(s)
Technique(s) evaluation and selection
Application of technique(s) and forecast revision
Evaluation of technique performance
Typically, the forecaster uses a summary measure of forecasting error to parameterize the technique
(Step 4), to evaluate the technique (Step 5), and to evaluate the technique’s performance (Step 7).
Forecasting error is defined by taking the actual sales and subtracting the forecast. However, there are
several ways to come up with a summary of the error over several periods. The summary error measure
that is chosen will ultimately have a significant impact on selecting a forecasting method.
One of the necessary steps in generating forecasts is parameterizing the forecasting methods. What the
forecaster needs to do at this stage is to pick the particular parameters that will be used to run the
method. For example single exponential smoothing uses the parameter alpha (α) where alpha is chosen
to be between 0 and 1. The formula for single exponential smoothing is:
F(t+1) = F(t) + α * [X(t) – F(t)] where 0 ≤ α ≤ 1.
Initialize by letting F(1) = X(1)
F(t+1) = one step ahead forecast at time period t
F(t) = forecast for time period t
X(t) = sales for time period t
As different values for the parameter are chosen the forecasting method will generate very different
forecasts. In the following example single exponential smoothing is used to generate forecasts using
three different parameter values. It can clearly be seen the difference the parameter value makes in
forecasting and consequently in the technique’s accuracy. Although this simple example uses only a
one-step-ahead forecast for illustration purposes, the methodology generalizes to multi-step-ahead
forecasts as well.
TABLE 1 – Forecasts and Parameterization
Sales Forecast Forecast
Forecast
α=.2
α=.5
α=.8
1
20
20
20
20
2
40
20
20
20
3
30
24
30
36
4
50
25.2
30
31.2
5
40
30.16
40
46.24
t
The key question is what values should be chosen for the parameters that will do the best job at
forecasting. Although some software programs do not allow the user to choose different error measures
in parameterizations, the most popular statistical program, SAS, allows this flexibility. Other programs
may offer this feature as well if it is requested by users. In addition, users could use spreadsheets to
calculate these measures.
Typically, the best job in parameterization is defined by the parameter(s) that give the forecasting method
the lowest forecasting error. Error is defined by taking the actual sales and subtracting the forecast.
2
However, there are several ways to come up with a summary of the error over several periods. The
summary error measure that is chosen will have a great influence on parameters chosen.
Here are seven summary measures of error commonly used.







Mean (Average) Error (ME) = the average of the error for each period
Mean Absolute Deviation (MAD) = the average of the absolute values of the error for each period
Mean Squared Error (MSE) = the average of the squared error for each period
Standard Deviation (SD) = the square root of the MSE also known as root mean squared error
(RMSE)
Signed Squared Error (SSE) = the average of the squared error with the signed retained of the
direction of the error
Mean Percentage Error (MPE) = the average of the percentage error for each period. The
percentage is calculated by dividing the error by the actual sales.
Mean Absolute Percentage Error (MAPE) = the average of the absolute value of the percentage
error for each period. The percentage is calculated by dividing the error by the actual sales.
Table 1 contains a brief example that illustrates how each summary error measure is computed.
TABLE 1 – Error Measures
Abs
Signed Sqr.
Error
Sqr. Error
Error
t
Sales
Forecast
Error
1
2
3
4
20
40
30
50
10
30
50
40
10
10
-20
10
10
10
20
10
100
100
400
100
10
2.5
ME
50
12.5
MAD
700
175
MSE
√175=13.23
SD/RMSE
Sum
Ave
Percent
Abs.
Percent
100
100
-400
100
50%
25%
-67%
20%
50%
25%
67%
20%
-100
-25
SSE
28%
7%
MPE
162%
40.5%
MAPE
Each of these error summary measures has certain characteristics.
Mean Error (ME)
- shows direction of error
- does not penalize extreme errors
- errors cancel out (no idea of how much)
- in original units
Mean Absolute Deviation (MAD)
- shows magnitude of overall error
- does not penalize extreme errors
- errors do not cancel out
- no idea of direction of error
- in original units
Mean Squared Error (MSE)
- penalizes extreme errors
- errors do not offset one another
- not in original units
- does not show direction of error
Standard Deviation (SD) or (RMSE)
- penalizes extreme errors
- errors do not offset one another
- in original units
- does not show direction of error
Signed Squared Error (SSE)
- penalizes extreme errors
- errors can offset one another
- shows direction of error
- not in original units
Mean Percentage Error (MPE)
- takes percentage of actual sales
- does not penalize extreme error
- errors can offset one another
- shows direction of error
- assumes more sales can absorb more error in units
3
Mean Absolute Percentage Error (MAPE)
- takes percentage of actual sales
- does not penalize extreme deviations
- does not cancel offsetting errors
- assumes more sales can absorb more error in
units
- does not show direction of error
These characteristics are illustrated in Figure 2 below.
Figure 2 - Summary Error Measures Taxonomy
Units or Percentage?
Units
Percentage
Let + and – errors cancel?
Let + and – errors cancel?
Yes
No
Penalize
extreme
errors?
Yes
Penalize
extreme
errors?
Yes
No
Yes
SSE
ME
Original
Units?
Yes
SD/RMSE
MPE
No
MAPE
No
MAD
No
MSE
Since the summary error measure characteristics differ, a manager should ask the following questions to
help him/her determine which summary error measure would be best for using in technique evaluation
and in the other stages of the forecasting process.
4
1.
2.
3.
4.
5.
Is the manager looking at a long-term perspective; i.e., more interested in the final result than
period by period accuracy? Is period by period accuracy more important than ultimate accuracy?
If final result is more important then ME, SSE, MPE would be most appropriate.
If period by period is more important then MAD, MSE, SD, MAPE would be most appropriate.
Would the manager have trouble comprehending unless "regular" units are used to express error
(accuracy)?
If regular units is desired then ME, MAD, and SD would be appropriate.
Is the manager willing to take more error if the (sales) base is larger?
If yes, then MPE and MAPE would be appropriate.
Would extreme error be very costly so that the manager would be willing to take lower overall
accuracy if extreme error could be avoided for any one period?
If yes, then MSE, SD, and SSE would be most appropriate.
Does the direction (sign) of the error make a difference in cost? In other words, is there an
asymmetrical loss function? (for a complete discussion of asymmetrical loss functions, see
Diebold (2001, 34-37)
If yes, then ME, SSE, and MPE would be most appropriate.
Table 2 reflects the summary error measures which would be most appropriate given the managerial
questions posed above.
TABLE 2 – Managerial Questions Versus Summary Error Measures
ME
MAD
MSE
SD
SSE
MPE
x
x
x
End result
accuracy: errors
cancel
Period by period
accuracy: errors
don't cancel
Regular units?
Willing to accept
more absolute
error if the base is
larger?
Penalize extreme
error?
Does direction of
error make a
difference?
Asymmetrical
loss?
x
x
x
x
x
x
x
x
x
x
MAPE
x
x
x
x
x
3. APPLICATION
Four industries with particular issues can be examined to illustrate the importance of accounting for
managerial concerns in selecting an error measure. We have purposely limited the complexity of the
applications so as not to obscure the managerial perspective regarding the error measure used in the
forecasting situation. First, in the electricity industry, short term forecasters are called upon to forecast
the demand for electricity on an hourly basis one day ahead taking account of the number and type of
customers, weather, etc. If the forecaster overestimates the demand, the utility will have excess capacity
on hand for which it receives no payment because no customer buys it. If the forecaster underestimates
demand, the utility may have to buy power on the spot market at very high prices or there may be
blackouts if no electricity can be found from other places. Clearly, managers would be somewhat
unhappy about the former but extremely upset at the latter. Thus, underforecasts are much worse for the
5
utility than overforecasts. In this case, the forecaster would be best served by choosing Signed Squared
Error (SSE) because it is in units and it penalizes extreme values and the direction of the error matters.
Second, consider a manufacturer of airplanes. In this industry, a forecaster may be called upon to
forecast future demand for a model of airplane that is not built yet. In this case, the management might
be biased against overforecasts for fear that it would have billions of dollars tied up in airplanes that it
cannot sell. In other words, in technical terms, there is an asymmetrical loss function. In this case, if the
management is risk averse, more error might be acceptable only as the expected number of planes sold
gets larger since a larger revenue base can absorb the cost of the error and consequently, the most
appropriate error measure would be in percentage terms. Because of the danger of overforecasting, the
sign of the error is still important, so the logical choice here would be Mean Percentage Error (MPE). If,
on the other hand, management is not risk averse and a larger revenue base is not important, Mean Error
(ME) would be acceptable. In addition to having billions of dollars tied up with an overforecast, this might
affect the production process and affect per-unit costs.
A third industry is represented by a restaurant where demand for different dishes needs to be forecasted
in order to buy ingredients. Some of these ingredients are perishable goods that cannot be carried over
from day to day. If ingredients are only used for one dish, an overforecast would result in the waste of the
perishable goods but an underforecast may result in angry customers who cannot order the dish that they
want. In this case, Mean Absolute Percentage Error (MAPE) would be best error measure for the
forecaster since errors don’t cancel and as in the previous example, management is willing to accept
more error if they have a larger revenue base to absorb the error with more sales. If management would
not be willing to accept more error with larger revenue, Mean Absolute Deviation (MAD) would be the
best error measure.
Finally, a manager of future commodity prices may need to be as accurate as possible with
underforecasts being just as bad as overforecasts. Further, it may not be acceptable to the manager to
accept a greater error as prices rise and errors in one month don’t cancel out errors in another month
since all errors are costly. Since an extreme error could be disastrous financially, the manager should
penalize extreme errors. Thus, the best error measure would be Mean Squared Error (MSE) or if the
forecaster prefers to work in regular units, Standard Deviation (SD).
4. SUMMARY
When an error measure is chosen for use in evaluating technique performance, certain assumptions are
being made about what is important in the forecasting situation. For example, the measurement chosen
should reflect the importance that the company’s management places on the size and sign of the
forecasting error. If a manager is extremely concerned with underforecasts of demand for fear of not
having enough of his/her product and disappointing customers, the forecaster should choose an error
measurement that reflects this concern. If a manager only feels comfortable thinking in regular units then
this should be considered.
All too often management does not consider the characteristics of the error summary measures used by
their forecasters for technique selection. More effort should be made to use these characteristics in
evaluation stage of the forecasting process. If forecasters would do this they would be more likely to meet
the managerial expectations for the forecast.
5. REFERENCES
Armstrong, Scott J. and Collopy, Fred, “Error Measures for Generalizing About Forecasting
Methods: Empirical Comparisons”, International Journal of Forecasting, Vol. VIII (1),
1992, 69-80.
Armstrong, Scott J. and Fildes, Robert, “On the Selection of Error Measures for Comparisons
6
Among Forecasting Methods”, Journal of Forecasting, Vol. XIV (1), 1995, 67-71.
Armstrong, Scott J. Principles of Forecasting. Kluwer, Boston, 2001.
Carbone, Robert and Armstrong, Scott J. “Evaluation of Extrapolative Forecasting Methods:
Results of a Survey of Academics and Practitioners”, Journal of Forecasting, Vol. I, 1982,
215-217.
DeLurgio, Stephen A., Forecasting Principles and Applications, Irwin/McGraw Hill, New York,
1998.
Diebold, Francis X., Elements of Forecasting, 2nd Ed., South-Western, Cincinnati, OH, 2001.
Fildes, Robert, “The Evaluation of Extrapolative Forecasting Methods”, International Journal of
Forecasting, Vol. VIII (1), 1992, 88-98.
Gardner, E. S., Jr., “Evaluating Forecast Performance in an Inventory Control System,” Management
Science, Vol. XXXVI (4), 1990, 490-499.
Geurts, M.D. and Whitlark, D.B., "Six Ways to Make Sales Forecasts More Accurate”, The
Journal of Business Forecasting, Vol. XVIII (4), Winter 1999-2000, pp. 21-23,30.
Jain, Chaman L., "Forecasting Errors in the Consumers Products Industry”, The Journal of
Business Forecasting, Vol. XXII (2), Summer 2003, 2-4.
Jain, Chaman L., "How to Measure the Cost of Forecast Error", The Journal of Business
Forecasting Methods and Systems, Vol. XXII (4), Winter 2003-04, 2,29-30.
Kahn, K.B. "How to Measure the Impact of Forecast Error on an Enterprise", The Journal of
Business Forecasting, Spring 2003, 21-25.
Makridakis, S. et. al., “The Accuracy of Extrapolation (Time Series) Methods: Results of a
Forecasting Competition”, Journal of Forecasting, Vol. I, 1982, 111-53.
Makridakis, S., Wheelwright, S.C., and Hyndman S.C., Forecasting Methods and Applications.
Wiley & Sons, New York, 1998.
Mentzer, John T. and Cox, James. E., Jr., “Familiarity Application and Performance of Sales
Forecasting Techniques”, Journal of Forecasting. Vol. III (1), January-March 1984, 27-36.
Mentzer, John T. and Kahn, Kenneth B., “Forecasting Technique Familiarity, Satisfaction, Usage
and Application”, Journal of Forecasting, Vol. XIV (5), 1995, 465-476.
Thompson, Patrick A., “An MSE Statistic for Comparing Forecast Accuracy Across Series”,
International Journal of Forecasting, Vol. VI (2), 1990, 219-27.
6. AUTHOR PROFILES
Dr. James E. Cox, Jr. earned his Ph.D. at the University of Illinois, Champaign-Urbana in 1981. Currently
he is a professor of marketing at Illinois State University. In addition to teaching forecasting for over 25
years, he has consulted with major corporations in the forecasting area.
Dr. David G. Loomis earned his Ph.D. in economics at Temple University in 1995. Currently he is an
associate professor of economics at Illinois State University where he teaches forecasting. He formerly
worked at Bell Atlantic in the forecasting area.
7
Download