COORDINATING DEMAND FORECASTING AND OPERATIONAL DECISION-MAKING WITH ASYMMETRIC COSTS: THE TREND CASE Robert M. Saltzman, San Francisco State University ABSTRACT This article presents two methods for coordinating demand forecasting and operational decisionmaking when time series data display a trend and the unit costs of under-forecasting (cu) and over-forecasting (co) may differ. The first method, called ordinary least squares with adjustment (OLSA), makes a simple additive adjustment to the forecasts generated by OLS regression. The second method generates forecasts by fitting a weighted least absolute value (WLAV) regression line through the data, where the weights correspond to the unit underage and overage costs. Using simulation, both methods are tested against OLS regression forecasts made in the presence of one of four types of error distributions, and numerous combinations of cu and co. The performance measure used for evaluation is the mean opportunity loss (MOL) over the postsample periods. Simulation results show that the benefits of both approaches increase rapidly as cu and co diverge, regardless of the error distribution. In particular, when the ratio of costs is 3:1, OLSA can reduce post-sample MOL of OLS by 19-23%. Even when the ratio of costs is closer to 2:1, OLSA can still reduce post-sample MOL by 6-7%. In a few scenarios, WLAV outperforms OLSA, but generally is not as robust as OLSA. I. INTRODUCTION Traditionally, articles and texts concerned with demand forecasting tend to ignore the related decision-making aspects of the situation. Similarly, those concerned with operational decisionmaking tend to ignore how the relevant demand forecasts were developed. Many assume that a manager somehow already possesses a demand forecast, or that demand can be represented by some specific probability distribution (Brown, 1982). In reality, demand forecasting and decisionmaking should not be activities performed in isolation from one another. A few articles bridge this gap and show how these two activities interact, especially in the context of inventory control. For example, Gardner (1990) demonstrated how the choice of forecasting model plays a key role in determining the amount of investment needed to support a desired customer service level in a large physical distribution system. Eppen and Martin (1988) showed how various stationary demand distributions and lead time length affect forecasting error and, thus, error in the proper amount of safety stock required to meet a given service level. Watson (1987) studied how lumpy demand patterns can lead to large fluctuations in forecasts, which, in turn, increase ordering and holding costs as well as distort the actual level of service provided. Importantly, positive and negative forecasting errors often have asymmetric impacts on operations and, therefore, may be viewed differently by the manager who actually uses the forecasts to make a decision. Symmetric performance measures (e.g., MAD, MSE, MAPE) may be misleading if the cost of over-forecasting is substantially different from that of underforecasting. Consequently, the methods presented here are evaluated from the viewpoint of a decision-maker who adopts mean opportunity loss (MOL) as the best measure of forecasting performance. Unfortunately, articles and texts that use symmetric measures of forecasting performance and do not consider the role played by forecasts within the larger decision-making context dominate the forecasting literature. One exception is the situation described by Lee and Adam (1986), who report that some MRP systems can reduce total production costs if demand for final product is over-forecast by 10-30%. Conversely, class schedulers at a large university (SFSU) prefer to under-forecast enrollment in multi-section courses, finding it easier to add a section at the last minute if demand turns out to be higher than expected, rather than cancel an under-filled section if demand is lower than expected. Another common situation in which the costs of overforecasting (co) and under-forecasting (cu) may be quite different is in the classical single-period California Journal of Operations Management Volume I, Number 1, 2003 14 inventory problem. Gerchak and Mossman (1992, p. 805) claim that cu is usually much greater than co. In all of these settings, planners could estimate the relative costs of over- and underforecasting and adjust the given forecasts in a manner that appropriately balances the costs. Rech, Saltzman and Mehrotra (2002) present an easy-to-use, general procedure for doing this that still allows the forecaster/decision-maker to first uncover the underlying pattern in the time series data. The aim of this article is to extend that integrative approach by demonstrating how demand forecasting and decision-making can be effectively coordinated when the time series of interest displays a linear trend. II. MODEL AND FORECASTING METHODS This study used the linear model yt = α + βt + εt, for t = 1, 2, … n, where yt is the tth observation of the time series and εt is a random error for the tth observation. Four different εt distributions, described in the next section, were tested. The parameters α and β were estimated (by a and b, respectively) by each of the three forecasting methods, i.e., (1) OLS: Ordinary Least Squares regression; (2) OLSA: Ordinary Least Squares regression with Adjustment – a method first described in Rech, Saltzman and Mehrotra (2002) that makes a simple additive adjustment to the OLS forecasts; and (3) WLAV: Weighted Least Absolute Value regression – a new approach that finds the line that minimizes the total opportunity loss. Of course, OLS minimizes the sum of the squared residuals Σ(yt – ft)2, where ft = a + bt. For example, the OLS regression equation for the time series data (with n = 18) listed in the first two columns of Table 1 is ft = 7.447 + 3.971t, graphed as the long-dashed line in Figure 1. In the simulation experiments, OLS served as a benchmark for comparison with the other two methods. OLSA makes a simple additive adjustment to the OLS forecasts that takes into account the weights placed on positive and negative errors (cu and co, respectively) by the decisionmaker. Given any set of forecasts and the associated errors, Rech, Saltzman and Mehrotra (2002) show how to easily find the adjustment a* that, when added back into the original forecasts, would minimize the historical MOL. This adjustment a* is, in essence, just the Sth percentile of the distribution of forecast errors, where S = cu/(co + cu) is called the critical ratio. TABLE 1: EXAMPLE OF FORECASTING METHODS & THEIR OPPORTUNITY LOSSES Forecasting Method Opportunity Loss ($) Actual OLS OLSA WLAV OLS OLSA WLAV t 1 5.77 11.42 17.07 19.02 197.71 395.57 463.70 2 22.79 15.39 21.04 22.79 480.80 113.35 0.00 3 28.04 19.36 25.01 26.55 564.06 196.60 96.44 4 25.67 23.33 28.98 30.32 152.06 115.98 162.81 5 20.74 27.30 32.96 34.09 229.66 427.52 467.25 6 22.23 31.27 36.93 37.86 316.37 514.23 546.86 7 30.81 35.24 40.90 41.63 155.19 353.05 378.58 8 51.41 39.21 44.87 45.39 792.45 424.99 390.78 9 49.16 43.19 48.84 49.16 388.48 21.03 0.00 10 36.70 47.16 52.81 52.93 365.98 563.85 568.07 11 48.35 51.13 56.78 56.70 97.29 295.15 292.27 12 45.57 55.10 60.75 60.47 333.49 531.36 521.37 13 67.57 59.07 64.72 64.23 552.31 184.85 216.58 14 55.11 63.04 68.69 68.00 277.74 475.60 451.41 15 73.36 67.01 72.66 71.77 412.57 45.11 103.22 16 76.64 70.98 76.64 75.54 367.46 0.00 71.29 17 75.75 74.95 80.61 79.31 51.45 170.16 124.67 18 77.44 78.92 84.58 83.08 52.06 249.92 197.33 MOL 321.51 282.13 280.70 California Journal of Operations Management Volume I, Number 1, 2003 15 FIGURE 1: PLOT OF THE DATA AND FORECASTS WHEN S = 0.65 100 Demand 80 60 Actual 40 OLS OLSA 20 WLAV 0 0 2 4 6 8 10 12 14 16 18 20 Period t To illustrate, let cu = $65 and co = $35, so that S = 0.65. Then, based on the set of errors generated by using the OLS regression equation to forecast the 18 historical periods, a* is set to 5.653, the value of the 12th smallest error (i.e., the 65th percentile of the error distribution). The resulting OLSA equation is f t = 7.447 + 3.971t + a* = 13.1 + 3.971t, shown as the solid line in Figure 1. Because under-forecasting is nearly twice as costly as over-forecasting, the OLSA forecasts are considerably higher than the OLS forecasts and thereby reduce the historical MOL of OLS by 12.2% (from $321.51 to $282.13). WLAV extends the Least Absolute Value criterion that others have investigated (e.g., Dielman and Rose 1994, Dielman 1986, and Rech, Scmidbauer and Eng 1989) to allow positive and negative errors to have different costs. Its objective is to find the line that minimizes the sum of weighted absolute errors, that is, cu ∑(y et > 0 t − f t ) + co ∑ ( f t − yt ) , where e t = y t – f t. Thus, et < 0 WLAV minimizes both the total and mean opportunity loss over the historical periods. To solve the WLAV problem, we reformulate it as the following linear programming problem. Minimize subject to cu Σ Pt + co Σ Nt Pt – Nt + a + bt = yt, Pt ≥ 0, Nt ≥ 0, for t = 1, 2, …, n for t = 1, 2, …, n The variables Pt and Nt represent the magnitude of positive and negative errors, respectively, i.e., the vertical distances from the actual data point (t, yt) down and up to the fitted line. The difference Pt – Nt defines the residual for period t. With appropriate costs in the objective, the WLAV problem combines the forecasting and decision-making activities into a single problem and gives the decision-maker the flexibility to adapt to any situation with asymmetric costs. Applying this approach to the data given in Table 1 results in a WLAV equation of ft = 15.25 + 3.768t, shown as the short-dashed line in Figure 1. As with OLSA, the WLAV forecasts are also well above those generated by OLS. Moreover, they reduce the historical MOL to $280.70, a 12.7% reduction over that of OLS and a slight improvement over OLSA. III. EXPERIMENTAL DESIGN To test these methods on a wide variety of sample data, and to assess their performance on future or “post-sample” data, a Monte Carlo simulation experiment was conducted for the model presented in the previous section. The experimental design included two factors: the error distribution and the critical ratio. The following four error distributions, each with a mean of zero, California Journal of Operations Management Volume I, Number 1, 2003 16 were tested in the experiments: (a) Normal, with mean µ = 0, standard deviation σ = 5, or N(0, 5); (b) Laplace (or double exponential), with each exponential having σ = 5; (c) Contaminated Normal, where the εt are drawn from a N(0, 5) distribution with probability 0.85 and from a N(0, 25) distribution with probability 0.15; and (d) Uniform, with floor -15 and ceiling +15, or U(-15, 15). A Normal error distribution is typically assumed to hold for OLS applications; the latter three distributions permit investigation of how sensitive the results are to increasing weight in the tails of the error distribution. The critical ratio S = cu/(co + cu) expresses the cost of under-forecasting relative to that of over-forecasting. For example, a cu:co ratio of 3:2 is represented by S = 0.60. In order to determine how the results change over a wide range of values that S might take on in practice (i.e., from 0.10 to 0.90), the value of cu was varied from $10 to $90, in increments of $5, while setting co = $100 – cu. Results in the next section are presented as a function of S. All of the experiments were run on a personal computer in Microsoft Excel using Visual Basic for Applications (Harris, 1996). Each experiment used the same parameter values of α = 0 and β = 5, as well as the same sample size of n = 24 historical periods, and forecasting horizon of h = 6 post-sample periods. This relatively small amount of data is typical of business situations in which a time series may exhibit a trend, e.g., 2 years of monthly data (projecting up to half a year ahead) or 6 years of quarterly data (projecting up to a year and a half ahead). A scenario here refers to a particular combination of critical ratio and error distribution. Within each replication of a scenario, the traditional design for testing forecasting methods on a time series (Makridakis, et al., 1982) was followed. First, a time series of the desired length (n + h = 30 periods) was randomly generated using one of the four error distributions. The first n periods were then used to estimate (fit) the parameters of each forecasting method, while the last h periods were used by each method to generate a set of post-sample forecasts. Successively applying each forecasting method to the data yielded both a set of n historical forecasts and a set of h post-sample forecasts. Performance was then measured separately for each set of forecasts, i.e., the MOL across the n historical periods was recorded, as was the MOL over the h post-sample periods. Finally, an average MOL over 1000 replications was calculated, and reported upon next. IV. SIMULATION RESULTS A few general observations can be made before examining the results in detail. First, in every scenario tested, WLAV always achieves the lowest historical MOL. This is not surprising because it's objective is precisely to find the line that minimizes the MOL over the historical periods. OLSA, on the other hand, is more constrained than WLAV in that it takes the forecasts from the OLS line as fixed and then adjusts only the intercept of this line by adding a* to all the historical forecasts. Regardless of the error distribution, WLAV significantly lowers the historical MOL of OLS. However, OLSA does nearly as well as WLAV in reducing the historical MOL, (only about 2% higher). Since both OLSA and WLAV reduce the historical MOL of OLS in all scenarios tested, the historical MOL case won’t be discussed further. Second, the relative performance of WLAV and OLSA over the historical periods tended to be reversed over the post-sample periods in most scenarios. The simulation showed that, for the scenarios with Normal and Uniform errors, OLSA always outperforms WLAV; with Laplace and Contaminated Normal errors, OLSA outperforms WLAV for some values of S. While unexpected, this reversal in performance between sample and post-sample periods is not uncommon. Makridakis (1990, p. 505) references a number of empirical studies in which the method that best fit the historical data did not necessarily provide the most accurate post-sample forecasts. Third, for each error distribution tested, the relative benefits of using either OLSA or WLAV over OLS are roughly symmetric about S = 0.50 (as can be seen in Figures 2-5), i.e., the percentage reductions in MOL at S = 0.25 and S = 0.75 are quite similar to one another, as are those at S = 0.35 and S = 0.65. Therefore, statements made below for values of S a certain distance above 0.50 are also approximately true for values of S the same distance below 0.50. California Journal of Operations Management Volume I, Number 1, 2003 17 Normal errors. Two important differences from the historical data case emerge in the postsample (Figure 2). First, OLSA outperforms WLAV for all values of S, and second, there are some values of S for which neither OLSA nor WLAV outperforms OLS. In particular, OLSA improves upon OLS when either S ≥ 0.60 or S ≤ 0.40. Otherwise, OLSA yields a slightly higher MOL than OLS. Meanwhile, WLAV reduces the MOL of OLS only when S ≥ 0.65 or S ≤ 0.35. Still, both methods improve on OLS as the costs diverge from one another. For instance, at S = 0.65 (where cu is almost twice co), OLSA reduces the MOL by 5.8% (from $227.38 to $214.14), while WLAV reduces it by 2.1% (to $222.63). At S = 0.75 (where cu is three times co), OLSA reduces the MOL by 20.1% (from $226.79 to $181.30), while WLAV reduces it by 13.5% (to $196.07). FIGURE 3: POST-SAMPLE MOL VS. S WITH LAPLACE ERRORS 250 350 200 300 Mean Opportunity Loss ($) Mean Opportunity Loss ($) FIGURE 2: POST-SAMPLE MOL VS. S WITH NORM AL ERRORS 150 100 OLS 50 OLSA 250 200 OLS 150 OLSA WLAV WLAV 0 100 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 Critical Ratio S 0.5 0.6 0.7 0.8 0.9 Critical Ratio S FIGURE 5: POST-SAMPLE MOL VS. S WITH UNIFORM ERRORS FIGURE 4: POST-SAMPLE MOL VS. S WITH CONTAM . NORM AL ERRORS 500 450 400 Mean Opportunity Loss ($) Mean Opportunity Loss ($) 0.4 350 300 250 OLS OLSA 200 400 300 200 OLS 100 OLSA WLAV WLAV 150 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 Critical Ratio S California Journal of Operations Management 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Critical Ratio S Volume I, Number 1, 2003 18 Laplace errors. Figure 3 shows that WLAV and OLSA both improve upon OLS for all S values. However, for 0.40 ≤ S ≤ 0.60, WLAV does slightly better than OLSA, while for S ≤ 0.30 and S ≥ 0.70, OLSA does increasingly better than WLAV. As before, both methods rapidly improve on OLS as the costs diverge from one another. For instance, at S = 0.65, OLSA reduces the MOL of OLS by almost 6%, while WLAV reduces it by 5%. At S = 0.75, OLSA reduces the MOL by 17%, while WLAV reduces it by just under 13%. Contaminated Normal errors. Figure 4 shares two key features with Figure 3. First, both OLSA and WLAV improve upon OLS for every value of S, and second, WLAV is superior to OLS for some values of S (0.25–0.70). In many cases, WLAV is quite a bit better than OLS, e.g., at S = 0.45, where it reduces the MOL by 8.3% (from $384.56 to $352.49) while OLSA only reduces the MOL by 1.4% (to $379.08). Uniform errors. Figure 5 shows OLSA consistently outperforming WLAV, and by larger amounts than it did in the Normal error case. OLSA appears to do slightly worse than OLS when S is 0.45–0.55. However, at S = 0.65, OLSA reduces the MOL by 6.2% (from $410.99 to $385.32), while WLAV actually increases it by 2% (to $419.12). At S = 0.75, OLSA reduces the MOL by nearly 23% (from $409.58 to $315.70), while WLAV reduces it by 15%. As S approaches 0.50, WLAV performs much poorer here than it did in the other cases. In particular, WLAV’s postsample MOL is 2–12% higher than that of OLS for values of S in 0.40–0.65. Clearly, one would not use WLAV in this case unless S was at least 0.70 or at most 0.30. TABLE 2: SUMMARY OF POST-SAMPLE MOL RESULTS FOR SELECTED VALUES OF S Error Distribution S Forecasting Method Normal Laplace Contaminated Normal Uniform OLS $222.95 $284.64 $375.16 $404.43 OLSA $179.84 19.3% $237.35 16.6% $314.87 16.1% $315.54 22.0% WLAV $190.78 14.4% $246.84 13.3% $305.56 18.6% $340.04 15.9% OLS $222.86 $291.73 $381.66 $403.72 OLSA $208.39 6.5% $271.06 7.1% $356.84 6.5% $376.82 6.7% WLAV $214.02 4.0% $270.63 7.2% $337.97 11.4% $398.86 1.2% OLS $227.38 $286.60 $392.80 $410.99 OLSA $214.14 5.8% $270.09 5.8% $361.38 8.0% $385.32 6.2% WLAV $222.63 2.1% $272.22 5.0% $353.16 10.1% $419.12 -2.0% OLS $226.79 $279.10 $400.54 $409.58 OLSA $181.30 20.1% $231.62 17.0% $324.19 19.1% $315.70 22.9% WLAV $196.07 13.5% $243.07 12.9% $325.65 18.7% $346.88 15.3% 0.25 0.35 0.65 0.75 California Journal of Operations Management Volume I, Number 1, 2003 19 Table 2 summarizes the post-sample MOL results for all of the tested error distributions for four selected values of S. The percentages below the MOL figures for OLSA and WLAV represent the relative changes from the MOL with OLS. V. CONCLUSION This article has pursued two main themes. First, demand forecasting and operational decisionmaking can be easily coordinated in some circumstances. Second, when positive and negative forecasting errors have significantly different costs for the decision-maker, it's usually well worthwhile to adjust the forecasts to account for the difference. This article has shown that the straightforward adjustment approach described in Rech, Saltzman and Mehrotra (2002) works well in the trend case across a variety of error distributions and critical ratios. It appears here that the best forecasting method depends on the type of data being analyzed, as well as the relative costs of under- and over-forecasting. In particular, if the pattern of errors about the trend line is well approximated by either a Normal or Uniform distribution, and the critical ratio S is not too close to 0.50, using OLSA provides a consistently greater reduction in MOL than does WLAV. Specifically, when the ratio of costs is 3:1 (i.e., S = 0.75 or 0.25), OLSA reduces post-sample MOL by 19–23%. Even when the ratio of costs is closer to 2:1 (i.e., S = 0.65 or 0.35), OLSA still reduces post-sample MOL by 6–7%. If errors appear to follow either a Normal or Uniform error distribution, and S is in the 0.45–0.55 range, it's probably not worthwhile to use either OLSA or WLAV. When the errors fit either a contaminated Normal or Laplace distribution, both OLSA and WLAV improve upon the post-sample MOL of OLS for all values of S. However, WLAV tends to outperform OLSA for values of S close to 0.50, while OLSA does better for values of S closer to 0 or 1. In general, the benefits of using either OLSA or WLAV over OLS increase dramatically as the costs become more asymmetric, regardless of the error distribution. Given the uncertainty in actually knowing the underlying error distribution, and that OLSA's performance is more robust than WLAV's with respect to the error distribution, I’d recommend that OLSA always be used, especially if S is outside of the 0.45–0.55 range. OLSA is much easier to implement than WLAV and outperforms WLAV under most scenarios. While WLAV has the pleasing property of finding the forecasting line and accounting for asymmetric costs simultaneously, it requires setting up and solving a linear program. Moreover, it significantly improves upon OLSA only when the errors are contaminated Normal and 0.30 ≤ S ≤ 0.70. This article has looked at long-term forecasting methods (i.e., regression) for the linear case. An area for further research would be to investigate the effect of adjusting short-term, adaptive forecasting methods in the trend case. For example, using exponential smoothing or moving average forecasts with some type of moving adjustment may well provide benefits similar to those described here relative to more sophisticated approaches such as Holt's method. VI. REFERENCES Dielman, T., “A Comparison of Forecasts from Least Absolute Value and Least Squares Regression”, Journal of Forecasting, Vol. 5, 1986, 189-195. Dielman, T. and Rose, E., “Forecasting in Least Absolute Value Regression with Autocorrelated Errors: A Small-Sample Study”, International Journal of Forecasting, Vol. 10, 1994, 539-547. Eppen, G. and Martin, R., “Determining Safety Stock in the Presence of Stochastic Lead Time and Demand”, Management Science, Vol. 34, 1988, 1380-1390. Gardner, E., “Evaluating Forecast Performance in an Inventory Control System”, Management Science, Vol. 36 (4), 1990, 490-499. Gerchak, Y. and Mossman, D., “On the Effect of Demand Randomness on Inventories and Costs”, Operations Research, Vol. 40 (4), 1992, 804-807. California Journal of Operations Management Volume I, Number 1, 2003 20 Harris, M., Teach Yourself Excel Programming with Visual Basic for Applications in 21 Days, SAMS Publishing, Indianapolis, 1996. Lee, T. and Adam, E., “Forecasting Error Evaluation in Material Requirements Planning Production-Inventory Systems”, Management Science, Vol. 32 (9), 1986, 1186-1205. Makridakis, S. et al., “The Accuracy of Extrapolation (Time Series) Methods: Results of a Forecasting Competition”, Journal of Forecasting, Vol. 1, 1982, 111-153. Makridakis, S., “Sliding Simulation: A New Approach to Time Series Forecasting”, Management Science, Vol. 36 (4), 1990, 505-512. Rech, P., Saltzman, R. and Mehrotra, V., “Coordinating Demand Forecasting and Operational Decision Making: Results from a Monte Carlo Study and a Call Center”, Proceedings of the 14th Annual CSU-POM Conference - San Jose, CA, February 2002, 232-241. Rech, P., Schmidbauer, P. and Eng, J., “Least Absolute Regression Revisited: A Simple Labeling Method for Finding a LAR Line”, Communications in Statistics - Simulation, Vol. 18 (3), 1989, 943-955. Watson, R., “The Effects of Demand-Forecast Fluctuations on Customer Service Level and Inventory Cost when Demand is Lumpy”, Journal of the Operational Research Society, Vol. 38 (1), 1987, 75-82. California Journal of Operations Management Volume I, Number 1, 2003 21