CHAPTER 2 Forecasting McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved. 2-2 Introduction to Forecasting What is forecasting? Primary Function is to Predict the Future Why are we interested? Affects the decisions we make today Examples: who uses forecasting in their jobs? forecast demand for products and services forecast availability of manpower forecast inventory and materiel needs daily 2-3 Characteristics of Forecasts They are usually wrong! A good forecast is more than a single number mean and standard deviation range (high and low) Aggregate forecasts are usually more accurate Accuracy erodes as we go further into the future. Forecasts should not be used to the exclusion of known information 2-4 What Makes a Good Forecast It should be timely It should be as accurate as possible It should be reliable It should be in meaningful units It should be presented in writing The method should be easy to use and understand in most cases. 2-5 Forecast Horizons in Operation Planning Figure 2.1 2-6 Subjective Forecasting Methods Sales Force Composites Aggregation of sales personnel estimates Customer Surveys Jury of Executive Opinion The Delphi Method Individual opinions are compiled and reconsidered. Repeat until and overall group consensus is (hopefully) reached. 2-7 Objective Forecasting Methods Two primary methods: causal models and time series methods Causal Models Let Y be the quantity to be forecasted and (X1, X2, . . . , Xn) be n variables that have predictive power for Y. A causal model is Y = f (X1, X2, . . . , Xn). A typical relationship is a linear one. That is, Y = a0 + a1X1 + . . . + an Xn. 2-8 Time Series Methods A time series is just collection of past values of the variable being predicted. Also known as naïve methods. Goal is to isolate patterns in past data. (See Figures on following pages) Trend Seasonality Cycles Randomness 2-9 Figure 2.2 2-10 Notation Conventions Let D1, D2, . . . Dn, . . . be the past values of the series to be predicted (demand). If we are making a forecast in period t, assume we have observed Dt , Dt-1 etc. Let Ft, t + t = forecast made in period t for the demand in period t + t where t = 1, 2, 3, … Then Ft -1, t is the forecast made in t-1 for t and t+1 is the forecast made in t for t+1. (one step ahead) Use shorthand notation Ft = Ft - 1, t . Ft, 2-11 Evaluation of Forecasts The forecast error in period t, et, is the difference between the forecast for demand in period t and the actual value of demand in t. For a multiple step ahead forecast: et = Ft - t, t - Dt. For one step ahead forecast: et = Ft - Dt. MAD = (1/n) S | e i | MSE = (1/n) S ei 2 2-12 Biases in Forecasts A bias occurs when the average value of a forecast error tends to be positive or negative. Mathematically an unbiased forecast is one in which E (e i ) = 0. See Figure 2.3 on page 64 in text (next slide). Forecast Errors Over Time Figure 2.3 2-13 2-14 Forecasting for Stationary Series A stationary time series has the form: Dt = m + e t where m is a constant and e t is a random variable with mean 0 and var s2 . Two common methods for forecasting stationary series are moving averages and exponential smoothing. 2-15 Moving Averages In words: the arithmetic average of the n most recent observations. For a one-step-ahead forecast: Ft = (1/N) (Dt - 1 + Dt - 2 + . . . + Dt - n ) (Go to Example.) 2-16 Summary of Moving Averages Advantages of Moving Average Method Easily understood Easily computed Provides stable forecasts Disadvantages of Moving Average Method Requires saving all past N data points Lags behind a trend Ignores complex relationships in data Moving Average Lags a Trend Figure 2.4 2-17 2-18 Exponential Smoothing Method A type of weighted moving average that applies declining weights to past data. 1. New Forecast = a (most recent observation) + (1 - a) (last forecast) or 2. New Forecast = last forecast a (last forecast error) where 0 < a < 1 and generally is small for stability of forecasts ( around .1 to .2) 2-19 Exponential Smoothing (cont.) In symbols: Ft+1 = a Dt + (1 - a ) Ft = a Dt + (1 - a ) (a Dt-1 + (1 - a ) Ft-1) = a Dt + (1 - a )(a )Dt-1 + (1 - a)2 (a )Dt - 2 + . . . Hence the method applies a set of exponentially declining weights to past data. It is easy to show that the sum of the weights is exactly one. (Or Ft + 1 = Ft - a (Ft - Dt) ) 2-20 Weights in Exponential Smoothing 2-21 Comparison of ES and MA Similarities Both methods are appropriate for stationary series Both methods depend on a single parameter Both methods lag behind a trend One can achieve the same distribution of forecast error by setting a = 2/ ( N + 1). Differences ES carries all past history. MA eliminates “bad” data after N periods MA requires all N past data points while ES only requires last forecast and last observation. 2-22 Using Regression for Times Series Forecasting Regression Methods Can be Used When Trend is Present. –Model: Dt = a + bt. If t is scaled to 1, 2, 3, . . . , then the least squares estimates for a and b can be computed as follows: Set Sxx = n2 (n+1)(2n+1)/6 - [n(n+1)/2]2 Set Sxy = n S i Di - [n(n + 1)/2] S Di _ –Let b = Sxy / Sxx and a = D - b (n+1)/2 These values of a and b provide the “best” fit of the data in a least squares sense. 2-23 Other Methods When Trend is Present Double exponential smoothing, of which Holt’s method is only one example, can also be used to forecast when there is a linear trend present in the data. The method requires separate smoothing constants for slope and intercept. 2-24 Forecasting For Seasonal Series Seasonality corresponds to a pattern in the data that repeats at regular intervals. (See figure next slide) Multiplicative seasonal factors: c1 , c2 , . . . , cN where i = 1 is first period of season, i = 2 is second period of the season, etc.. S ci = N. ci = 1.25 implies 25% higher than the baseline on avg. ci = 0.75 implies 25% lower than the baseline on avg. 2-25 Figure 2.8 2-26 Quick and Dirty Method of Estimating Seasonal Factors Compute the sample mean of the entire data set (should be at least several seasons of data). Divide each observation by the sample mean. (This gives a factor for each observation.) Average the factors for like periods in a season. The resulting N numbers will exactly add to N and correspond to the N seasonal factors. 2-27 Deseasonalizing a Series To remove seasonality from a series, simply divide each observation in the series by the appropriate seasonal factor. The resulting series will have no seasonality and may then be predicted using an appropriate method. Once a forecast is made on the deseasonalized series, one then multiplies that forecast by the appropriate seasonal factor to obtain a forecast for the original series 2-28 Box-Jenkins Models Recommended when at least 72 data points of past history are available. Primary feature: Exploits the structure of the autocorrelation function of the time series. Autocorrelation coefficient of lag k: n ( D D)( D rk = t = k 1 t n t k 2 ( D D ) t t =1 D) , 2-29 Stationary Time Series Box Jenkins models can only be constructed for stationary series. That is, series that exhibit no trend, seasonality, growth, etc. If the series is represented by D1, D2, . . . then this translates to the assumptions that E(Di) = μ and Var(Di) = s 2 independent of i. Later we will show how differencing can convert many non-stationary series to stationary series. 2-30 The Autoregressive Process Dt = a0 a1 Dt 1 a2 Dt 2 ... a p Dt p e t . (a0 , a1 ,..., aas Interpret p ) the linear regression coefficients and ase tthe error term. This is an AR(p) process. Simpler and more common is the AR(1) process given by: Dt = a0 a1Dt 1 e t . 2-31 Theoretical Autocorrelation Function of the AR(1) Process 2-32 2-33 The Moving Average Process Dt = b0 b1e t 1 b2e t 2 ... bqe t q e t Note that the weights (b1 , b2 ,..., bq ) are shown with negative signs by convention. It can be shown that an AR(1) process is equivalent to an MA(∞) process. The MA(1) model is powerful because the autocorrelation function, which has a non-zero value only at lag 1, is often observed in practice. Typical Realizations of the MA(1) Process with negative and positive one period autocorrelations. 2-34 2-35 2-36 ARMA Models An ARMA model is one that includes both AR terms and MA terms. For example, the ARMA(1,1) model is: Dt = c a1Dt 1 b1e t 1 e t By combining AR and MA terms into a single model, we are able to capture complex relationships in the data with a parsimonious model (i.e., one with as few terms as possible). 2-37 ARIMA Models The “I” in ARIMA stands for integrated, which means applying an ARMA model to a differenced process. Differencing can convert a nonstationary time series into a stationary time series under some circumstances. One order of differencing eliminates trend, and two orders of differencing eliminates quadratic trend First differencing would be denoted: U t = Dt Dt 1 2-38 Practical Considerations in Forecasting Overly sophisticated forecasting methods can be problematic, especially for long term forecasting. (Refer to Figure on the next slide.) Tracking signals may be useful for indicating forecast bias. Box-Jenkins methods require substantial data history, use the correlation structure of the data, and can provide significantly improved forecasts under some circumstances. 2-39 Figure 2.12 2-40 Case Study: Sport Obermeyer Saves Money Using Sophisticated Forecasting Methods Problem: Company had to commit at least half of production based on forecasts, which were often very wrong. Standard jury of executive opinion method of forecasting was replaced by a type of Delphi Method which could itself predict forecast accuracy by the dispersion in the forecasts received. Firm could commit early to items that had forecasts more likely to be accurate and hold off on items in which forecasts were probably off. Use of early information from retailers improved forecasting on difficult items. Consensus forecasting in this case was not the best method.