BUAD306 Chapter 3 – Forecasting Everyday Forecasting Weather Time Traffic Other examples??? What is Forecasting? Forecast: A statement about the future Used to help managers: Plan the system Plan the use of the system Use of Forecasts Accounting Cost/profit estimates Finance Cash flow and funding Human Resources Hiring/recruiting/training Marketing Pricing, promotion, strategy MIS IT/IS systems, services Operations Schedules, MRP, workloads Product/service design New products and services Forecasting Basics Assumes causal system past ==> future Forecasts rarely perfect because of randomness Forecasts more accurate for groups vs. individuals Forecast accuracy decreases as time horizon increases Elements of a Good Forecast Timely – feasible horizon Reliable – works consistently Accurate – degree should be stated Expressed in meaningful units Written – for consistency of usage Easy to Use - KISS Approaches to Forecasting Judgmental – subjective inputs Time Series – historical data Associative – explanatory variables Judgmental Forecasts Executive Opinions Accuracy?? Outside Opinions Industry experts Sales Force Feedback Bias??? Consumer Surveys Guarantee??? What would you rather evaluate? 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Period A B C 1 30 18 46 2 34 17 26 3 32 19 27 4 34 19 23 5 35 22 22 6 30 23 48 7 34 23 29 8 36 25 20 9 29 24 14 10 31 26 18 11 35 27 47 12 31 28 26 13 37 29 27 14 34 31 24 15 33 33 22 16 Time Series Forecasts Based on observations over a period of time Identifies: Trend – LT movement in data Seasonality – ST variations Cycles – wavelike variations Irregular Variations – unusual events Random Variations – chance/residual Forecast Variations Irregular variation Random variation Trend Cycles 90 89 88 Seasonal Variations Naïve Forecasting Simple to use Minimal to no cost Data analysis is almost nonexistent Easily understandable Cannot provide high accuracy Can be a standard for accuracy RULE: “Whatever happened “yesterday” is going to happen tomorrow as long as I apply LOGIC.” HW Problem 1 Day Muffins Buns Cupcakes 1 30 18 46 2 34 17 26 3 32 19 27 4 34 19 23 5 35 22 22 6 30 23 48 7 34 23 29 8 36 25 20 9 29 24 14 10 31 26 18 11 35 27 47 12 31 28 26 13 37 29 27 14 34 31 24 15 33 33 22 16 HW Problem 1 Muffins Buns Cupcakes 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Techniques for Averaging Moving average Weighted moving average Exponential smoothing Simple Moving Average n MAn = Ai i=1 n Where: i = index that corresponds to periods n = number of periods (data points) Ai = Actual value in time period I MA = Moving Average Ft = Forecast for period t Example 1: Moving Average Four period moving average for period 7: Period 1 2 3 4 5 6 7 8 Sales 3520 2860 4005 3740 4310 5001 4890 ?? Four period moving average for period 8: Four period moving average for period 9 if actual for 8 = 5025: Weighted Moving Average Similar to a moving average, but assigns more weight to the most recent observations. Total of weights must equal 1. Example 2: Weighted Moving Average Period Sales 1 2 3 4 3520 2860 4005 3740 5 6 7 4310 5001 4890 8 ?? Compute a weighted moving average forecast for period 8 using the following weights: .40, .30, .20 and .10: HW #2 – Let’s Discuss Month Feb Sales 19 Mar Apr May June July Aug 18 15 20 18 22 20 Calculating Error Mathematically: et = At - Ft Let’s discuss examples on board… Premise - Exponential Smoothing The most recent observations might have the highest predictive value…. And since all forecasts have error… We should give more weight to the error in the more recent time periods when forecasting. Exponential Smoothing Ft = Ft-1 + (At-1 - Ft-1) Next forecast = Previous forecast + (Actual -Previous Forecast) Smoothing Constant About = Smoothing constant selected by forecaster It is a percentage of the forecast error The closer the value is to zero, the slower the forecast will be to adjust to forecast errors (greater smoothing) The higher the value is to 1.00, the greater the responsiveness to errors and the less smoothing Example 3: Exponential Smoothing Ft = Ft-1 + (At-1 - Ft-1) Period Sales 1 3520 2 2860 3 4 5 6 4005 3740 4310 5001 7 8 4890 ?? Assume a starting forecast of 4030 for period 3. Given data at left and = .10, what would the forecast be for period 8? Example 3: Exponential Smoothing Period Act 3 4005 4 3740 5 4310 6 5001 7 4890 8 Forecast Calc Fore 4030 HW #2 – Let’s Discuss Month Feb Sales 19 Mar Apr May June July Aug 18 15 20 18 22 20 Techniques for Seasonality Seasonal Variations – regularly repeating movements in series values that can be tied to recurring events Computing Seasonal Relatives: Although we will discuss how relatives are created in class, you do not have to know this for exam – just how to apply the relatives to a forecast. Using Seasonal Relatives Allows you to incorporate seasonality or deseasonalize data Incorporate: Adds seasonality into the trend forecast so that you can see peaks and valleys. Deseasonalize: Remove seasonal components to get a clearer picture of non-seasonal components (underlying trend) Example 4: Using Seasonal Relatives A publisher wants to predict quarterly demand for a certain book for periods 11 and 12, which happen to be in the 3rd and 4th quarters of a particular year. The data series consists of both trend and seasonality. The trend portion of demand is projected using the equation: yt=12,500 + 150.5t. Quarter relatives are Q1= 1.3, Q2=.8, Q3=1.4, Q4=.9 Use this information to predict demand for periods 9 and 16. The trend values: Applying the relatives: HW #11 – Let’s Discuss The following equation summarizes the trend portion of quarterly sales of condos over a long cycle. Prepare a forecast for each Q of next year and the first quarter of the following year. Ft = 40 – 6.5t + 2t2 Ft = unit sales t= 0 at 1Q of last year Quarter Relative 1 1.1 2 1 3 .6 4 1.3 Assoc. Forecasting Technique: Simple Linear Regression Predictor variables - used to predict values of variable interest Regression - technique for fitting a line to a set of points Least squares line - minimizes sum of squared deviations around the line Linear Regression Assumptions Variations around line are random No patterns are apparent Deviations around the line should be normally distributed Predictions are being made only in the range of observed values Should use minimum of 20 observations for best results Suppose you analyze the following data... X 7 2 6 4 14 15 16 12 14 20 15 7 Y 15 10 13 15 25 27 24 20 27 44 34 17 50 40 30 20 10 0 0 5 10 15 20 25 The regression line has the following equation: y c = a + bx Where: y c = Predicted (dependent) variable x = Predictor (independent) variable b = slope of the line a = Value of y c when x=0 b = n (xy) - (x)(y) n(x2) - (x)2 a = y - b x n Example 5 - Linear Regression: Suppose that a manufacturing company made batches of a certain product. The accountant for the company wished to determine the cost of a batch of product given the following data: Size of batch 20 30 40 50 70 80 100 120 150 Cost of batch (in 1000s) $1.4 3.4 4.1 Question… which is 3.8 the dependent (y) and 6.7 which is the independent 6.6 (x) variable? 7.8 10.4 11.7 Batch (x) Cost (y) 20 1.4 30 3.4 40 4.1 50 3.8 70 6.7 80 6.6 100 7.8 120 10.4 150 11.7 SUM = 660 55.9 xy 28 102 164 190 469 528 780 1248 1755 5264 x^2 400 900 1600 2500 4900 6400 10000 14400 22500 63600 We are now ready to determine the values of b and a: SUM = Batch (x) Cost (y) 660 55.9 xy 5264 x^2 63600 b = n (xy) - (x)(y) = 9 (5264) - (660)(55.9) n(x2) - (x)2 = 47376-36894 572400-435600 9(63600) - (660)2 = 10482 = 136800 a = y - bx = 55.9 - .0766(660) = n 9 Our linear regression equation: y c = a + bx yc= What is the cost of a batch of 125 pieces? yc= Series1 Linear (Series1) 14 12 10 8 6 4 2 0 Cost Batch (x) Cost (y) 20 1.4 30 3.4 40 4.1 50 3.8 70 6.7 80 6.6 100 7.8 120 10.4 150 11.7 0 100 Batch size 200 Problem #7 Freight car loadings at a busy port are as follows: Week # Week # 1 220 10 380 2 245 11 420 3 280 12 450 4 275 13 460 5 300 14 475 6 310 15 500 7 350 16 510 8 360 17 525 9 400 18 541 Period (t) Loads (y) 1 220 2 245 3 280 4 275 5 300 6 310 7 350 8 360 9 400 10 380 11 420 12 450 13 460 14 475 15 500 16 510 17 525 18 541 SUM 171 7,001 t^2 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 2,109 Problem #7 t*y 220 490 840 1,100 1,500 1,860 2,450 2,880 3,600 3,800 4,620 5,400 5,980 6,650 7,500 8,160 8,925 9,738 75,713 b = n (xy) - (x)(y) n(x2) - (x)2 a = y - bx n Correlation (r) A measure of the relationship between two variables • Strength • Direction (positive or negative) Ranges from -1.00 to +1.00 • Correlation close to 0 signifies a weak relationship – other variables may be at play • Correlation close to +1 or -1 signifies a strong relationship Calculating a Correlation Coefficient r = n( xy) - ( x)( y) n( x2)- ( x)2 * n( y2) - ( y)2 Example 6: Continued Batch (x) Cost (y) 660 55.9 SUM = r = xy 5264 x^2 63600 y^2 439.11 n( xy) - ( x)( y) n( x2)- ( x)2 * n( y2) - ( y)2 r = r = 9 (5264) - (660)(55.9) 9(63600)- (660)2 * 9(439.11) - (55.9)2 47376 - 36894 = 10482 = .985 136800 * 827.18 369.86 * 28.76 Coefficient of Determination (r2) How well a regression line “fits” the data Ranges from 0.00 to 1.00 The closer to 1.0, the better the fit Example 6: Continued Cost 15 Series1 10 Linear (Series1) 5 0 0 50 100 150 200 Batch size r = .985 r2 = .9852 = .97 Conclusion of Example R = .985 Positive, close to one R2 = .9852 = .97 Closer to one, the better the fit to the line Forecast Accuracy Error - difference between actual value and predicted value Mean absolute deviation (MAD) Average absolute error Mean squared error (MSE) Average of squared error Why can’t we simply calculate error for each observed period and then select the technique with the lowest error? Error Example Period 1 2 3 4 5 6 7 8 9 10 #errors? Actual 55 60 75 58 80 90 70 92 100 3PMA 5PMA 3P WMA .6, .3, .1 EX SM .2 63.33333 64.33333 71 76 80 84 87.33333 65.6 72.6 74.6 78 86.4 68.5 63.3 72.9 83.8 77 85.2 94.6 6 4 6 LR 55.69 55 60.66 56 65.63 59.8 70.6 59.44 75.57 63.552 80.54 68.8416 85.51 69.07328 90.48 73.65862 95.45 78.9269 100.42 8 9 Does the # of errors calculated impact the "accuracy" comparison??? Calculating Error Mathematically: et = At - Ft What do the negative errors mean? How do they affect total error? Calculating MAD and MSE Actual forecast MAD = n 2 (Actual forecast) MSE = n -1 Conclusions with MAD & MSE The MAD and MSE can be used as a comparison tool for several forecasting techniques. The forecasting technique that yields the lowest MAD and MSE is the preferred forecasting method. MAD & MSE Comparison Tech MAD MSE 3 Period MA 3.6 8.1 Exp Sm .02 2.2 5.6 Exp Sm .04 2.6 6.1 Which technique would you select for your forecast approach?