INDR 372 - Spring 2024 Fikri Karaesmen Review Exercises for the Midterm Exam 1. Let us assume that Dt = µ + ϵt (for t=1,2,...) where ϵt are independent and identically distributed normal random variables with mean 0 and variance σ 2 . Consider the following three forecasts: i) Ft = Dt−1 ii) Gt = 0.8Dt−1 + 0.2Dt−2 , iii) Ht = Dt−1 + (Dt−2 − Dt−3 ). For all three forecasts, (a) Determine whether the forecast is unbiased. (b) Find the variance of the error et = Ft − Dt . (c) Are Ft and Dt−1 correlated? If so find the auto-covariance Cov(Ft , Dt−1 ). (d) Are Gt and Dt−1 correlated? If so find the auto-covariance Cov(Gt , Dt−1 ). 2. Let us assume that Dt = a + bt + ϵt (for t=1,2,...) where ϵt are independent and identically distributed normal random variables with mean 0 and variance σ 2 . (a) Assume that b is unknown (but constant). Show that 0.5(Dt−1 − Dt−2 ) + 0.5(Dt−2 − Dt−3 ) is an unbiased estimator for b. What is the variance of this estimator? (b) Assume a is known, consider the following forecast: Ft = a + (Dt−1 − Dt−2 )t. Is this unbiased? 3. The Australian Beer Production Data (see the blackboard page) for 1991 starts with the following monthly production quantities: y1 = 164, y2 = 148, y3 = 152, y4 = 144,y5 = 155, y6 = 125. (a) Compute F4 using a moving average with a 3 period window. (b) Compute F4 using exponential smoothing with α = 0.8. (c) Compute F4 using double exponential smoothing with α = 0.8 and β = 0.5. You can use an initial estimator of slope and level â = 150 and b̂ = 10. (d) Compute F4 using the following ARIMA model: Dt = 80 + 0.4Dt−1 + 0.1Dt−2 + ϵt . 1 4. John Kittle, an independent insurance agent, uses a five-year moving average to forecast the number of claims made in a single year for one of the large insurance companies he sells for. He has just discovered that a clerk in his employ incorrectly entered the number of claims made four years ago as 1,400 when it should have been 1,200. (a) What adjustment should Mr. Kittle make in next year’s forecast to take into account the corrected value of the number of claims four years ago? (b) Suppose that Mr. Kittle used simple exponential smoothing with α = 0.2 instead of moving averages to determine his forecast. What adjustment is now required in next year’s forecast? (Note that you do not need to know the value of the forecast for next year in order to solve this problem.) 5. The owner of a small brewery in Milwaukee, Wisconsin, is using Winter’s method to forecast his quarterly beer sales. He has been using smoothing constants of α = 0.2, β = 0.2, and γ = 0.2. He has currently obtained the following values of the various slope, intercept, and seasonal factors: S10 = 120, G10 = 14, c10 = 1.2, c9 = 1.1, c8 = 0.8, c7 = 0.9. (a) Determine the forecast for beer sales in quarter 11. (b) Suppose that the actual sales turn out to be 128 in quarter 11. Find S11 and G11 , and find the updated values of the seasonal factors. Also determine the forecast made at the end of quarter 11 for quarter 13. 6. Consider a monthly sales data that exhibits increasing trend with a 12 month seasonality. (a) What would be an appropriate ARIMA model based on this information only? (i.e. Dt = α1 Dt−1 + α2 et−1 + ... ?) (b) How would you analyze the data for a complete ARIMA study? 7. Consider the autocorrelation and the partial autocorrelation plots for two different time series in Figure 1. 2 (a) What is an appropriate ARIMA model for the series in Figure 1A? (b) What is an appropriate ARIMA model for the series in Figure 1B? (c) Assume that D1 = 20, D2 = 15, D3 = 12, and that you are using an ARIMA(1,0,0) model with a0 = 10 and a1 = −0.5, what is your forecast for period 4 and period 5 (made in period 3)? 8. Consider the bike share (the daily number of bikes that are checked out from the Paris municipality bike share system) demand time series whose plot and autocorrelation plot is in Figure 2. (a) Explain what is observed in the ACF. (b) What steps should be taken to fit an ARIMA model to this data? 9. Consider the bike share demand time series (the daily number of bikes that are checked out from the Paris municipality bike share system). The data also includes weather related measurements for each day. Thinking that the number of bikes shared may be dependent on weather factors, we perform a linear regression taking three weather-related measurements. Note that the weather related measurements such as wind speed, temperature and humidity are normalized to values between 0 and 1. The results are summarized in the Table in Figure 3. (a) Which predictors (factors) are significant? Why? (b) Is the regression satisfactory? What can be done to improve it? (c) Assume that tomorrow is predicted to be an average day (i.e. x1 = x2 = x3 = 0.5). Find a point estimate and a 90% confidence interval for the number of bikes demanded? 10. Consider the refrigerator sales data from the lectures (the monthly number of refrigerators sold over a number of years). Because there appears to be both trend and seasonality in the data. We use the following model to predict the monthly sales: yt = b0 + b1 x1,t + b2 x2,t + . . . + b12 x12,t + ϵt where x1,t models the trend: i.e x1,t = t and and xi,t (for i = 2, 3, ..12) are the monthly binary (dummy) variables i.e : x2,t = 1 if month t is 3 January and is zero otherwise. Similarly, x3,t = 1 if month t is February and is zero otherwise. Note that we only need only 11 such variables since: x13,t = 1 − (x2,t + x3,t + . . . + x12,t ) (the binary variable for December is linearly dependent on the binaries for the other months). The regression results are in the Table in Figure 4. (a) Is the regression statistically significant? Explain. (b) Which predictors are statistically significant? Explain. (c) Predict the average sales of refrigerators for month 25 (January of year 3). (d) What is the expected change in sales of refrigerators from month 25 to month 27 (January to March of year 3)? (e) What is the expected change in sales of refrigerators from month 25 to month 36 (January to December of year 3)? (f) Would the accuracy of the regression (measured by R2 ) improve if we added a new term wt = t2 ? for quadratic trend? What about the adjusted R2 ? (g) If we were to shrink (reduce) the model by a lasso regression, which predictor is likely to vanish first as we increase the penalty parameter λ? Which predictor is likely to survive until the end as λ increases further? (Note that we can only make a reasonable guess but cannot give a definitive answer since the optimization problem is complicated. That’s why we need computational tools). 11. Mr. Meadows Cookie Company makes a variety of chocolate chip cookies in the plant in Albion, Michigan. Based on orders received and forecasts of buying habits, it is estimated that the demand for the next four months is 850, 1,260, 510, and 980, expressed in thousands of cookies. During a 46-day period when there were 120 workers, the company produced 1.7 million cookies. Assume that the number of workdays over the four months are respectively 26, 24, 20, and 16. There are currently 100 workers employed, and there is no starting inventory of cookies. (a) What is the minimum constant workforce required to meet demand over the next four months? 4 (b) Assume that cI = 10 cents per cookie per month, cH = $100, and cF = $ 200. Evaluate the cost of the plan derived in part (a). (c) Formulate as a linear program. Be sure to define all variables and include the required constraints. (d) Suppose now that the cost of hiring workers each period is $100 for each worker until 20 workers are hired, $400 for each worker when between 21 and 50 workers are hired, and $700 for each worker hired beyond 50. Write down the complete linear programming formulation of the problem for these hiring costs. 12. Consider the following extension to the standard production modeling framework. The workforce has a learning curve and becomes more efficient each month they work. Assume that each month of experience leads to an additional r% productivity with respect to the previous month i.e. if a newly starting worker makes K units in a day, the next month he makes K(1 + r) units in day and at month τ + 1, he makes K(1 + r)τ units in a day. (a) Let Vi,j be the number of workers hired in month i who are still working in month j (j > i). (Assume that workers that are fired and later rehired start from the beginning of the learning curve.) Write the constraints for a production plan that determines how many workers to employ in each period depending on their experience (i.e. hiring and firing decisions depend on the time that a worker started). Explain which workers to fire in priority under such a structure. (b) Assume that the wages of the workers also depend on their experience (i.e. wages are increasing in the number of months worked). Write the objective function to take into account a q% increase in the wage for each month worked. Online Exam Questions 13. (3 points) Assume that we have observed the following demand over the first five periods: d(1) = 12, d(2) = 18, d(3) = 12, d(4) = 15, d(5) = 17. What is the forecast for the expected demand in period 6 if a Moving Average forecast with a 5-period window is used? 5 14. (3 points) Assume that we have observed the following demand over the first five periods: d(1) = 12, d(2) = 18, d(3) = 9, d(4) = 15, d(5) = 11. What is the forecast for the expected demand in period 6 if an Exponential Smoothing forecast with smoothing constant α=0.1 is used?. Assume that the ES forecast for period 5 is: F5 = 19. 15. (3 points) Let Dt = µ + ϵt where ϵt are independent random variables that are normally distributed with mean zero and standard deviation 7. Consider the following forecast: Ft = (0.7)Dt−1 + (1 − 0.7)Dt−2 . What is the variance of the forecast error et ? Note that ( et = Ft −Dt )? 16. (4 points) Let Dt = 30 + (5)t + ϵt where ϵt are independent random variables with mean zero and standard deviation σ. Consider the following forecast Ft = Dt−2 + (2)(5). Find the expected value of the forecast F12 : E(F12 ). 17. (4 points) Let Dt = a+bt+ϵt where ϵt are independent random variables with mean zero and standard deviation σ. Consider the following forecast Ft = Dt−1 + (Dt−1 − Dt−2 ). Note that this is an unbiased forecast. Assume that the observed demand in the first three periods are d(1) = 10, d(2) = 13 and d(3) = 9. Find a forecast for the demand in period 8 made after observing the demand in the first three periods: F3,8 . 18. (3 points) Let Dt = 5 + (40)t + ϵt where ϵt are independent random variables with mean zero and standard deviation σ. Consider the following transformation: Zt = Dt − Dt−1 . Assume that we obtain an unbiased forecast Gt for Zt . Find the expected value of Gt : E(Gt ). 19. (3 points) Let Dt = 12 + (15)t + ϵt where ϵt are independent random variables with mean zero and standard deviation σ. Consider the following transformation: Zt = Dt − Dt−1 . Find the variance of Zt : V ar(Zt ). 6 20. (4 points) Consider the following Box-Jenkins process: Dt = 100 + (0.1)Dt−1 + (0.3)Dt−2 + ϵt . Assume that the observed demand in the first two perids are: d(1) = 75, d(2) = 70. Find the forecast for the expected demand in period 3. 21. (4 points) Consider the following Box-Jenkins process: Wt = 100 + (−0.5)Wt−1 + ϵt . Now assume that Dt = (11)t + Wt . Note that Dt has a known trend. Assume that we observed the outcomes of Wt in the first two perids as: w(1) = 71, w(2) = 60. Find the forecast for the expected demand in period 4 made after observing demands (and outcomes of Wt ) in the first two periods: F2,4 . 22. (4 points) Assume that we have quarterly demand data and we use the following time series regression as a forecasting model: dt = b0 + b1 t + b2 y2t + b3 y3t + b4 y4t + ϵt . where yit (i = 2, 3, 4) are binary (dummy) variables for seasons 2, 3 and 4 (yit = 1, if period t is quarter i of a year and yit = 0 otherwise). We fit a regular least squares regression and estimate the coefficients as follows b0 = 80, b1 = 6.5, b2 = 10, b3 = 70 and b4 = −20. Let us assume that all coefficients are statistically significant. Find a forecast for the demand in period 17. (Assume that t = 1 corresponds to Q1, t = 2 corresponds to Q2 and so on.) 23. (16 points) Let Dt = a + bt + ϵt where ϵt are independent random variables with mean zero and standard deviation σ. Consider the following forecast Ft = Dt−1 + (0.1)(Dt−1 − Dt−2 ) + (1 − 0.1)(Dt−2 − Dt−3 ) (a) (3 points) Assume that the observed demand in the first three periods are d(1) = 10, d(2) = 11 and d(3) = 16. Find the forecast for the expected demand in period 4: F4 . (b) (4 points) Establish whether Ft is a an unbiased forecast or not for all t > 3. (c) (5 points) Find the variance of the forecast F t: V ar(Ft ). 7 (d) (4 points) Consider the forecast Gt = Dt−2 +2(Dt−1 −Dt−2 ). Find V ar(Gt ) and compare with part d. Expain the difference. 24. (14 points) Assume that we have quarterly demand data and we use the following time series regression as a forecasting model: dt = b0 + b1 t + b2 y2t + b3 y3t + b4 y4t + ϵt where yit (i = 2, 3, 4) are binary (dummy) variables for seasons 2, 3 and 4 (yit = 1, if period t is quarter i of a year and yit = 0 otherwise). We fit a regular least squares regression to the first 6 years (24 quarters) of demand observations and obtain the following results (assume that t=1 corresponds to first quarter of the year): Estimated Coefficients: Estimate SE tStat (Intercept) 104.29 9.0209 11.561 x1 4.7794 0.51429 9.2931 x2 17.39 9.9503 1.7477 x3 -28.776 9.9901 -2.8804 x4 39.867 10.056 3.9645 pValue 4.8499e-10 1.6914e-08 0.096663 0.0095816 0.00083098 Number of observations: 24, Error degrees of freedom: 19 Root Mean Squared Error: 17.2 R-squared: 0.885, Adjusted R-Squared: 0.861 F-statistic vs. constant model: 36.7, p-value = 1.09e-08 (a) (2 points) Which predictors are statistically significant? Explain. (b) (2 points) What is the forecast for expected value of demand in quarter 28? (c) (2 points) Find a 90% prediction interval for the forecast in part b. Note that z0.95 = 1.64. 8 (d) (2 points) What is the expected change in demand between quarter 20 and quarter 28? (e) (2 points) What is the expected change in demand between quarter 21 and quarter 28? (f) (2 points) Assume that we also have data on the demand of the competitor’s product which is correlated with our demand. How would the R2 value change if we add the competitor’s demand as a new predictor in the above model. Explain. (g) (2 points) Assume that we have additional 16 quarters of data that we can use as a test set. We can then test the above regression model on the test set. How would you expect the Root Mean Squared Error to change in the test set. Explain. 25. (10) points Consider an online retailer that ships healthy meals to customers. Their veggie menu is becoming popular. The demand for this menu for the next 4-days is forecasted as follows: Day 1 2 3 4 Demand 8 8 32 12 Assume that each veggie menu requires two hours of preparation and packaging time. Assume that the cooks for veggie meals work for eight hours a day (and only cook veggie menus) and the menus can be prepared in advance and kept in inventory. Assume also that the workforce (number of cooks) must be an integer. (a) (2 points) What is the number of cooks needed to satisfy the demand on day 1 (disregarding the other days)? 9 (b) (8 points) What is the minimum constant workforce level (number of cooks employed) if there is no inventory available in the beginning and it is targeted to have no inventories of veggie meals at the end of day 4? You can assume that production can be slowed down to avoid excess in inventories in period 4 if required. Show all your work and explain all steps. 10