DS 533 Fall 2003 Exam # 1 Name: ___________________ Show All your Work 1. You are the quality control manager in a plant that produces bungee cords. Your responsibility is to oversee the production of the synthetic material in the cord. Specifically, your responsibility is to ensure that bungee cords have the correct elastic qualities to avoid personal injury lawsuits. Your efforts are compounded in that you use two procedures for testing bungee cord elasticity, procedure A and procedure B. Procedure A is generally subject to error, but few are very large. On the other hand, procedure B is very accurate but subject to large one-time errors. Specifically, forecast errors in evaluating the dynamic cord elasticity per pound of load are presented below for a random sample of four cords. Procedure A Forecast Errors .01 -.01 -.02 .02 Procedure B Forecast Errors .008 -.009 -.008 .03 Using mean-absolute deviation (MAE) and mean-squared error (RMSE), evaluate the relative accuracy of each procedure. Which procedure will you use in quality control testing? Procedure A 1 .01 .01 .02 .02 .06 MAE ( y i yˆ i ) .015 n 4 RMSE ( y i yˆ i ) 2 n .001 .001 .004 .004 .001 .0158 4 4 Procedure B .008 .009 .008 .03 .055 MAE .01375 4 4 .001109 RMSE .01665 4 Under MAD procedure B is superior, under RMSE procedure A is superior. Ultimately this depends on the relative costs of large versus small forecast errors underlying the accuracy measures. I suspect that large errors are more costly to the firm than small, and accordingly apply the RMSE and conclude that procedure A is superior. 2. A random sample of twelve automobiles showed the following figures for miles achieved on a gallon of gas. Assume the population distribution is normal. From the data: 12 12 i 1 i 1 12 n 12, X i 232.9, X i 4533.49, and ( X (a) s2 (c) i 1 i X ) 2 13.2867. Find the average miles per gallon of gas. x (b) 2 232.9 19.41 12 Find the sample variance. 13.2867 1.21 11 Find an approximate 95% confidence interval for the population mean. From t-table for = .05 and n-1 = 11 degrees of freedom, we get t 2 x t 2 , n 1 s n 1.099 12 (18.71, 20.11) 19.41 2.21 2.201 , 11 3. Private housing starts (PHS) are considered leading indicators of future economic activity. Using Quarterly data on private housing starts over the period Feb. 1980-Feb. 2000, the estimated correlograms are reported below. 0.8 0.6 0.4 ACF 0.2 Upper Limit Low er Limit 0 -0.2 1 2 3 4 5 6 7 8 9 10 11 12 -0.4 a) Based on = 5%, do you think that the private housing starts data exhibit seasonal variation? Explain your inferences clearly and note what hypotheses you are testing in your answer. H0: k = 0 Ha: k 0 K = 4, 8, 12 Since r4, r8, r12 are greater than the critical value = 2/80 = .224 Reject H0 that k = 0 for k = 4, 8, 12. This implies that seasonal variation exists in this time series data. b) To evaluate the possibility of a significant trend in the data it is recommended to deseasonalize the series first. The correlograms for the deseasonalized PHS series are reported below. 1.2 1 0.8 0.6 ACF 0.4 Upper Limit 0.2 Low er Limit 0 -0.2 1 2 3 4 5 6 7 8 9 10 11 12 -0.4 Using = 5% can you say that there is a strong evidence for trend in the PHS series? Explain your inferences clearly and note what hypotheses you are testing in your answer. H0: k = 0 Ha: k 0 K = 1, 2, 3, …, 12 rk for k =1, 2, 3, …, 7 gradually decline and become not significant after r7 at 5% level of significance. This implies that there is a trend in this time series data. 4. In 1985, the government bond yield in the United States was 10.62 percent. A random sample of government bond yields in nine foreign countries was: 11.04, 6.34, 10.94, 13.00, 7.34, 13.09, 4.78, 10.62, 6.87. The mean foreign bond yield was 9.34 with variance 9.31. Assume that government bond yields are normally distributed. At the 5 percent level of significance, test whether the government bond yields in the rest of the world during 1985 were lower than in the United States (state the null and alternative hypothesis, evaluate the test statistic, draw the decision criteria or evaluate the P-value, and state your conclusion). H0: 10.62 Ha: <10.62 t x 0 9.34 10.62 1.26 s 3.05 n 9 Decision Criteria: t * t , n1 t.05, 8 1.86 Reject H0 if t < -1.86 Conclusion: Do not reject H0, government bond yields in the rest of the world was not lower than the government bond yield in the U.S. during 1985. Multiple Choice Select the best answer 1. Which measure of forecast accuracy is analogous to standard deviation? A) B) C) D) Mean Absolute Error. Mean Absolute Percentage Error. Mean Squared Error. Root Mean Squared Error ** 2. Which of the following measures is a poor indicator of forecast accuracy, but useful in determining the direction of bias in a forecasting model? A) B) C) D) E) 3. Mean Absolute Percentage Error. Mean Percentage Error. ** Mean Squared Error. Root Mean Squared Error. None of the above. Which of the following is incorrect? Evaluation of forecast accuracy A) B) C) D) E) F) is important since the production of forecasts is costly to the firm. requires the use of symmetric error cost functions. is important since it may reduce business losses from inaccurate forecasts. is done by averaging forecast errors. both b) and d) are incorrect. ** both a) and b) are incorrect. 4. Measures of forecast accuracy based upon a quadratic error cost function, notably root mean square error (RMSE), tend to treat A) B) C) D) D) levels of large and small forecast errors equally. large and small forecast errors equally on the margin. large and small forecast errors unequally on the margin. ** every forecast error with the same penalty. None of the above. 5. You are given a time series of sales data with 10 observations. You construct forecasts according to last period’s actual level of sales plus the most recent observed change in sales. How many data points will be lost in the forecast process relative to the original data series? A) B) C) D) E) One. Two. ** Three. Zero. None of the above. 6. Forecasts based solely on the most recent observation(s) of the variable of interest A) B) C) D) E) 7. The sampling distribution of the sample mean is A) B) C) D) F) 8. are called “naive” forecasts. are the simplest of all quantitative forecasting methods. leads to loss of one data point in the forecast series relative to the original series. are consistent with the “random walk” hypothesis in finance, which states that the optimal forecast of today's stock rate of return is yesterday's actual rate of return. All the above. ** normally distributed with mean and variance 2. normally distributed with mean and variance s2. distributed as a t distribution with variance 1. normally distributed with mean 0 and variance 1. None of the above. ** An unbiased model A) B) C) D) is one that does not consistently over-estimate or under-estimate the true value of a parameter. ** is one that consistently produces estimates with the smallest RMSE. is one, which contains no independent variable; it depends solely on time-series pattern recognition. is one made up by a team of forecasters. 9. When testing the null hypothesis that the population correlation between a pair of variables is zero A) B) C) D) 10. the normal sampling distribution is used. the chi-square distribution is used. the standard normal distribution is used. The t distribution is used for small samples. ** Which of the following is not consistent with the presence of a trend in a time series? A) B) C) D) The autocorrelation function declines quickly to zero as the lag increases. ** The autocorrelation function of the first-differences declines quickly to zero as the lag increases. The autocorrelation function declines slowly towards zero as the lag increases. The autocorrelation function of the first-differences quickly declines to zero. 11. Autocorrelation refers to the correlation between a variable and: A) B) C) D) E) 12. itself. another very similar variable. itself when lagged one or more periods. ** another variable when the analysis is done on a computer. None of the above. Stationarity refers to A) B) C) D) E) the size of the RMSE of a forecasting model. the size of variances of the model's estimates. a method of forecast optimization. lack of trend in a given time series. ** None of the above. Formulas Mean Absolute Error MAE 1 n yt yˆ t n t 1 Mean Squared Error MSE 1 n ( yt yˆ t ) 2 n t 1 Mean absolute percentage Error MAPE 1 n yt yˆ t n t 1 yt Mean percentage Error MPE 1 n ( yt yˆ t ) y n t 1 t Root mean square error n (y RMSE= t 1 t yˆ t ) 2 n yˆ t yt 1 yˆ t yt 1 P( yt 1 yt 2 ) ( xi ) 2 ( xi x ) 2 x n S2 n 1 n 1 n x xi 2 i i 1 n (x x) s 2 i z n 1 Standard Normal Test Confidence Interval X z X 0 z n T-test n Confidence Interval t r x x s n n XY X Y n X ( X ) 2 n rk x t* (y t k 1 t n Y ( Y ) y )( yt k y ) n (y t 1 2 t y)2 2 s n t 2 t r 0 1 r 2 n2 rk 0 1 nk