252solnK2 11/26/07 (Open this document in 'Page Layout' view!) K. REGRESSION EXTENSIONS 1. Residual Analysis Text 13.23, 13.24, 13.26, 14.18 [13.20, 13.21, 13.22, 14.9] (13.20, 13.21, 13.22, 14.9) 2. Dummy Variables 14.38-14.34, 14.41 [14.33 – 14.35] (15.6 – 15.8) 3. Nonlinear regression 15.1, 15.6, 15.7 [15.1, 15.6, 15.7] (15.1, 15.13, 15.14) 4. Runs test K1 5. Durbin-Watson test 13.32-13.34 [13.28, 13.29, 13.30] (13.28, 13.29, 13.30) Solutions to sections 4 and 5 are in this document. ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Runs test Problem K1: (Levin and Rubin, modified) a. Men and women are admitted to a training program in the following order. Is it random? MWWMMMMWWWWMMWMWWM. b. What about WMWMWMWMWMWMWMWWMM? c. A professor hypothesizes that the more able students will tend to turn their exams in either earlier or later than the majority and that less able students tend to be more apt to turn exams in after an average amount of time. If we only count grades above 89 as high, test the following sequence for randomness. 94 70 85 89 92 98 63 88 74 85 69 90 57 86 79 72 80 93 66 74 50 55 47 59 68 63 89 51 90 88 Solution: H 0 : Randomness .05 a. There are n1 9 women, n 2 9 men and r 9 runs. From Table 4a, reject H 0 if r 5 and from Table 4b, reject H 0 if r 16 . Do not reject H 0 . b. There are n1 9 women, n 2 9 men and r 16 runs. From Table 4a, reject H 0 if r 5 and from Table 4b, reject H 0 if r 16 . Reject H 0 . c. We can replace these by H for high and L for low. The sequence is shown below. HLLLHHLLLL LHLLLLLHLL LLLLLLLLHL There are n1 6 Hs, n 2 24 men and r 10 runs. This is out of the range of the table, so use z r 2 , where 2n1 n 2 2624 1 1 10 .6 and n1 n 2 6 24 1 2 10.6 110.6 2 2.84697 n1 n 2 1 r 6 24 1 , so that 2.84697 1.6872 and 10 10 .6 0.356 . Since .05 we do not reject H 0 if z is between –1.960 and +1.960. 1.6872 In this case we do not reject H 0 . z 252solnK2 11/26/07 (Open this document in 'Page Layout' view!) Durbin-Watson test Exercises 13.32 [13.28 in 8th]: We have residuals for 10 periods of {-5, -4, -3, -2, -1, 1, 2, 3, 4, 5}. a) Plot – is there a pattern? b) Is there a conclusion about autocorrelation? Solution: (a) You don’t need a plot – these suckers are taking off. An increasing linear relationship exists. The critical values of the Durbin Watson statistic are d L 1.08 and dU 1.36 . (b) 0 + 0 + The presence of similar values before each value indicates high positive autocollrelation. D = 0.109 (Don’t worry about the computation) This is very far from 2 and looks very suspicious. (c) There really aren’t enough numbers here to use the table The critical values of the Durbin Watson statistic for n 15 and k 1 are d L 1.08 and d U 1.36 . The numbers we want must be closer to 1 and 1.3. The positive autocorrelation test is given by the diagram below. ? No evidence of positive autocorrelation. dU 0 dL 0 2 + + + + + + If we stick in 1 and 1.3, we get 1 ? 1.3 0 2 No evidence of positive autocorrelation. 0 + + + + + + Since D = 0.109 is obviously in the bottom of the 0 region, there is enough evidence to conclude that there is strong positive autocorrelation among the residuals. Exercises 13.33 [13.29 in 8th]: We have residuals for 15 periods of {4, -6, -1, -5, 2, 5, -2, 7, 6, -3, 1, 3, 0, 4, -7}. a) Plot – is there a pattern? b) Is there a conclusion about autocorrelation? Use DW. Solution: (a) There is no apparent pattern in the residuals over time. (b) I’m going to pretend that the word ‘positive did not appear in this problem. D = 1.661. The critical values of the Durbin Watson statistic for n 15 and k 1 are d L 1.08 and 0 + 0 0 + 0 (c) d U 1.36 . The diagram is below. ? dL dU 0 2 0 4 dU ? + + + + If we use the values we have it becomes 1.08 ? 1.36 0 2 0 2.64 ? + + + + 4 dL + 0 4 + 2.92 + 0 4 + 1.661 falls into the 0 region. There is no evidence of positive autocorrelation among the residuals. The data are not autocorrelated. Exercises 13.34 [13.30 in 9th]: a) In PETFOOD do we need to calculate DW? b) When do we need DW? Solution: (a) No, since the data have been collected for a single period for a set of stores. (b) If a single store was studied over a period of time and the amount of shelf space varied over time, computation of the Durbin-Watson statistic would be necessary. 252solnK2 11/26/07 (Open this document in 'Page Layout' view!) The following problems are worth looking at, even if you don’t have the data. ----------------------------------------------------------------------------------------------Exercises 13.47(McClave et. al.): In this problem we are faced with a picture of residuals cycling around the x axis. The D-W statistic is 0.6292. It is a simple regression, so k 1 is the number of independent variables. Also the total degrees of freedom are 38 so there must be n 39 observations. We are told that .10 . a) Even a brief look at the plot on page 802 indicates that positive residuals are generally followed by positive residuals, while negative residuals are followed by negative residuals. If we represent the regression line as a line with a slight negative slope relative to time, the actual values of y seem to be cycling around it. This may indicate that inflation is a relatively stable process, when it becomes faster than the trend people will eventually cut back on buying and cause a lower rate, while slower than anticipated inflation could eventually bring about higher spending. b) The D-W statistic is 0.6292. This is a test for randomness of regression residuals . The alternative e hypothesis is (first-order) autocorrelation. The computer will print d t et 1 2 , and compute the et Durbin-Watson statistic, DW . Use a Durbin-Watson table to fill in the diagram below. 0 + 0 dL + ? dU + 0 2 0 + 4 dU + ? 4 dL + 0 4 + To find d L and d U , remember that k 1 is the number of independent variables. Also the total degrees of freedom are 38 so there must be n 39 observations. We are told that .10 , so that for a 2-sided test .05 . The D-W table is in the text appendix on pages 1001-1002 and we find. d 1.43 and L 2 d U 1.54 , so that 4 dU 4 1.54 2.46 and 4 d L 4 1.43 2.57. Our diagram is now 0 0 1.43 ? 1.54 0 2 0 2.46 ? 2.57 0 4 + + + + + + + Since .6292 falls in the lower rejection region, we reject the null hypothesis of no autocorrelation and conclude that the residuals are autocorrelated and thus somewhat predictable. Exercise 13.44(McClave et. al.): This is here just to remind you of the definition of autocorrelation. First order autocorrelation indicates that adjacent residuals (for example those at time 20 and 21) are correlated. At the very least this means that either each residual is very frequently followed by a residual of the same sign (positive autocorrelation) or each residual is frequently followed by a residual of the opposite sign (negative autocorrelation).