C22.0015 Information for Final Exam 2011.MAY.04 The problems below are taken from previously-used exams, or were considered for exams. Solutions follow, beginning on page 7. 1. In the following analysis of variance table, there are some missing entries. Fill in those entries and be sure to leave empty those positions that are left empty by convention. SOURCE OF VARIATION Regression Error Total DEGREES OF FREEDOM 3 SUM SQUARES MEAN SQUARES 45.00 F 5.00 23 2. Suppose that you have three random variables X1, X2, X3, independent and normally distributed. It is known that the standard deviation is = 10 for each value, but the means 1, 2, and 3 are not known. Find the likelihood ratio test criterion for H0: 1 = 2 versus HA : 1 2. Show that –2 log is exactly chi-squared on one degree of freedom. 3. Consider the regression model Yi = xi + i where the xi’s are known nonrandom quantities and the ’s are independent normal N(0, 2). Suppose that the value of is known. (a) Find the maximum likelihood estimate of . (b) Find Fisher’s information and thus give the limiting variance of the maximum likelihood estimate. (c) Find the exact variance of your estimate from (a). (d) If (b) and (c) do not produce exactly the same variance, show how these two solutions are reconciled as n . 4. There are two independent samples of data from normal populations. We have X1, X2, …, Xm as a sample from N(, 2) Y1, Y2, …, Yn as a sample from N(, 2) Observe that the sample sizes are m and n, which are in general not equal. The standard deviations are, of course, ψ and . This problem asks about some of the steps needed to do the likelihood ratio test of H0: 2 = 2. (The final form of the test is not part of the problem, however.) Page 1 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 (a) (b) (c) (d) Write the likelihood for this problem. Give the maximum likelihood estimates for , 2, , and 2. (If you remember these, you can simply state the result without any derivation.) Write the maximized likelihood. That is, substitute your answers in (b) into the likelihood of (a). What would be the maximum likelihood estimate of 2 under H 0 : 2 = 2 ? (Obviously this is also the maximum likelihood estimate of 2.) 5. A certain biochemical measurement is used to determine whether or not a person has been exposed to a certain disease. If the person has been exposed to the disease, then the measurement (call it X) follows a normal distribution with = 4.0 and = 1.2. If the person has not been exposed, then X follows a normal distribution with = 2.2 and = 1.2. Since the only difference is in the mean, the problem has been set up as a hypothesis test of H 0 : = 4.0 versus HA : = 2.2. (For medical considerations the “exposed” version has been set up as the null hypothesis.) (a) (b) (c) Based on a single observation X, give the details of the best test at level = 0.01. (Here “details” means that you should give cutoff points as numbers.) For the rule in (a), give the probability of type II error. This is somewhat harder. You’ll find that the solution to (b) is not very satisfying. Suppose that you could take a whole sample of independent values X 1 , X 2 ,..., X n and base the procedure on X . What is the smallest sample size n for which you would be able to get the probability of type II error at or below 0.01 (while still keeping = 0.01)? 6. The model for Poisson regression is that x1 , x 2 ,..., x n are n positive known values, Y1, Y2 , ..., Yn are n independent random variables, with Yi ~ Poisson(xi). (a) (b) Find the least squares estimate of . HINT: E Yi = xi. Find the maximum likelihood estimate of . Page 2 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 7. The random variable U takes its values in the set {0, 1, 2, 3, 4, 5, 6}. Two possible probability laws are under consideration. (a) (b) (c) (d) 0 1 2 3 4 5 6 f0 1 16 1 7 3 16 1 7 4 16 1 7 3 16 1 7 2 16 1 7 1 16 f1 2 16 1 7 1 7 Based on one observation of U, you wish to test H 0 : f0 versus HA: f1 . What is the best test of size = 18 ? For the test that you found in (a), give the power, meaning 1 – P(Type II error). Find the best test of size = 83 . For the test in (c), give the power. 8. Suppose that you have exactly one observation X on a random variable which has the x triangular distribution on [0, ]. The density is f(x | ) = 2 2 I 0 x . You want to test H0: = 10 versus H1: = 12 at significance level = 0.05. Give the details of the best test. 9. Suppose that you have exactly one observation X on a random variable which is uniform on the interval [0, ]. You want to test H0: = 10 versus H1: = 12 at significance level = 0.05. Give the details of the best test. 10. You have a sample X1, X2, …, Xn from the normal probability density N(, 2). Note that the mean and standard deviation are the same. Assume that > 0. Find the maximum likelihood estimate of . 11. Consider the regression of Y on X1, X2, X3, X4, and X5. The units on the variables are these: Y Hours X1 Inches X2 (none) X3 Weeks X4 Hours X5 Dollars Here X2 has no units; its values are pure numbers. Page 3 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 Give the units for each of the following statistical quantities. (a) (b) (c) (d) (e) b1 b2 b4 s R2 (f) (g) (h) (i) (j) ei (a residual) ri (a Studentized residual) F (the F statistic) SSE (error sum of squares) det( X´X ) 12. Consider the regression model noted as Y X n1 n p p1 n1 Two people are making independent attempts to analyze this model. Andrea assumes that Var = 2 I and thus use the ordinary least squares estimate bLS = (X´X)-1 X´Y. Barbara assumes that Var = 2 V, where V is a specified (known) n-by-n matrix, and she uses the generalized least squares estimate given by bV = (X´V-1X)-1 X´ V-1Y. (a) (b) (c) Suppose that Andrea is correct and that Var really is 2 I. Exactly what do we mean when we say that bLS is better than bV? Again assuming that Andrea’s assumption is correct, what is Var(bV)? Suppose that Barbara is really correct in that Var = 2 V. Find Var(bV) in this case. 13. This problem concerns the regression given by the model Yi = 0 + A Ai + B Bi + C Ci + D Di + i (a) (b) (c) (d) Compare the R 2 statistic for the model to the R 2 statistic after omitting variable B. How would you test the null hypothesis H 0 : D = 0 ? How would you test the combined null hypothesis H 0 : C = 0 and D = 0 ? How would you test the null hypothesis H 0 : C = D? (Assume that variables C and D are in the same units.) 14. You have data (x1, y1), …, (x24, y24), and you have done the regression of Y on X. This resulted in the fitted regression equation Y = 13.2 + 4.8 x. The standard error of Page 4 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 b regression was reported as s = 2.2. The estimated variance matrix for 0 was given as b1 16 4 . Give a 95% confidence interval for the linear combination 2 0 + 3 1. 4 2.25 15. The multiple regression model is Yi = 0 + 1 xi1 + 2 xi2 + … + k xik + i for i = 1, 2, …, n. It is assumed that the xij’s are all nonrandom and that 1, 2, …, n is a sample from a normal N(0, population. What are the sufficient statistics for this model? 16. This problem considers the regression of YIELD on the predictors ARCH, BERRY, CRUST, DEPLETE, and EDGAR. There is some concern as to whether we can accept the hypothesis H0 : ARCH = 0, BERRY = 0, CRUST = 0. There is a separate question as to whether we can accept H0 : DELETE = 0, EDGAR = 0. Observe that each of these hypotheses involves more than a single . The regression of YIELD on all five variables yielded this output: The regression equation is YIELD = 734 + 1.36 ARCH - 3.26 BERRY - 0.050 CRUST + 0.0123 DEPLETE - 0.258 EDGAR Analysis of Variance Source DF SS Regression 5 20540.6 Error 76 47297.3 Total 81 67837.9 MS 4108.1 622.3 F 6.60 P 0.000 The regression of YIELD on ARCH, BERRY, and CRUST gave this: The regression equation is YIELD = 720 + 1.10 ARCH - 3.34 BERRY - 0.004 CRUST Analysis of Variance Source DF SS Regression 3 19948.5 Error 78 47889.4 Total 81 67837.9 Page 5 MS 6649.5 614.0 F 10.83 P 0.000 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 The regression of YIELD on DELETE and EDGAR gave this: The regression equation is YIELD = 598 - 0.0531 DEPLETE + 0.611 EDGAR Analysis of Variance Source DF SS Regression 2 6478.0 Error 79 61359.9 Total 81 67837.9 (a) (b) MS 3239.0 776.7 F 4.17 P 0.019 Based on these findings give the test for H0 : ARCH = 0, BERRY = 0, CRUST = 0 and indicate whether we would accept or reject H0 at the 0.05 level of significance. Based on these findings give the test for H0 : DELETE = 0, EDGAR = 0 and indicate whether we would accept or reject H0 at the 0.05 level of significance. Solutions begin on next page. Page 6 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 SOLUTIONS 1. The completed table is this: SOURCE OF VARIATION Regression Error Total DEGREES OF FREEDOM 3 20 23 SUM SQUARES 135.00 180.00 315.00 MEAN SQUARES 45.00 9.00 F 5.00 2. It is tempting to drop X3 from the problem because it is irrelevant to the hypothesis. In this case, it’s a legitimate decision, but we’ll show the results while keeping X3. The likelihood is 1 3 xi i 2 1 200 L = e i 1 10 2 Under the alternative HA, the best (maximum likelihood) estimate for each i is xi. The exponential part of the maximized likelihood will drop out completely, and the maximized likelihood is 3 1 1 = 3/ 2 2 1, 000 2 10 LA,max = i1 x1 x2 and 3 = x3. Observe then that 2 2 2 2 x1 x2 x x x x x1 1 2 = 1 2 = 2 4 2 2 Under H 0 , you’ll find 1 2 x1 ˆ 1 2 x1 x2 Similarly, x2 ˆ 2 = 2 2 . Of course x3 ˆ 3 = 0. This results in a 2 4 maximized likelihood under H0 which is x x 1 x1 x2 1 2 0 200 4 4 2 L0,max = 1 1, 000 2 3/ 2 e x x 1 x1 x2 1 2 0 200 4 4 2 = 1 1, 000 2 3/ 2 e 2 Page 7 2 = 1 1, 000 2 3/ 2 e x1 x2 2 400 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 It follows that the likelihood ratio test statistic is 1 e x1 x2 2 1, 000 2 L = 0,max = 1 LA,max 3/ 2 1, 000 2 3/ 2 It follows immediately that – 2 log = X1 X 2 400 = e x1 x2 x1 x2 2 400 2 200 . In random variable from, this is 2 . However, under H 0 , it happens that X1 – X2 is N(0, 2×100), and so it 200 follows that – 2 log ~ 12 exactly. 3. We must begin by writing the likelihood. This is 1 n 1 2 2 n 2 yi xi 2 yi xi 1 1 2 i 1 2 e L = = n e n/2 2 i 1 2 For part (a), we need the maximum likelihood estimate. This is done in the usual way. log L = n log n 1 n 2 log 2 y xi 2 i 2 2 i 1 let 1 n log L 2 x y x 0 i i i 22 i 1 This condition can be written as n xi2 i 1 n x y i i i 1 n The solution is then ML xY i i i 1 n x , in random variable form. 2 i i 1 Page 8 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 For part (b), we need Fisher’s information. There are several ways to do this, but probably the easiest is to exploit the second derivative. 2 1 n I() = E 2 log L = E 2 xi yi xi 2 2 i 1 n 2 xi 1 1 = E 2 xi yi xi2 = E 2 xi2 = i 1 i 1 n n i 1 2 It follows that the limiting variance of the maximum likelihood estimate is the reciprocal 2 of this, namely n . 2 xi i 1 For part (c) we’ll find the exact variance of the maximum likelihood estimate. This is n xiYi Var i n1 2 xi i 1 = n 1 n 2 xi i 1 1 1 n = Var xiYi = 2 2 n n i 1 2 2 xi xi i 1 i 1 2 x i 1 2 i 2 = n x i 1 2 i Var Yi 2 n x 2 i i 1 Finally, for part (d) we note that the exact and asymptotic variances are exactly the same. Thus, there is no need to consider what happens as n . 4. The likelihood is 1 2 1 xi 2 n m y j 1 1 22 2 2 e e L = i 1 2 j 1 2 2 1 1 n 2 xi y j 1 1 2 2 i1 2 2 j 1 e e = m n m/2 n/2 2 2 m Page 9 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 For part (b), the maximum likelihood estimates from the X’s are = X and 2 1 m ˆ 2 X i X ; these are given in random variable form. From the Y’s, the m i 1 2 1 n ˆ 2 Y j Y . maximum likelihood estimates are = Y and n j 1 The maximized likelihood (call it LA,max ) is LA,max = 2 1 1 n 2 xi x yj y 1 1 ˆ 2 i1 ˆ 2 j 1 2 2 e e m n m/2 n/2 ˆ 2 ˆ 2 m 2 1 m 1 2 yj y xi x 2 j 1 1 n 2 i 1 1 m 2 y j y 2 xi x 1 1 m i 1 n j 1 e e = m m/2 n/2 n ˆ ˆ 2 2 n m n m n 1 1 1 2 2 e n e = = m e 2 2 m/2 n/2 m n / 2 m n ˆ ˆ ˆ ˆ 2 2 2 For part (d), we’ve got to consider the likelihood with 2 = 2. This is 2 1 1 2 xi y j 1 1 2 2 i1 2 2 j 1 e e L0 = m n m/2 n/2 2 2 m = 1 m n 2 m n / 2 e n n 2 1 m 2 xi y j 2 2 i 1 j 1 Now, in finding the maximum likelihood estimate, we’ll still have to have = X and = Y . The likelihood at this intermediate stage of maximization is L0(intermediate) = 1 m n 2 mn / 2 e Page 10 n 1 m 2 xi x y j y 2 2 i 1 j 1 2 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 To get the maximum likelihood estimate for (or 2), let’s take the logarithm. log L0(intermediate) = m n log n 2 mn 1 m 2 log 2 x x yj y 2 i j 1 2 2 i 1 Then the derivative: n 2 let mn 1 m 2 log L0 intermediate 3 xi x y j y 0 j 1 i 1 The solution is ˆ 02 n 2 1 m 2 xi x y j y j 1 m n i 1 The details of the likelihood ratio test would be straightforward at this point, though messy. 5. The best test will be Neyman-Pearson, and H 0 will be rejected for 1 e 1 21.22 x 4.0 2 1.2 2 1 x 2.2 2 1 21.22 e 1.2 2 k This quickly reduces to rejecting H 0 if X < k (for some k). This form of the rule could be deemed obvious just from the statement of the problem. The relevant detail comes in the selection of k so that the level of the test is 0.01. We set up the following condition: k 4.0 X 4.0 k 4.0 40 = P Z P[ X < k = 4.0 ] = P 1.2 1.2 1.2 Page 11 want 0.01 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 k 4.0 2.33 , for which 1.2 the solution is k = 1.204. This takes care of (a). We note that the cutoff k is not between the null and alternative values (4.0 and 2.2), so we expect very high probability of Type II error. The normal table gives P[ Z < -2.33 ] = 0.01, so we solve For (b), we need the probability of type II error. This is found as X 2.2 1.204 2.2 P[ X 1.204 = 2.2 ] = P 2.2 = P[ Z -0.83 ] 1.2 1.2 = 0.7967 80% For part (c), you know that the rule will be to reject H 0 if X < k. You also know that 1.2 SD( X ) = = . Now find n n X 4.0 P[ X < k = 4.0 ] = P 1.2 n k 4.0 = P Z n 1.2 Now we solve for k in n k 4.0 1.2 n 4.0 want 0.01 k 4.0 2.796 2.33 , giving k = 4.0 . 1.2 n We will now use this k in dealing with the type II error probability. 2.796 4.0 2.2 X 2.2 2.796 n 2.2 P X 4.0 2.2 = P 1.2 n 1.2 n n 2.796 1.8 n = P Z 1.8 n 2.796 = P Z 1.2 1.2 n Page 12 want 0.01 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 1.8 n 2.796 2.33 The solution is n 3.107. 1.2 The objective can be reached if n (3.107)2 9.65. Thus we would need at least 10 observations in order to make both misclassification probabilities 1% or less. This is not 4.0 2.2 a massive sample size. Since the null and alternative values are = 1.5 standard 1.2 deviations apart, we do not need a lot of data to decide between H0 and H1 . As P[ Z 2.33 ] = 0.01, we solve n y x 6. The least squares criterion is simply Q = i i 1 i 2 . Routine differentiation will n lead to LS x y i i i 1 n x . 2 i i 1 To obtain the maximum likelihood estimate, we start naturally with the likelihood: n L = e xi xi yi n = e yi ! i 1 xi i 1 n yi i 1 xiyi i 1 yi ! n Then we find n n i 1 i 1 log L = xi yi log n y log x i 1 i i n log y ! i 1 i n Then n log L xi i 1 yi i 1 let 0 n This solves as ML y i i 1 n x = i y . x i 1 Page 13 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 7. The best tests are based on the likelihood ratio. This table gives the relevant information. 0 1 2 3 4 5 6 f0 1 16 1 7 3 16 1 7 4 16 1 7 3 16 1 7 2 16 1 7 1 16 f1 f0 f1 2 16 1 7 7 16 14 16 21 16 28 16 21 16 14 16 7 16 1 7 f0 . The obvious most extreme f1 under H 0 , solving (a). The null hypothesis is to be rejected for small values of rejection set is {0, 6}. This has probability The set {0, 6} has probability 2 7 2 16 = 1 8 under HA , the power of the test, as requested in (b). If we enlarge the rejection set to {0, 1, 5, 6}, this has probability thus is our solution to (c). For part (d), we simply notice that this set has probability 4 7 6 16 = 3 8 under H 0 and under HA. 8. This is clearly a Neyman-Pearson problem. The null hypothesis likelihood is f(x | = 10) = 10 x I 0 x 10 50 The alternative likelihood is f(x | = 12) = 12 x I 0 x 12 72 The best test rejects H0 whenever = f x | 10 c f x | 12 Page 14 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 The value c can be chosen to bring the test to level = 0.05. However, we see easily that 10 x I 0 x 10 72 50 = = 12 x 50 I 0 x 12 72 10 x I 0 x 10 12 x I 0 x 12 We can absorb the numerical constant into c so that the best test requires that we reject 10 x I 0 x 10 c. H0 when 12 x I 0 x 12 There are four critical ranges of x to consider. Numerator Denominator Ratio 0 x 10 10 - x 12 - x 10 x 12 x x<0 0 0 irrelevant 10 < x 12 0 12 - x 12 < x 0 0 0 irrelevant We will ignore the “irrelevant” sets, since they are impossible under both H0 and also under H1 . The set { 10 < x 12 } should certainly be included in the rejection set, since a value of x in that range is a clear indicator that H0 is false. We should also reject for the condition 10 x 12 x < c 10 12c . This fraction is just another 1 c constant, so we might as well describe the condition for rejection as x > c. This condition is quickly rewritten as x > Page 15 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 For this final use of c, we need to pick the c so that P[ X > c | = 10 ] = 0.05. The condition we’ll work with is 10 x 0.05 P[ X > c | = 10 ] = I 0 x 10 dx = 50 c 10 want 1 = 50 = 10 x dx 50 c 10 x 10 1 x2 10 x = c 10 x dx 50 2 x c 10 1 c2 1 c2 100 50 10c = 50 10c 50 2 50 2 An equivalent statement is 2.50 = c2 10c 50 2 or c2 20c 95 = 0 This is a routine quadratic whose roots are c = 20 20 2 4 1 95 2 = 20 20 = 10 2 5 Numerically, these are 7.7639 and 12.2361. For this example, it’s the smaller root that matters. Thus, we’ll reject H0 whenever x > 7.7639. 9. The null hypothesis likelihood is f(x | = 10) = likelihood is f(x | = 12) = = 1 I 0 x 10 . The alternative 10 1 I 0 x 12 . The best test rejects H0 whenever 12 f x | 10 c f x | 12 The value c can be chosen to bring the test to level = 0.05. However, we see easily that Page 16 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 1 I 0 x 10 I 0 x 10 6 = 10 = 1 5 I 0 x 12 I 0 x 12 12 We can absorb the numerical constant into c so that the best test requires that we reject I 0 x 10 H0 when c. I 0 x 12 However, the numerator and denominator can only take values of 0 or 1. Thus, there are only four possible values. Numerator Denominator Ratio x<0 0 0 irrelevant 0 x 10 1 1 1 10 < x 12 0 1 0 12 < x 0 0 irrelevant We will choose to ignore the possibilities x < 0 and 12 < x, as these are impossible under both H0 and H1. Certainly the ratio is small on the set 10 < x 12, so we should include this interval in our rejection set. This takes nothing out of our level of significance, since P[ 10 < x 12 | H0 ]. This means that we can spend the entire = 0.05 on any subset of the interval [0, 10]. It seems most natural to reject H0 on the set { x > 9.5 }, since P[ X > 9.5 | H0 ] = 0.05. We could however create other equally-good rejection sets. Some odd examples might be these: Reject H0 on the set { x < 0.5 } { x > 10 }. Reject H0 on the set { 0.5 < x < 1.5 } { x > 10 }. Reject H0 on the set { 1.0 < x < 1.25 } { 3.5 < x < 3.75 } { x > 10 }. This is a consequence of the uniform distribution, as it has a completely flat likelihood. Page 17 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 10. The likelihood is 1 2 xi 1 L = e 2 i 1 2 n 2 1 = n 2 n/2 e 1 2 2 n xi 2 i 1 The maximizing will almost certainly be easier through log L : log L = n log = n log n 1 n 2 log 2 x 2 i 2 2 i 1 n n 1 n 2 log 2 x 2 xi n2 2 i 2 2 i 1 i 1 n = n log x 2 i i 1 22 1 n xi terms without i 1 Now we can differentiate: n d n log L = d x 2 i i 1 3 let 1 n x i 0 2 i 1 If we multiple through by -3, we produce this quadratic equation: n n2 xi i 1 n let xi2 0 i 1 This looks a little better if we divide by n. This gets us to 2 x 1 n 2 let xi 0 n i 1 x The roots of this are x2 4 n 2 xi n i 1 2 Page 18 . gs2011 C22.0015 Information for Final Exam 2011.MAY.04 The item under the square root looks a little better if rearranged as follows: 4 n 2 4 n 2 2 2 2 x = x xi nx nx = i n i 1 n i 1 4 n 1 s 2 nx 2 n x2 x2 = 5x2 4 n 1 2 s n x 5x2 4 Thus, we can re-identify the roots as n 1 2 s n . It’s clear that the + 2 part of the is relevant here, as the - part will produce a negative answer. Thus, we present our maximum likelihood estimate as ˆ ML = x 5x2 4 n 1 2 s n 2 This answer makes sense in data for which x s. Then the item under the square root is about 9 x 2, and the estimate about x 9x 2 , which is x . 2 11. Units are shown here. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) b1 hours/inch b2 hours b4 no units s hours 2 no units R ei (a residual) ri (a Studentized residual) F (the F statistic) SSE (error sum of squares) det( X´X ) Page 19 hours no units no units hours2 inch2 × week2 × hours2 × dollars2 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 12. (a) (b) The Gauss-Markov theorem states that bLS is best in the sense that Var(b*) - Var(bLS) is a positive semi-definite matrix. In this statement, b* is any other unbiased estimate, including bV. Using Andrea’s assumptions and using Andrea’s calculations, Var(bV) = Var( (X´V-1X)-1 X´ V-1Y ) = (X´V-1X)-1 X´ V-1 Var(Y) [(X´V-1X)-1 X´ V-1]´ = (X´V-1X)-1 X´ V-1 2 I V-1 X (X´V-1X)-1 = 2 (X´V-1X)-1 X´ V-1 V-1 X (X´V-1X)-1 = 2 (X´V-1X)-1 X´ V-2 X (X´V-1X)-1 This cannot be simplified further. This derivation used (X´)´ = X and [ (X´V-1X)-1 ]´ = (X´V-1X)-1. (c) Using Barbara’s assumptions, and Barbara’s calculations, Var(bV) = Var( (X´V-1X)-1 X´ V-1Y ) = (X´V-1X)-1 X´ V-1 Var(Y) [(X´V-1X)-1 X´ V-1]´ = (X´V-1X)-1 X´ V-1 2 V V-1 X (X´V-1X)-1 = 2 (X´V-1X)-1 X´ V-1 X (X´V-1X)-1 = 2 (X´V-1X)-1 13. (a) The R 2 for the full model is bigger. (b) There are two ways to do this. One is to use the t-value listed for variable D; this gives an ordinary t statistic, and we reject H 0 if t tn-5; /2 . The other way uses the reduced sum of squares method: Page 20 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 residual sum of squares residual sum of squares from model with A, B, C from model with A, B, C , D residual sum of squares n 5 from model with A, B, C , D residual sum of squares residual sum of squares from model with A, B, C from model with A, B, C , D = residual mean square from model with A, B, C , D The null hypothesis is to be rejected if this calculation equals or exceeds F1,n5 , the upper point for the F distribution with 1, n – 5 degrees of freedom. (c) For this part, you MUST use the reduced sum of squares method: residual sum of squares residual sum of squares 2 from model with A, B from model with A, B, C , D residual sum of squares n 5 from model with A, B, C , D The null hypothesis is to be rejected if this calculation equals or exceeds F2,n5 , the upper point for the F distribution with 2, n – 5 degrees of freedom. (d) Observe that C Ci + D Di = C (Ci + Di) + (D - C) Di . Thus, by defining a new variable Ti = Ci + Di and a new coefficient E = C - D , we can rewrite the model as Yi = 0 + A Ai + B Bi + C Ti + E Di + i Now we have changed the hypothesis to H 0 : E = 0. This is now a question about a single coefficient, and the method of part (b) can be used. 14. Certainly the point estimate for 20 + 31 is 2b0 + 3b1 = 2(13.2) + 3(4.8) = 26.4 + 14.4 = 40.8. Next observe that Var(2b0 + 3b1) = 4 Var(b0) + 12 Cov(b0, b1) + 9 Var(b1) Page 21 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 This is estimated as 4(16.0) + 12(-4.0) + 9(2.25) = 36.25 36.25 6.0208. Then SE(2b0 + 3b1) = With n = 24, the estimate of must have 22 degrees of freedom. Thus, the 95% confidence interval is 40.8 ± t22;0.025 6.0208 or 40.8 ± 2.0744 × 6.0208, or 40.8 ± 12.4895 You could reasonably give the interval as 40.8 ± 12.49, meaning (28.31, 53.29). Observe that you do not explicitly use the value of s. This was incorporated into the estimated b variance matrix of 0 . b1 15. The likelihood for this model is 1 2 yi 0 1 xi1 2 xi 2 ... k xik 1 L = e 2 i 1 2 n 1 = n 2 n/2 e 1 2 2 n yi 0 1 xi 1 2 xi 2 ... k xik 2 2 i 1 It’s clear that all the sufficient statistics must come from the exponent. This exponent is 1 n 2 y 0 1 xi1 2 xi 2 ... k xik = 2 i 2 i 1 n n n n terms with 1 n 2 y 2 y 2 x y 2 x y ... 2 xik yi i 0 i 1 i 1 i 2 i 2 i k 2 2 i 1 i 1 i 1 i 1 i 1 no y's Thus the sufficient statistics are these: n y i 1 2 i n , n y , x i1 yi , i i 1 i 1 n n x i 2 yi , …, i 1 x y ik i i 1 Page 22 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 There are k + 2 of these. This makes perfect sense, since there are k + 2 parameters, namely 0, 1, …, k, . n Many people also include in the sufficient statistics the sums xij , i 1 n xij2 , i 1 n x x . ig ih i 1 Technically, we don’t use these as statistics, since the x’s are modeled as nonrandom. However, we will still need these sums to compute the estimates, so that there is no harm in including them. Page 23 gs2011 C22.0015 Information for Final Exam 2011.MAY.04 16. Each of these tests is based on the reduced sum of squares method. For part (a) we examine SSresid from model SSresid from model 3 using D, E using A,B,C , D, E F = SSresid from model 76 using A,B,C , D, E Here we use A for ARCH, B for BERRY, and so on. Numerically, the statistic is F = 61,359.9 47,297.3 3 7.532 47,297.3 76 0 . 05 This is to be compared to F3,76 = 2.7249. As our computed value of 7.532 exceeds this, we just reject H 0 . We cannot accept H0 : ARCH = 0, BERRY = 0, CRUST = 0. Part (b) is similar. We do this statistic: SSresid from model SSresid from model 2 using A , B , C using A , B , C , D , E F = SSresid from model 76 using A,B,C , D, E The numeric work is this: F = 47,889.4 47,297.3 2 0.476 47,297.3 76 As this is less than 1, we will certainly accept H 0 . Just for the record, the comparison . 05 point is F20, 76 = 3.1170. Page 24 gs2011