STAT 511 Spring 2001 1. EXAM I NAME _________________ Different levels of an amino acid (0, 1, 3, or 6 mg/kg) were added to the rations fed to dairy cows. Ten different cows were used in the study. The objective was to examine the possible effect of the amino acid on milk production. Four cows were randomly assigned to the control group ( 0 mg/kg of the amino acid) and two cows were randomly assigned to each of the other three levels of the amino acid. Aside from the level of this amino acid, the rations fed to the ten cows were identical. The cows were kept in separate pens, so you can assume that one cow responded independently of any other cow. Each cow was put on the assigned diet immediately after giving birth to a calf. The response was the amount of milk produced by the cow during the next 200 days. The data are displayed in the following table. Observed Milk Production Level of Amino Acid (mg/kg) Y11 Y12 Y13 Y14 Y21 Y22 Y31 Y32 Y41 Y42 X1 X1 X1 X1 X2 X2 X3 X3 X4 X4 = = = = = = = = = = 0 0 0 0 1 1 3 3 6 6 Consider the following linear models: Model A: Yij = β0 + β1(Xi – 2) + εij where εij ~ NID(0, σ 2) Model B: Yij = µ + αi + εij where εij ~ NID(0, σ 2) A. What is the minimal set of conditions for a model to be a Gauss-Markov model? B. State the definition of an estimable function of parameters for a linear model. Use the definition to determine if either of the following is estimable for model B. (In each case give an explanation. A simple yes or no answer is not sufficient). (i) α2 + α3 (ii) 2µ + 4α1 - α2 - α3 2 C. Y = W α + ε , where Model B can be written as ~ W = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 ~ and ~ α = ~ µ α1 α2 α3 α4 Compute a generalized inverse for W T W and use it to obtain a solution to the normal equations. Express your solution as a function of Y1•• , Y2•• , Y3•• , Y4•• , the sample means for milk production at the four levels of amino acid. D. Verify that 3α1 - α2 - α3 - α4 is an estimable function of the parameters in model B. Present a formula for the ordinary least squares estimator for 3α1 - α2 - α3 - α4 as a function of Y1. , Y2. , Y3. , Y4. . E. State the Gauss-Markov theorem and describe what it implies about the properties of the ordinary least squares estimator for 3α1 - α2 - α3 - α4 that you presented in Part D. F. Write model A as Y = X β + ε , where ~ X = ~ ~ 1 1 1 1 1 1 1 1 1 1 −2 −2 −2 −2 −1 −1 1 1 4 4 and β0 β = ~ β1 3 ( ( ) ) and define PX = X X T X −1 X T . Also define PW = W W T W − WT for Model B. Furthermore, recall the following results about the normal distribution and distributions of quadratic forms. Result 1. Let A be a symmetric matrix with rank (A) = k, and let Y ~ N(µ , ∑ ) . ~ If A ∑ is idempotent, then Y ~ T ~ A Y ~ χ2k (µ T A µ) . ~ ~ ~ Result 2. Let Y ~ N(µ , ∑ ) and let A1 and A2 be symmetric matrices. If ~ ~ A1 ∑ A 2 = 0 , then Y T A1 Y and Y T A 2 Y are independent random ~ ~ ~ ~ variables. Result 3. If Y ~ N(µ , ∑ ) and A is a symmetric matrix of non-random constants, ~ ~ then E Y T A Y = µT A µ + tr (A ∑ ) ~ ~ ~ ~ ( ) Var Y T A Y = 4 µT A ∑ A µ + 2 tr (A ∑ A ∑ ) ~ ~ Result 4. If Y ~ N(µ , ∑ ) and A is a matrix of non-random constants, then ~ ~ A Y ~ N(A µ , A ∑ AT ) . ~ ~ Y1 µ1 ~ ∑11 ~ ~ N Result 5. If , Y2 µ2 ∑ 21 ~ ~ ∑12 ∑ 22 then Y1 and Y2 ~ are ~ independent random vectors if and only if Cov Y1 , Y2 = ∑12 = 0 . ~ ~ 4 (i) 1 YT (I − PX ) Y has a chi-square distribution under model B, 2 ~ ~ σ Show that i.e., when Y ~ N W γ , σ 2I . Report degrees of freedom. Does it have a ~ ~ central chi-square distribution? (ii) G. 1 Show that 2 YT ( I − PX ) PW ( I − PX ) Y has a chi-square distribution under ~ σ model B. Report degrees of freedom. Does it have a central chi-square distribution? Using the notation and results presented in Part (F), show that F= c Y T (I − PX ) PW (I − PX ) Y ~ Y ~ T ~ ( I − PW ) Y ~ has an F-distribution for some constant c when model B is the correct model. Report a numerical value for c and degrees of freedom, and give a formula for the non-centrality parameter of the F-distribution. H. 2. Can the F-statistic defined in Part (G) be used to test a null hypothesis of practical importance? Explain. Consider the study in Problem 1 and suppose the researchers also had information on the age of each cow. Let Z ij denote the age of the j-th cow fed the i-th level of amino acid. The following model was proposed. Model C: Y = M γ+ ε ~ Mγ = ~ ~ 1 1 1 1 1 1 1 1 1 1 −2 −2 −2 −2 −1 −1 1 1 4 4 where ~ Z11 Z12 Z13 Z14 0 0 0 0 0 0 0 0 0 0 Z 21 Z 22 0 0 0 0 ε ~ N 0,σ 2 I and ~ ~ 0 0 0 0 0 0 Z 31 Z 32 0 0 0 0 0 0 0 0 0 0 Z 41 Z 42 γ0 γ1 γ2 γ3 γ4 γ5 5 (A) Under what conditions, if any, would γ 2 be estimable in model C? (B) Show how you would test the null hypothesis that the linear effect of age on average milk production is the same at every level of amino acid in the diet, i.e., show how to test the null hypothesis H 0 : γ 2 = γ 3 = γ 4 = γ 5 against the alternative that H 0 is false. Report a formula for your test statistic, degrees of freedom, and indicate how you would reach a conclusion. (Just state the formula for your test statistic, you do not have to prove anything about the distribution of your test statistic or show any derivation.) (C) Suppose model C is correct. Show how you would construct a 95% confidence interval for σ 2 . (D) 3. Is model C a reparameterization of model B in Problem 1? Give an explanation. A simple "yes" or "no" response is not sufficient. Consider the linear model Y = X β + ε , where X is an n × k matrix with rank(X)=k ~ 2 2 ~ ~ T and Var ( ε ) = σ I + δ 1 1 , where I is the n × n identity matrix and 1 is an n × 1 ~ ~~ ~ 2 vector of ones. For this model, each observation has the same variance σ + δ 2 , but the observations are correlated. Any pair of observations has correlation δ 2 /(σ 2 + δ2 ) . Furthermore, since X has full column rank, every element of β is estimable and c T β is ~ ~ ~ T estimable for any c ≠ 0 . The ordinary least squares estimator for c β is ~ ~ ~ ~ c T b = c T (X T X) −1 X T Y . ~ ~ ~ ~ (A) Show that c T b = c T (X T X) −1 X T Y is an unbiased estimator for c T β . ~ ~ ~ ~ ~ ~ (B) Is c T b = c T (X T X) −1 X T Y a best linear unbiased estimator for c T β ? Justify ~ ~ ~ your answer. Exam Score ___________ ~ ~ ~