STAT 511 EXAM I NAME _________________

advertisement
STAT 511
Spring 2001
1.
EXAM I
NAME _________________
Different levels of an amino acid (0, 1, 3, or 6 mg/kg) were added to the rations fed to dairy
cows. Ten different cows were used in the study. The objective was to examine the
possible effect of the amino acid on milk production. Four cows were randomly assigned to
the control group ( 0 mg/kg of the amino acid) and two cows were randomly assigned to
each of the other three levels of the amino acid. Aside from the level of this amino acid, the
rations fed to the ten cows were identical. The cows were kept in separate pens, so you can
assume that one cow responded independently of any other cow. Each cow was put on the
assigned diet immediately after giving birth to a calf. The response was the amount of milk
produced by the cow during the next 200 days.
The data are displayed in the following table.
Observed
Milk Production
Level of
Amino Acid (mg/kg)
Y11
Y12
Y13
Y14
Y21
Y22
Y31
Y32
Y41
Y42
X1
X1
X1
X1
X2
X2
X3
X3
X4
X4
=
=
=
=
=
=
=
=
=
=
0
0
0
0
1
1
3
3
6
6
Consider the following linear models:
Model A:
Yij = β0 + β1(Xi – 2) + εij
where
εij ~ NID(0, σ 2)
Model B:
Yij = µ + αi + εij
where
εij ~ NID(0, σ 2)
A.
What is the minimal set of conditions for a model to be a Gauss-Markov model?
B.
State the definition of an estimable function of parameters for a linear model. Use the
definition to determine if either of the following is estimable for model B. (In each
case give an explanation. A simple yes or no answer is not sufficient).
(i)
α2 + α3
(ii)
2µ + 4α1 - α2 - α3
2
C.
Y = W α + ε , where
Model B can be written as
~
W =















1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
~















and
~



α = 
~



µ 
α1 
α2 

α3 
α4 
Compute a generalized inverse for W T W and use it to obtain a solution to the
normal equations. Express your solution as a function of Y1•• , Y2•• , Y3•• , Y4•• , the
sample means for milk production at the four levels of amino acid.
D.
Verify that 3α1 - α2 - α3 - α4 is an estimable function of the parameters in model B.
Present a formula for the ordinary least squares estimator for 3α1 - α2 - α3 - α4 as a
function of Y1. , Y2. , Y3. , Y4. .
E.
State the Gauss-Markov theorem and describe what it implies about the properties of
the ordinary least squares estimator for 3α1 - α2 - α3 - α4 that you presented in Part
D.
F.
Write model A as Y = X β + ε , where
~
X =
~
~















1
1
1
1
1
1
1
1
1
1
−2
−2
−2
−2
−1
−1
1
1
4
4















and
 β0 
β = 

~
 β1 
3
(
(
)
)
and define PX = X X T X −1 X T . Also define PW = W W T W − WT for Model
B. Furthermore, recall the following results about the normal distribution and
distributions of quadratic forms.
Result 1. Let A be a symmetric matrix with rank (A) = k, and let Y ~ N(µ , ∑ ) .
~
If A ∑ is idempotent, then Y
~
T
~
A Y ~ χ2k (µ T A µ) .
~
~
~
Result 2. Let Y ~ N(µ , ∑ ) and let A1 and A2 be symmetric matrices. If
~
~
A1 ∑ A 2 = 0 , then Y T A1 Y and Y T A 2 Y are independent random
~
~
~
~
variables.
Result 3. If Y ~ N(µ , ∑ ) and A is a symmetric matrix of non-random constants,
~
~
then
E  Y T A Y  = µT A µ + tr (A ∑ )
~
~
~
~
(
)
Var Y T A Y = 4 µT A ∑ A µ + 2 tr (A ∑ A ∑ )
~
~
Result 4. If Y ~ N(µ , ∑ ) and A is a matrix of non-random constants, then
~
~
A Y ~ N(A µ , A ∑ AT ) .
~
~
 Y1 
 µ1 
  ~   ∑11
 ~ 
~
N
Result 5. If 
  ,
Y2 
 µ2   ∑ 21
 ~ 
 ~



∑12  

∑ 22  

then Y1 and Y2
~
are
~


independent random vectors if and only if Cov Y1 , Y2  = ∑12 = 0 .
 ~
~ 
4
(i)
1
YT (I − PX ) Y has a chi-square distribution under model B,
2 ~
~
σ
Show that


i.e., when Y ~ N W γ , σ 2I  . Report degrees of freedom. Does it have a
~

 ~
central chi-square distribution?
(ii)
G.
1
Show that
2
YT ( I − PX ) PW ( I − PX ) Y has a chi-square distribution under
~
σ
model B. Report degrees of freedom. Does it have a central chi-square
distribution?
Using the notation and results presented in Part (F), show that
F=
c Y T (I − PX ) PW (I − PX ) Y
~
Y
~
T
~
( I − PW ) Y
~
has an F-distribution for some constant c when model B is the correct model.
Report a numerical value for c and degrees of freedom, and give a formula for the
non-centrality parameter of the F-distribution.
H.
2.
Can the F-statistic defined in Part (G) be used to test a null hypothesis of practical
importance? Explain.
Consider the study in Problem 1 and suppose the researchers also had information on the
age of each cow. Let Z ij denote the age of the j-th cow fed the i-th level of amino acid.
The following model was proposed.
Model C:
Y = M γ+ ε
~







Mγ = 
~







~
1
1
1
1
1
1
1
1
1
1
−2
−2
−2
−2
−1
−1
1
1
4
4
where
~
Z11
Z12
Z13
Z14
0
0
0
0
0
0
0
0
0
0
Z 21
Z 22
0
0
0
0
ε ~ N 0,σ 2 I  and
~

~
0
0
0
0
0
0
Z 31
Z 32
0
0
0
0
0
0
0
0
0
0
Z 41
Z 42
























γ0 
γ1 
γ2 

γ3 
γ4 

γ5 
5
(A)
Under what conditions, if any, would γ 2 be estimable in model C?
(B)
Show how you would test the null hypothesis that the linear effect of age on average
milk production is the same at every level of amino acid in the diet, i.e., show how to
test the null hypothesis H 0 : γ 2 = γ 3 = γ 4 = γ 5 against the alternative that H 0 is
false. Report a formula for your test statistic, degrees of freedom, and indicate how
you would reach a conclusion. (Just state the formula for your test statistic, you do
not have to prove anything about the distribution of your test statistic or show any
derivation.)
(C)
Suppose model C is correct. Show how you would construct a 95% confidence
interval for σ 2 .
(D)
3.
Is model C a reparameterization of model B in Problem 1? Give an explanation. A
simple "yes" or "no" response is not sufficient.
Consider the linear model Y = X β + ε , where X is an n × k matrix with rank(X)=k
~
2
2
~
~
T
and Var ( ε ) = σ I + δ 1 1 , where I is the n × n identity matrix and 1 is an n × 1
~
~~
~
2
vector of ones. For this model, each observation has the same variance σ + δ 2 , but the
observations are correlated. Any pair of observations has correlation δ 2 /(σ 2 + δ2 ) .
Furthermore, since X has full column rank, every element of β is estimable and c T β is
~
~
~
T
estimable for any c ≠ 0 . The ordinary least squares estimator for c β is
~
~
~
~
c T b = c T (X T X) −1 X T Y .
~ ~
~
~
(A) Show that c T b = c T (X T X) −1 X T Y is an unbiased estimator for c T β .
~
~
~
~
~
~
(B) Is c T b = c T (X T X) −1 X T Y a best linear unbiased estimator for c T β ? Justify
~
~
~
your answer.
Exam Score ___________
~
~
~
Download