Document 10639669

advertisement
Introduction to the Gauss-Markov Linear
Model
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
1 / 36
Random Vectors



y=

y1
y2
..
.



 is a random vector if and only if each element of y is a

yn
random variable (i.e., yi is a random variable ∀ i = 1, . . . , n).



The mean of the random vector y is E(y) = 

E(y1 )
E(y2 )
..
.



.

E(yn )
The variance of the random vector y is the matrix whose i, jth
element is Cov(yi , yj ) = E(yi yj ) − E(yi )E(yj ).
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
2 / 36
Example: Variance of a Random Vector


y1
For example, the variance of y =  y2  is
y3


Cov(y1 , y1 ) Cov(y1 , y2 ) Cov(y1 , y3 )
Var(y) =  Cov(y2 , y1 ) Cov(y2 , y2 ) Cov(y2 , y3 ) 
Cov(y3 , y1 ) Cov(y3 , y2 ) Cov(y3 , y3 )


Var(y1 )
Cov(y1 , y2 ) Cov(y1 , y3 )
Var(y2 )
Cov(y2 , y3 )  .
=  Cov(y2 , y1 )
Cov(y3 , y1 ) Cov(y3 , y2 )
Var(y3 )
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
3 / 36
The Gauss-Markov Linear Model
y = Xβ + y is an n × 1 random vector of responses.
X is an n × p matrix of constants with columns corresponding to
explanatory variables. X is sometimes referred to as the design
matrix.
β is an unknown parameter vector in IRp .
is an n × 1 random vector of errors.
E() = 0 and Var() = σ 2 I, where σ 2 is an unknown parameter in
IR+ .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
4 / 36
The Gauss-Markov Linear Model
Note that the model is not completely specified because the
distribution of y is not completely specified.
y = Xβ + ,
=⇒
=⇒
E() = 0,
Var() = σ 2 I
E(y) = Xβ, Var(y) = σ 2 I
y ∼ (Xβ, σ 2 I)
“y has a distribution with mean Xβ and variance σ 2 I.”
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
5 / 36
The Normal Theory Gauss-Markov Linear Model
We often add an assumption of multivariate normality to the
Gauss-Markov linear model: ∼ N(0, σ 2 I).
The assumption ∼ N(0, σ 2 I) is equivalent to
i.i.d.
1 , . . . , n ∼ N(0, σ 2 ).
The assumption ∼ N(0, σ 2 I) =⇒ y ∼ N(Xβ, σ 2 I), i.e.,
y1 , . . . , yn are independent normal random variables,
Var(yi ) = σ 2 ∀ i = 1, . . . , n, and
E(yi ) = x0(i) β (where x0(i) is the ith row of X) ∀ i = 1, . . . , n.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
6 / 36
Goal of Analysis
y = Xβ + The goal of analysis often focuses on answering questions
about certain linear functions of β of the form Cβ for a
specified matrix C.
The normality assumption is useful for constructing
confidence intervals and performing tests concerning Cβ.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
7 / 36
Example 1
Researchers harvested five randomly selected ears of corn from a
field. For i = 1, . . . , 5; let yi denote the weight in grams of the ith ear.
i.i.d.
y1 , . . . , y5 ∼ N(µ, σ 2 )
yi = µ + i ,
i = 1, . . . , 5;
i.i.d.
1 , . . . , 5 ∼ N(0, σ 2 )
y1 = µ + 1
y2 = µ + 2
y3 = µ + 3
i.i.d.
1 , . . . , 5 ∼ N(0, σ 2 )
y4 = µ + 4
y5 = µ + 5
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
8 / 36
Example 1 (continued)
y1 = µ + 1
y2 = µ + 2
i.i.d.
1 , . . . , 5 ∼ N(0, σ 2 )
y3 = µ + 3
y4 = µ + 4
y5 = µ + 5






y1
y2
y3
y4
y5


 
 
=
 
 
µ
µ
µ
µ
µ


 
 
+
 
 
c
Copyright 2012
Dan Nettleton (Iowa State University)
1
2
3
4
5




,







1
2
3
4
5



 ∼ N(0, σ 2 I)


Statistics 511
9 / 36
Example 1 (continued)







y1
y2
y3
y4
y5
y1
 y2

 y3

 y4
y5

 
1
µ
  µ   2 

 
 
 =  µ  +  3  ,

 
 
  µ   4 
5
µ
  


1
1
  1 
 2 
  


 =  1  [µ] +  3  ,
  


  1 
 4 
1
5


c
Copyright 2012
Dan Nettleton (Iowa State University)


1
 2 


 3  ∼ N(0, σ 2 I)


 4 
5


1
 2 


 3  ∼ N(0, σ 2 I)


 4 
5
Statistics 511
10 / 36
Example 1 (continued)






y1
y2
y3
y4
y5


 
 
=
 
 
1
1
1
1
1






 [µ] + 




y = Xβ + ,
1
2
3
4
5




,







1
2
3
4
5



 ∼ N(0, σ 2 I)


∼ N(0, σ 2 I)
Cβ = [1][µ] = µ
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
11 / 36
Example 2
Researchers randomly assigned eight experimental units to two
treatments and measured a response of interest. For i = 1, 2; let
yi1 , yi2 , yi3 , yi4 denote the responses of the experimental units in the ith
treatment group.
i.i.d.
y11 , y12 , y13 , y14 ∼ N(µ1 , σ 2 )
independent of
i.i.d.
y21 , y22 , y23 , y24 ∼ N(µ2 , σ 2 )
yij = µi + ij ,
i = 1, 2; j = 1, . . . , 4
i.i.d.
11 , 12 , 13 , 14 , 21 , 22 , 23 , 24 ∼ N(0, σ 2 )
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
12 / 36
Example 2 (continued)
y11 = µ1 + 11
y12 = µ1 + 12
y13 = µ1 + 13
y14 = µ1 + 14
y21 = µ2 + 21
y22 = µ2 + 22
y23 = µ2 + 23
y24 = µ2 + 24
i.i.d.
11 , 12 , 13 , 14 , 21 , 22 , 23 , 24 ∼ N(0, σ 2 )
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
13 / 36
Example 2 (continued)












y11
y12
y13
y14
y21
y22
y23
y24


 
 
 
 
 
=
 
 
 
 
 
µ1
µ1
µ1
µ1
µ2
µ2
µ2
µ2


 
 
 
 
 
+
 
 
 
 
 
c
Copyright 2012
Dan Nettleton (Iowa State University)
11
12
13
14
21
22
23
24







,
















11
12
13
14
21
22
23
24






 ∼ N(0, σ 2 I)





Statistics 511
14 / 36
Example 2 (continued)












y11
y12
y13
y14
y21
y22
y23
y24


 
 
 
 
 
=
 
 
 
 
 
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
1











 µ1


 µ2 + 








c
Copyright 2012
Dan Nettleton (Iowa State University)
11
12
13
14
21
22
23
24







,
















11
12
13
14
21
22
23
24






 ∼ N(0, σ 2 I)





Statistics 511
15 / 36
Example 2 (continued)












y11
y12
y13
y14
y21
y22
y23
y24


 
 
 
 
 
=
 
 
 
 
 
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
1











 µ1


 µ2 + 








y = Xβ + ,
Cβ = [1, −1]
c
Copyright 2012
Dan Nettleton (Iowa State University)
11
12
13
14
21
22
23
24







,
















11
12
13
14
21
22
23
24






 ∼ N(0, σ 2 I)





∼ N(0, σ 2 I)
µ1
µ2
= µ1 − µ2
Statistics 511
16 / 36
Example 3
Suppose eight fertilizer amounts denoted x1 , . . . , x8 were randomly
assigned to eight field plots. For i = 1, . . . , 8; let yi denote the yield of
the plot that received fertilizer amount xi .
yi = β0 + β1 xi + i ,
i = 1, . . . , 8
i.i.d.
1 , . . . , 8 ∼ N(0, σ 2 )
y1 = β0 + β1 x1 + 1
y2 = β0 + β1 x2 + 2
y3 = β0 + β1 x3 + 3
y4 = β0 + β1 x4 + 4
y5 = β0 + β1 x5 + 5
y6 = β0 + β1 x6 + 6
y7 = β0 + β1 x7 + 7
y8 = β0 + β1 x8 + 8
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
17 / 36
Example 3 (continued)












y1
y2
y3
y4
y5
y6
y7
y8


 
 
 
 
 
=
 
 
 
 
 
β0 + β1 x1
β0 + β1 x2
β0 + β1 x3
β0 + β1 x4
β0 + β1 x5
β0 + β1 x6
β0 + β1 x7
β0 + β1 x8
c
Copyright 2012
Dan Nettleton (Iowa State University)


 
 
 
 
 
+
 
 
 
 
 
1
2
3
4
5
6
7
8







,
















1
2
3
4
5
6
7
8






 ∼ N(0, σ 2 I)





Statistics 511
18 / 36
Example 3 (continued)












y1
y2
y3
y4
y5
y6
y7
y8


 
 
 
 
 
=
 
 
 
 
 
1
1
1
1
1
1
1
1
x1
x2
x3
x4
x5
x6
x7
x8











 β0


 β1 + 








c
Copyright 2012
Dan Nettleton (Iowa State University)
1
2
3
4
5
6
7
8







,
















1
2
3
4
5
6
7
8






 ∼ N(0, σ 2 I)





Statistics 511
19 / 36
Example 3 (continued)












y1
y2
y3
y4
y5
y6
y7
y8


 
 
 
 
 
=
 
 
 
 
 
1
1
1
1
1
1
1
1
x1
x2
x3
x4
x5
x6
x7
x8











 β0


+

 β1








y = Xβ + ,
c
Copyright 2012
Dan Nettleton (Iowa State University)







,
















1
2
3
4
5
6
7
8






 ∼ N(0, σ 2 I)





∼ N(0, σ 2 I)
Cβ = [0, 1]
1
2
3
4
5
6
7
8
β0
β1
= β1
Statistics 511
20 / 36
Example 4
Eight hogs were randomly assigned to two diets and two inoculations
such that two hogs received each combination of diet and inoculation.
This experiment involves two factors: diet and inoculation.
In this case, each factor has two levels (denoted here generically
as 1 and 2).
A combination of one level from each factor forms a treatment.
In this case, we have four treatments:
Treatment
1
2
3
4
Diet
1
1
2
2
Inoculation
1
2
1
2
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
21 / 36
Example 4 (continued)
For i = 1, 2; j = 1, 2; and k = 1, 2; let yijk denote the average daily gain
of the kth hog that received diet i and inoculation j.
yijk = µ + ijk
i = 1, 2; j = 1, 2; k = 1, 2;
i.i.d.
111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 )
Under this model, neither diet nor inoculation affects average daily
gain.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
22 / 36
Example 4 (continued)
For i = 1, 2; j = 1, 2; and k = 1, 2; let yijk denote the average daily gain
of the kth hog that received diet i and inoculation j.
yijk = µ + αi + ijk
i = 1, 2; j = 1, 2; k = 1, 2;
i.i.d.
111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 )
Under this model, only diet affects average daily gain.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
23 / 36
Example 4 (continued)
For i = 1, 2; j = 1, 2; and k = 1, 2; let yijk denote the average daily gain
of the kth hog that received diet i and inoculation j.
yijk = µ + βj + ijk
i = 1, 2; j = 1, 2; k = 1, 2;
i.i.d.
111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 )
Under this model, only inoculation affects average daily gain.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
24 / 36
Example 4 (continued)
yijk = µ + αi + βj + ijk
i = 1, 2; j = 1, 2; k = 1, 2;
i.i.d.
111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 )
Under this model, factors diet and inoculation affect the mean
average daily gain in an additive manner.
There is no interaction between the factors diet and inoculation.
diet
1
2
diet difference
inoculation
1
2
µ + α1 + β1 µ + α1 + β2
µ + α2 + β1 µ + α2 + β2
α1 − α2
α1 − α2
c
Copyright 2012
Dan Nettleton (Iowa State University)
inoculation difference
β1 − β2
β1 − β2
Statistics 511
25 / 36
Example 4 (continued)
yijk = µ + αi + βj + γij + ijk
i = 1, 2; j = 1, 2; k = 1, 2;
i.i.d.
111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 )
Under this model, there is one mean for each combination of diet
and inoculation.
Those four means are free to take any four values with no
restrictions.
diet
1
2
∆diet
inoculation
1
2
µ + α1 + β1 + γ11
µ + α1 + β2 + γ12
µ + α2 + β1 + γ21
µ + α2 + β2 + γ22
α1 − α2 + γ11 − γ21 α1 − α2 + γ12 − γ22
c
Copyright 2012
Dan Nettleton (Iowa State University)
∆inoculation
β1 − β2 + γ11 − γ12
β1 − β2 + γ21 − γ22
Statistics 511
26 / 36
Example 4 (continued)
An equivalent model is the so called cell means model:
yijk = µij + ijk
i = 1, 2; j = 1, 2; k = 1, 2;
i.i.d.
111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 )
diet
1
2
∆diet
inoculation
1
2
µ11
µ12
µ21
µ22
µ11 − µ21 µ12 − µ22
c
Copyright 2012
Dan Nettleton (Iowa State University)
∆inoculation
µ11 − µ12
µ21 − µ22
Statistics 511
27 / 36
Example 4 (continued)
yijk = µ + αi + βj + γij + ijk
i = 1, 2; j = 1, 2; k = 1, 2;
y111 = µ + α1 + β1 + γ11 + 111
y112 = µ + α1 + β1 + γ11 + 112
y121 = µ + α1 + β2 + γ12 + 121
y122 = µ + α1 + β2 + γ12 + 122
y211 = µ + α2 + β1 + γ21 + 211
y212 = µ + α2 + β1 + γ21 + 212
y221 = µ + α2 + β2 + γ22 + 221
y222 = µ + α2 + β2 + γ22 + 222
i.i.d.
111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 )
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
28 / 36
Example 4 (continued)












y111
y112
y121
y122
y211
y212
y221
y222


 
 
 
 
 
=
 
 
 
 
 
µ + α1 + β1 + γ11
µ + α1 + β1 + γ11
µ + α1 + β2 + γ12
µ + α1 + β2 + γ12
µ + α2 + β1 + γ21
µ + α2 + β1 + γ21
µ + α2 + β2 + γ22
µ + α2 + β2 + γ22
c
Copyright 2012
Dan Nettleton (Iowa State University)


 
 
 
 
 
+
 
 
 
 
 
111
112
121
122
211
212
221
222
 





,
















111
112
121
122
211
212
221
222






 ∼ N(0, σ 2 I)





Statistics 511
29 / 36
Example 4 (continued)












y111
y112
y121
y122
y211
y212
y221
y222


 
 
 
 
 
=
 
 
 
 
 
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
1
1
0
0
0
0
1
1
0
0
1
1
y = Xβ + ,
c
Copyright 2012
Dan Nettleton (Iowa State University)
1
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
1
1















µ
α1
α2
β1
β2
γ11
γ12
γ21
γ22



 
 
 
 
 
+
 
 
 
 
 

111
112
121
122
211
212
221
222












∼ N(0, σ 2 I)
Statistics 511
30 / 36
Example 4 (continued)
β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0
diet
1
2
∆diet
inoculation
1
2
µ + α1 + β1 + γ11
µ + α1 + β2 + γ12
µ + α2 + β1 + γ21
µ + α2 + β2 + γ22
α1 − α2 + γ11 − γ21 α1 − α2 + γ12 − γ22
Is the difference between diet means for inoculation 1 the same as the
difference between diet means for inoculation 2?
Cβ = [0, 0, 0, 0, 0, 1, −1, −1, 1]β = γ11 − γ12 − γ21 + γ22 = 0?
This questions asks if there is interaction between the factors diet and
inoculation.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
31 / 36
Example 4 (continued)
β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0
diet
1
2
inoculation
1
2
µ + α1 + β1 + γ11 µ + α1 + β2 + γ12
µ + α2 + β1 + γ21 µ + α2 + β2 + γ22
∆inoculation
β1 − β2 + γ11 − γ12
β1 − β2 + γ21 − γ22
Is the difference between inoculation means for diet 1 the same as the
difference between inoculation means for diet 2?
Cβ = [0, 0, 0, 0, 0, 1, −1, −1, 1]β = γ11 − γ12 − γ21 + γ22 = 0?
This questions also asks if there is interaction between the factors diet
and inoculation.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
32 / 36
Example 4 (continued)
β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0
diet
1
2
inoculation
1
2
µ + α1 + β1 + γ11 µ + α1 + β2 + γ12
µ + α2 + β1 + γ21 µ + α2 + β2 + γ22
Diet Means
µ + α1 + β̄· + γ̄1·
µ + α2 + β̄· + γ̄2·
Is the average over inoculation means for diet 1 different than the
average over inoculation means for diet 2?
Cβ = [0, 1, −1, 0, 0, .5, .5, −.5, −.5]β = α1 − α2 + γ̄1· − γ̄2· = 0?
This question asks about the main effect of the factor diet.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
33 / 36
Example 4 (continued)
β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0
diet
1
2
Inoculation Means
inoculation
1
2
µ + α1 + β1 + γ11 µ + α1 + β2 + γ12
µ + α2 + β1 + γ21 µ + α2 + β2 + γ22
µ + ᾱ· + β1 + γ̄·1 µ + ᾱ· + β2 + γ̄·2
Is the average over diet means for inoculation 1 different than the
average over diet means for inoculation 2?
Cβ = [0, 0, 0, 1, −1, .5, −.5, .5, −.5]β = β1 − β2 + γ̄·1 − γ̄·2 = 0?
This question asks about the main effect of the factor inoculation.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
34 / 36
Example 4 (continued)
β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0
diet
1
2
∆diet
inoculation
1
2
µ + α1 + β1 + γ11
µ + α1 + β2 + γ12
µ + α2 + β1 + γ21
µ + α2 + β2 + γ22
α1 − α2 + γ11 − γ21
Is there a difference between the diet means for inoculation 1?
Cβ = [0, 1, −1, 0, 0, 1, 0, −1, 0]β = α1 − α2 + γ11 − γ21 = 0?
This question asks about the simple effect of the factor diet for the first
level of the factor inoculation.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 511
35 / 36
Example 4 (continued)
β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0
diet
1
2
∆diet
inoculation
1
2
µ + α1 + β1 + γ11
µ + α1 + β2 + γ12
µ + α2 + β1 + γ21
µ + α2 + β2 + γ22
α1 − α2 + γ11 − γ21 α1 − α2 + γ12 − γ22
Are all four treatment means identical?

0 0 0 1 −1
Cβ =  0 0 0 1 −1
0 1 −1 0 0

β1 − β2 + γ11 − γ12
=  β1 − β2 + γ21 − γ22
α1 − α2 + γ11 − γ21
c
Copyright 2012
Dan Nettleton (Iowa State University)
∆inoculation
β1 − β2 + γ11 − γ12
β1 − β2 + γ21 − γ22

1 −1 0
0
0 0
1 −1  β
1 0 −1 0
  
0
 =  0 ?
0
Statistics 511
36 / 36
Download