THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS

advertisement
THE ANOVA APPROACH TO THE ANALYSIS OF
LINEAR MIXED EFFECTS MODELS
We begin with a relatively simple special case. Suppose
yijk = µ + τi + uij + eijk , (i = 1, . . . , t; j = 1, . . . , n; k = 1, . . . , m)
β = (µ, τ1 , . . . , τt )0 , u = (u11 , u12 , . . . , utn )0 , e = (e111 , e112 , . . . , etnm )0 ,
β ∈ IRt+1 , an unknown parameter vector,
2
u
0
σu I 0
∼N
,
, where
e
0
0 σe2 I
σu2 , σe2 ∈ IR+ are unknown variance components.
c
Copyright 2012
(Iowa State University)
Statistics 511
1 / 47
This is the standard model for a CRD with t treatments, n
experimental units per treatment, and m observations per
experimental unit.
We can write the model as y = Xβ + Zu + e, where
X = [1tnm×1 , It×t ⊗ 1nm×1 ]
c
Copyright 2012
(Iowa State University)
and
Z = [Itn×tn ⊗ 1m×1 ].
Statistics 511
2 / 47
The ANOVA Table
Source
treatments
exp.units(treatments)
obs.units(exp.units, treatments)
c.total
c
Copyright 2012
(Iowa State University)
DF
t−1
t(n − 1)
tn(m − 1)
tnm − 1
Statistics 511
3 / 47
The ANOVA Table
Source
trt
xu(trt)
ou(xu, trt)
DF
t−1
t(n − 1)
tn(m − 1)
c.total
tnm − 1
c
Copyright 2012
(Iowa State University)
Sum of Squares
Pt Pn Pm
2
k=1 (yi·· − y··· )
Pti=1 Pj=1
n Pm
2
k=1 (yij· − yi·· )
Pi=1
Pj=1
t
n Pm
2
j=1
k=1 (yijk − yij· )
Pi=1
P
P
t
n
m
2
i=1
j=1
k=1 (yijk − y··· )
Statistics 511
4 / 47
Source
trt
xu(trt)
ou(xu, trt)
DF
t−1
tn − t
tnm − tn
c.total
tnm − 1
c
Copyright 2012
(Iowa State University)
Sum of Squares
Mean Square
P
nmP ti=1P
(yi·· − y··· )2
m ti=1 nj=1 (yij· − yi·· )2
SS/DF
Pt Pn Pm
2
(y
−
y
)
ij·
k=1 ijk
Pi=1
Pj=1
t
n Pm
2
(y
−
y
··· )
i=1
j=1
k=1 ijk
Statistics 511
5 / 47
Expected Mean Squares
E(MStrt ) =
nm
t−1
Pt
=
nm
t−1
Pt
+ τi + ui· + ei·· − µ − τ · − u·· − e··· )2
=
nm
t−1
Pt
− τ · + ui· − u·· + ei·· − e··· )2
=
nm
t−1
Pt
=
nm Pt
t−1 [ i=1 (τi
i=1 E(yi..
i=1 E(µ
i=1 E(τi
i=1 [(τi
− y··· )2
− τ · )2 + E(ui· − u·· )2 + E(ei·· − e··· )2 ]
P
− τ · )2 + E{ ti=1 (ui· − u·· )2 }
P
+E{ ti=1 (ei·· − e··· )2 }]
c
Copyright 2012
(Iowa State University)
Statistics 511
6 / 47
To simplify this expression further, note that
σu2
i.i.d.
u1· , . . . , ut· ∼ N 0,
.
n
Thus,
E
( t
X
)
(ui· − u·· )
2
= (t − 1)
i=1
σu2
.
n
Similarly,
e1·· , . . . , et··
σe2
.
∼ N 0,
nm
i.i.d.
Thus,
E
( t
X
)
(ei·· − e··· )2
i=1
= (t − 1)
σe2
.
mn
It follows that
E(MStrt ) =
t
nm X
(τi − τ .)2 + mσu2 + σe2 .
t−1
i=1
c
Copyright 2012
(Iowa State University)
Statistics 511
7 / 47
Similar calculations allow us to add an Expected Mean Squares (EMS)
column to our ANOVA table.
Source
trt
xu(trt)
ou(xu, trt)
c
Copyright 2012
(Iowa State University)
EMS
σe2 + mσu2 +
σe2 + mσu2
σe2
nm
t−1
Pt
i=1 (τi
− τ .)2
Statistics 511
8 / 47
The entire table could also be derived using matrices
X1 = 1, X2 = It× t ⊗ 1nm× 1 , X3 = Itn× tn ⊗ 1m× 1 .
Source
trt
xu(trt)
ou(xu, trt)
c.total
Sum of Squares
y0 (P2 − P1 )y
y0 (P3 − P2 )y
y0 (I − P3 )y
y0 (I − P1 )y
c
Copyright 2012
(Iowa State University)
DF
MS
rank(X2 ) − rank(X1 )
rank(X3 ) − rank(X2 ) SS/DF
tnm − rank(X3 )
tnm − 1
Statistics 511
9 / 47
Expected Mean Squares (EMS) could be computed using
E(y0 Ay) = tr(AΣ) + E(y)0 AE(y),
where
Σ = Var(y) = ZGZ0 + R = σu2 Itn× tn ⊗ 110 m× m + σe2 Itnm× tnm
and




E(y) = 



c
Copyright 2012
(Iowa State University)
µ + τ1
µ + τ2
.
.
.
µ + τt




 ⊗ 1nm× 1 .



Statistics 511
10 / 47
Furthermore, it can be shown that
y0 (P2 − P1 )y
∼ χ2t−1
σe2 + mσu2
t
X
nm
(τi − τ .)2
σe2 + mσu2
!
,
i=1
y0 (P3 − P2 )y
∼ χ2tn−t ,
σe2 + mσu2
y0 (I − P3 )y
∼ χ2tnm−tn ,
σe2
and that these three χ2 random variables are independent.
c
Copyright 2012
(Iowa State University)
Statistics 511
11 / 47
It follows that
MStrt
∼ Ft−1, tn−t
F1 =
MSxu(trt)
and
MSxu(trt)
F2 =
∼
MSou(xu,trt)
t
X
nm
(τi − τ .)2
σe2 + mσu2
σe2 + mσu2
σe2
!
i=1
Ftn−t, tnm−tn .
Thus, we can use F1 to test H0 : τ1 = ... = τt and F2 to test H0 : σu2 = 0.
c
Copyright 2012
(Iowa State University)
Statistics 511
12 / 47
Estimating σu2
Note that
E
MSxu(trt) − MSou(xu,trt)
m
Thus,
=
(σe2 + mσu2 ) − σe2
= σu2 .
m
MSxu(trt) − MSou(xu,trt)
m
is an unbiased estimator of σu2 .
c
Copyright 2012
(Iowa State University)
Statistics 511
13 / 47
Although
MSxu(trt) − MSou(xu,trt)
m
is an unbiased estimator of σu2 , this estimator can take negative
values.
This is undesirable because σu2 , the variance of the u random
effects, cannot be negative.
c
Copyright 2012
(Iowa State University)
Statistics 511
14 / 47
As we have seen previously,
Σ ≡ Var(y) = σu2 Itn× tn ⊗ 110 m× m + σe2 Itnm× tnm .
It turns out that
β̂ Σ = (X0 Σ−1 X)− X0 Σ−1 y = (X0 X)− X0 y = β̂.
Thus, the GLS estimator of any estimable Cβ is equal to the OLS
estimator in this special case.
c
Copyright 2012
(Iowa State University)
Statistics 511
15 / 47
An Analysis Based on the Average for Each
Experimental Unit
Recall that our model is
yijk = µ + τi + uij + eijk , (i = 1, ..., t; j = 1, ..., n; k = 1, ..., m)
The average of observations for experimental unit ij is
yij· = µ + τi + uij + eij·
c
Copyright 2012
(Iowa State University)
Statistics 511
16 / 47
If we define
ij = uij + eij· ∀ i, j
and
σ 2 = σu2 +
σe2
,
m
we have
yij· = µ + τi + ij ,
where the ij terms are iid N(0, σ 2 ). Thus, averaging the same number
(m) of multiple observations per experimental unit results in a normal
theory Gauss-Markov linear model for the averages
yij· : i = 1, ..., t; j = 1, ..., n .
c
Copyright 2012
(Iowa State University)
Statistics 511
17 / 47
Inferences about estimable functions of β obtained by analyzing
these averages are identical to the results obtained using the
ANOVA approach as long as the number of multiple observations
per experimental unit is the same for all experimental units.
When using the averages as data, our estimate of σ 2 is an
2
estimate of σu2 + σme .
We can’t separately estimate σu2 and σe2 , but this doesn’t matter if
our focus is on inference for estimable functions of β.
c
Copyright 2012
(Iowa State University)
Statistics 511
18 / 47
Because




E(y) = 



µ + τ1
µ + τ2
.
.
.
µ + τt




 ⊗ 1nm× 1 ,



the only estimable quantities are linear combinations of the treatment
means µ + τ1 , µ + τ2 , ..., µ + τt , whose Best Linear Unbiased
Estimators are y1·· , y2·· , . . . , yt·· , respectively.
c
Copyright 2012
(Iowa State University)
Statistics 511
19 / 47
Thus, any estimable Cβ can always be written as


µ + τ1
 µ + τ2 


A
 for some matrix A.
..


.
µ + τt
It follows that the BLUE of Cβ can be written as


y1··
 y 
 2·· 
A . .
 .. 
yt··
c
Copyright 2012
(Iowa State University)
Statistics 511
20 / 47
Now note that
Var(yi ..) = Var(µ + τi + ui . + ei ..)
= Var(ui . + ei ..)
= Var(ui .) + Var(ei ..)
σu2
σ2
=
+ e
n
nm
1
σe2
2
=
σu +
n
m
2
σ
=
.
n
c
Copyright 2012
(Iowa State University)
Statistics 511
21 / 47
Thus





Var 



y1..
y2..
.
.
.
yt..



 σ 2
 =
It×t

n


which implies that the variance of the BLUE of Cβ is

 
y1··
  . 
2
 

σ
σ2 0




Var A  .  = A
It×t A0 =
AA .
n
n
  . 
yt··
c
Copyright 2012
(Iowa State University)
Statistics 511
22 / 47
Thus, we don’t need separate estimates of σu2 and σe2 to carry out
inference for estimable Cβ.
We do need to estimate σ 2 = σu2 +
σe2
m.
This can equivalently be estimated by
MSxu(trt)
m
or by the MSE in an analysis of the experimental unit means
yij· : i = 1, ..., t; j = 1, ..., n. .
c
Copyright 2012
(Iowa State University)
Statistics 511
23 / 47
For example, suppose we want to estimate τ1 − τ2 . The BLUE is
y1·· − y2·· whose variance is
Var(y1·· − y2·· ) = Var(y1·· ) + Var(y2·· )
2
σ2
σu
σe2
= 2
=2
+
n
n
mn
2 2
=
(σ + mσu2 )
mn e
2
E(MSxu(trt) )
=
mn
Thus,
c 1·· − y2·· ) =
Var(y
c
Copyright 2012
(Iowa State University)
2MSxu(trt)
.
mn
Statistics 511
24 / 47
A 100(1 − α)% confidence intreval for τ1 − τ2 is
r
2MSxu(trt)
.
y1·· − y2·· ± tt(n−1),1−α/2
mn
A test of H0 : τ1 = τ2 can be based on

y − y2··
τ1 − τ2
t = q1··
∼ tt(n−1)  q
2MSxu(trt)
mn
c
Copyright 2012
(Iowa State University)
2(σe2 +mσu2 )
mn

.
Statistics 511
25 / 47
What if the number of observations per experimental unit is not
the same for all experimental units?
Let us look at two miniature examples to understand how this type
of unbalancedness affects estimation and inference.
c
Copyright 2012
(Iowa State University)
Statistics 511
26 / 47
First Example


y111
 y121 

y=
 y211  ,
y212
c
Copyright 2012
(Iowa State University)

1
 1
X=
 0
0

0
0 
,
1 
1

1
 0
Z=
 0
0
0
1
0
0

0
0 

1 
1
Statistics 511
27 / 47
X1 = 1,
X2 = X,
X3 = Z
MStrt = y0 (P2 − P1 )y = 2(y1·· − y··· )2 + 2(y2·· − y··· )2 = (y1·· − y2·· )2
1
MSxu(trt) = y0 (P3 − P2 )y = (y111 − y1·· )2 + (y121 − y1·· )2 = (y111 − y121 )2
2
1
MSou(xu,trt) = y0 (I − P3 )y = (y211 − y2·· )2 + (y212 − y2·· )2 = (y211 − y212 )2
2
c
Copyright 2012
(Iowa State University)
Statistics 511
28 / 47
E(MStrt ) = E(y1·· − y2·· )2
= E(τ1 − τ2 + u1· − u21 + e1·· − e2·· )2
= (τ1 − τ2 )2 + Var(u1· ) + Var(u21 ) + Var(e1·· ) + Var(e2·· )
= (τ1 − τ2 )2 +
σu2
2
+ σu2 +
σe2
2
+
σe2
2
= (τ1 − τ2 )2 + 1.5σu2 + σe2
c
Copyright 2012
(Iowa State University)
Statistics 511
29 / 47
E(MSxu(trt) )
=
1
2 E(y111
=
1
2 E(u11
=
1
2
2 (2σu
− y121 )2
− u12 + e111 − e121 )2
+ 2σe2 )
= σu2 + σe2
E(MSou(xu,trt) ) =
1
2 E(y211
− y212 )2
=
1
2 E(e211
− e212 )2
= σe2
c
Copyright 2012
(Iowa State University)
Statistics 511
30 / 47
SOURCE
trt
xu(trt)
ou(xu, trt)
F=
MStrt
1.5σu2 + σe2
c
Copyright 2012
(Iowa State University)
EMS
(τ1 − τ2 )2 + 1.5σu2 + σe2
σu2 + σe2
σe2
MSxu(trt)
(τ1 − τ2 )2
/
∼
F
1,1
σu2 + σe2
1.5σu2 + σe2
Statistics 511
31 / 47
The test statistic that we used to test
H0 : τ1 = · · · = τt
in the balanced case is not F distributed in this unbalanced case.
(τ1 − τ2 )2
MStrt
1.5σu2 + σe2
F1,1
∼
MSxu(trt)
σu2 + σe2
1.5σu2 + σe2
c
Copyright 2012
(Iowa State University)
Statistics 511
32 / 47
A Statistic with an Approximate F Distribution
We’d like our denominator to be an unbiased estimator of
1.5σu2 + σe2 in this case.
Consider 1.5MSxu(trt) − 0.5MSou(xu,trt)
The expectation is
1.5(σu2 + σe2 ) − 0.5σe2 = 1.5σu2 + σe2 .
The ratio
MStrt
1.5MSxu(trt) − 0.5MSou(xu,trt)
can be used as an approximate F statistic with 1 numerator DF
and a denominator DF obtained using the Cochran-Satterthwaite
method.
c
Copyright 2012
(Iowa State University)
Statistics 511
33 / 47
The Cochran-Satterthwaite method will be explained in the next
set of notes.
We should not expect this approximate F-test to be reliable in this
case because of our pitifully small dataset.
c
Copyright 2012
(Iowa State University)
Statistics 511
34 / 47
Best Linear Unbiased Estimates in this First Example
What do the BLUEs of the treatment means look like in this case?
Recall




1 0
1 0 0
 1 0 


µ1
, Z =  0 1 0 .
β=
, X=
 0 1 
 0 0 1 
µ2
0 1
0 0 1
c
Copyright 2012
(Iowa State University)
Statistics 511
35 / 47
Σ = Var(y) = ZGZ0 + R = σu2 ZZ0 + σe2 I




1 0 0 0
1 0 0 0
 0 1 0 0 


2 0 1 0 0 

= σu2 
 0 0 1 1  + σe  0 0 1 0 
0 0 1 1
0 0 0 1
 2
  2
σu 0 0 0
σe 0 0 0
 0 σu2 0 0   0 σe2 0 0
 
= 
 0 0 σu2 σu2  +  0 0 σe2 0
0 0 σu2 σu2
0 0 0 σe2
 2

σu + σe2
0
0
0


0
σu2 + σe2
0
0

= 
2
2
2


0
0
σu + σe
σu
2
2
2
0
0
σu
σu + σe
c
Copyright 2012
(Iowa State University)




Statistics 511
36 / 47
It follows that
β̂ Σ = (X0 Σ−1 X)− X0 Σ−1 y
1 1
0 0
y1··
2
2
=
y=
y2··
0 0 21 12
Fortunately, this is a linear estimator that does not depend on unknown
variance components.
c
Copyright 2012
(Iowa State University)
Statistics 511
37 / 47
Second Example


y111
 y112 

y=
 y121  ,
y211
c
Copyright 2012
(Iowa State University)

1
 1
X=
 1
0

0
0 
,
0 
1

1
 1
Z=
 0
0
0
0
1
0

0
0 

0 
1
Statistics 511
38 / 47
In this case, it can be shown that
β̂ Σ = (X0 Σ−1 X)− X0 Σ−1 y

"
=
"
=
σe2 +σu2
3σe2 +4σu2
σe2 +σu2
3σe2 +4σu2
σe2 +2σu2
3σe2 +4σu2
0
0
0
2σe2 +2σu2
3σe2 +4σu2
c
Copyright 2012
(Iowa State University)
y11·
+
y211
σe2 +2σu2
3σe2 +4σu2
0
1
#
y121

y111
 y112 


 y121 
y211
#
.
Statistics 511
39 / 47
It is straightforward to show that the weights on y11· and y121 are
1
Var(y11· )
1
Var(y11· )
+
1
Var(y121 )
and
1
Var(y121 )
1
Var(y11· )
+
1
Var(y121 )
, respectively.
This is a special case of a more general phenomenon: the BLUE is a
weighted average of independent linear unbiased estimators with
weights for the linear unbiased estimators proportional to the inverse
variances of the linear unbiased estimators.
c
Copyright 2012
(Iowa State University)
Statistics 511
40 / 47
Of course, in this case and in many others,
" 2 2
#
2σe +2σu
σe2 +2σu2
y
+ 3σ2 +4σ2 y121
e
u
β̂ Σ = 3σe2 +4σu2 11·
y211
is not an estimator because it is a function of unknown parameters.
2
2
Thus, we use β̂ Σ
b as our estimator (i.e., we replace σe and σu by
estimates in the expression above).
c
Copyright 2012
(Iowa State University)
Statistics 511
41 / 47
β̂ Σ
b is an approximation to the BLUE.
β̂ Σ
b is not even a linear estimator in this case.
Its exact distribution is unknown.
When sample sizes are large, it is reasonable to assume that the
distribution of β̂ Σ
b is approximately the same as the distribution of
β̂ Σ .
c
Copyright 2012
(Iowa State University)
Statistics 511
42 / 47
Var(β̂ Σ ) = Var[(X0 Σ−1 X)−1 X0 Σ−1 y]
= (X0 Σ−1 X)−1 X0 Σ−1 Var(y)[(X0 Σ−1 X)−1 X0 Σ−1 ]0
= (X0 Σ−1 X)−1 X0 Σ−1 ΣΣ−1 X(X0 Σ−1 X)−1
= (X0 Σ−1 X)−1 X0 Σ−1 X(X0 Σ−1 X)−1
= (X0 Σ−1 X)−1
−1
0b
Var(β̂ Σ
b ) = Var[(X Σ
c
Copyright 2012
(Iowa State University)
b −1 y] =???? ≈ (X0 Σ
b −1 X)−1
X)−1 X0 Σ
Statistics 511
43 / 47
Summary of Main Points
Many of the concepts we have seen by examining special cases
hold in greater generality.
For many of the linear mixed models commonly used in practice,
balanced data are nice because...
c
Copyright 2012
(Iowa State University)
Statistics 511
44 / 47
1
It is relatively easy to determine degrees of freedom, sums of
squares, and expected mean squares in an ANOVA table.
2
Ratios of appropriate mean squares can be used to obtain exact
F-tests.
3
For estimable Cβ, Cβ̂ Σ
b = Cβ̂. (OLS = GLS).
4
5
When Var(c0 β̂) = constant × E(MS), exact inferences about c0 β
can be obtained by constructing t-tests or confidence intervals
based on
c0 β̂ − c0 β
t= p
∼ tDF(MS) .
constant × (MS)
Simple analysis based on experimental unit averages gives the
same results as those obtained by linear mixed model analysis of
the full data set.
c
Copyright 2012
(Iowa State University)
Statistics 511
45 / 47
When data are unbalanced, the analysis of linear mixed may be
considerably more complicated.
1
Approximate F-tests can be obtained by forming linear
combinations of Mean Squares to obtain denominators for test
statistics.
2
The estimator Cβ̂ Σ
b may be a nonlinear estimator of Cβ whose
exact distribution is unknown.
3
Approximate inference for Cβ is often obtained by using the
distribution of Cβ̂ Σ
b , with unknowns in that distribution replaced by
estimates.
c
Copyright 2012
(Iowa State University)
Statistics 511
46 / 47
Whether data are balanced or unbalanced, unbiased estimators of
variance components can be obtained using linear combinations of
mean squares from the ANOVA table.
c
Copyright 2012
(Iowa State University)
Statistics 511
47 / 47
Download