Linear mixed models 1/19

advertisement
Linear mixed models
1/19
Example: random blocks (penicillin production)
I
Comparison of four processes for producing penicillin.
I
Four processes A, B, C and D, levels of a “fixed” effect
treatment.
I
Random sample of five batches of raw material, corn steep
liquor.
I
Split each batch into four parts. Run each process on one
part, and randomize the order in which the processes are
run with each batch.
I
Would like to know the effect of batch on the producing
penicillin.
2/19
Example continued
Let Yij be the yield for the i-th process (i = 1, 2, 3, 4) applied to
the j-th batch. Consider the following simple model
Yij = µ + αi + eij
where αi is the main effect associated with the i-th process. In
linear models, it is typically assumed that eij and ei 0 j (i 6= i 0 ) are
independent. But this might not be realistic for this data set,
because both samples were obtained from the j-th batch. Thus,
eij and ei 0 j (i 6= i 0 ) could be dependent to each other.
3/19
Example contiued
To study the batch effect, we might consider a two way ANOVA
model as following:
Yij = µ + αi + γj + εij .
In a usual two way ANOVA model, γj is typically treated as fixed
effects.
4/19
Linear mixed models
However, this might not be entire appropriate for this data set
due to the following reasons:
I
We are not interested in the effects of these five selected
batches. Instead, we are interested in effects of any batch
that are samples from the population.
I
To repeat this experiment, you would need to use a
different set of batches of raw material.
I
The observations obtained from the same batch are
dependent to each other.
5/19
Example continued
As a result, we could consider the following random block
effects model:
Yij = µ + αi + γj + εij ,
iid
where γj is the random block effect (batch effect), γj ∼ N(0, σγ2 )
iid
and εij ∼ N(0, σ 2 ).
6/19
Variance covariance structure
I
The variance of an observation Yij is
Var(Yij ) = Var(µ + αi + γj + εij ) = Var(γj + εij )
= Var(γj ) + Var(εij ) = σγ2 + σ 2 .
I
The covariance of Yij and Ykj (k 6= i) is
Cov(Yij , Ykj ) = Cov(µ + αi + γj + εij , µ + αk + γj + εkj )
= Cov(γj + εij , γj + εkj )
= Cov(γj , γj ) = σγ2 .
I
The covariance of Yij and Ykl (j 6= l) is 0.
7/19
Variance covariance structure
For the observations from the same process,
Yj = (Y1j , Y2j , Y3j , Y4j )T , the variance covariance structure of Y
is




Var(Yj ) = 



σγ2 + σ 2
σγ2
σγ2
σγ2

σγ2
σγ2 + σ 2
σγ2
σγ2
σγ2
σγ2
σγ2 + σ 2
σγ2
σγ2
σγ2
σγ2
σγ2 + σ 2



.



This is the so-called compound symmetric structure.
8/19
Example continued
The above random block effects model can be written into a
matrix form as following:
























Y11


 
 
Y21  
 
 

Y31 
 
 

Y41 
 

.. 
 
. =
 
 
Y15  
 
 
Y25  
 
 
Y35  
 
Y45
1
1
0
0
1
0
1
0
1
0
0
1
1
..
.
0
..
.
0
..
.
0
..
.
1
1
0
0
1
0
1
0
1
0
0
1
1
0
0
0
0




0 


0 


1 


.. 

. 


0 


0 


0 

1
µ
α1
α2
α3
α4



 


 
 
 
 
 
+
 
 
 
 







1
0
0
0
1
0
0
0
1
0
0
0
1
..
.
0
..
.
0
..
.
0
..
.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0




0 


0 


0 


.. 

. 


1 


1 


1 

1
γ1
γ2
γ3
γ4
γ5



 


 
 
 
 
 
+
 
 
 
 







ε11



ε21 


ε31 


ε41 

.. 

. 


ε15 


ε25 


ε35 

ε45
9/19
Fixed and random effects
I
Fixed effects are unknown constants that we try to
estimate from the data. In the previous example, we treat
process effects as treatment effects. Because we are
interested in the effects of these specific processes on the
penicillin production.
I
Random effects are random variables, which are used to
account for the variation due to the associated factors
(predictors). For random effects, we are not interested in
the particular value of random effects. We try to
understand the variation due to the random effects and the
distribution of random effects.
10/19
Linear mixed models
In general, a linear mixed model may be represented as
Y = X β + Zu + ε,
where
I
Y is an n × 1 vector of response;
I
X is an n × p design matrix;
I
β is a p × 1 vector of “fixed” unknown parameter values;
I
Z is an n × q model matrix of known constants;
I
u is a q × 1 random vector;
I
ε is a n × 1 random error.
11/19
Linear mixed models
We typically assume that
E(ε) = 0, Var(ε) = R E(u) = 0, Var(u) = G.
As a result,
Var(Y ) = ZGZ T + R.
12/19
Linear mixed models
In normal theory mixed models, we often assume that


  

u
0
G 0

 ∼ N   , 
 .
ε
0
0 R
Then it is equivalent to assume that
Y ∼ N(X β, Σ),
where Σ = ZGZ T + R.
13/19
Example: dry-weight data
A study was conducted to compare two plant genotypes
(genotype 1 and 2). Suppose 10 seeds (5 of genotype 1 and 5
of genotype 2) were planted in a total of 4 pots. Suppose 3
genotype 1 seeds were planted in one pot, and the other 2
genotype 1 seeds were planted in another pot. The same
planting strategy was used when planting genotype 2 seeds in
the other two pots. The seeds germinated and emerged from
the soil as seedlings. After a four-week growing period, each
seedling was dried and weighed.
14/19
Example: dry-weight data
Let Yijk denote the weight for genotype i, pot j, seedling k.
I
Consider the genotype effect and the pot effect. Which one
should be fixed effect and which one is random effect?
I
Provide a linear mixed-effects model for the dry-weight
data. Determine Y , X , β, Z , u, G, R, and Var (Y ).
15/19
Estimation of β
I
For any estimable functions C T β, the ordinary least
squares estimate (OLSE) of C T β is
T β = C T (X T X )− X T Y .
Cd
I
The above OLSE is unbiased to C T β. But the OLSE is not
necessary a best linear unbiased estimator (BLUE)
because Var(Y ) 6= σ 2 In . It is also not UMVUE under
normal theory because X T Y is no longer a sufficient and
complete statistic.
16/19
Generalized LSE
If β is estimable, a generalized LSE is the minimizer of the
following objective function
β̂ = arg min(Y − X T β)T Σ−1 (Y − X T β).
β
From which we can get the estimating equation for β as
(X T Σ−1 X )β = X T Σ−1 Y .
The GLSE of β is then
β̂ = (X T Σ−1 X )−1 X T Σ−1 Y .
It can be shown that the above GLSE is the BLUE.
17/19
Generalized LSE
I
For any estimable function C T β, the unique BLUE is
T β = C T (X T Σ−1 X )− X T Σ−1 Y .
Cd
I
The variance of the GLSE is
T β) = C T (X T Σ−1 X )− C.
Var(Cd
I
Asymptotically,
T β ∼ N(C T β, C T (X T Σ−1 X )− C).
Cd
18/19
Generalized LSE
I
Recall that Σ = ZGZ T + R where G and R are typically
unknown.
I
We need to estimate G and R by replacing the unknown
parameters in G and R by their estimators Ĝ and R̂.
I
Finally, we estimate Σ̂ by Σ̂ = Z ĜZ T + R̂.
I
The estimation of G and R will be introduced soon.
19/19
Download