Linear Mixed-Effects Models c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 34 The Linear Mixed-Effects Model y = Xβ + Zu + e X is an n × p design matrix of known constants β ∈ Rp is an unknown parameter vector Z is an n × q matrix of known constants u is a q × 1 random vector e is an n × 1 vector of random errors c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 2 / 34 The Linear Mixed-Effects Model y = Xβ + Zu + e The elements of β are considered to be non-random and are called “fixed effects.” The elements of u are random variables and are called “random effects.” The elements of the error vector e are always considered to be random variables. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 3 / 34 Because the model includes both fixed and random effects (in addition to the random errors), it is called a “mixed-effects” model or, more simply, a “mixed” model. The model is called a “linear” mixed-effects model because (as we will soon see) E(y|u) = Xβ + Zu, a linear function of fixed and random effects. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 4 / 34 We assume that E(e) = 0 Var(e) = R E(u) = 0 Var(u) = G Cov(e, u) = 0. It follows that c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 5 / 34 E(y) = E(Xβ + Zu + e) = Xβ + ZE(u) + E(e) = Xβ Var(y) = Var(Xβ + Zu + e) = Var(Zu + e) = Var(Zu) + Var(e) = ZVar(u)Z0 + R = ZGZ0 + R ≡ Σ. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 6 / 34 We usually consider the special case in which " u e # " ∼N 0 0 # " , G 0 0 #! R =⇒ y ∼ N(Xβ, ZGZ0 + R). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 7 / 34 The conditional moments, given the random effects, are E(y|u) = Xβ + Zu Var(y|u) = R. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 8 / 34 Example 1 Suppose a study was conducted to compare two plant genotypes (genotype 1 and genotype 2). Suppose 10 seeds (5 of genotype 1 and 5 of genotype 2) were planted in a total of 4 pots. Suppose 3 genotype 1 seeds were planted in one pot, and the other 2 genotype 1 seeds were planted in another pot. Suppose the same planting strategy was used when planting genotype 2 seeds in the other two pots. The seeds germinated and emerged from the soil as seedlings. After a four-week growing period, each seedling was dried and weighed. Let yijk denote the weight for genotype i, pot j, seedling k. Provide a linear mixed-effects model for the dry-weight data. Determine y, X, β, Z, u, G, R, and Var(y). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 9 / 34 Consider the model yijk = µ + γi + pij + eijk i.i.d. p11 , p12 , p21 , p22 ∼ N(0, σp2 ) independent of the eijk terms, which are assumed to be iid N(0, σe2 ). This model can be written in the form y = Xβ + Zu + e, where c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 10 / 34 y= y111 1 1 0 1 1 0 y112 1 1 0 y113 1 1 0 y121 µ 1 1 0 y122 , β = γ1 , X = 1 0 1 y211 γ2 1 0 1 y212 1 0 1 y213 1 0 1 y221 1 0 1 y222 c Copyright 2012 Dan Nettleton (Iowa State University) , Statistics 611 11 / 34 Z= 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 , u = 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 c Copyright 2012 Dan Nettleton (Iowa State University) p11 p12 , p21 p22 e= e111 e112 e113 e121 e122 . e211 e212 e213 e221 e222 Statistics 611 12 / 34 G = Var(u) = Var([p11 , p12 , p21 , p22 ]0 ) = σp2 I4×4 R = Var(e) = σe2 I10×10 Var(y) = ZGZ0 + R = Zσp2 IZ0 + σe2 I = σp2 ZZ0 + σe2 I. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 13 / 34 ZZ0 = 1 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 14 / 34 Thus, Var(y) = σp2 ZZ0 + σe2 I is a block diagonal matrix. The first block is y111 Var y112 = y113 c Copyright 2012 Dan Nettleton (Iowa State University) σp2 + σe2 σp2 σp2 σp2 σp2 + σe2 σp2 σp2 σp2 σp2 + σe2 . Statistics 611 15 / 34 Note that Var(yijk ) = σp2 + σe2 Cov(yijk , yijk∗ ) = σp2 Cov(yijk , yi∗ j∗ k∗ ) = 0 ∀ i, j, k. ∀ i, j, and k 6= k∗ . if i 6= i∗ or j 6= j∗ . Any two observations from the same pot have covariance σp2 . Any two observations from different pots are uncorrelated. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 16 / 34 Note that Var(y) may be written as σe2 V where V is a block diagonal matrix with blocks of the form 1 + σp2 /σe2 σp2 /σe2 . . . σp2 /σe2 σp2 /σe2 1 + σp2 /σe2 . . . σp2 /σe2 . . . . . . . σp2 /σe2 σp2 /σe2 . . . . . . . . 1 + σp2 /σe2 Thus, if σp2 /σe2 were known, we would have the Aitken Model. y = Xβ + ε, where ε = Zu + e ∼ N(0, σ 2 V), σ 2 ≡ σe2 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 17 / 34 Thus, if σp2 /σe2 were known, we would use GLS to estimate any estimable Cβ by Cβ̂ GLS = C(X0 V −1 X)− X0 V −1 y. However, we seldom know σp2 /σe2 or, more generally, Σ or V. For the general problem where Var(y) = Σ is an unknown positive definite matrix, we can rewrite Σ as σ 2 V, where σ 2 is an unknown positive variance and V is an unknown positive definite matrix. As in our simple example, each entry of V is usually assumed to be a known function of few unknown parameters. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 18 / 34 Thus, our strategy for estimating an estimable Cβ involves estimating the unknown parameters in V to obtain 0 −1 − 0 −1 Cβ̂ GLS d = C(X V̂ X) X V̂ y. In general, 0 −1 − 0 −1 Cβ̂ GLS d = C(X V̂ X) X V̂ y is an nonlinear estimator that is an approximation to Cβ̂ GLS = C(X0 V −1 X)− X0 V −1 y, which would be the BLUE of Cβ if V were known. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 19 / 34 In special cases, Cβ̂ GLS d may be a linear estimator. For example, if there exists a matrix Q such that VX = XQ, then we know that Cβ̂ GLS = Cβ̂ and Cβ̂ GLS d = Cβ̂, which is a linear estimator of Cβ. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 20 / 34 However, even for our simple example involving seedling dry weight, Cβ̂ GLS d is a nonlinear estimator of Cβ for C = [1, 1, 0] ⇐⇒ Cβ = µ + γ1 , C = [1, 0, 1] ⇐⇒ Cβ = µ + γ2 , and C = [0, 1, −1] ⇐⇒ Cβ = γ1 − γ2 . Confidence intervals and tests for these estimable functions are not exact. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 21 / 34 In our simple example involving seedling dry weight, there was only one random factor (pot). When there are m random factors, we can partition Z and u as u1 . . Z = [Z1 , . . . , Zm ] and u = . , um where uj is the vector of random effects associated with factor j (j = 1, . . . , m). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 22 / 34 We can write Zu as u1 m . X . [Z1 , . . . , Zm ] . = Zj uj . j=1 um We often assume that all random effects (including random errors) are mutually independent and that the random effects associated with the jth random factor have variance σj2 (j = 1, . . . , m). Under these assumptions, Var(y) = ZGZ0 + R = c Copyright 2012 Dan Nettleton (Iowa State University) m X σj2 Zj Z0j + σe2 I. j=1 Statistics 611 23 / 34 Example 2 Consider an experiment involving 4 litters of 4 animals each. Suppose 4 treatments are randomly assigned to the 4 animals in each litter. Suppose we obtain two replicate muscle samples from each animal and measure the response of interest for each muscle sample. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 24 / 34 Let yijk denote the kth measure of the response for the animal from litter j that received treatment i (i = 1, 2, 3, 4; j = 1, 2, 3, 4; k = 1, 2) Suppose yijk = µ + τi + `j + aij + eijk , where β = [µ, τ1 , τ2 , τ3 , τ4 ]0 ∈ R5 is an unknown vector of fixed parameters, u = [`1 , `2 , `3 , `4 , a11 , a21 , a31 , a41 , a12 , . . . , a34 , a44 ]0 is a vector of random effects, and c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 25 / 34 e = [e111 , e112 , e212 , . . . , e411 , e412 , . . . , e441 , e442 ]0 is a vector of random errors. With y = [y111 , y112 , y212 , . . . , y411 , y412 , . . . , y441 , y442 ]0 , we can write the model as a linear mixed-effects model y = Xβ + Zu + e, where c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 26 / 34 1 1 0 0 0 1 1 0 0 0 1 0 1 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 0 1 X= 1 0 0 0 1 , The matrix above repeated three more times. c Copyright 2012 Dan Nettleton (Iowa State University) 1 1 1 1 1 1 1 1 . Z= .. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .. . 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .. . 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .. . 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 .. . 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 .. . 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 .. . 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 .. . 0 0 0 0 0 0 0 0 ... ... ... ... ... ... ... ... .. . ... ... ... ... ... ... ... ... Statistics 611 0 0 0 0 0 0 0 0 .. . 0 0 0 0 0 0 1 1 27 / 34 We can write less and be more precise using Kronecker product notation. X = 1 ⊗ [ 1 , I ⊗ 1 ], 4×1 8×1 4×4 2×1 Z=[ I ⊗ 1 , I 4×4 8×1 16×16 ⊗ 1 ]. 2×1 In this experiment, we have two random factors: litter and animal. We can partition our random effects vector u into a vector of litter effects and a vector of animal effects: c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 28 / 34 u= " # ` a ` 1 ` 2 ` = , `3 `4 , a11 a21 a31 a = a41 , a12 . .. a44 We make the usual assumption that u= " # ` a ∼N " # " #! 0 σ`2 I 0 , , 0 0 σa2 I where σ`2 , σa2 ∈ R+ are unknown parameters. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 29 / 34 We can partition Z = [ I ⊗ 1 , I 4×4 8×1 16×16 ⊗ 1 ] 2×1 = [Z` , Za ]. We have Zu = [Z` , Za ] " # ` a = Z` ` + Za a and c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 30 / 34 Var(Zu) = ZGZ0 " = [Z` , Za ] σ`2 I 0 #" 0 σa2 I Z0` # Z0a = Z` (σ`2 I)Z0` + Za (σa2 I)Z0a = σ`2 Z` Z0` + σa2 Za Z0a = σ`2 I ⊗ 110 + σa2 I 4×4 c Copyright 2012 Dan Nettleton (Iowa State University) 8×8 16×16 ⊗ 110 . 2×2 Statistics 611 31 / 34 We usually assume that all random effects and random errors are mutually independent and that the errors (like the effects within each factor) are identically distributed: σ`2 I 0 0 0 ` . a ∼ N 0 , 0 σ 2 I 0 a 0 0 σe2 I 0 e The unknown variance parameters σ`2 , σa2 , σe2 ∈ R+ are called variance components. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 32 / 34 In this case, we have R = Var(e) = σe2 I. Thus, Var(y) = ZGZ0 + R = σ`2 Z` Z0` + σa2 Za Z0a + σe2 I. This is a block diagonal matrix with a block as follows. (To get a block to fit on one slide, let ` = σ`2 , a = σa2 , e = σe2 ). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 33 / 34 `+a+e `+a ` ` ` ` ` ` `+a `+a+e ` ` ` ` ` ` ` ` `+a+e `+a ` ` ` ` ` ` `+a `+a+e ` ` ` ` ` ` ` ` ` `+a+e `+a ` ` ` ` ` `+a `+a+e ` ` ` ` ` ` ` ` `+a+e `+a ` ` ` ` ` ` `+a `+a+e c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 34 / 34