THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS We begin with a relatively simple special case. Suppose yijk = µ + τi + uij + eijk , (i = 1, . . . , t; j = 1, . . . , n; k = 1, . . . , m) β = (µ, τ1 , . . . , τt )0 , u = (u11 , u12 , . . . , utn )0 , e = (e111 , e112 , . . . , etnm )0 , β ∈ IRt+1 , an unknown parameter vector, 2 u 0 σu I 0 ∼N , , where e 0 0 σe2 I σu2 , σe2 ∈ IR+ are unknown variance components. c Copyright 2012 (Iowa State University) Statistics 511 1 / 47 This is the standard model for a CRD with t treatments, n experimental units per treatment, and m observations per experimental unit. We can write the model as y = Xβ + Zu + e, where X = [1tnm×1 , It×t ⊗ 1nm×1 ] c Copyright 2012 (Iowa State University) and Z = [Itn×tn ⊗ 1m×1 ]. Statistics 511 2 / 47 The ANOVA Table Source treatments exp.units(treatments) obs.units(exp.units, treatments) c.total c Copyright 2012 (Iowa State University) DF t−1 t(n − 1) tn(m − 1) tnm − 1 Statistics 511 3 / 47 The ANOVA Table Source trt xu(trt) ou(xu, trt) DF t−1 t(n − 1) tn(m − 1) c.total tnm − 1 c Copyright 2012 (Iowa State University) Sum of Squares Pt Pn Pm 2 k=1 (yi·· − y··· ) Pti=1 Pj=1 n Pm 2 k=1 (yij· − yi·· ) Pi=1 Pj=1 t n Pm 2 j=1 k=1 (yijk − yij· ) Pi=1 P P t n m 2 i=1 j=1 k=1 (yijk − y··· ) Statistics 511 4 / 47 Source trt xu(trt) ou(xu, trt) DF t−1 tn − t tnm − tn c.total tnm − 1 c Copyright 2012 (Iowa State University) Sum of Squares Mean Square P nmP ti=1P (yi·· − y··· )2 m ti=1 nj=1 (yij· − yi·· )2 SS/DF Pt Pn Pm 2 (y − y ) ij· k=1 ijk Pi=1 Pj=1 t n Pm 2 (y − y ··· ) i=1 j=1 k=1 ijk Statistics 511 5 / 47 Expected Mean Squares E(MStrt ) = nm t−1 Pt = nm t−1 Pt + τi + ui· + ei·· − µ − τ · − u·· − e··· )2 = nm t−1 Pt − τ · + ui· − u·· + ei·· − e··· )2 = nm t−1 Pt = nm Pt t−1 [ i=1 (τi i=1 E(yi.. i=1 E(µ i=1 E(τi i=1 [(τi − y··· )2 − τ · )2 + E(ui· − u·· )2 + E(ei·· − e··· )2 ] P − τ · )2 + E{ ti=1 (ui· − u·· )2 } P +E{ ti=1 (ei·· − e··· )2 }] c Copyright 2012 (Iowa State University) Statistics 511 6 / 47 To simplify this expression further, note that σu2 i.i.d. u1· , . . . , ut· ∼ N 0, . n Thus, E ( t X ) (ui· − u·· ) 2 = (t − 1) i=1 σu2 . n Similarly, e1·· , . . . , et·· σe2 . ∼ N 0, nm i.i.d. Thus, E ( t X ) (ei·· − e··· )2 i=1 = (t − 1) σe2 . mn It follows that E(MStrt ) = t nm X (τi − τ .)2 + mσu2 + σe2 . t−1 i=1 c Copyright 2012 (Iowa State University) Statistics 511 7 / 47 Similar calculations allow us to add an Expected Mean Squares (EMS) column to our ANOVA table. Source trt xu(trt) ou(xu, trt) c Copyright 2012 (Iowa State University) EMS σe2 + mσu2 + σe2 + mσu2 σe2 nm t−1 Pt i=1 (τi − τ .)2 Statistics 511 8 / 47 The entire table could also be derived using matrices X1 = 1, X2 = It× t ⊗ 1nm× 1 , X3 = Itn× tn ⊗ 1m× 1 . Source trt xu(trt) ou(xu, trt) c.total Sum of Squares y0 (P2 − P1 )y y0 (P3 − P2 )y y0 (I − P3 )y y0 (I − P1 )y c Copyright 2012 (Iowa State University) DF MS rank(X2 ) − rank(X1 ) rank(X3 ) − rank(X2 ) SS/DF tnm − rank(X3 ) tnm − 1 Statistics 511 9 / 47 Expected Mean Squares (EMS) could be computed using E(y0 Ay) = tr(AΣ) + E(y)0 AE(y), where Σ = Var(y) = ZGZ0 + R = σu2 Itn× tn ⊗ 110 m× m + σe2 Itnm× tnm and E(y) = c Copyright 2012 (Iowa State University) µ + τ1 µ + τ2 . . . µ + τt ⊗ 1nm× 1 . Statistics 511 10 / 47 Furthermore, it can be shown that y0 (P2 − P1 )y ∼ χ2t−1 σe2 + mσu2 t X nm (τi − τ .)2 σe2 + mσu2 ! , i=1 y0 (P3 − P2 )y ∼ χ2tn−t , σe2 + mσu2 y0 (I − P3 )y ∼ χ2tnm−tn , σe2 and that these three χ2 random variables are independent. c Copyright 2012 (Iowa State University) Statistics 511 11 / 47 It follows that MStrt ∼ Ft−1, tn−t F1 = MSxu(trt) and MSxu(trt) F2 = ∼ MSou(xu,trt) t X nm (τi − τ .)2 σe2 + mσu2 σe2 + mσu2 σe2 ! i=1 Ftn−t, tnm−tn . Thus, we can use F1 to test H0 : τ1 = ... = τt and F2 to test H0 : σu2 = 0. c Copyright 2012 (Iowa State University) Statistics 511 12 / 47 Estimating σu2 Note that E MSxu(trt) − MSou(xu,trt) m Thus, = (σe2 + mσu2 ) − σe2 = σu2 . m MSxu(trt) − MSou(xu,trt) m is an unbiased estimator of σu2 . c Copyright 2012 (Iowa State University) Statistics 511 13 / 47 Although MSxu(trt) − MSou(xu,trt) m is an unbiased estimator of σu2 , this estimator can take negative values. This is undesirable because σu2 , the variance of the u random effects, cannot be negative. c Copyright 2012 (Iowa State University) Statistics 511 14 / 47 As we have seen previously, Σ ≡ Var(y) = σu2 Itn× tn ⊗ 110 m× m + σe2 Itnm× tnm . It turns out that β̂ Σ = (X0 Σ−1 X)− X0 Σ−1 y = (X0 X)− X0 y = β̂. Thus, the GLS estimator of any estimable Cβ is equal to the OLS estimator in this special case. c Copyright 2012 (Iowa State University) Statistics 511 15 / 47 An Analysis Based on the Average for Each Experimental Unit Recall that our model is yijk = µ + τi + uij + eijk , (i = 1, ..., t; j = 1, ..., n; k = 1, ..., m) The average of observations for experimental unit ij is yij· = µ + τi + uij + eij· c Copyright 2012 (Iowa State University) Statistics 511 16 / 47 If we define ij = uij + eij· ∀ i, j and σ 2 = σu2 + σe2 , m we have yij· = µ + τi + ij , where the ij terms are iid N(0, σ 2 ). Thus, averaging the same number (m) of multiple observations per experimental unit results in a normal theory Gauss-Markov linear model for the averages yij· : i = 1, ..., t; j = 1, ..., n . c Copyright 2012 (Iowa State University) Statistics 511 17 / 47 Inferences about estimable functions of β obtained by analyzing these averages are identical to the results obtained using the ANOVA approach as long as the number of multiple observations per experimental unit is the same for all experimental units. When using the averages as data, our estimate of σ 2 is an 2 estimate of σu2 + σme . We can’t separately estimate σu2 and σe2 , but this doesn’t matter if our focus is on inference for estimable functions of β. c Copyright 2012 (Iowa State University) Statistics 511 18 / 47 Because E(y) = µ + τ1 µ + τ2 . . . µ + τt ⊗ 1nm× 1 , the only estimable quantities are linear combinations of the treatment means µ + τ1 , µ + τ2 , ..., µ + τt , whose Best Linear Unbiased Estimators are y1·· , y2·· , . . . , yt·· , respectively. c Copyright 2012 (Iowa State University) Statistics 511 19 / 47 Thus, any estimable Cβ can always be written as µ + τ1 µ + τ2 A for some matrix A. .. . µ + τt It follows that the BLUE of Cβ can be written as y1·· y 2·· A . . .. yt·· c Copyright 2012 (Iowa State University) Statistics 511 20 / 47 Now note that Var(yi ..) = Var(µ + τi + ui . + ei ..) = Var(ui . + ei ..) = Var(ui .) + Var(ei ..) σu2 σ2 = + e n nm 1 σe2 2 = σu + n m 2 σ = . n c Copyright 2012 (Iowa State University) Statistics 511 21 / 47 Thus Var y1.. y2.. . . . yt.. σ 2 = It×t n which implies that the variance of the BLUE of Cβ is y1·· . 2 σ σ2 0 Var A . = A It×t A0 = AA . n n . yt·· c Copyright 2012 (Iowa State University) Statistics 511 22 / 47 Thus, we don’t need separate estimates of σu2 and σe2 to carry out inference for estimable Cβ. We do need to estimate σ 2 = σu2 + σe2 m. This can equivalently be estimated by MSxu(trt) m or by the MSE in an analysis of the experimental unit means yij· : i = 1, ..., t; j = 1, ..., n. . c Copyright 2012 (Iowa State University) Statistics 511 23 / 47 For example, suppose we want to estimate τ1 − τ2 . The BLUE is y1·· − y2·· whose variance is Var(y1·· − y2·· ) = Var(y1·· ) + Var(y2·· ) 2 σ2 σu σe2 = 2 =2 + n n mn 2 2 = (σ + mσu2 ) mn e 2 E(MSxu(trt) ) = mn Thus, c 1·· − y2·· ) = Var(y c Copyright 2012 (Iowa State University) 2MSxu(trt) . mn Statistics 511 24 / 47 A 100(1 − α)% confidence intreval for τ1 − τ2 is r 2MSxu(trt) . y1·· − y2·· ± tt(n−1),1−α/2 mn A test of H0 : τ1 = τ2 can be based on y − y2·· τ1 − τ2 t = q1·· ∼ tt(n−1) q 2MSxu(trt) mn c Copyright 2012 (Iowa State University) 2(σe2 +mσu2 ) mn . Statistics 511 25 / 47 What if the number of observations per experimental unit is not the same for all experimental units? Let us look at two miniature examples to understand how this type of unbalancedness affects estimation and inference. c Copyright 2012 (Iowa State University) Statistics 511 26 / 47 First Example y111 y121 y= y211 , y212 c Copyright 2012 (Iowa State University) 1 1 X= 0 0 0 0 , 1 1 1 0 Z= 0 0 0 1 0 0 0 0 1 1 Statistics 511 27 / 47 X1 = 1, X2 = X, X3 = Z MStrt = y0 (P2 − P1 )y = 2(y1·· − y··· )2 + 2(y2·· − y··· )2 = (y1·· − y2·· )2 1 MSxu(trt) = y0 (P3 − P2 )y = (y111 − y1·· )2 + (y121 − y1·· )2 = (y111 − y121 )2 2 1 MSou(xu,trt) = y0 (I − P3 )y = (y211 − y2·· )2 + (y212 − y2·· )2 = (y211 − y212 )2 2 c Copyright 2012 (Iowa State University) Statistics 511 28 / 47 E(MStrt ) = E(y1·· − y2·· )2 = E(τ1 − τ2 + u1· − u21 + e1·· − e2·· )2 = (τ1 − τ2 )2 + Var(u1· ) + Var(u21 ) + Var(e1·· ) + Var(e2·· ) = (τ1 − τ2 )2 + σu2 2 + σu2 + σe2 2 + σe2 2 = (τ1 − τ2 )2 + 1.5σu2 + σe2 c Copyright 2012 (Iowa State University) Statistics 511 29 / 47 E(MSxu(trt) ) = 1 2 E(y111 = 1 2 E(u11 = 1 2 2 (2σu − y121 )2 − u12 + e111 − e121 )2 + 2σe2 ) = σu2 + σe2 E(MSou(xu,trt) ) = 1 2 E(y211 − y212 )2 = 1 2 E(e211 − e212 )2 = σe2 c Copyright 2012 (Iowa State University) Statistics 511 30 / 47 SOURCE trt xu(trt) ou(xu, trt) F= MStrt 1.5σu2 + σe2 c Copyright 2012 (Iowa State University) EMS (τ1 − τ2 )2 + 1.5σu2 + σe2 σu2 + σe2 σe2 MSxu(trt) (τ1 − τ2 )2 / ∼ F 1,1 σu2 + σe2 1.5σu2 + σe2 Statistics 511 31 / 47 The test statistic that we used to test H0 : τ1 = · · · = τt in the balanced case is not F distributed in this unbalanced case. (τ1 − τ2 )2 MStrt 1.5σu2 + σe2 F1,1 ∼ MSxu(trt) σu2 + σe2 1.5σu2 + σe2 c Copyright 2012 (Iowa State University) Statistics 511 32 / 47 A Statistic with an Approximate F Distribution We’d like our denominator to be an unbiased estimator of 1.5σu2 + σe2 in this case. Consider 1.5MSxu(trt) − 0.5MSou(xu,trt) The expectation is 1.5(σu2 + σe2 ) − 0.5σe2 = 1.5σu2 + σe2 . The ratio MStrt 1.5MSxu(trt) − 0.5MSou(xu,trt) can be used as an approximate F statistic with 1 numerator DF and a denominator DF obtained using the Cochran-Satterthwaite method. c Copyright 2012 (Iowa State University) Statistics 511 33 / 47 The Cochran-Satterthwaite method will be explained in the next set of notes. We should not expect this approximate F-test to be reliable in this case because of our pitifully small dataset. c Copyright 2012 (Iowa State University) Statistics 511 34 / 47 Best Linear Unbiased Estimates in this First Example What do the BLUEs of the treatment means look like in this case? Recall 1 0 1 0 0 1 0 µ1 , Z = 0 1 0 . β= , X= 0 1 0 0 1 µ2 0 1 0 0 1 c Copyright 2012 (Iowa State University) Statistics 511 35 / 47 Σ = Var(y) = ZGZ0 + R = σu2 ZZ0 + σe2 I 1 0 0 0 1 0 0 0 0 1 0 0 2 0 1 0 0 = σu2 0 0 1 1 + σe 0 0 1 0 0 0 1 1 0 0 0 1 2 2 σu 0 0 0 σe 0 0 0 0 σu2 0 0 0 σe2 0 0 = 0 0 σu2 σu2 + 0 0 σe2 0 0 0 σu2 σu2 0 0 0 σe2 2 σu + σe2 0 0 0 0 σu2 + σe2 0 0 = 2 2 2 0 0 σu + σe σu 2 2 2 0 0 σu σu + σe c Copyright 2012 (Iowa State University) Statistics 511 36 / 47 It follows that β̂ Σ = (X0 Σ−1 X)− X0 Σ−1 y 1 1 0 0 y1·· 2 2 = y= y2·· 0 0 21 12 Fortunately, this is a linear estimator that does not depend on unknown variance components. c Copyright 2012 (Iowa State University) Statistics 511 37 / 47 Second Example y111 y112 y= y121 , y211 c Copyright 2012 (Iowa State University) 1 1 X= 1 0 0 0 , 0 1 1 1 Z= 0 0 0 0 1 0 0 0 0 1 Statistics 511 38 / 47 In this case, it can be shown that β̂ Σ = (X0 Σ−1 X)− X0 Σ−1 y " = " = σe2 +σu2 3σe2 +4σu2 σe2 +σu2 3σe2 +4σu2 σe2 +2σu2 3σe2 +4σu2 0 0 0 2σe2 +2σu2 3σe2 +4σu2 c Copyright 2012 (Iowa State University) y11· + y211 σe2 +2σu2 3σe2 +4σu2 0 1 # y121 y111 y112 y121 y211 # . Statistics 511 39 / 47 It is straightforward to show that the weights on y11· and y121 are 1 Var(y11· ) 1 Var(y11· ) + 1 Var(y121 ) and 1 Var(y121 ) 1 Var(y11· ) + 1 Var(y121 ) , respectively. This is a special case of a more general phenomenon: the BLUE is a weighted average of independent linear unbiased estimators with weights for the linear unbiased estimators proportional to the inverse variances of the linear unbiased estimators. c Copyright 2012 (Iowa State University) Statistics 511 40 / 47 Of course, in this case and in many others, " 2 2 # 2σe +2σu σe2 +2σu2 y + 3σ2 +4σ2 y121 e u β̂ Σ = 3σe2 +4σu2 11· y211 is not an estimator because it is a function of unknown parameters. 2 2 Thus, we use β̂ Σ b as our estimator (i.e., we replace σe and σu by estimates in the expression above). c Copyright 2012 (Iowa State University) Statistics 511 41 / 47 β̂ Σ b is an approximation to the BLUE. β̂ Σ b is not even a linear estimator in this case. Its exact distribution is unknown. When sample sizes are large, it is reasonable to assume that the distribution of β̂ Σ b is approximately the same as the distribution of β̂ Σ . c Copyright 2012 (Iowa State University) Statistics 511 42 / 47 Var(β̂ Σ ) = Var[(X0 Σ−1 X)−1 X0 Σ−1 y] = (X0 Σ−1 X)−1 X0 Σ−1 Var(y)[(X0 Σ−1 X)−1 X0 Σ−1 ]0 = (X0 Σ−1 X)−1 X0 Σ−1 ΣΣ−1 X(X0 Σ−1 X)−1 = (X0 Σ−1 X)−1 X0 Σ−1 X(X0 Σ−1 X)−1 = (X0 Σ−1 X)−1 −1 0b Var(β̂ Σ b ) = Var[(X Σ c Copyright 2012 (Iowa State University) b −1 y] =???? ≈ (X0 Σ b −1 X)−1 X)−1 X0 Σ Statistics 511 43 / 47 Summary of Main Points Many of the concepts we have seen by examining special cases hold in greater generality. For many of the linear mixed models commonly used in practice, balanced data are nice because... c Copyright 2012 (Iowa State University) Statistics 511 44 / 47 1 It is relatively easy to determine degrees of freedom, sums of squares, and expected mean squares in an ANOVA table. 2 Ratios of appropriate mean squares can be used to obtain exact F-tests. 3 For estimable Cβ, Cβ̂ Σ b = Cβ̂. (OLS = GLS). 4 5 When Var(c0 β̂) = constant × E(MS), exact inferences about c0 β can be obtained by constructing t-tests or confidence intervals based on c0 β̂ − c0 β t= p ∼ tDF(MS) . constant × (MS) Simple analysis based on experimental unit averages gives the same results as those obtained by linear mixed model analysis of the full data set. c Copyright 2012 (Iowa State University) Statistics 511 45 / 47 When data are unbalanced, the analysis of linear mixed may be considerably more complicated. 1 Approximate F-tests can be obtained by forming linear combinations of Mean Squares to obtain denominators for test statistics. 2 The estimator Cβ̂ Σ b may be a nonlinear estimator of Cβ whose exact distribution is unknown. 3 Approximate inference for Cβ is often obtained by using the distribution of Cβ̂ Σ b , with unknowns in that distribution replaced by estimates. c Copyright 2012 (Iowa State University) Statistics 511 46 / 47 Whether data are balanced or unbalanced, unbiased estimators of variance components can be obtained using linear combinations of mean squares from the ANOVA table. c Copyright 2012 (Iowa State University) Statistics 511 47 / 47