Variance component estimation 1/16 Linear mixed models In general, a linear mixed model may be represented as Y = X β + Zu + ε, where I Y is an n × 1 vector of response; I X is an n × p design matrix; I β is a p × 1 vector of “fixed” unknown parameter values; I Z is an n × q model matrix of known constants; I u is a q × 1 random vector; I ε is an n × 1 random error. 2/16 Linear mixed models We typically assume that E(ε) = 0, Var(ε) = R, E(u) = 0, Var(u) = G. As a result, Var(Y ) = ZGZ T + R. 3/16 Generalized LSE I For any estimable function C T β, the unique BLUE is T β = C T (X T Σ−1 X )− X T Σ−1 Y . Cd I Note that Σ = ZGZ T + R and G, R are typically unknown. G and R are functions of some unknown parameters. We need to estimate G and R by replacing the unknown parameters in G and R by their estimators. 4/16 Variance component estimation Three basic methods: I ANOVA methods (method of moments) I Maximum likelihood (ML) method I Restricted ML method (REML) 5/16 ANOVA methods ANOVA methods: I Step 1: Compute an ANOVA table I Step 2: Find the expectation of the mean squares I Step 3: Equate the mean squares to their expectations and solving the resulting equations 6/16 Example: random blocks (penicillin production) I Comparison of four processes for producing penicillin. I Four processes A, B, C and D, levels of a “fixed” effect treatment. I Random sample of five batches of raw material, corn steep liquor. I Split each batch into four parts. Run each process on one part, and randomize the order in which the processes are run with each batch. 7/16 Example continued Let us consider the following random block effects model: Yij = µ + αi + γj + εij , i = 1, · · · , a; j = 1, · · · , b, iid where γj is the random block effect (batch effect), γj ∼ N(0, σγ2 ), iid εij ∼ N(0, σ 2 ) and γj ’s are independent of εij ’s. 8/16 Variance covariance structure If a = 4, the variance covariance of Yj = (Y1j , Y2j , Y3j , Y4j )T is Var(Yj ) = σγ2 + σ 2 σγ2 σγ2 σγ2 σγ2 σγ2 + σ 2 σγ2 σγ2 σγ2 σγ2 σγ2 + σ 2 σγ2 σγ2 σγ2 σγ2 σγ2 + σ 2 . The above variance covariance structure is a function of σ 2 and σγ2 . We need to estimate the unknown variance components σ 2 and σγ2 . 9/16 Random block effects model For the above random block effects model, we can write the model as Y11 Y 21 .. . Yab µ 1a 0 ··· 0 γ1 α1 .. . + 0 .. . 1a .. . ··· .. . 0 .. . γ2 .. . 0 0 ··· 1a ε 21 + .. . εab 1a 0 ··· 0 0 and Z = .. . Ia 0 1a .. . ··· .. . 0 .. . . 0 ··· 1a = 1a .. . Ia .. . 1a Ia αa γb ε11 . Denote 1a .. X = . 1a Ia .. . 10/16 ANOVA table From what have learned in Chapter 1, the variance explained by αi ’s is SSA = R(αi0 s|µ) = Y T (PX − P1ab )Y , where 1ab = (1Ta , · · · , 1Ta )T , PX = X (X T X )− X T and P1ab = 1ab (1Tab 1ab )− 1Tab . Similarly, the variance explained by by γj ’s is SSA = R(γj0 s|µ, αi0 s) = Y T (P(X ,Z ) − PX )Y . and the sum of square of error is SSE = Y T (I − P(X ,Z ) )Y . 11/16 ANOVA method Step 1: The ANOVA table is Source Sum of Squares P αi0 s SSA = b ai=1 (Ȳi· − Ȳ·· )2 P γj0 s SSB = a aj=1 (Ȳ·j − Ȳ·· )2 P P error SSE = ai=1 bj=1 (Yij − Ȳi· − Ȳ·j + Ȳ·· )2 DF a−1 b−1 (a − 1)(b − 1) 12/16 ANOVA method Step 2: we can show the following E(MSE) = σ 2 E(MSB) = σ 2 + aσγ2 a E(MSA) = σ 2 + b X (αi − ᾱ)2 a−1 i=1 where MSE = SSE/{(a − 1)(b − 1)}, MSA = SSA/(a − 1) and MSB = SSB/(b − 1). 13/16 ANOVA method Step 3: Equate the MS with their expectations MSE = σ 2 MSB = σ 2 + aσγ2 a MSA = σ 2 + b X (αi − ᾱ)2 a−1 i=1 Then we obtain the estimation of variance components as following σ̂ 2 = MSE σ̂ 2 = (MSB − MSE)/a. 14/16 Example: Hierarchical (nested) random effects model Analysis of sources of variation in a process used to monitor the production of pigment paste. The following sampling scheme is used I Sample b barrels of pigment paste I Take s samples from each barrel I Each sample is mixed and divided into r parts. Each part is sent to the lab for determination of moisture content There are a total of n = bsr observations. 15/16 Example continued Measured response: moisture content of the pigment paste. Problem: variation in moisture content is too large, with average moisture content approximately 2.5% and the standard deviation is about 0.6%. Goal: to identify the sources of variation in order to improve the pigment pate production. 16/16