ANOVA Variance Component Estimation c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 32 We now consider the ANOVA approach to variance component estimation. The ANOVA approach is based on the method of moments. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 2 / 32 Suppose y = Xβ + Zu + e, where u and e are independent random vectors satisfying E(u) = 0 and c Copyright 2012 Dan Nettleton (Iowa State University) E(e) = 0. Statistics 611 3 / 32 Furthermore, suppose n×q Z can be partitioned as Z = [ Z1 , . . . , Zm ] n×q1 n×qm and u can be partitioned correspondingly as u1 . . u= . um so that Zu = c Copyright 2012 Dan Nettleton (Iowa State University) m X Zj uj . j=1 Statistics 611 4 / 32 Suppose u1 . 2 0 2 0 . Var(u) = Var . = diag(σ1 1q1 , . . . , σm 1qm ) um σ12 I 0 q1 ×q1 ... = . 0 σm2 I qm ×qm c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 5 / 32 Then Var(y) = m+1 X σj2 Zj Z0j , j=1 2 where Zm+1 ≡n×n I and σm+1 = σe2 . By Lemma 4.1, E(y0 Ay) = tr(AVar(y)) + [E(y)]0 AE(y) m+1 X = σj2 tr(AZj Z0j ) + β 0 X0 AXβ j=1 = m+1 X σj2 tr(Z0j AZj ) + β 0 X0 AXβ. j=1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 6 / 32 Now suppose we choose a set of matrices A1 , . . . , Am+1 3 X0 Ai X = 0 ∀ i = 1, . . . , m + 1. Then 0 E(y Ai y) = m+1 X σj2 tr(Z0j Ai Zj ) ∀ i = 1, . . . , m + 1. j=1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 7 / 32 If we use the method of moments of moments, we replace E(y0 Ai y) with its observed value y0 Ai y to obtain the equations 0 y Ai y = m+1 X σ̂j2 tr(Z0j Ai Zj ) ∀ i = 1, . . . , m + 1. j=1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 8 / 32 We can write these equations in matrix form as tr(Z01 A1 Z1 ) ... tr(Z0m+1 A1 Zm+1 ) σ̂12 y0 A1 y tr(Z01 A2 Z1 ) . . . tr(Z0m+1 A2 Zm+1 ) σ̂22 y0 A2 y = .. .. .. .. .. . . . . . . 0 0 2 tr(Z1 Am+1 Z1 ) . . . tr(Zm+1 Am+1 Zm+1 ) σ̂m+1 y0 Am+1 y 2 Solving for σ̂12 , σ22 , . . . , σ̂m+1 gives the ANOVA estimates. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 9 / 32 The matrices A1 , . . . , Am+1 are usually chosen so that y0 A1 y, . . . , y0 Am+1 y correspond to sums of squares from an ANOVA table. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 10 / 32 Many strategies for choosing A1 , . . . , Am+1 have been proposed. The book Variance Components by Searle, Casella, and McCulloch contains an extensive discussion of several strategies. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 11 / 32 Let M denote the matrix whose (i, j)th element is tr(Z0j Ai Zj ). Let s = [y0 A1 y, y0 A2 y, . . . , y0 Am+1 y]0 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 12 / 32 If M is nonsingular, then the vector of ANOVA variance component estimates is σ̂12 . −1 . σ̂ 2 ≡ . = M s. 2 σ̂m+1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 13 / 32 Recall M and s were chosen so that Mσ 2 = E(s), where σ12 . . σ2 = . . 2 σm+1 Thus, E(σ̂ 2 ) = E(M−1 s) = M−1 E(s) = M−1 Mσ 2 = σ 2 . ∴ the ANOVA estimator of σ 2 is unbiased. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 14 / 32 Find ANOVA-based estimates of σp2 and σe2 for the seedling dry weight example. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 15 / 32 Recall 1 1 1 1 1 X= 1 1 1 1 1 1 0 1 0 1 0 1 0 1 0 , 0 1 0 1 0 1 0 1 0 1 c Copyright 2012 Dan Nettleton (Iowa State University) 1 1 1 0 0 Z= 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 . 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 Statistics 611 16 / 32 Because there is only one variance component besides the error variance, we have m = 1, Z1 = Z, and Z2 = I. We need to identify matrices A1 and A2 such that X0 A1 X = 0 and X0 A2 X = 0. Of course, we also require the quadratic forms y0 A1 y and y0 A2 y to contain information about σp2 and σe2 . To find appropriate A1 and A2 , start by writing down an ANOVA table with columns labeled Source, Matrix, and DF. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 17 / 32 Which of the matrices in the ANOVA table satisfy X0 AX = 0? c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 19 / 32 To apply the ANOVA variance component estimation method, we need to set up the equations " #" # tr(Z01 A1 Z1 ) tr(Z02 A1 Z2 ) σ̂p2 tr(Z01 A2 Z1 ) tr(Z02 A2 Z2 ) σ̂e2 " = # y0 A1 y y0 A2 y ⇐⇒ " tr(Z0 (PZ − PX )Z) tr(PZ − PX ) tr(Z0 (I − PZ )Z) tr(I − PZ ) c Copyright 2012 Dan Nettleton (Iowa State University) #" # σ̂p2 σ̂e2 " = # y0 (PZ − PX )y y0 (I − PZ )y . Statistics 611 21 / 32 Let nij = number of seedlings in the jth pot for genotype i. Thus, we have n11 = 3, n12 = 2, n21 = 3, n22 = 2. Write y0 A1 y = y0 (PZ − PX )y and y0 A2 y = y0 (I − PZ )y using summation notation. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 22 / 32 Find tr(Z0 (PZ − PX )Z), tr(Z0 (I − PZ )Z), tr(PZ − PX ), and tr(I − PZ ). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 25 / 32 Find expressions for the ANOVA estimators of σ̂p2 and σ̂e2 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 29 / 32 The ANOVA estimates of the variance components are sometimes equal to the REML estimates. This equality occurs for certain types of balanced designs and when the ANOVA estimates are positive. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 31 / 32 For the seedling dry weight example, the REML and ANOVA estimates agree when the latter are positive. However, if last pot contained a 3 seedlings instead of 2 (for example), the REML and ANOVA estimates would differ. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 32 / 32