Orthogonal Linear Combinations, Contrasts, and Additional Partitioning of ANOVA Sums of Squares c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 1 / 27 Orthogonal Linear Combinations Under the model y = Xβ + , ∼ N(0, σ 2 I), two estimable linear combinations c01 β and c02 β are orthogonal if and only if their best linear unbiased estimators c01 β̂ and c02 β̂ are uncorrelated. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 2 / 27 Orthogonal Linear Combinations Recall c0k β estimable ⇐⇒ ∃ ak 3 c0k = a0k X. Cov(c01 β̂, c02 β̂) = Cov(a01 Xβ̂, a02 Xβ̂) = Cov(a01 PX y, a02 PX y) = a01 PX Cov(y, y)P0X a2 = a01 PX Var(y)P0X a2 = a01 PX (σ 2 I)P0X a2 = σ 2 a01 PX PX a2 = σ 2 a01 PX a2 = σ 2 a01 X(X0 X)− X0 a2 = σ 2 c01 (X0 X)− c2 . Thus, estimable linear combinations c01 β and c02 β are orthogonal if and only if c01 (X0 X)− c2 = 0. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 3 / 27 Orthogonal Contrasts A linear combinations c0 β is a contrast if and only if c0 1 = 0. Two estimable contrasts c01 β and c02 β that are orthogonal are called orthogonal contrasts. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 4 / 27 Suppose c01 β, . . . , c0q β are pairwise orthogonal linear combinations. Let C0 = [c1 , . . . , cq ]. Then c01 (X0 X)− c1 c01 (X0 X)− c2 c02 (X0 X)− c1 c02 (X0 X)− c2 = .. .. . . 0 0 0 − 0 cq (X X) c1 cq (X X)− c2 0 0 − c1 (X X) c1 0 0 0 0 c2 (X X)− c2 = .. .. . . C(X0 X)− C0 0 c Copyright 2016 Dan Nettleton (Iowa State University) 0 · · · c01 (X0 X)− cq · · · c02 (X0 X)− cq .. .. . . 0 · · · c0q (X X)− cq ··· 0 ··· 0 . .. ... . ··· c0q (X0 X)− cq Statistics 510 5 / 27 When C has rank q, it follows that the sum of squares 0 0 0 − 0 −1 β̂ C [C(X X) C ] Cβ̂ = q X 0 β̂ ck [c0k (X0 X)− ck ]−1 c0k β̂ k=1 = q X (c0k β̂)2 /c0k (X0 X)− ck . k=1 0 Thus, the sum of squares β̂ C0 [C(X0 X)− C0 ]−1 Cβ̂ with q degrees of freedom can be partitioned into q single-degree-of-freedom sums of squares (c01 β̂)2 /c01 (X0 X)− c1 , . . . , (c0q β̂)2 /c0q (X0 X)− cq , corresponding to orthogonal linear combinations. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 6 / 27 Example: Balanced Two-Factor Diet-Drug Experiment y111 y112 y121 y122 y131 y132 y211 y212 y221 y222 y231 y232 = 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 y = Xβ + , c Copyright 2016 Dan Nettleton (Iowa State University) 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 µ11 µ12 µ13 µ21 µ22 µ23 + 111 112 121 122 131 132 211 212 221 222 231 232 ∼ N(0, σ 2 I) Statistics 510 7 / 27 Example: Balanced Two-Factor Diet-Drug Experiment X0 X = 2 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 0 0 0 0 2 1 2 = 0 0 0 0 0 (X0 X)−1 0 1 2 0 0 0 0 0 0 1 2 0 0 0 0 0 0 1 2 0 0 0 0 0 0 0 1 2 0 0 0 0 0 1 2 Thus, in this case, c01 (X0 X)− c2 = c01 c2 /2 so that linear combinations c01 β and c02 β are orthogonal if and only if c01 c2 = 0. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 8 / 27 Example: Balanced Two-Factor Diet-Drug Experiment It follows that c01 β = [ 1, 1, 1, −1, −1, −1]β c02 β = [ 1, −1, 0, 1, −1, 0]β c03 β = [ 1, 1, −2, 1, 1, −2]β c04 β = [ 1, −1, 0, −1, 1, 0]β c05 β = [ 1, 1, −2, −1, −1, 2]β comprise a set of pairwise orthogonal contrasts. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 9 / 27 Connection to the ANOVA Table Source Sum of Squares Diets y0 (P2 − P1 )y 0 DF 2−1=1 Drugs y (P3 − P2 )y 4−2=2 Diets × Drugs y0 (P4 − P3 )y 6−4=2 Error y0 (I − P4 )y 12 − 6 = 6 C. Total y0 (I − P1 )y 12 − 1 = 11 c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 10 / 27 Connection to the ANOVA Table SS DF y0 (P2 − P1 )y = (c01 β̂)2 /c01 (X0 X)− c1 1 y0 (P3 − P2 )y = (c02 β̂)2 /c02 (X0 X)− c2 + (c03 β̂)2 /c03 (X0 X)− c3 2 y0 (P4 − P3 )y = (c04 β̂)2 /c04 (X0 X)− c4 + (c05 β̂)2 /c05 (X0 X)− c5 2 y0 (I − P4 )y 6 y0 (I − P1 )y 11 c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 11 / 27 ANOVA Table with Additional Partitioning Source Diet Drug 1 − Drug 2 (Drug 1 + Drug 2)/2 − Drug 3 Diet × (Drug 1 − Drug 2) Diet × [(Drug 1 + Drug 2)/2 − Drug 3] Error C.Total c Copyright 2016 Dan Nettleton (Iowa State University) SS DF 0 0 2 0 − (c1 β̂) /c1 (X X) c1 1 (c02 β̂)2 /c02 (X0 X)− c2 1 0 0 2 0 − (c3 β̂) /c3 (X X) c3 1 0 0 2 0 − (c4 β̂) /c4 (X X) c4 1 0 0 2 0 − (c5 β̂) /c5 (X X) c5 1 0 y (I − P4 )y 6 y0 (I − P1 )y 11 Statistics 510 12 / 27 Additional Partitioning of ANOVA Sums of Squares The previous example shows how the Drug and Diet × Drug sums of squares can each be partitioned into two single-degree of freedom sums of squares corresponding to estimable orthogonal contrasts. More generally, any ANOVA sum of squares with q degrees of freedom can be partitioned into q single-degree-of-freedom sums of squares corresponding to q estimable orthogonal linear combinations c01 β, . . . , c0q β. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 13 / 27 Proof of ANOVA Partitioning To see that such a partitioning is always possible, first recall from the last set of slides that an ANOVA sum of squares 0 y0 (Pj+1 − Pj )y = β̂ C0 [C(X0 X)− C0 ]−1 Cβ̂, where C is any matrix q × p matrix of rank q = rj+1 − rj whose row space is the same as the row space of (Pj+1 − Pj )X. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 14 / 27 Proof of ANOVA Partitioning (continued) Now let a1 , . . . , aq be an orthogonal basis for C(Pj+1 − Pj ). Then, ∀ k = 1, . . . , q, ak ∈ C(Pj+1 − Pj ) =⇒ ∃ vk 3 ak = (Pj+1 − Pj )vk . ∀ k = 1, . . . , q, let c0k = a0k X so that c0k β is estimable. Also, ∀ k 6= `, c0k (X0 X)− c` = a0k X(X0 X)− X0 a` = a0k PX a` = a0k PX (Pj+1 − Pj )v` = a0k (Pj+1 − Pj )v` = a0k a` = 0 so that c01 β, . . . , c0q β are orthogonal linear combinations. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 15 / 27 Proof of ANOVA Partitioning (continued) Let C0 = [c1 , . . . , cq ] = [X0 a1 , . . . , X0 aq ] = X0 [a1 , . . . , aq ] Is the row space of C the same as the row space of (Pj+1 − Pj )X? Equivalently, is C(C0 ) = C(X0 (Pj+1 − Pj ))? C0 = X0 [a1 , . . . , aq ] = X0 [(Pj+1 − Pj )v1 , . . . , (Pj+1 − Pj )vq ] = X0 (Pj+1 − Pj )[v1 , . . . , vq ] =⇒ C(C0 ) ⊆ C(X0 (Pj+1 − Pj )). Also, X0 (Pj+1 − Pj ) = X0 [a1 , . . . , aq ]M = C0 M, for some q × n matrix M because a1 , . . . , aq comprises a basis for C(Pj+1 − Pj ). Thus, C(X0 (Pj+1 − Pj )) ⊆ C(C0 ). c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 16 / 27 Proof of ANOVA Partitioning (continued) Thus, we have C(X0 (Pj+1 − Pj )) = C(C0 ). It follows that 0 y0 (Pj+1 − Pj )y = β̂ C0 [C(X0 X)− C0 ]−1 Cβ̂ = q X 0 β̂ ck [c0k (X0 X)− ck ]−1 c0k β̂ k=1 = c Copyright 2016 Dan Nettleton (Iowa State University) q X (c0k β̂)2 /c0k (X0 X)− ck . 2 k=1 Statistics 510 17 / 27 ANOVA Partitioning is Not Always Necessary Just because we can partition ANOVA sums of squares does not mean we need to partition ANOVA sums of squares. The goals of an analysis typically involve constructing estimates or conducting tests of scientific interest. The tests of scientific interest do not necessarily involve orthogonal linear combinations. For example, suppose the goal of the researchers who conducted the diet-drug study is to determine which of the three drugs is best for enhancing weight gain of pigs on each diet. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 18 / 27 SAS Code proc mixed; class diet drug; model weightgain=diet drug diet*drug; lsmeans diet*drug / slice=diet; estimate ’drug 1 - drug 2 for diet 1’ drug 1 -1 0 diet*drug 1 -1 0 0 0 0 / cl; estimate ’drug 1 - drug 3 for diet 1’ drug 1 0 -1 diet*drug 1 0 -1 0 0 0; estimate ’drug 2 - drug 3 for diet 1’ drug 0 1 -1 diet*drug 0 1 -1 0 0 0; estimate ’drug 1 - drug 2 for diet 2’ drug 1 -1 0 diet*drug 0 0 0 1 -1 0; estimate ’drug 1 - drug 3 for diet 2’ drug 1 0 -1 diet*drug 0 0 0 1 0 -1; estimate ’drug 2 - drug 3 for diet 2’ drug 0 1 -1 diet*drug 0 0 0 0 1 -1; run; c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 19 / 27 SAS Output The Mixed Procedure Type 3 Tests of Fixed Effects Effect diet drug diet*drug Num DF Den DF F Value Pr > F 1 2 2 6 6 6 61.96 6.04 5.01 0.0002 0.0365 0.0526 c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 20 / 27 SAS Output Tests of Effect Slices Effect diet*drug diet*drug diet Num DF Den DF F Value Pr > F 1 2 2 2 6 6 9.59 1.46 0.0135 0.3042 c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 21 / 27 SAS Output Least Squares Means Effect diet*drug diet*drug diet*drug diet*drug diet*drug diet*drug diet drug Estimate Standard Error 1 1 1 2 2 2 1 2 3 1 2 3 42.5000 40.0500 37.6500 35.7000 33.9500 35.4500 0.7832 0.7832 0.7832 0.7832 0.7832 0.7832 c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 22 / 27 SAS Output Estimates Label drug 1 drug 1 drug 2 drug 1 drug 1 drug 2 - drug drug drug drug drug drug 2 3 3 2 3 3 for for for for for for diet diet diet diet diet diet 1 1 1 2 2 2 Label drug 1 drug 1 drug 2 drug 1 drug 1 drug 2 - drug drug drug drug drug drug 2 3 3 2 3 3 for for for for for for diet diet diet diet diet diet 1 1 1 2 2 2 c Copyright 2016 Dan Nettleton (Iowa State University) Estimate 2.4500 4.8500 2.4000 1.7500 0.2500 -1.5000 Lower -0.2601 2.1399 -0.3101 -0.9601 -2.4601 -4.2101 Standard Error DF t Value Pr > |t| 1.1075 6 2.21 0.0689 1.1075 6 4.38 0.0047 1.1075 6 2.17 0.0734 1.1075 6 1.58 0.1652 1.1075 6 0.23 0.8289 1.1075 6 -1.35 0.2244 Upper 5.1601 7.5601 5.1101 4.4601 2.9601 1.2101 Statistics 510 23 / 27 Main Conclusions For pigs on diet 1, treatment with drug 1 led to significantly greater mean weight gain than treatment with drug 3. No other differences in mean weight gain between drugs within either diet were statistically significant. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 24 / 27 Comments on the Analysis Note that the main analysis focuses on pairwise comparisons of drugs within each diet. This involves a set of six contrasts, but the contrasts are not pairwise orthogonal within either diet. The sums of squares for these contrasts do not add up to any ANOVA sums of squares, but they are the contrasts that best address the researchers’ questions. If we want to control the probability of one or more type I errors, we could use Bonferroni’s method. In this case, the adjustment for multiple testing would not change the conclusions. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 25 / 27 Comments on the Cell Means vs. Additive Model We used the cell means model for analysis even though the interactions were not significant at the 0.05 level. I tend to prefer the cell means model in experiments with a full-factorial treatment design even if interactions are not significant. The cell means model is less restrictive than an additive model. The cell means model estimator of error variance σ 2 is not inflated by incorrectly specifying an additive mean structure when the additive mean structure is too restrictive. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 26 / 27 Comments on the Cell Means vs. Additive Model Using the cell means model honors the treatment structure. Using the cell means model avoids problems with using the data once to select a model and a second time to perform inference. Some other statisticians may favor a different strategy, especially in experiments with many factors or few degrees of freedom for error. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 27 / 27