Consider Comments: For R(j), b nij (i) j =1 ni: ij may not be equal for all i = 1; : : : ; a, when 1 b ij are equal b j =1 for all i = 1; : : : ; a. R(j; ) = YT (P;; P;)Y and the corresponding F-statistic X X F = R(j; )=(b MSE 1) F(b Here, 1 R(j; ) 2 2 rank(X;;) rank(X;)(Æ ) 2 (ii) j =1 nnij ij may be equal i: for all i = 1; : : : ; a, when 1 b ij are not equal b j =1 for some i = 1; : : : ; a. b X % - [1+(a 1)+(b 1)] [1+(a 1)] = b 1 degrees of freedom X and Æ2 = 21 2 [(P;; P;)X]T [(P;; P;)X] 557 558 T X T P;;X = X;; X;; ;; X;; X 2 = X;; 2 6 6 4 6 6 6 6 6 6 6 6 6 6 6 6 6 4 n:: n1: n2: n:l n:2 n:3 n1: n1: 0 n11 n12 n13 % n2: 0 n2: n21 n22 n23 n:1 n11 n21 n:1 3 n:2 n12 n22 0 n:2 0 0 0 n:3 n13 n23 0 0 n:3 2 3 6 6 6 4 7 7 7 5 3 7 7 7 7 7 7 7 7 7 7 7 7 7 5 0 T X X;; " = " A 0 1 X " # T 0 01 + 0 C # " I C 1B # T T [A BC 1B ] 1[I j T 1 C C C C A X E (Yijk) = ij ; 0 + A 1 B [C B A 1 B ] 1 [ B A 1 j I ] 0 I # B B B B @ With respect to the cell means, 7 7 5 = The null hypothesis is a nij H0 : i=1 n:j (j + ij ) a nij b nik ( + ik ) = 0 i=1 n:j k=1 ni: k for all j = 1; : : : ; b X A B call this T B C A B 1 BT C 2 1;n:: ab)(Æ ) BC 1] W W BC 1 = C 1B T W C 1 + C 1B T W BC 1 where W = [A BC 1B T ] 1 2 3 6 6 6 6 4 7 7 7 7 5 559 this null hypothesis is a nij b nik a nij H0 : i=1 ij i=1 n:j k=1 ni: ik n:j for all j = 1; 2; : : : ; b: 0 X X B B B B @ X 560 1 C C C C A =0 Consider R( j; ; ) = YT [PX P;;]Y and the associated F-statistic =[(a 1)(b F = R( j; ; )MSE F(a 1)(b 1);n:: ab(Æ2) Type I sums of squares Source of variation. Soil types Var. Interaction Resid. Corr. total Corr. for the mean 1)] The null hypothesis is: H0 : (ij i` kj + k`) = (ij i` kj + k`) = 0 for all (i; j ) and (k; `) : sums of squares d.f. Mean square T X n 1 = 14 Y (I P1)Y = 520 R() = 3375 1 562 Summary: Associated null hypothesis R() H0 : + i: =1 :: a X b X Associated null hypothesis R() H0 : + :: i ij j :: a X @ ij R(j) :: R(j) H0 : + nn ( + ) are equal =1 n are equal or H0 : =1 n n ( + ) R(j; ) H0 : + nn = nn =1 =1 =1 n for all j = 1; : : : ; b n for all j = 1; : : : ; b n = n or H0 : =1 n =1 n =1 n R( j; ; ) H0 : + = 0 for all (i; j ) and (k; `) b X j j @ ij i: b X 1 ij A ij j i: a X a X ij j 0 ik k i :j a X @ b X ij ij i :j k a X ij ij b X 1 ik ij kj (or H0 : ij i` kj A ik i :j ik k: ij i :j k i: k` + = 0 for all (i; j )and (k; `) i` :j j :: ij :: j ! b X ij ij :: H0 : + nn ( + ) are equal for all j = 1; : : : ; b =1 n are equal for all j = 1; : : : ; b or H0 : =1 n n ( + ) H0 : nn ( + ) = nn =1 =1 =1 n for all i = 1; : : : ; a n = n n or H0 : =1 n =1 n =1 n for all i = 1; : : : ; a a X ij j j ij :j i a X ! ij ij ij i 0 b X j =1 j =1 A ij =1 j =1 i b X i i 1 b X :: a X ij 0 i: =1 i j j n + n n =1 n n =0 + =1 =1 n n =0 or H0 : n a X i ij :j i b X Sums of Squares a X n + n n =1 n n =0 + =1 =1 n n =0 or H0 : n a X i p-val T 561 Sums of Squares F R(j) = 52:5 52.5 3.94 .0785 R(j; ) = 124:73 62.4 4.68 .0405 (a-1)(b-1) R( j; ; ) =2 = 222:76 111.38 8.35 .0089 n ab = 9 Y (I P )Y = 120 13.33 a 1=1 b 1=2 i R(j; ) b X :j b X ij ij j ij kj k j b X a X ij i: b X ij i: k " a X ij # kj ij j R( j; ; ) H0 : ij or H0 : kj ij kj j i: kj :j i: k ! :j + = 0 for all (i; j ) and (k; `) i` k` ! kj + = 0 for all (i; j ) and (k; `) i` k` k` 563 564 Type I sums of squares Source of variat. sums of squares d.f. \Soils" \Var." Interaction a 1=1 b 1=2 R(j) = 52:50 R(j; ) = 124:73 (a-1)(b-1) =2 R( j; ; ) (n \Res." Corr. total n :: Source of variat. \Var." \Soils" Interaction Corr. total =9 T 52.5 62.4 3.94 4.68 .0785 .0405 111.38 8.35 .0089 1 = 14 Y (I P1)Y =520.00 b 1=2 a 1=1 R(j) = 93:33 R(j; ) = 83:90 (a-1)(b-1) =2 R( j; ; ) ij =9 = 222.76 1) Y (I P )Y T X =120.00 Type II sums of squares: Source of variat. 13.33 X =120.00 d.f. :: p-val 1) Y (I P )Y sums of squares n F T (n \Res." ij =222.76 Mean square Mean square F p-val 46.67 83.90 3.50 6.29 .0751 .0334 111.38 8.35 .0089 sums of squares d.f. Mean square F p-val 83.90 62.37 6.3 4.7 .0339 .0405 \Soils" \Var." Interaction a 1=1 b 1=2 R(j; ) = 83:90 R(j; ) = 124:73 R( j; ; ) = 222:76 111.38 8.4 .0089 \Res." (a-1)(b-1) =2 n ab Y (I P )Y = 120 Corr. total n 1 Y (I P1)Y = 520 =9 T X 13.33 T 13.33 1 = 14 Y (I P1)Y T =520.00 565 566 Examine the soil type eect on time to germination for each variety: 40 Average Time to Carrot Seed Germination 20 Variety j=1 j=2 j=3 t 9.0 2.11 16.0 1.83 -2.51 14.0 2.58 31.0 3.65 -3.80 18.0 2.58 13.0 2.11 1.50 Yij: SY ij: Y2j: SY 2j: p-value .0333 .0042 .1679 10 Time to germination for variety 2 is shorter in soil type 1. 0 Mean Time 30 Soil Type 1 Soil Type 2 Time to Germination Soil Type 1 Soil Type 2 1.0 1.5 2.0 2.5 3.0 Time to germination for variety 1 may also be shorter in soil type 1. For variety 3 there is no signicant Variety dierence in average germination times for the two soil types. 567 568 In the previous analysis: Yij: = ^ ij = ^+ ^ i + ^j + ^ij is the OLS estimator (b.l.u.e.) for ij = + i + j + ij Also, for i = 1; : : : ; a SYij: = MSE nij and j = 1; : : : ; b Since Y1j: is independent of Y2j: t = Y1j: 1 Y2j: 1 for j = 1; : : : ; b MSE ( n1j + n2j ) v u u u u t Method of Unweighted Means (Type III sums of squares in SAS when nij > 0 for all (i; j )). Consider the cell means reparameterization of the model: Yijk + i + j + ij + ijk ij + ijk = = v u u u u t 570 569 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 Y111 1 0 0 0 0 0 Y112 1 0 0 0 0 0 Y113 1 0 0 0 0 0 Y121 0 1 0 0 0 0 Y122 0 1 0 0 0 0 Y131 0 0 1 0 0 0 Y132 0 0 1 0 0 0 Y211 = 0 0 0 1 0 0 Y212 0 0 0 1 0 0 Y213 0 0 0 1 0 0 Y214 0 0 0 1 0 0 Y221 0 0 0 0 1 0 Y231 0 0 0 0 0 1 Y232 0 0 0 0 0 0 Y233 0 0 0 0 0 0 " 3 2 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 " Y D 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 111 112 113 121 122 11 131 12 13 + 132 211 21 212 22 213 23 214 221 231 232 233 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 " 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 The model is Y = D + 571 The least squares estimator (b.l.u.e.) for is ^ = (DT D) 1DT Y n111 n121 n131 = 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 Y11: Y12: : = YY13 21: Y22: Y23: 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 n211 n221 n231 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 Y11: Y12: Y12: Y21: Y22: Y23: 572 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 Test the null hypothesis H0 : 1b b 1j = 1b b 2j = = 1b b aj j =1 j =1 j =1 vs. 1 b for some i = HA : 1b b ij = 6 6 k b j =1 kj j =1 X X X X Express the null hypothesis in matrix form: H0 : C1 = 0 X The OLS estimator (b.l.u.e.) for 1 b ij is b j =1 where C1 = [Ia 1 1a 1] 1Tb 2 X b Yij: Y~i:: = 1b j =1 = X with V ar(Y~i::) = = 2 2 b2 j =1 nij 2 b 1 b2 j =1 nij 1 b X 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1Tb 2 = X 6 6 6 6 6 6 6 6 6 6 4 X j 1Tb 1j ... X .. j j a 1;j X -1Tb -1Tb .. T 1b -1Tb aj j aj X 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 3 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 11 12 .. 1b 21 .. 2b . ab 574 Compute SSH0 = (C1b 0)T [C1(DT D) 1C1T ] = YT D(DT D) 1C1T [C1(DT D) C1(DT D) 1DT Y C1b = C1(DT D) 1DT Y Y j 1j: 2 = 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 X .. .. Y j a 1;j: X Y j aj: X Y j aj: X 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 7 7 7 7 7 7 7 7 7 7 5 573 Then 3 1(C1b 1C T ] 1 1 Use result 4.7 to show 1 is the OLS estimator (b.l.u.e.) of C1, and V ar(C1b) = V ar(C1(DT D) 1DT Y) = C1(DT D) 1DT (2I )D(DT D) 1C1T = 2C1(DT D) 1DT D(DT D) 1C1T = 2C1(DT D) 1C1T 575 2 (Æ2) SS H (a 1) 0 2 Check that A = 12 D(DT D) 1C1T [C1(DT D) 1C1T ] 1 C1(DT D) 1DT (2I ) is idempotent and that a 1 = rank(C1(DT D) 1C1T ) 576 0) Use result 4.8 to show that SSE = YT (I PD )Y - call this A1 is distributed independently of Compute: SSE = YT (I PD )Y where SSH = YT D(DT D) 1C1T [C1(DT D) 1C1T ] 1C1(DT D) 1DT Y 0 - call this A2 PD = D(DT D) 1DT Check that Use result 4.7 to show A1A2 = A1(2I )A2 = 2A1A2 = 2(I PD )(D(DT D) 1C1T (C1(DT D) 1C1T ) 1C1(DT D) 1DT 1 2 SSE (nij 1) 2 = 0 This is true because (I PD )D = 0. 577 578 Reject Then F = H0 : 1b b 1j = 1b b 2j = = 1b b aj X SSH0=(a 1) SSE=((nij 1)) F (a j =1 X j =1 X j =1 if 2 1;(nij 1))(Æ ) SSH0 =(a 1) > F F = SSE= (a 1;(nij 1)) ; ((nij 1)) where 1 Æ2 = 2 T C1T [C1(DT D) 1C1T ] 1C1 579 or, if p-value = P r F(a 1;(nij 1)) > F < 580 Test H0 : 1 a i1 = 1 a i2 = = 1 a ib a i=1 a i=1 a i=1 vs. HA : a1 a ij 6= a1 a ik for some j 6= k X X X X X i=1 Compute SSH0;2 i=1 Write the null hypothesis in matrix form as where H0 : C2 = 0 C2 = 1 [I 1j 1 1] T b a 2 = then 6 6 6 6 4 1 1 1C T 2 [C2(D T D ) 1C2T ] 1 C2(DT D) 1DT Y = YT D (D T D ) and reject H0 if b ... 2 6 6 6 6 6 6 6 6 6 6 6 6 4 1 11 12 ... 1 1 1 1 ... ... 1 1 3 7 7 7 7 7 7 7 7 7 7 7 7 5 C2 = C2 1 = 21 b ... ab 2 6 6 6 6 6 6 6 6 4 1 a 1 a a X =1 i a X =1 i 1 a1 i i;b ... 1 a X =1 i 1 a =1 i 3 7 7 7 7 5 F = 3 a X 1 1 ... 1 ib ib 7 7 7 7 7 7 7 7 5 SSH0;2=(b 1) SSE=((nij 1)) > F(b 1;(nij 1)); 581 Test for Interaction: 582 Compute 2 Test b = (D T D ) 1 D T Y = H0 : ij i` kj + k` = 0 for all (i; j ) and (k; `) vs. 6 6 6 6 6 6 6 6 4 Y11: .. Yab: 3 7 7 7 7 7 7 7 7 5 SSH0;3 = (C3b 0)T [C3(DT D) 1C3T ] 1 (C3b 0) HA : ij i` kj + k` 6= 0 for some (i 6= k) and (j 6= `). Write the null hypothesis in matrix form as H0 : C3 = 0 where C3 = [Ia 1j 1a 1] [Ib 1j 1b 1] 583 = YT D(DT D) 1C3T [C3(DT D) 1C3T ] 1 C3(DT D) 1DT Y and reject H0 if SS 0;3=((a 1)(b 1)) F = HSSE= ((nij 1)) > F((a 1)(b 1);(nij 1)); 584 Note that PROC GLM is SAS reports this as Type III sums of squares. Source of variation Soils Var. Inter. Sum of Squares d.f. a-1=1 b-1=2 (a-1)(b-1)=2 Mean Square F p-val SS 0 = 123.77 123.77 9.28 .0139 SS 0 2 = 192.13 96.06 7.20 .0135 SS 0 3 = 222.76 111.38 8.35 .0089 H H ; H ; YT P1Y + YT D(DT D) 1[C1(DT D) 1C1T ] 1 C1(DT D) 1DT Y + YT D(DT D) 1C2T [C2(DT D) 1C2T ] 1 C2(DT D) 1DT Y + YT D(DT D) 1C3T [C3(DT D) 1C3T ] 1 C3(DT D) 1DT Y + YT (I PD)Y do not necessarily sum to YT Y, nor do the middle three terms (SSH0 ; SSH0;2 ; SSH0;3) necessarily sum to SSmodel,corrected = YT (PD P1)Y ; nor are (SSH0 ; SSH0;2 ; SSH0;3) necessarily independent of each other. 586 585 Note that Furthermore, ~k: 2 w Y k SSH0 = i=1 wi Y~i: k=1a w k=1 k 2 a 6 6 6 6 6 6 6 6 6 4 X a 3 X X 2 7 7 7 7 7 7 7 7 7 5 b SSH0;2 = j =1 wj Y~:j X where Y~i: = X 2 wi = where 1 b Yij: b j =1 6 6 6 6 6 4 2 1 = 2 V ar(Y~ ) 1 i: b2 j =1 nij 1 b Y~:j = 3 7 7 7 7 7 5 X 2 3 4 5 and Y~i: is not necessarily equal to b b nij nij Yij: Y ijk j =1 j =1 k =1 = Yi: = b b n n ij j =1 j =1 ij X X X 6 6 6 6 6 6 6 6 6 4 X X 587 = 3 7 7 7 7 7 7 7 7 7 5 X 1 a Yij: X a i=1 3 6 6 6 6 6 4 7 7 7 7 7 5 2 wj ~:` 2 w Y ` `=1a w `=1 ` a X 2 1 = 2 V ar(Y~ ) 1 :j a2 i=1 nij 1 a X 2 3 4 5 and Y~:j is not necessarily equal to a a nij nij Yij: Yijk i=1 k =1 i =1 = Y:j = a a n n ij i=1 i=1 ij X X X X X 588 For a balanced experiment, Type I, Type II, and Type III sums of squares are the same: Balanced factorial experiments nij = n for i = 1; : : : ; a j = 1; : : : ; b R(j) Example 8.2: Sugar Cane Yields (from Snedecor and Cochran) Variety 1 Variety 2 Variety 3 Nitrogem Level 150 lb/acre 210 lb/acre 270 lb/acre Y111 = 70:5 Y112 = 67:5 Y113 = 63:9 Y114 = 64:2 Y211 = 58:6 Y212 = 65:2 Y213 = 70:2 Y214 = 51:8 Y311 = 65:8 Y312 = 68:3 Y313 = 72:7 Y314 = 67:6 Y121 = 67:3 Y122 = 75:9 Y123 = 72:2 Y124 = 60:5 Y221 = 64:3 Y222 = 48:3 Y223 = 74:0 Y224 = 63:6 Y321 = 64:1 Y322 = 64:8 Y323 = 70:9 Y324 = 58:3 Y131 = 79:9 Y132 = 72:8 Y133 = 64:8 Y134 = 86:3 Y231 = 64:4 Y232 = 67:3 Y233 = 78:0 Y234 = 72:0 Y331 = 56:3 Y332 = 54:7 Y331 = 66:2 Y334 = 54:4 R(j; ) = SSH0 a n b i=1 (Yi:: Y:::)2 = = R(j) = = X R(j; ) = SSH0;2 b n a j =1 (Y:j: Y:::)2 X R( j; ; ) = SSH0;3 =n a b (Yij: Yi:: Y:j: + Y:::)2 i=1 j =1 X X 590 589 Summary Associated null hypothesis Sum of Squares R() = YT P1Y = a b n Y:::2 H0 : + a1 + ab i 1 a X b a X i=1 X i + 1b ij = 0 b X j =1 =1 j =1 a b 1 X X H0 : a b j R( j; ; ) = n a b (Yij: Yi:: Y:j: + Y:::)2 X ij = 0 i=1 j =1 b i + 1b (j + ij ) j =1 H0 : ij kj i` + k` = 0 for all (i; j ) and (k; `) R(j) = R(j; ) H0 : a 2 = n b i=1(Yi:: Y:::) are equal b 1 H0 : b ij are equal X X X j =1 a R(j) = R(j; ) H0 : j + a (i + ij ) i=1 b 2 are equal = n a j=1(Y:j: Y:::) a H0 : a1 ij are equal 1 X i=1 j =1 H0 : ij kj i` + k` = 0 for all (i; j ) and (k; `) X X X i=1 591 592 # A file with the S-PLUS commands is # posted as cane.ssc 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 # Enter the data. Note that the first # line of this file is a line of data, # not a line of variable names. > cane <- read.table("cane.dat", col.names=c("Variety","Nitrogen", "Yield")) # Create factors > cane$V <- as.factor(cane$Variety) > cane$N <- as.factor(cane$Nitrogen) # Print the data frame > cane Variety Nitrogen Yield N V 1 150 70.5 150 1 1 150 67.5 150 1 1 150 63.9 150 1 1 150 64.2 150 1 1 210 67.3 210 1 1 210 75.9 210 1 1 210 72.2 210 1 1 210 60.5 210 1 1 270 79.9 270 1 1 270 72.8 270 1 1 270 64.8 270 1 1 270 86.3 270 1 2 150 58.6 150 2 2 150 65.2 150 2 2 150 70.2 150 2 2 150 51.8 150 2 2 210 64.3 210 2 2 210 48.3 210 2 593 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 210 210 270 270 270 270 150 150 150 150 210 210 210 210 270 270 270 270 74.0 63.6 64.4 67.3 78.0 72.0 65.8 68.3 72.7 67.6 64.1 64.8 70.9 58.3 56.3 54.7 66.2 54.4 210 210 270 270 270 270 150 150 150 150 210 210 210 210 270 270 270 270 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 594 # # # # # Compute mean yields for all combinations of nitrogen levels and varieties and Make a profile plot. At this point UNIX users should open a graphics window with the motif( ) function. > means <- tapply(cane$Yield, list(cane$Variety,cane$Nitrogen), mean) > means 150 210 270 1 66.525 68.975 75.950 2 61.450 62.550 70.425 3 68.600 64.525 57.900 595 596 Set up the profile plot par(fin=c(7,7),cex=1.2,lwd=3,mex=1.5,mkh=.20) x.axis <- unique (cane$Nitrogen) matplot(c(130,270), c(50,80), type="n", xlab="Nitrogen(lb/acre)", ylab="Mean Yield", main= "Sugar Cane Yields") 65 60 55 # Plot symbols for the sample means > matpoints(x.axis,means, pch=c(15,16,18)) Variety 1 Variety 2 Variety 3 50 # Add a profile for each soil type > matlines(x.axis,means,type='l', lty=c(1,3,5),lwd=3) 70 75 80 Sugar Cane Yields Mean Yield # > > > 140 160 180 200 220 240 260 # Add a legend to the plot > legend(130,60, legend=c('Variety 1', 'Variety 2','Variety 3'), lty=c(1,3,5),bty='n') Nitrogen(lb/acre) 597 # Fit a model with main effects and # interaction effects. Compute both # sets of Type I sums of squares. options(contrasts=c('contr.sum','contr.ploy')) > lm.out2 <- lm(Yield~V*N, data=cane) > anova(lm.out2) Analysis of Variance Table > lm.out1 <- lm(Yield~N*V, data=cane) > anova(lm.out1) Response: Yield Analysis of Variance Table Response: Yield Terms added sequentially (first Df Sum of Sq Mean Sq N 2 56.541 28.2703 V 2 319.374 159.6869 N:V 4 559.788 139.9469 Residuals 27 1254.460 46.4615 598 to last) F Value Pr(F) 0.60847 0.551478 3.43698 0.046797 3.01211 0.035471 599 Terms added sequentially (first Df Sum of Sq Mean Sq V 2 319.374 159.6869 N 2 56.541 28.2703 V:N 4 559.788 139.9469 Residuals 27 1254.460 46.4615 to last) F Value Pr(F) 3.43698 0.046797 0.60847 0.551478 3.01211 0.035471 600 > summary(lm.out2, correlation=F) Call: lm(formula = Yield ~ V * N, data = cane) Residuals: Min 1Q Median 3Q Max -14.25 -3.131 -0.3625 3.956 11.45 Coefficients: (Intercept) V1 V2 N1 N2 V1N1 V2N1 V1N2 V2N2 Value 66.3222 4.1611 -1.5139 -0.7972 -0.9722 -3.1611 -2.5611 -0.5361 -1.2861 Std. Error 1.1360 1.6066 1.6066 1.6066 1.6066 2.2721 2.2721 2.2721 2.2721 t value Pr(>|t|) 58.3800 0.0000 2.5900 0.0153 -0.9423 0.3544 -0.4962 0.6238 -0.6051 0.5501 -1.3913 0.1755 -1.1272 0.2696 -0.2360 0.8152 -0.5660 0.5760 Residual standard error: 6.816 on 27 df Multiple R-Squared: 0.4272 F-statistic: 2.517 on 8 and 27 df, the p-value is 0.03462 > model.matrix(lm.out2) (Intercept) V1 V2 N1 1 1 1 0 1 2 1 1 0 1 3 1 1 0 1 4 1 1 0 1 5 1 1 0 0 6 1 1 0 0 7 1 1 0 0 8 1 1 0 0 9 1 1 0 -1 10 1 1 0 -1 11 1 1 0 -1 12 1 1 0 -1 13 1 0 1 1 14 1 0 1 1 15 1 0 1 1 16 1 0 1 1 17 1 0 1 0 18 1 0 1 0 N2 V1N1 V2N1 V1N2 V2N2 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 -1 -1 0 -1 0 -1 -1 0 -1 0 -1 -1 0 -1 0 -1 -1 0 -1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 1 1 0 0 0 1 602 601 18 60 65 70 3.5 2.5 11 1.5 75 60 65 Fitted : N * V 70 75 fits 10 19 5 0 -5 Residuals 11 -15 60 65 70 18 75 -2 Fitted : N * V -1 0 1 2 Quantiles of Standard Normal 11 19 0.10 Cook’s Distance 0.20 18 -15 -15 0.0 -5 0 10 Residuals 5 Fitted Values -5 # Create diagnostic plots > par(mfrow=c(3,2)) > plot(lm.out1) 18 19 0.5 11 sqrt(abs(Residuals)) 0 5 10 19 -5 Residuals 1 1 -1 -1 -1 -1 0 0 0 0 -1 -1 -1 -1 1 1 1 1 -15 0 0 0 0 0 0 0 0 0 0 -1 -1 -1 -1 1 1 1 1 80 0 0 -1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 1 1 1 1 70 0 0 0 0 0 0 -1 -1 -1 -1 0 0 0 0 1 1 1 1 Yield 1 1 -1 -1 -1 -1 0 0 0 0 1 1 1 1 -1 -1 -1 -1 60 0 0 -1 -1 -1 -1 1 1 1 1 0 0 0 0 -1 -1 -1 -1 50 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 10 0 0 0 0 0 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Yield 0 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 0.0 0.2 0.4 0.6 0.8 1.0 603 0 10 20 30 0.0 0.2 0.4 0.6 0.8 1.0 f-value Index 604 # Create a data frame containing the original # data and the residuals and estimated means > data.frame(cane$Nitrogen,cane$Variety, cane$Yield,Pred=lm.out1$fitted, Resid=round(lm.out1$resid,3)) 1 2 3 4 5 6 7 8 9 10 11 12 X1 150 150 150 150 210 210 210 210 270 270 270 270 X2 1 1 1 1 1 1 1 1 1 1 1 1 X3 70.5 67.5 63.9 64.2 67.3 75.9 72.2 60.5 79.9 72.8 64.8 86.3 Pred 66.525 66.525 66.525 66.525 68.975 68.975 68.975 68.975 75.950 75.950 75.950 75.950 Resid 3.975 0.975 -2.625 -2.325 -1.675 6.925 3.225 -8.475 3.950 -3.150 -11.150 10.350 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 150 150 150 150 210 210 210 210 270 270 270 270 150 150 150 150 210 210 210 210 270 270 270 270 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 58.6 65.2 70.2 51.8 64.3 48.3 74.0 63.6 64.4 67.3 78.0 72.0 65.8 68.3 72.7 67.6 64.1 64.8 70.9 58.3 56.3 54.7 66.2 54.4 61.450 61.450 61.450 61.450 62.550 62.550 62.550 62.550 70.425 70.425 70.425 70.425 68.600 68.600 68.600 68.600 64.525 64.525 64.525 64.525 57.900 57.900 57.900 57.900 -2.850 3.750 8.750 -9.650 1.750 -14.250 11.450 1.050 -6.025 -3.125 7.575 1.575 -2.800 -0.300 4.100 -1.000 -0.425 0.275 6.375 -6.225 -1.600 -3.200 8.300 -3.500 605 # > > > > > # Compute Type III sums of squares and # corresponding F-tests. # Generate an identity matrix and a # vector of ones > Iden <- function(n) diag(rep(1,n)) > one <- function(n) matrix(rep(1,n),ncol=1) # Compute the transpose of the model # matrix for the cell means model > > > > > 606 s <- length(unique(cane$Nitrogen)) t <- length(unique(cane$Variety)) st <- s*t r <- length(cane$Yield)/(st) D <- t(kronecker(Iden(st), t(one(r)))) Least squares estimation y <- matrix(cane$Yield,ncol=1) b <- solve(crossprod(D)) %*% crossprod(D,y) yhat <- D %*% b sse <- crossprod(y-yhat) df2 <- nrow(y) - st >c1 <- kronecker( cbind(Iden(s-1),-one(s-1)), t(one(t)) ) > q1 <- t(b) %*% t(c1)%*% solve( c1 %*% solve(crossprod(D)) %*% t(c1))%*% c1 %*% b > df1<- s-1 > f <- (q1/df1)/(sse/df2) > p <- 1-pf(f,df1,df2) > c1 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 1 1 1 0 0 0 -1 -1 -1 [2,] 0 0 0 1 1 1 -1 -1 -1 > data.frame(SS=q1,df=df1,F.stat=f,p.value=p) SS df F.stat p.value 1 319.3739 2 3.436975 0.04679743 607 608 > c2 <- kronecker( t(one(s)), cbind(Iden(t-1),-one(t-1)) ) > q2 <- t(b) %*% t(c2)%*%solve( c2 %*% solve(crossprod(D)) %*% t(c2))%*% c2 %*% b > df1<- t-1 > f <- (q2/df1)/(sse/df2) > p <- 1-pf(f,df1,df2) > c2 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 1 0 -1 1 0 -1 1 0 -1 [2,] 0 1 -1 0 1 -1 0 1 -1 > data.frame(SS=q2,df=df1,F.stat=f,p.value=p) SS df F.stat p.value 1 56.54056 2 0.608467 0.551478 > c3 <- kronecker( cbind(Iden(s-1),-one(s-1)), cbind(Iden(t-1),-one(t-1)) ) > q3 <- t(b) %*% t(c3)%*% solve( c3 %*% solve(crossprod(D)) %*% t(c3))%*% c3 %*% b > df1<- (s-1)*(t-1) > f <- (q3/df1)/(sse/df2) > p <- 1-pf(f,df1,df2) > c3 [1,] [2,] [3,] [4,] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] 1 0 -1 0 0 0 -1 0 1 0 1 -1 0 0 0 0 -1 1 0 0 0 1 0 -1 -1 0 1 0 0 0 0 1 -1 0 -1 1 > data.frame(SS=q3,df=df1,F.stat=f,p.value=p) SS df F.stat p.value 1 559.7878 4 3.012107 0.03547072 610 609 Conclusions: Variety 1 appears to provide a Variety 3 exhibits a \linear" consistently higher yield than Variety 2, but the dierence in these two varieties is not \signicant" at the .05 level. decrease in yield as nitrogen increases from 150 lb/acre to 270 lb/acre. Varieties 1 and 2 exhibit parallel \linear" increasing trends in yield as nitrogen increases from 150 lb/acre to 270 lb/acre. 611 Variety 3 seems to do as well as Variety 1 at 150 lb/acre of nitrogen. 612 proc glm data=set1; class variety nitrogen; model yield = variety|nitrogen / p clm alpha=.05 ss1 ss2 ss3 ss4 e e1 e2 e3 e4; output out=setr r=resid p=yhat; lsmeans variety*nitrogen / stderr pdiff; means variety nitrogen / tukey; contrast 'n-linear' nitrogen -1 0 1; contrast 'n-quad' nitrogen -1 2 -1; contrast 'v1-v2' variety 1 -1 0; contrast '(v1+v2)-v3' variety .5 .5 -1; contrast '(v1-v2)*(n-lin)' variety*nitrogen -1 0 1 1 0 -1 0 0 0; contrast '(v1-v2)*(n-quad)' variety*nitrogen -1 2 -1 1 -2 1 0 0 0; contrast '(.5(v1+v2)-v3)*(n-lin)' variety*nitrogen -.5 0 .5 -.5 0 .5 1 0 -1; contrast '(.5(v1+v2)-v3)*(n-quad)' variety*nitrogen -.5 1 -.5 -.5 1 -.5 1 -2 1; /* Analysis of completely randomized factorial experiements with an application to the sugar cane data from Snedecor and Cochran. This program is posted as cane.sas */ data set1; infile 'cane.dat'; input variety nitrogen yield; run; /* Print the data */ proc print data=set1; var yield; run; /* Compute an ANOVA table */ 613 estimate estimate estimate estimate estimate 'n-linear' nitrogen -1 0 1; 'n-quad' nitrogen -1 2 -1; 'v1-v2' variety 1 -1 0; '(v1+v2)-v3' variety .5 .5 -1; '(v1-v2)*(n-lin)' variety*nitrogen -1 0 1 1 0 -1 0 0 0; estimate '(v1-v2)*(n-quad)' variety*nitrogen -1 2 -1 1 -2 1 0 0 0; estimate '(.5(v1+v2)-v3)*(n-lin)' variety*nitrogen -.5 0 .5 -.5 0 .5 1 0 -1; estimate '(.5(v1+v2)-v3)*(n-quad)' variety*nitrogen -.5 1 -.5 -.5 1 -.5 1 -2 1; run; 615 614 /* Make a profile plots for the interaction between varieties and nitrogen levels */ /* UNIX users can use the following options */ /* goptions cback=white colors=(black) targetdevice=ps300 rotate=landscape; */ /* Windows users can use the following */ goptions cback=white colors=black device=WIN target=WINPRTC; proc sort data=set1; by variety nitrogen; proc means data=set1 noprint; by variety nitrogen; var yield; output out=means mean=my; run; 616 General Form of Estimable Functions axis1 label=(f=swiss h=2.5) ORDER = 120 to 300 by 30 value=(f=swiss h=2.0) w=3.0 length= 5.5 in; axis2 label=(f=swiss h=2.0) order = 50 to 80 by 10 value=(f=swiss h=2.0) w= 3.0 length = 5.5 in; Effect Coefficients Intercept L1 variety variety variety SYMBOL1 V=CIRCLE H=2.0 w=3 l=1 i=join ; SYMBOL2 V=DIAMOND H=2.0 w=3 l=3 i=join ; SYMBOL3 V=square H=2.0 w=3 l=9 i=join ; PROC GPLOT DATA=means; PLOT my*nitrogen=variety / vaxis=axis2 haxis=axis1; TITLE1 H=3.0 F=swiss "Sugar Cane Yields"; LABEL my='Mean Yield'; LABEL nitrogen = 'Nitrogen (lb/acre)'; RUN; 1 2 3 nitrogen nitrogen nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen 1 1 1 2 2 2 3 3 3 L2 L3 L1-L2-L3 150 210 270 L5 L6 L1-L5-L6 150 210 270 150 210 270 150 210 270 L8 L9 L2-L8-L9 L11 L12 L3-L11-L12 L5-L8-L11 L6-L9-L12 L1-L2-L3-L5-L6+L8 +L9+L11+L12 618 617 Type I Estimable Functions Effect variety Intercept variety variety variety 0 1 2 3 nitrogen nitrogen nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen Type I Estimable Functions 1 1 1 2 2 2 3 3 3 L2 L3 -L2-L3 150 210 270 0 0 0 150 210 270 150 210 270 150 210 270 0.3333*L2 0.3333*L2 0.3333*L2 0.3333*L3 0.3333*L3 0.3333*L3 -0.3333*L2-0.3333*L3 -0.3333*L2-0.3333*L3 -0.3333*L2-0.3333*L3 619 Effect -------------Coefficients------nitrogen variety*nitrogen Intercept 0 0 variety variety variety 1 2 3 0 0 0 0 0 0 nitrogen nitrogen nitrogen 150 210 270 L5 L6 -L5-L6 0 0 0 variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen 1 1 1 2 2 2 3 3 3 150 210 270 150 210 270 150 210 270 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 L8 L9 -L8-L9 L11 L12 -L11-L12 -L8-L11 -L9-L12 L8+L9+L11+L12 620 Type II Estimable Functions Type II Estimable Functions Effect ----Coefficients---variety Effect -------------Coefficients--------nitrogen variety*nitrogen Intercept 0 Intercept 0 0 variety variety variety 1 2 3 L2 L3 -L2-L3 variety variety variety 1 2 3 0 0 0 0 0 0 nitrogen nitrogen nitrogen 150 210 270 0 0 0 nitrogen nitrogen nitrogen 150 210 270 L5 L6 -L5-L6 0 0 0 variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen 1 1 1 2 2 2 3 3 3 0.3333*L2 0.3333*L2 0.3333*L2 0.3333*L3 0.3333*L3 0.3333*L3 -0.3333*L2-0.3333*L3 -0.3333*L2-0.3333*L3 -0.3333*L2-0.3333*L3 variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen 1 1 1 2 2 2 3 3 3 150 210 270 150 210 270 150 210 270 150 210 270 150 210 270 150 210 270 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 621 L8 L9 -L8-L9 L11 L12 -L11-L12 -L8-L11 -L9-L12 L8+L9+L11+L12 622 Type III Estimable Functions ----Coefficients---variety Effect 0 Effect -------------Coefficients-------nitrogen variety*nitrogen L2 L3 -L2-L3 Intercept 0 0 variety variety variety 1 2 3 0 0 0 0 0 0 150 210 270 0 0 0 nitrogen nitrogen nitrogen 150 210 270 L5 L6 -L5-L6 0 0 0 150 210 270 150 210 270 150 210 270 0.3333*L2 0.3333*L2 0.3333*L2 0.3333*L3 0.3333*L3 0.3333*L3 -0.3333*L2-0.3333*L3 -0.3333*L2-0.3333*L3 -0.3333*L2-0.3333*L3 variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen 1 1 1 2 2 2 3 3 3 Intercept variety variety variety 1 2 3 nitrogen nitrogen nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen 1 1 1 2 2 2 3 3 3 Type III Estimable Functions 623 150 210 270 150 210 270 150 210 270 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 L8 L9 -L8-L9 L11 L12 -L11-L12 -L8-L11 -L9-L12 L8+L9+L11+L12 624 Type IV Estimable Functions Effect ----Coefficients---variety Intercept 0 variety variety variety 1 2 3 nitrogen nitrogen nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen 1 1 1 2 2 2 3 3 3 L2 L3 -L2-L3 150 210 270 0 0 0 150 210 270 150 210 270 150 210 270 0.3333*L2 0.3333*L2 0.3333*L2 0.3333*L3 0.3333*L3 0.3333*L3 -0.3333*L2-0.3333*L3 -0.3333*L2-0.3333*L3 -0.3333*L2-0.3333*L3 Type IV Estimable Functions Effect -------------Coefficients------nitrogen variety*nitrogen Intercept 0 0 variety variety variety 1 2 3 0 0 0 0 0 0 nitrogen nitrogen nitrogen 150 210 270 L5 L6 -L5-L6 0 0 0 variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen variety*nitrogen 1 1 1 2 2 2 3 3 3 150 210 270 150 210 270 150 210 270 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 0.3333*L5 0.3333*L6 -0.3333*L5-0.3333*L6 L8 L9 -L8-L9 L11 L12 -L11-L12 -L8-L11 -L9-L12 L8+L9+L11+L12 625 626 Source Dependent Variable: yield Source DF Model 8 Error Sum of Squares Mean Square F Pr > F variety nitrogen var*nit 27 1254.4600 DF Type I SS 2 2 4 2 2 4 Mean Square F Pr > F 319.3739 159.6869 3.44 0.0468 56.5406 28.2703 0.61 0.5515 559.7878 139.9469 3.01 0.0355 935.7022 116.9628 2.52 0.0346 46.4615 Source C. Total 35 2190.1622 Source variety nitrogen var*nit DF Type II SS variety nitrogen var*nit Mean Square F Mean DF Type III SS Square 2 2 4 319.3739 159.6869 3.44 0.0468 56.5406 28.2703 0.61 0.5515 559.7878 139.9469 3.01 0.0355 627 variety nitrogen var*nit DF Type IV SS 2 2 4 Pr > F 319.3739 159.6869 3.44 0.0468 56.5406 28.2703 0.61 0.5515 559.7878 139.9469 3.01 0.0355 Pr > F Source F Mean Square F Pr > F 319.3739 159.6869 3.44 0.0468 56.5406 28.2703 0.61 0.5515 559.7878 139.9469 3.01 0.0355 628 Least Squares Means for effect variety*nitrogen Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: yield Least Squares Means variety 1 1 1 2 2 2 3 3 3 i/j nitrogen LSMEAN yield Standard Error Pr > |t| 150 210 270 150 210 270 150 210 270 66.525 68.975 75.950 61.450 62.550 70.425 68.600 64.525 57.900 3.408133 3.408133 3.408133 3.408133 3.408133 3.408133 3.408133 3.408133 3.408133 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 1 1 2 3 4 5 6 7 8 9 0.6154 0.0610 0.3017 0.4168 0.4255 0.6702 0.6815 0.0848 i/j 1 2 3 4 5 6 7 8 9 2 3 4 5 0.6154 0.0610 0.1594 0.3017 0.1301 0.0056 0.4168 0.1937 0.0098 0.8212 0.1594 0.1301 0.1937 0.7658 0.9386 0.3640 0.0296 0.0056 0.0098 0.2617 0.1389 0.0252 0.0009 0.8212 0.0735 0.1495 0.5289 0.4678 6 7 8 9 0.4255 0.7658 0.2617 0.0735 0.1139 0.6702 0.9386 0.1389 0.1495 0.2202 0.7079 0.6815 0.3640 0.0252 0.5289 0.6852 0.2315 0.4053 0.0848 0.0296 0.0009 0.4678 0.3432 0.0150 0.0350 0.1806 0.7079 0.2315 0.0150 0.4053 0.0350 0.1806 629 Contrast n-linear n-quad v1-v2 (v1+v2)-v3 (v1-v2)*(n-lin) (v1-v2)*(n-quad) (.5(v1+v2)-v3)*(n-lin) (.5(v1+v2)-v3)*(n-quad) DF Contrast SS Mean Square 1 1 1 1 1 1 1 1 39.5266667 17.0138889 193.2337500 126.1401389 0.2025000 1.6875000 528.0133333 29.8844444 39.5266667 17.0138889 193.2337500 126.1401389 0.2025000 1.6875000 528.0133333 29.8844444 Contrast n-linear n-quad v1-v2 (v1+v2)-v3 (v1-v2)*(n-lin) (v1-v2)*(n-quad) (.5(v1+v2)-v3)*(n-lin) (.5(v1+v2)-v3)*(n-quad) F Value Pr > F 0.85 0.37 4.16 2.71 0.00 0.04 11.36 0.64 0.3645 0.5501 0.0513 0.1110 0.9478 0.8503 0.0023 0.4296 631 0.1139 0.2202 0.6852 0.3432 630 Parameter n-linear n-quad v1-v2 (v1+v2)-v3 (v1-v2)*(n-lin) (v1-v2)*(n-quad) (.5(v1+v2)-v3)*(n-lin) (.5(v1+v2)-v3)*(n-quad) Estimate Standard Error 2.5666667 -2.9166667 5.6750000 3.9708333 0.4500000 2.2500000 19.9000000 -8.2000000 2.7827289 4.8198279 2.7827289 2.4099139 6.8162659 11.8061189 5.9030595 10.2243989 Parameter n-linear n-quad v1-v2 (v1+v2)-v3 (v1-v2)*(n-lin) (v1-v2)*(n-quad) (.5(v1+v2)-v3)*(n-lin) (.5(v1+v2)-v3)*(n-quad) Pr > |t| 0.3645 0.5501 0.0513 0.1110 0.9478 0.8503 0.0023 0.4296 632 t Value 0.92 -0.61 2.04 1.65 0.07 0.19 3.37 -0.80 Two factor experiments with empty cells Data from Littell, Freund, and Spector, 1991, SAS System for Linear Models, 3rd edition, SAS Institute, Cary, N.C. j=1 Factor A Y111 = 5 Y112 = 6 i=1 Y211 = 2 Y212 = 3 i=2 Factor B j=2 Y121 = 2 Y122 = 3 Y123 = 5 Y124 = 6 Y125 = 7 Y221 = 8 Y222 = 8 Y223 = 9 j=3 { Sample sizes: Factor B Factor A j = 1 j = 2 j = 3 i = 1 n11 = 2 n12 = 5 { i = 2 n21 = 2 n22 = 3 n23 = 5 Eects model: Y231 = 4 Y232 = 4 Y233 = 6 Y234 = 6 Y235 = 7 Yijk = + i + j + ij + ijk for (i; j ) = 6 (1; 3) and k = 1; : : : ; nij 634 633 ij = = E (Yij:) + i + j + ij is estimable for all (i; j ) 6= (1; 3). Functions of parameters that are not estimable include: 13 = + 1 + 3 + 13 :: = 61 2 3 ij i=1 j =1 = + 21 (1 + 2) + 13 (1 + 2 + 3) X X + 61 (11 + 12 + 13 + 21 + 22 + 23): 1: = 31 3 1j j =1 X 635 :3 = 12 (13 + 23) Two factor classications with empty cells: Compute F-tests and sums of No single \best" or \correct" Compare estimated means for analysis. Analysis of variance { Test for interaction is useful { Use SSE to estimate the error variance 2. { Tests for \main eects" may not be meaningful, especially in the presence of interaction. squares for meaningful contrasts. dierent combinations of factor levels. Consider the combinations of factor levels as levels of a single \combined" factor. { one-way ANOVA { contrasts { compare means 637 636 2 1 2 1 2 2 2 2 2 2 2 3 2 3 2 3 2 3 2 3 run; /* SAS code for analyzing data from the two factor experiment with no data for one combination of factors> This code is posted as littell.sas */ data set1; input A B y; cards; 1 1 5 1 1 6 1 2 2 1 2 3 1 2 5 1 2 6 1 2 7 2 3 8 8 9 4 4 6 6 7 /* Print the data */ proc print data=set1; run; /* Compute sample means for all factor combinations with data. Make a profile plot. */ 638 639 proc sort data=set1; by a b; proc means data=set1 noprint; by a b; var Y; output out=means mean=my; run; SYMBOL1 V=circle H=2.0 w=3 l=1 i=join; SYMBOL2 V=diamond H=2.0 w=3 l=3 i=join; goptions cback=white colors=black device=WIN target=WINPRTC; /* goptions cback=white colors=(black) targetdevice=ps300 rotate=landscape; */ proc gplot data=means; plot my*b=a / vaxis=axis2 haxis=axis1; title ls=0.8in H=3.0 F=swiss "Sample Means"; label my='Mean'; label b = 'Factor B'; footnote ls=0.4in ' '; run; /* Perform analysis of variance where facror A is entered into the model before factor B. Use the LSMEANS statement to compare means for different combinations of factor A and factor B. */ axis1 label=(f=swiss h=2.0) value=(f=swiss h=1.8) w=3.0 length= 5.0 in; axis2 label=(f=swiss h=2.0 a=90 r=0) value=(f=swiss h=1.8) w= 3.0 length = 5.0 in; 640 641 proc glm data=set1; class A B; model y = A B A*B / solution ss1 ss2 ss3 ss4 e e1 e2 e3 e4 p; means A B A*B; lsmeans A*B / pdiff tdiff stderr; estimate 'A1-A2' A 1 -1 / e; contrast 'A1-A2' A 1 -1 / e; estimate 'A1-A2 within B1' A 1 -1 A*B 1 0 -1 0 0 / e; estimate 'A1-A2 within B2' A 1 -1 A*B 0 1 0 -1 0 / e; estimate 'A1-A2 over B' A 1 -1 A*B .5 .5 -.5 -.5 0 / e; estimate 'B1-B2 over A' B 1 -1 0 A*B .5 -.5 .5 -.5 0 / e; estimate 'B3-.5(B1+B2) in A2' B -.5 -.5 1 A*B 0 0 -.5 -.5 1 / e; estimate 'interaction' A*B 1 -1 -1 1 0 / e; run; 642 643 /* Do everything with a one-factor ANOVA by combining the two factors into a single factor with 5 categories. */ data set1; set set1; C=10*A+B; run; proc glm data=set1; class C; model y = C / solution e e2; estimate 'C11-C21' C 1 0 -1 0 0; estimate 'C12-C22' C 0 1 0 -1 0; estimate '.5(C11+C12-C21+C22)' C .5 .5 -.5 -.5 0; estimate '.5(C11-C12+C21-C22)' C .5 -.5 .5 -.5 0; estimate 'C23-.5(C21+C22)' C 0 0 -.5 -.5 1; estimate 'C11-C12-C21+C22' C 1 -1 -1 1 0; lsmeans C / stderr tdiff pdiff; run; General Form of Estimable Functions Effect Coefficients Intercept L1 A A 1 2 L2 L1-L2 B B B 1 2 3 L4 L5 L1-L4-L5 A*B A*B A*B A*B A*B 1 1 2 2 2 1 2 1 2 3 L7 L2-L7 L4-L7 -L2+L5+L7 L1-L4-L5 645 644 Type IV Estimable Functions Type III Estimable Functions Effect -----------Coefficients----------A B A*B Intercept 0 0 0 A A 1 2 L2 -L2 0 0 0 0 B B B 1 2 3 0 0 0 L4 L5 -L4-L5 0 0 0 A*B A*B A*B A*B A*B 1 1 2 2 2 0.5*L2 0.5*L2 -0.5*L2 -0.5*L2 0 0.25*L4-0.25*L5 -0.25*L4+0.25*L5 0.75*L4+0.25*L5 0.25*L4+0.75*L5 -L4-L5 L7 -L7 -L7 L7 0 1 2 1 2 3 Effect ------Coefficients-----A B A*B Intercept 0 0 0 A A 1 2 L2 -L2 0 0 0 0 B B B 1 2 3 0 0 0 L4 L5 -L4-L5 0 0 0 A*B A*B A*B A*B A*B 1 1 2 2 2 0.5*L2 0.5*L2 -0.5*L2 -0.5*L2 0 0 0 L4 L5 -L4-L5 L7 -L7 -L7 L7 0 1 2 1 2 3 NOTE: Other Type IV estimable functions exist. 646 647 General Form of Estimable Functions Effect Coefficients Intercept L1 Dependent Variable: y L2 L3 L4 L5 L1-L2-L3-L4-L5 Source DF Sum of Squares Mean Square F Value Pr > F Model 4 45.8157 11.4539 5.27 0.0110 Error 12 26.0667 2.1722 C. Total 16 71.8824 C C C C C 11 12 21 22 23 Type II Estimable Functions Effect -CoefficientsC Intercept 0 C C C C C 11 12 21 22 23 Parameter C11-C21 C12-C22 .5(C11+C12-C21+C22) .5(C11-C12+C21-C22) C23-.5(C21+C22) C11-C12-C21+C22 L2 L3 L4 L5 -L2-L3-L4-L5 Estimate Standard Error t 3.0000 -3.7333 -0.3667 -2.4667 0.0167 6.7333 1.4738 1.0763 0.9125 0.9125 0.9418 1.8250 2.04 -3.47 -0.40 -2.70 -0.02 3.69 Estimable functions for Type IV sums of squares may depend on location of empty cells ordering of the levels for the row and column factors Least Squares Means C Standard Error Pr > |t| LSMEAN Number 11 12 21 22 23 5.5000 4.6000 2.5000 8.3333 5.4000 1.0421 0.6591 1.0422 0.8509 0.6591 0.0002 <.0001 0.0336 <.0001 <.0001 1 2 3 4 5 Example: Exchange columns 1 and 3 in the previous example. Least Squares Means for Effect C t for H0: LSMean(i)=LSMean(j) / Pr > |t| Dependent Variable: y i/j 1 1 2 3 4 5 -0.7299 0.4795 -2.0355 0.0645 2.1059 0.0569 -0.0811 0.9367 2 0.7299 0.4795 -1.70301 0.1143 3.46853 0.0046 0.85824 0.4076 3 2.0355 0.0645 1.7030 0.1143 4.3357 0.0010 2.3518 0.0366 4 -2.1059 0.0569 -3.4685 0.0046 -4.3357 0.0010 -2.7253 0.0184 0.0645 0.0046 0.6949 0.0192 0.9862 0.0031 649 648 LSMEAN y Pr > |t| 5 0.0811 0.9367 -0.8582 0.4076 -2.3518 0.0366 2.7253 0.0184 650 Factor 2 B A Factor 1 (old j=3) i=1 { Y12: = 4:6 n12 = 5 i = 2 Y21: = 5:4 Y22 = 8:33 n21 = 5 n22 = 3 C (old j=1) Y13 = 5:5 n13 = 2 Y23: = 2:5 n23 = 2 651 Type IV estimable functions for Factor B: Main Eects A B i=1 { 0 i=2 1 0 C 0 -1 2A 2C Additive model Interaction Yijk = + i + j + ijk A B C i = 1 { .5 -.5 i = 2 0 .5 -.5 1 (1B + 2B ) 2 1 2 (1C + 2C ) In either case, Type IV sums of squares and testable functions are not the same as Type III sums of squares and testable functions. i = 1; : : : ; a j = 1; : : : ; b k + 1; : : : ; nij For this model E (Yijk) = ij = + i + j may be estimable when nij = 0. 652 For example 8.1, n13 = 0, but 653 Sum of Squares R() 13 = + 1 + 3 = ( + 2 + 3) = Associated null hypothesis H0 : + a nni: i + b nn:j j = 0 i=1 :: j =1 :: or H0 : a b nnij ij = 0 i=1 j =1 :: X ( + 2 + 2) +( + 1 + 2) = Summary 23 + (12 22) E (Y23: Y22: + Y12:) R(j) X X X H0 : i + b nij j are equal j =1 ni: for all i = 1; : : : ; a X or H0 : b nnij ij are equal j =1 i: for all i = 1; : : : ; a X R(j; ) H0 : j are equal for all j = 1; : : : ; b 654 655 Sum of Squares R() Associated null hypothesis H0 : + a nni: i + b nn:j j = 0 j =1 :: i=1 :: or H0 : a b nnij ij = 0 i=1 j =1 :: R(j) X X X X H0 : j + a nnij i are equal i=1 :j for all j = 1; : : : ; b X or H0 : a nnij ij areequal i=1 :j for all j = 1; : : : ; b X R(j; ) H0 : i are equal for all i = 1; : : : ; a 656