Unbalanced 2-Factor Studies KNNL – Chapter 23 Unequal Sample Sizes • When sample sizes are unequal, calculations and parameter interpretations (especially marginal ones) become messier • Observational studies often have unequal sample sizes due to availability of sampling units for certain combinations of factor levels (villagers of certain types in a rural study for instance) • Experimental studies, even when planned with equal sample sizes can end up unbalanced through technical problems or “drop outs” • Some conditions may be cheaper to measure than others, and will have larger sample sizes • Some situations have particular contrasts of higher importance Regression Approach - I Sample Sizes: # of Cases when Factor A is at level i, B @ j: nij b ni nij j 1 a n j nij i 1 a nij b nT nij Yij Yijk i 1 j 1 k 1 a b a b i 1 j 1 i 1 j 1 Restrictions on Effects: i j ij ij 0 a 1 2 ... a 1 nij ijk ~ N 0, 2 (independent) Model: Yijk i j ij ijk b 1 2 ... b 1 Y ij Yij ib i1 i 2 ... i ,b 1 aj 1 j 2 j ... a 1 j Regression Approach - II Regression Model: Yijk 1 X ijk 1 ... a 1 X ijk ,a 1 1 X ijka ... b 1 X ijk ,a b 2 11 X ijk1 X ijka ... a 1,b1 X ijk ,a 1 X ijk ,a b 2 ijk 1 if case from level 1 of factor A where: X ijk 1 1 if case from level a of factor A 0 otherwise 1 if case from level a-1 of factor A X ijk ,a 1 1 if case from level a of factor A 0 otherwise where: X ijka X ijk ,a b 2 1 if case from level 1 of factor B 1 if case from level b of factor B 0 otherwise 1 if case from level b-1 of factor B 1 if case from level b of factor B 0 otherwise Regression Approach – Example I Writer Type (B) Style (Factor A) Poets (B1) Conceptualists (A1) Eliot (Finders) Cummings Plath Pound Wilbur n & Mean n11=5 Experimentalists (A2) Bishop (Seekers) Moore Williams Lowell Stevens Frost n & Mean n21=6 Year at Peak 23 26 30 30 34 28.60 29 32 40 41 42 48 38.67 Novelists (B2) Fitzgerald Hemingway Melville Lawrence Joyce n12=5 James Faulkner Dickens Woolf Conrad Twain Hardy n22=7 Year at Peak 29 30 32 35 40 33.20 38 39 41 45 47 50 51 44.43 Yijk 1 X ijk 1 1 X ijk 2 11 X ijk 1 X ijk 2 ijk 2 1 2 1 12 21 11 22 11 X_ijk1 1 1 1 1 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 X_ijk2 1 1 1 1 1 -1 -1 -1 -1 -1 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 X1X2 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 1 1 1 1 Testing Strategies – Models Fit 1) Model 1: all Factor A, Factor B, and Interaction AB Effects 2) Model 2:all Factor A, Factor B Effects (Remove Interaction) 3) Model 3: all Factor B,Interaction AB Effects (Remove A) 4) Model 4:all Factor A,Interaction AB Effects (Remove B) 5) To test for Interaction Effects, Model 1 is Full Model, Model 2 is Reduced dfNumerator=(a-1)(b-1) dfden=nT-ab 6) Testing for Factor A Effects, Full=Model 1, Reduced=Model 3 dfNumerator=(a-1) dfden=nT-ab 7) Testing for Factor B Effects, Full=Model 1, Reduced=Model 4 dfNumerator=(b-1) dfden=nT-ab Regression Approach – Example - Continued Yijk 1 X ijk 1 1 X ijk 2 11 X ijk 1 X ijk 2 ijk Model 1: E Yijk 1 X ijk 1 1 X ijk 2 11 X ijk 1 X ijk 2 ^ Y 36.22 5.32 X 1 2.59 X 2 0.29 X 1 X 2 Model 2: E Yijk 1 X ijk 1 1 X ijk 2 Y 36.23 5.33 X 1 2.63 X 2 ^ Model 3: E Yijk 1 X ijk 2 11 X ijk 1 X ijk 2 Y 36.90 2.77 X 2 0.47 X 1 X 2 ^ Model 4: E Yijk 1 X ijk 1 11 X ijk 1 X ijk 2 Y 36.31 5.41X 1 0.62 X 1 X 2 ^ ANOVA Regression Residual Total Model1 df 3 19 22 SS 827.91 557.05 1384.96 Model2 df 2 20 22 SS 826.01 558.95 1384.96 Model3 df 2 20 22 SS 188.77 1196.19 1384.96 Model4 df 2 20 22 SS 676.58 708.37 1384.96 Regression Approach – Example - Continued H 0 : 11 12 21 22 0 SSE R 558.95 df E R 20 H A : Interaction Exists SSE F 557.05 df E F 19 SSE R SSE F df E R df E F TS : F SSE F df E F 558.95 557.05 20 19 * 0.065 RR : FAB F .95,1,19 4.381 557.05 19 * AB ANOVA Regression Residual Total Model1 df 3 19 22 SS 827.91 557.05 1384.96 Model2 df 2 20 22 SS 826.01 558.95 1384.96 Model3 df 2 20 22 SS 188.77 1196.19 1384.96 Model4 df 2 20 22 SS 676.58 708.37 1384.96 Regression Approach – Example - Continued H 0 : 1 2 0 H A : Factor A Effects Exist: SSE R 1196.19 df E R 20 1196.19 557.05 20 19 F 21.80 RR : FA* F .95,1,19 4.381 557.05 19 * A H 0 : 1 2 0 H A : Factor B Effects Exist: SSE R 708.37 df E R 20 708.37 557.05 20 19 F 5.16 RR : FB* F .95,1,19 4.381 557.05 19 * B ANOVA Regression Residual Total Model1 df 3 19 22 SS 827.91 557.05 1384.96 Model2 df 2 20 22 SS 826.01 558.95 1384.96 Model3 df 2 20 22 SS 188.77 1196.19 1384.96 Model4 df 2 20 22 SS 676.58 708.37 1384.96 Estimating Treatment and Factor Level Means/Contrasts Treatment Means: nij Parameter: ij ^ Estimator: ij Y ijk k 1 nij MSE nij ^ Estimated Standard Error: s ij = Y ij Factor A Means: b Parameter: i = j 1 b ij ^ Estimator: i b Y ij ^ j 1 Estimated Standard Error: s i b MSE b 1 b 2 j 1 nij Factor B Means: a Parameter: j = ij i 1 a a ^ Estimator: j Y ij ^ Estimated Standard Error: s j i 1 a Contrast or Linear Function of Factor A Means: a Parameter: LA ci i i 1 a ^ ^ ^ Estimator: L A ci i Estimated Standard Error: s L A i 1 Contrast or Linear Function of Factor B Means: b Parameter: LB c j j j 1 b ^ MSE a 1 a 2 i 1 nij MSE a 2 b 1 ci b 2 i 1 j 1 nij ^ MSE b 2 a 1 cj a 2 j 1 i 1 nij ^ Estimator: L B c j j Estimated Standard Error: s L B j 1 Contrast or Linear Function of Treatment Means: a b Parameter: LAB cij ij i 1 j 1 ^ a b ^ a b Estimator: L AB cij Y ij Estimated Standard Error: s L AB MSE i 1 j 1 i 1 j 1 cij2 nij Standard Error Multipliers Single Comparisons: t 1 / 2 ; nT ab General Multiple Comparisons of Treatment (Cell) Means : Scheffe: S Bonferroni: ab 1 F 1 ; ab 1, nT ab B t 1 2 g , nT ab Tukey (all pairs of treatment means): T 1 q 1 ; ab, nT ab 2 General Multiple Comparisons of Factor Level Means : a 1 F 1 ; a 1, nT ab Factor B: Factor A or Factor B : B t 1 2 g , nT ab Scheffe: Factor A: S A Bonferroni: Tukey: Factor A: TA 1 q 1 ; a, nT ab 2 Factor B: TB SB b 1 F 1 ; b 1, nT ab 1 q 1 ; b, nT ab 2 Creative Life Cycles – Comparing Treatment Means Comparing all 4 Treatment Means(athough no interaction was present): MSE 29.32 2.42 n11 5 Y 12 33.20 n12 5 s Y 12 MSE 29.32 2.21 n21 6 Y 22 44.43 n22 7 s Y 22 Y 11 28.60 n11 5 s Y 11 Y 21 38.67 n21 6 s Y 21 T MSE 29.32 2.42 n12 5 MSE 29.32 2.05 n22 7 1 1 1 1 q 0.95, 4, 23 4 19 3.977 2.812 s Y ij Y i ' j ' MSE 2 2 nij ni ' j ' 1 1 Y 11 Y 12 28.60 33.20 4.60 s Y 11 Y 12 29.32 3.43 5 5 HSD 2.812(3.28) 9.22 HSD 2.812(3.17) 8.92 1 1 Y 11 Y 21 28.60 38.67 10.07 s Y 11 Y 21 29.32 3.28 5 6 1 1 Y 11 Y 22 28.60 44.43 15.83 s Y 11 Y 22 29.32 3.17 5 7 1 1 Y 12 Y 21 33.20 38.67 5.47 s Y 12 Y 21 29.32 3.28 5 6 1 1 Y 12 Y 22 33.20 44.43 11.23 s Y 12 Y 22 29.32 3.17 5 7 1 1 Y 21 Y 22 38.67 44.43 5.76 s Y 21 Y 22 29.32 3.01 6 7 Conceptualists/Poets HSD 2.812(3.43) 9.65 HSD 2.812(3.28) 9.22 HSD 2.812(3.17) 8.92 HSD 2.812(3.01) 8.47 Conceptualists/Novelists Experimentalists/Poets Experimentalists/Novelists Creative Life Cycles – Comparing Factor Level Means Factor A (Style): b ^ 1 Y 1 j j 1 b Y 11 Y 12 28.60 33.20 30.90 2 2 Y 21 Y 22 38.67 44.43 41.55 2 2 b ^ 2 ^ Y 2 j j 1 b ^ 1 2 30.9 41.55 10.65 ^ ^ 29.32 2 1 1 1 1 1 (1) 2 10.402 3.23 2 2 5 5 6 7 s 1 2 t 0.975, 23 4 19 2.093 95% CI for 1 2 (Conceptualists - Experimentalists): -10.7 (2.093)(3.23) 10.65 6.75 Factor B (Writer Type): a ^ 1 Y i1 i 1 a Y 11 Y 21 28.60 38.67 33.635 2 2 Y 12 Y 22 33.20 44.43 38.815 2 2 b ^ 2 ^ Y i 2 j 1 b ^ 1 2 33.635 38.815 5.18 ^ ^ s 1 2 29.32 2 1 1 1 1 1 (1) 2 10.402 3.23 2 2 5 5 6 7 t 0.975, 23 4 19 2.093 95% CI for 1 2 (Poets - Novelists): - 5.18 (2.093)(3.23) 5.18 6.75 11.93,1.57 17.40, 3.90