Analyses of Variance Review Simple Situation Genotype A 135 Genotype B 34 Simple Situation Genotype A 135 115 102 110 115.5 Genotype B 34 76 83 64 64.2 t-test |x1-x2| t= 2[(12+22)/(n1+n2)] More than two treatments Rep. Genotype Brundage Lambert Croft Stephens 1 64 78 75 55 2 3 4 5 6 72 68 77 56 95 91 97 82 85 77 93 78 71 63 76 66 49 64 70 68 Multiple t-tests Brundage v Lambert; Brundage v Croft; Brundage v Stephens; Lambert v Croft; Lambert v Stephens; Croft v Stephens. Problems? If all tests were done at 95% significance level, and one difference was significant, we have done 6 tests and would expect 1/20 to be significant, at random. Analysis of Variance Is an elegant and quicker way to calculate a pooled error term. Analysis is simple in simple designs but can be complicated and lengthy in some designs (i.e. rectangular lattices). In some experimental designs the ANOVA is the only method to estimate a pooled error term. Analysis of Variance It can provide an F-test to tests specific hypotheses. (i.e. to test general differences between different treatments). Can be an invaluable initial contribution to interpretation of experiments. Theory of Analysis of Variance ij(xij-x..)2 = ij[(xij-xi.) + (xi.-x..)]2 ij[(xij-xi.)2+2(xij-xi.)(xi.-x..)+(xi.-x..)2] ij(xij-x..)2 = ij(xij-xi.)2+ki(xi.-x..)2] ki(xi.-x..)2 = Between Treatment SS ij(xij-xi.)2 = Within Treatment SS Theory of Analysis of Variance BTMS ~ 2n-1 df : WTMS ~ 2nk-n df 2n-1 df 2nk-n df ~ F Dist n-1,nk-n df Theory of Analysis of Variance Source of variation df EMS Between treatments n-1 e2 + kt2 Within treatments Total nk-n e2 nk-1 [e2 + kt2]/e2 = 1, if kt2 = 0 Assumptions behind the ANOVA Assumption of data being normally distributed. Homogeneity of error variance. Additivity of variance effects. Data collected from a properly randomized experiment. Analyses of CRB Designs Yij = + ti + eij Analysis of Variance of CRB Source df Between treatments Within treatments k-1 [G12/n1 + G22/n2 … Gk2/nk] - CF Total SS jk-k By difference jk-1 [x112 + x122 + … + xjk2] - CF CF = [xij]2/jk Analyses of RCB Designs Yij = + bi + tj + eij Analysis of Variance of RCB Source df Blocks r-1 [B12 + B22 + … + Br2]/t – CF Treatments t-1 [T12 + T22 + … + Tt2]/r – CF Error Total SS (r-1)(t-1) By difference rt-1 [x112 + x122 + … + xrt2] – CF CF = [xij]2/rt Analyses of Latin Designs Yijk = + ri + cj + tk(ij) + eijk Analysis of Variance of Latin Source df SS Rows t-1 [R12 + R22 + … + Rt2]/t – CF Columns t-1 [C12 + C22 + … + Ct2]/t – CF Treatments t-1 [T12 + T22 + … + Tt2]/t – CF Error Total (t-1)(t-2) By difference t2-1 [x112 + x122 + … + xtt2] – CF CF = [xij]2/t2 Efficiency of Latin Squares cw CRB Design [MSr + MSc + (t-1)EMS]/(t+1)EMS If value response is 325, then latin square in will increase precision by 225% over CRB and CRD would have need 2.25 x 4 = 9 replicates to be as accurate. Efficiency of Latin Squares cw RCB Design Row (RCB) = [MSr + (t-1)EMS]/(t+1)EMS Col(RCB) = [MSc + (t-1)EMS]/(t+1)EMS -19% +266% ☺ +226% -19% ☺ Analyses of Lattice Squares Yijk = + ri + a bj + a t k + eijk Lattice Square ANOVA Source Reps Blk(adj) Intra error T(adj) Eff. Error df 4 15 45 15 45 SS 5,946 11,382 14,533 24,030 16,605 MS 1,486 759 323 1,602 369 F 4.03 * 2.35 ns 4.34 ** - Efficiency of Lattice Design 100 x [Blk(adj)SS+Intra error SS]/k(k2-1)EMS 100 [11,382 + 14,533]/4(16)369 117% I II III IV I II III IV V V Dealing with Wrongful Data It is usually assumed that the data collected is correct!. Why would data not be correct? Mis-recording, mis-classification, transcription errors, errors in data entry. Outliers. Dealing with Wrongful Data What things can help? Keep detailed records, on each experimental unit. Decide beforehand what values would arouse suspision. Dealing with Wrongful Data What do you do with suspicios data? If correct, and it is discarded, then valuable information is lost. This will bias the results. If wrong and included, will bias results and may have extreme consequences. Checking ANOVA Accurucy of variation: [e/]x100. CV=(√100.9/73.75)*100=13.6% Coefficient R2 value = {[TSS-ESS]/TSS}x100. R2 = (1654/3654)*100 = 44.7%. Compare the effect of blocking or sub-blocking (discussed later). Marvelous Marvin father of the Groom Alaskan Wedding Feast ANOVA of Factorial Designs Factorial AOV Example Source df SS MS F Reps 2 0.01 0.005 ns Seed Density 2 2.75 1.375 33.9 *** Nitrogen 5 81.56 16.312 401.9*** SxN 10 1.33 0.133 Error 34 1.38 0.041 Total 53 87.03 3.28*** Factorial AOV Example Source df SS MS F Reps x Seed Rate 4 0.2268 0.0567 1.63 ns Rep x N rate 10 0.4528 0.0453 1.30 ns Rep x Seed x N 20 0.6936 0.0347 Split-plot AOV Source Reps Seed Density Error (1) Nitrogen SxN Error (2) Total df 2 2 4 5 10 30 53 SS 0.01 2.75 0.2268 81.56 1.33 1.1464 87.03 MS 0.005 1.375 0.057 16.312 0.133 0.038 F ns 24.2 *** 426.9*** 3.5*** - Strip-plot AOV Source Reps Seed Density Error 1 (Seed) Nitrogen Error 2 (N) SxN Error 3 (SxN) Total df 2 2 4 5 10 10 20 53 SS 0.01 2.75 0.2268 81.56 0.4528 1.33 0.6936 87.03 MS 0.005 1.375 0.0567 16.312 0.0453 0.133 0.0347 F ns 24.2 *** 360.1*** 3.83*** - Fixed and Random Effects Expected Mean Squares Dependant on whether factor effects are Fixed or Random. Necessary to determine which F-tests are appropriate and which are not. Setting Expected Mean Squares The expected mean square for a source of variation (say X) contains. the error term. a term in 2x. (or S2x ) a variance term for other selected interactions involving the factor X. Coefficients for EMS Coefficient for error mean square is always 1 Coefficient of other expected mean squares is n times the product of factors levels that do not appear in the factor name. Expected Mean Squares Which interactions to include in an EMS? All the letter (i.e. A, B, C, …) appear in X. All the other letters in the interaction (except those in X) are Random Effects. A and B Fixed Effects Source d.f. EMSq A (a) a-1 2e + rbS2A B (b) b-1 2e + raS2B AxB (a-1)(b-1) 2e + rS2AB Error r(a-1)(b-1) 2e Model yield=A B A*B; A and B Random Effects Source d.f. EMSq A (a) a-1 2e + r2AB + rb2A B (b) b-1 2e + r2AB + ra2B AxB (a-1)(b-1) 2e + r2AB Error r(a-1)(b-1) 2e Model yield=A B A*B; Test h = A B e=A*B; A Fixed and B Random Source d.f. EMSq A (a) a-1 2e + r2AB + rbS2A B (b) b-1 2e + ra2B AxB (a-1)(b-1) 2e + r2AB Error r(a-1)(b-1) 2e Model yield=A B A*B; Test h = A e=A*B; Multiple Comparisons Multiple Range Tests: Tukey’s and Duncan’s. Orthogonal Contrasts. Tukey’s Multiple Range Test W = q(p,f) x se[x] se[x] = (2/n) (94,773/4) = 153.9 W = 4.64 x 153.9 = 714.1 Tukey’s Multiple Range Test Comparison A = 2678 – W = 1964.9 B = 2552 – W = 1837.9 C = 2128 – W = 1413.9 D = 2127 – W = 1412.9 E = 1796 – W = 1081.9 F = 1681 – W = 966.9 Result A=B, C & D; A>E, F & G B=C & D; B>E, F & G C=D, E & F; C>G D=E & F; D>G E=F & G F=G Duncan’s Multiple Range Test p rp 2 3 4 5 6 7 2.94 3.02 3.18 3.24 3.30 3.33 Rp = (rp x se[x]) p Rp 2 453 3 476 4 489 5 499 6 508 7 513 Duncan’s Multiple Range Test Comparison A = 2678 – R7 = 2165 B = 2552 – R6 = 2044 C = 2128 – R5 = 1628 D = 2127 – R4 = 1638 E = 1796 – R3 = 1331 F = 1681 – R2 = 1229 Result A=B; A>C, D, E, F & G B=C & D; B>E, F & G C=D, E, & F; C>G D=E & F; D>G E=F; E> G F=G Multiple Comparisons Genotype A B C D E F G Tukey 2678 a 2552 ab 2128 abc 2127 abcd 1796 cde 1681 cdef 1316 ef Duncan 2678 a 2552 ab 2128 bc 2127 bcd 1796 cde 1681 cdef 1316 ef Orthogonal Contrasts Orthogonal Contrasts Maximum number of orthogonal contrasts is df for treatment. SS of all contrasts must equal SS of treatment effect. Rem SS is difference beyween treatment SS and sum of contrast SS. Contrasts can help understand main effects and interactions. Orthogonality ci = 0 [c1i x c2i] = 0 -1 -1 +1 +1 -- ci = 0 -1 +1 -1 +1 -- ci = 0 +1 -1 -1 +1 -- ci = 0 Calculating Orthogonal Contrasts d.f. (single contrast) = 1 S.Sq(contrast) = M.Sq = [ci x Yi]2/nci2] Analyses of Variance Detect significant differences between treatment means. Determine trends that may exist as a result of varying specific factor levels. Trend Analyses 3 3 2 2 1 1 0 0 -1 -1 -2 -2 -3 -3 Linear 3 8 2 6 Quadratic 4 1 2 0 0 -1 -2 -2 -4 -3 -6 Cubic Quartic When rabbits come to dinner Two carrot cultivars (‘Orange Gold’ and ‘Bugs Delight’. Four seeding rates (1.5, 2.0, 2.5 and 3.0 lb/acre). Three replicates. TQ #1 p. 155. Analysis of Variance Source d.f. S.Sq M.Sq Replicates 2 0.3575 0.1787 Cultivar 1 0.0122 0.0122 Seeding Dens 3 12.2496 4.0832 C x SD 3 6.4490 2.1497 Error 23 4.7967 0.3426 F-val 0.50 ns 0.03 ns 11.9 *** *** 6.27 When rabbits come to dinner Cultivar Orange Gold Bug’s Delight Mean Seeding Rate (lb/acre) 2.0 2.5 1.5 4.53 bc 3.25 3.89 d B cd 4.01 cd 3.70 B 3.85 ab 5.23 ab 5.41 A 5.28 3.0 bc 4.48 a 6.08 A 5.32 When rabbits come to dinner Cultivars Seeding rate ------------ lb/acre ----------1.5 2.0 2.5 3.0 Orange Gold 4.53 Bug’s Delight Total Linear Quadratic Cubic 3.25 3.70 5.41 6.08 23.34 23.13 31.92 31.68 SSq -3 -1 1 3 1143/120 1 -1 -1 1 0.0/24 -1 3 -3 1 325/120 4.01 5.23 4.48 Analysis of Variance Source Replicates Cultivar Seeding (L) (Q) (C) C x SD Error d.f. 2 1 1 1 1 3 98 S.Sq 0.3575 0.0122 9.5316 0.0000 2.7180 6.4490 70610 M.Sq F-val 0.1787 0.50 ns 0.0122 0.03 ns *** 9.5316 27.82 0.0000 0.00 ns ** 2.7180 7.93 2.1497 6.27 *** 720 When rabbits come to dinner Cultivar Seeding Rate Yield Cv’s Orange Gold 0.5 4.53 -1 -3 +3 2.0 4.01 -1 -1 +1 2.5 5.23 -1 +1 -1 3.0 4.48 -1 +3 -3 0.5 3.25 +1 -3 -3 2.0 3.97 +1 -1 -1 2.5 5.41 +1 +1 +1 3.0 6.8 +1 +3 +3 Bug’s Delight Linear C x L Analysis of Variance Source Replicates Cultivar Seeding (L) (Q) (C) C x (L) (Q) (C) Error d.f. 2 1 1 1 1 1 1 1 23 S.Sq 0.3575 0.0122 9.5316 0.0000 2.7180 6.2199 0.0794 0.1498 4.7967 M.Sq F-val 0.1787 0.50 ns 0.0122 0.03 ns 9.5316 27.82 *** 0.0000 0.00 ns 2.7180 7.93 ** 6.2199 18.15 *** 0.0794 0.23 ns 0.1498 0.44 ns 0.3426 Yield When rabbits come to dinner 6.5 6 5.5 5 4.5 4 3.5 3 Orange Gold Bug’s Delight 1.5 2 2.5 Seeding Rate 3 End of Analyses of Variance Section