COURSE: JUST 3900 TIPS FOR APLIA Chapter : 12 Analysis of Variance: ANOVA Developed By: Ethan Cooper (Lead Tutor) John Lohman Michael Mattocks Aubrey Urwick Key Terms: Don’t Forget Notecards Factors (p. 388) Levels (p. 388) Testwise Alpha Level (p. 391) Experimentwise Alpha Level (p. 391) Error Term (p. 394) Post Hoc Tests or Post Tests (p. 416) ANOVA Notation k is used to identify the number of treatment conditions n is used to identify the number of scores in each treatment condition N is used to identify the total number scores in the entire study T stands for treatment total and is calculated by ∑X, which equals the sum of the scores for each treatment condition G stands for the sum of all scores in a study (Grand Total) N = kn, when samples are the same size Calculate by adding up all N scores or by adding treatment total (G=∑T) You will also need SS and M for each sample, and ∑X2 for the entire set of all scores. Formulas F-ratio: 𝐹 = 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 𝑋2 SStotal: 𝑆𝑆𝑡𝑜𝑡𝑎𝑙 = SSwithin: 𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑆𝑆1 + 𝑆𝑆2 + 𝑆𝑆3 … SSbetween: 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑆𝑆𝑡𝑜𝑡𝑎𝑙 − 𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 − 𝐺2 𝑁 𝑇2 𝑛 SSbetween: 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = dftotal: 𝑑𝑓𝑡𝑜𝑡𝑎𝑙 = 𝑁 − 1 dfwithin: 𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑛 − 1 or 𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑁 − 𝑘 dfbetween: 𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑘 − 1 − 𝐺2 𝑁 More Formulas 2 MSwithin: 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑠𝑤𝑖𝑡ℎ𝑖𝑛 = MSbetween: 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = Tukey’s HSD: 𝐻𝑆𝐷 = 𝑞 𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 2 𝑠𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 𝑛 Scheffe Test: 𝐹𝐴 𝑣𝑒𝑟𝑠𝑢𝑠 𝐵 = Effect Size: η2 = 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 +𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑆𝑆𝑡𝑜𝑡𝑎𝑙 Hypothesis Testing with ANOVA Question 1: A psychologist studied three computer keyboard designs. Three samples of individuals were given material to type on a particular keyboard, and the number of errors committed by each participant was recorded. The data are as follows: Keyboard A Keyboard B Keyboard C 0 6 6 4 8 5 0 5 9 1 4 4 0 2 6 T= SS = T= SS = T= SS = N= G= ΣX2 = Hypothesis Testing with ANOVA Question 1: Are these data sufficient to conclude that there are significant differences in typing performance among the three keyboard designs? Set alpha at α = 0.05 Keyboard A Keyboard B Keyboard C 0 6 6 4 8 5 0 5 9 1 4 4 0 2 6 T=5 SS = 12 T = 25 SS = 20 T = 30 SS = 14 N = 15 G = 60 ΣX2 = 356 Hypothesis Testing with ANOVA Question 1 Answer: Step 1: State the hypothesis. H0: μ1 = μ2 = μ3 (Type of keyboard has no effect) H1: At least one of the treatment means is different. Hypothesis Testing with ANOVA Question 1 Answer: Step 2: Locate the critical region 𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑘 − 1 = 3 − 1 = 2 𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑁 − 𝑘 = 15 − 3 = 12 For this problem df = 2,12 and the critical value for α = 0.05 is F = 3.88. If F-ratio ≤ Fcritical (3.88), then fail to reject H0. If F-ratio > Fcritical (3.88), then reject H0. Hypothesis Testing with ANOVA Question 1 Answer: Step 3: Perform the analysis. 𝑋2 − 𝐺2 𝑁 𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑆𝑆1 + 𝑆𝑆2 + 𝑆𝑆3 = 12 + 20 + 14 = 46 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑆𝑆𝑡𝑜𝑡𝑎𝑙 − 𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 116 − 46 = 70 or 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑇2 𝑛 𝐺2 −𝑁 = 52 5 + 252 5 + 302 5 − 602 15 = 356 − 3600 15 𝑆𝑆𝑡𝑜𝑡𝑎𝑙 = = 356 − 602 15 = 310 − 𝑆𝑆 70 2 2 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑠𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 2 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑠𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 12 = 3.83 𝐹= 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑆𝑆 46 𝑤𝑖𝑡ℎ𝑖𝑛 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 35 = 3.83 = 9.14 602 15 = 35 = 356 − 240 = 116 = 310 − 240 = 70 Hypothesis Testing with ANOVA Sources SS df MS Between 70 2 35 Within 46 12 3.83 Total 116 14 F = 9.14 Hypothesis Testing with ANOVA Question 1 Answer: Step 4: Make a decision If F-ratio ≤ Fcritical (3.88), then fail to reject H0. If F-ratio > Fcritical (3.88), then reject H0. F-ratio (9.14) > Fcritical (3.88). Therefore, we reject H0. The type of keyboard used has a significant effect on the number of errors committed. Computing Effect Size for ANOVA Question 2: Compute effect size (η2), the percentage of variance explained, for the data that were analyzed in Question 1. Computing Effect Size for ANOVA Question 2 Answer: η2 = 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 +𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑆𝑆𝑡𝑜𝑡𝑎𝑙 = 70 116 = 0.60 = 60% Post Hoc Tests Question 3: For the data used in Question 1, perform a post hoc test to determine which mean differences are significant and which are not. Use both Tukey’s HSD and the Scheffe Test. Post Hoc Tests: Tukey’s HSD Question 3 Answer: 𝐻𝑆𝐷 = 𝑞 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 𝑛 1. Find q. q = 3.77 (Table B.5, p.708) 2. 𝐻𝑆𝐷 = 𝑞 3. Thus, the mean difference between any two samples must be at least 3.23 to be significant. Find the means for each treatment. 4. 1. 𝑀𝐴 = 2. 𝑀𝐵 = 3. 𝑀𝐶 = 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 𝑛 𝑋 𝑛 𝑋 𝑛 𝑋 𝑛 = 3.77 5 =5=1 = = 25 5 30 5 =5 =6 3.83 5 = 3.77 0.766 = 3.23 Post Hoc Tests: Tukey’s HSD Question 3 Answer: 𝐻𝑆𝐷 = 𝑞 6. 7. 8. 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 𝑛 HSD = 3.23 𝑀𝐴 − 𝑀𝐵 = 1 − 5 = −4, Treatment A is significantly different than Treatment B. 𝑀𝐴 − 𝑀𝐶 = 1 − 6 = −5, Treatment A is significantly different than Treatment C. 𝑀𝐵 − 𝑀𝐶 = 5 − 6 = −1, Treatment B is not significantly different than Treatment C. Post Hoc Tests: Scheffe Test Question 3 Answer: First, compute SSbetween for Treatments A and B. 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑇2 𝑛 𝐺2 −𝑁 = 52 5 + 252 5 − 302 10 = 5 + 125 − 90 = 40 Notice: G is equal to the total of Treatments A and B, not A, B, and C. Similarly, N is equal to nA + nB. Now, find MSbetween. 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 40 2 𝐹𝐴 𝑣𝑒𝑟𝑠𝑢𝑠 𝐵 = 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 3.83 = 5.22 For df (2,12) and α = 0.05, the critical region for F is 3.88. Therefore our obtained F-ratio is in the critical region, and we must conclude that these data show a significant difference between treatment A and treatment B. = 20 For dfbetween, use k-1. 20 Post Hoc Tests: Scheffe Test Question 3 Answer: First, compute SSbetween for Treatments A and C. 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑇2 𝑛 𝐺2 −𝑁 = 52 5 302 5 + − 352 10 = 5 + 180 − 122.5 = 62.5 Notice: G is equal to the total of Treatments A and C, not A, B, and C. Similarly, N is equal to nA + nC. Now, find MSbetween. 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 62.5 2 𝐹𝐴 𝑣𝑒𝑟𝑠𝑢𝑠 𝐶 = 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 31.25 3.83 For df (2,12) and α = 0.05, the critical region for F is 3.88. Therefore our obtained F-ratio is in the critical region, and we must conclude that these data show a significant difference between treatment A and treatment C. = 31.25 For dfbetween, use k-1. = 8.16 Post Hoc Tests: Scheffe Test Question 3 Answer: First, compute SSbetween for Treatments B and C. 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑇2 𝑛 𝐺2 −𝑁 = 252 5 + 302 5 552 − 10 = 125 + 180 − 302.5 = 2.5 Notice: G is equal to the total of Treatments B and C, not A, B, and C. Similarly, N is equal to nB + nC. Now, find MSbetween. 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 2.5 2 𝐹𝐵 𝑣𝑒𝑟𝑠𝑢𝑠 𝐶 = 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 3.83 = 0.33 For df (2,12) and α = 0.05, the critical region for F is 3.88. Therefore our obtained F-ratio is not in the critical region, and we must conclude that these data show no significant difference between treatment B and treatment C. = 1.25 For dfbetween, use k-1. 1.25 Assumptions for ANOVA Question 4: What three assumptions are required for ANOVA? Assumptions for ANOVA Question 4 Answer: The observations within each sample must be independent. The populations from which the samples are selected must be normal. The populations from which the samples are selected must have equal variances (homogeneity of variance).