A PowerPoint Presentation Package to Accompany Applied Statistics in Business & Economics, 4th edition David P. Doane and Lori E. Seward Prepared by Lloyd R. Jaisingh McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 11 Analysis of Variance Chapter Contents 11.1 11.2 11.3 11.4 11.5 11.6 11.7 Overview of ANOVA One-Factor ANOVA (Completely Randomized Model) Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA without Replication (Randomized Block Model) Two-Factor ANOVA with Replication (Full Factorial Model) Higher Order ANOVA Models (Optional) 11-2 Chapter 11 Analysis of Variance Chapter Learning Objectives LO11-1: LO11-2: LO11-3: LO11-4: LO11-5: LO11-6: LO11-7: LO11-8: LO11-9: LO11-10: LO11-11: Use basic ANOVA terminology correctly. Recognize from data format when one-factor ANOVA is appropriate. Interpret sums of squares and calculations in an ANOVA table. Use Excel or other software for ANOVA calculations. Use a table or Excel to find critical values for the F distribution. Explain the assumptions of ANOVA and why they are important. Understand and perform Tukey's test for paired means. Use Hartley's test for equal variances in c treatment groups. Recognize from data format when two-factor ANOVA is needed. Interpret main effects and interaction effects in two-factor ANOVA. Recognize the need for experimental design and GLM (optional). 11-3 Chapter 11 LO11-1 11.1 Overview of ANOVA LO11-1: Use basic ANOVA terminology correctly. • • • Analysis of variance (ANOVA) is a comparison of means. ANOVA allows you to compare more than two means simultaneously. Proper experimental design efficiently uses limited data to draw the strongest possible inferences. The Goal: Explaining Variation • • ANOVA seeks to identify sources of variation in a numerical dependent variable Y (the response variable). Variation in Y about its mean is explained by one or more categorical independent variables (the factors) or is unexplained (random error). 11-4 Chapter 11 LO11-1 11.1 Overview of ANOVA The Goal: Explaining Variation • • • • Each possible value of a factor or combination of factors is a treatment. We test to see if each factor has a significant effect on Y using (for example) the hypotheses: H0: m1 = m2 = m3 = m4 (e.g. mean defect rates are the same for all four plants) H1: Not all the means are equal The test uses the F distribution. If we cannot reject H0, we conclude that observations within each treatment have a common mean m. 11-5 Chapter 11 LO11-1 11.1 Overview of ANOVA The Goal: Explaining Variation Figure 11.3 11-6 Chapter 11 LO11-6 11.1 Overview of ANOVA LO11-6: Explain the assumptions of ANOVA and why they are important. ANOVA Assumptions • • Analysis of Variance assumes that the - observations on Y are independent, - populations being sampled are normal, - populations being sampled have equal variances. ANOVA is somewhat robust to departures from normality and equal variance assumptions. ANOVA Calculations • • Software (e.g., Excel, MegaStat, MINITAB, SPSS) is used to analyze data. Large samples increase the power of the test, but power also depends on the degree of variation in Y. • Lowest power would be in a small sample with high variation in Y. 11-7 Chapter 11 LO11-2 11.2 One-Factor ANOVA (Completely Randomized Model) LO11-2: Recognize from data format when one-factor ANOVA is appropriate. One-Factor ANOVA as a Linear Model • An equivalent way to express the one-factor model is to say that treatment j came from a population with a common mean (m) plus a treatment effect (Aj) plus random error (eij): yij = m + Aj + eij j = 1, 2, …, c and i = 1, 2, …, n • Random error is assumed to be normally distributed with zero mean and the same variance for all treatments. • A fixed effects model only looks at what happens to the response for particular levels of the factor. H0: A1 = A2 = … = Ac = 0 H1: Not all Aj are zero If the H0 is true, then the ANOVA model collapses to yij = m + eij • • One can use Excel’s one-factor ANOVA menu using Data Analysis to analyze data. 11-8 Chapter 11 LO11-3 11.2 One-Factor ANOVA (Completely Randomized Model) LO11-3: Interpret sums of squares and calculations in an ANOVA table. Partitioned Sum of Squares • • Use Appendix F or Excel to obtain the critical value of F for a given a. Table 11.2 For ANOVA, the F test is a right-tailed test. 11-9 Chapter 11 LO11-5 11.2 One-Factor ANOVA (Completely Randomized Model) LO11-5: Use a table or Excel to find critical values for the F distribution. Decision Rule for an F-test 11-10 Chapter 11 LO11-7 11.3 Multiple Comparisons LO11-7: Understand and perform Tukey's test for paired means. Tukey’s Test • • • • • • • After rejecting the hypothesis of equal mean, we naturally want to know: Which means differ significantly? In order to maintain the desired overall probability of type I error, a simultaneous confidence interval for the difference of means must be obtained. For c groups, there are c(c – 1) distinct pairs of means to be compared. These types of comparisons are called Multiple Comparison Tests. Tukey’s studentized range test (or HSD for “honestly significant difference” test) is a multiple comparison test that has good power and is widely used. Named for statistician John Wilder Tukey (1915 – 2000) This test is not available in Excel’s Tools > Data Analysis but is available in MegaStat and Minitab 11-11 Chapter 11 LO11-8 11.4 Tests for Homogeneity of Variances LO11-8: Use Hartley's test for equal variances in c treatment groups. ANOVA Assumptions • • ANOVA assumes that observations on the response variable are from normally distributed populations that have the same variance. The one-factor ANOVA test is only slightly affected by inequality of variance when group sizes are equal. Test this assumption of homogeneous variances, using Hartley’s Fmax Test. • The hypotheses are • The test statistic is the ratio of the largest sample variance to the smallest sample variance. 11-12 Chapter 11 LO11-8 11.4 Tests for Homogeneity of Variances Hartley’s Test • The decision rule is: 11-13 11.4 Tests for Homogeneity of Variances LO11-6: Explain the assumptions of ANOVA and why they are important. Levene’s Test • • • • Levene’s test is a more robust alternative to Hartley’s F test. Levene’s test does not assume a normal distribution. It is based on the distances of the observations from their sample medians rather than their sample means. A computer program (e.g., MINITAB) is needed to perform this test. 11-14 Chapter 11 LO11-6