Chapter 12 ONE-WAY ANALYSIS OF VARIANCE We wish to compare the mean responses for several populations where the levels of a single explanatory variable define the populations. Example: One-Way ANOVA Logic The two figures provide frequency plots for data obtained by taking independent random samples of size 10 from three populations. The three populations each had a normal distribution (normality is one of the assumptions of ANOVA) and the population means were 60, 65, and 70, respectively. So the population means are indeed not all equal. In Scenario I (Figure 11.2): the population standard deviations were all equal to 1.5. In Scenario II (Figure 11.3) the population standard deviations were all equal to 3. Another assumption for ANOVA is that the populations have equality standard deviations. a) Just looking at the frequency plots, which of the two scenarios do you think would provide more evidence that at least one of the population means is different from the others? b) The samples summary statistics, produced using MINTAB, for all six samples are provided next: Scenario I N MEAN MEDIAN STDEV Sample 1 Sample 2 Sample 3 10 60.6 10 64.5 10 70.2 60.7 64.7 70.3 1.6 1.1 2.2 Scenario II N MEAN MEDIAN STDEV Sample 1 10 60.9 60.3 3.5 Sample 2 10 66.9 66.9 2.9 Sample 3 10 69.4 71.1 3.7 MIN MAX Q1 Q3 58.6 62.3 66.3 62.8 59.1 62.2 66.1 64.0 65.1 73.3 69.0 71.6 MIN 55.7 60.6 61.2 MAX Q1 Q3 67.3 58.2 63.6 70.3 65.8 69.1 73.2 66.5 72.2 How did the sample means compare to the mean of the population from which the sample was generated? How did the sample standard deviations in Scenario I compare to those in Scenario II? Solution (a) It is a lot more obvious that the three different samples in Scenario I are from different populations. In Scenario II there is a lot more overlap between the populations. (b) Each sample mean is not exactly equal to the mean of the population from which the sample was generated. The two Sample 1 means of 60.6 and 60.9 were not equal to each other nor equal to the population mean of 60. The samples in Scenario I were generated from populations whose natural variation within each population was smaller compared to the natural variation within each population in Scenario II. For each scenario, even though the population standard deviations were equal, the sample standard deviations were not exactly equal, but they were comparable. So the sample means do vary, from about 60 to about 70, for each scenario. There is a good deal of variation between the sample means. c) What about Scenario III? In which we have frequency plots for three independent samples of size 10 each taken from a normal population with mean of 65 and a standard deviation of 1.5. So in this Scenario III, the population means are indeed all equal. Do the sample means vary? Is there variation within each sample? Does the data in Scenario III provide evidence that the population means are not all equal? Although the population means were all equal, there is still a small amount of variation between the sample means. The variation within each sample seems to mask any slight variation there is in the sample means. The data in Scenario III do not provide evidence that the population means are different. What We’ve Learned? As seen in Example, the decision about equality of the population means will be based on examining the variability between the sample means. This between variability will be contrasted to how much natural variation there is within the groups. The test-statistic actually used to make the decision is called an F-statistic and can be loosely viewed as follows: Variation BETWEEN the sample means F Natural Variation WITHIN the samples One-Way ANOVA Logic Revisited Recall the three scenarios that were presented in Example above. Scenario I: Independent random samples from three normal populations with population means of 60, 65, and 70, respectively, and population standard deviations all equal to 1.5. Scenario II: Independent random samples from three normal populations with population means of 60, 65, and 70, respectively, and population standard deviations all equal to 3. Scenario III: Independent random samples from three normal populations with population means all equal to 65 and population standard deviations all equal to 1.5. The test in ANOVA is essentially the ratio of two measures of variation in the sample data. The variation between the sample means is compared to the natural variation of the observations within the samples via their ratio that is called the F-statistic. F Variation BETWEEN the sample means MSB Natural Variation WITHIN the samples MSW Formally these two measures of variation are called mean squares, with the numerator referred to as the mean square between the groups (MSB) and the denominator referred to as the mean square within the groups (MSW). The larger the variation between the sample means (MSB), as compared to the natural variation within the samples (MSW), the more support there is for a difference in the population means. F-tests are one-sided tests with the direction of extreme being to the right The variation between the sample means was greatest for Scenarios I and II as compared to Scenario III. The natural variation within the samples was greatest for Scenario II as compared to Scenarios I and III. Q: So which of the three scenarios would you expect to result in the largest value of the F-statistic? The table below provides the values of the F-statistic for the one-way ANOVA test of equality of the population means Scenario Scenario I: H 1 is true Scenario II: H 1 is true Scenario II: H 0 is true Value of the F-statistic for testing H 0 : 1 2 3 p-value F 80.4 p-value 0 p-value 0.01 p-value 0.84 F 16.4 F 0.17 Note the value of the F-statistic is smallest and the p-value the largest when the null hypothesis is true (Scenario III). For Scenarios I and II, the population means are different, but the smaller population standard deviation in Scenario I accentuates the differences by producing a larger F-ratio and an extremely small p-value. The larger value of the F-statistic corresponds to more evidence (based on the data) that the population means are not all equal. Let's Do It! 12.1 Would We Reject the Null Hypothesis? Two sets of side-by-side boxplots are shown. Each set represents the boxplots based on independent random samples selected from three normal populations with possibly different population means but equal population standard deviations. In answering the questions below, remember that the F-test in ANOVA is based on comparing the variation between the sample means to the natural variation within the samples. (a) Based on the boxplots in Set I, do you think the null hypothesis H 0 : 1 2 3 will be rejected using a one-way ANOVA F-test? Explain your answer. (b) Based on the boxplots in Set II, do you think the null hypothesis H 0 : 1 2 3 will be rejected using a one-way ANOVA F-test? Explain your answer. THE F-DISTRIBUTION Recall that the test statistic in ANOVA is essentially the ratio of two measures of variation in the sample data. The variation between the sample means is compared to the natural variation of the observations within the samples via their ratio that is called the F-statistic. The probability distribution of the F-statistic is called an F-distribution. The family of F-distributions is a family of skewed to the right distributions, each with a minimum value of 0. F-distributions are indexed by a pair of degrees of freedom, referred to as the numerator and denominator degrees of freedom. Example Working with the F-Distribution A study was conducted to assess the effectiveness of the I = 3 different ads, Ad A, Ad B and Ad C. n = 43 subjects: 14 shown Ad A, 14 Ad B, 15 Ad C Score of effectiveness (larger score indicates more favorably). Is there evidence at the 5% level to conclude that one ad is more effective than the other two ads? Suppose the ANOVA assumptions hold and the MSB =303.7 and the MSW = 120.4. Then, The distribution of the F-statistic under HO is n Fdistribution with I – 1 = 2 and n – I = 43 – 3 = 40 degrees of freedom. Using Oneway ANOVA command in MINITAB: Decision and Conclusion At a 5% significance level we would accept H0. We conclude that it appears the three ads are equally effective. Let's Do It !Working with the F-Distribution ANOVA performed to test the equality of I = 4 population means. Independent random samples of size seven obtained from each of the four populations. The F-test resulted in an observed test statistic value of 3.28. A) Numerator degrees of freedom = I - 1 = ______ . B) Denominator degrees of freedom = n - I = ______ C) Find the p-value for this test. D) Complete the picture with the proper labeling for the distribution and shade the area corresponding to the p-value. E) Are the results statistically significant at the 5% level? YES NO Explain: Performing an ANOVA F-Test Using the TI Example Three Treatments Suppose we enter the data for the three headache drugs from Example 12.7 under the lists L1, L2, and L3. Then the following steps can be used to perform the one-way ANOVA on these data: Note: The term Factor represents the between source of variation. The term Error represents the within source of variation. Complete the ANOVA table The p-value is 0.034. At 5% level, conclude that there appears to be a difference in the mean time to relief for the three drugs. MULTIPLE COMPARISONS* In one-way ANOVA we are trying to compare many population means—that is, we are trying to do multiple comparisons. However, we would like to do the many comparisons at one time and still be able to attach to our conclusions some overall measure of confidence. The statistical approach to this problem of multiple comparisons is to first do an overall test to assess if there are any differences between the population means, and if we accept that there is a difference, then to do a follow-up analysis that helps determine which of the means differ and estimates by how much they differ. The F-test described in the preceding sections is this overall test in ANOVA. In this last section on one-way ANOVA we discuss the need for this overall F-test in ANOVA and present some multiple comparison methods that are part of the follow-up analysis should our decision be to reject H0. When we perform an ANOVA to test if the population means are equal, we must remember that if we reject the null hypothesis we can only conclude that at least one of the population means appears to be different from the other population means. Thus the ANOVA F-test is only the first step in our analysis. If significant differences exist among the treatment means, we would like to investigate which means are significantly different and perhaps measure how different they are. We can determine where the differences among the means seem to occur by conducting multiple comparisons. Let's Do It! 1Three Treatments Revisited (a) Report the confidence interval for each pairwise comparison: P-vlaue for μ1 – μ2 : ______________ , Is Drug 1 and 2 Different? P-value for μ1 – μ3: ______________ , Is Drug 1 and 3 Different? P-value for μ2 – μ3 : ______________, Is Drug 2 and 3 Different? (b) State your conclusions regarding the differences between the mean response for the three drug groups based on the Bonferroni multiple comparison method. Let’s Do It A researcher wishes to try three different techniques to lower the blood pressure of individuals diagnosed with high blood pressure. The subjects are randomly assigned to three groups; the first group takes medication, the second group exercises, and the third group follows a special diet (1: Medication, 2: Exercise, 3: Diet). After four weeks, the reduction in each person’s blood pressure is recorded. A one-way ANOVA was calculated on reduction in blood pressure Using SPSS. ANOVA Sum of Squares df Mean Square F Between Groups 160.133 2 80.067 Within Groups 104.800 12 8.733 Total 264.933 14 Sig. 9.168 .004 Multiple Comparisons (I) Technique (J) Technique 1 2 8.000* 1.869 .003 3 4.200 1.869 .103 1 -8.000* 1.869 .003 3 -3.800 1.869 .147 1 -4.200 1.869 .103 2 3.800 1.869 .147 2 3 Mean Difference (I-J) Std. Error Sig. a. F = _____ b. p-value= __________ c. Using 0.05, is there a significant difference between the three techniques? Yes NO d. If your answer in part b is yes, determine which techniques are significantly different. (Check all that applies) ___Medication and Exercise ___Medication and Diet ___Exercise and Diet End of chapter12