Author: Brenda Gunderson, Ph.D., 2012 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-NonCommercial-Share Alike 3.0 Unported License: http://creativecommons.org/licenses/by-nc-sa/3.0/ The University of Michigan Open.Michigan initiative has reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The attribution key provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact open.michigan@umich.edu with any questions, corrections, or clarification regarding the use of content. For more information about how to attribute these materials visit: http://open.umich.edu/education/about/terms-of-use. Some materials are used with permission from the copyright holders. You may need to obtain new permission to use those materials for other uses. This includes all content from: Mind on Statistics Utts/Heckard, 4th Edition, Cengage L, 2012 Text Only: ISBN 9781285135984 Bundled version: ISBN 9780538733489 SPSS and its associated programs are trademarks of SPSS Inc. for its proprietary computer software. Other product names mentioned in this resource are used for identification purposes only and may be trademarks of their respective companies. Attribution Key For more information see: http:://open.umich.edu/wiki/AttributionPolicy Content the copyright holder, author, or law permits you to use, share and adapt: Creative Commons Attribution-NonCommercial-Share Alike License Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain. Make Your Own Assessment Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. Public Domain – Ineligible. WOrkds that are ineligible for copyright protection in the U.S. (17 USC §102(b)) *laws in your jurisdiction may differ. Content Open.Michigan has used under a Fair Use determination Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act (17 USC § 107) *laws in your jurisdiction may differ. Our determination DOES NOT mean that all uses of this third-party content are Fair Uses and we DO NOT guarantee that your use of the content is Fair. To use t his content you should conduct your own independent analysis to determine whether or not your use will be Fair. Module 8: One-Way Analysis of Variance (ANOVA) Objective: In this module you will perform a one-way analysis of variance, often abbreviated ANOVA. We have already seen that the two independent samples ttest can be used to compare the means of two populations (when the samples are independent). However, when we want to compare the means of three or more populations, we turn to ANOVA. You can think of ANOVA as an extension of the two independent sample pooled t-test since it compares several population means and requires the assumption that the populations have equal variances. Overview: ANOVA is a statistical tool for analyzing how the mean value of a quantitative response (or dependent) variable is affected by one or more categorical variables, known as treatment variables or factors. For example, we might administer a new antibiotic drug to a random sample of people. The response variable is white blood cell count and the grouping variable or factor could be age with 4 levels: 1 = 0 to 19 years, 2 = 20 to 29 years, 3 = 31 to 40 years, and 4 = 40 years and older. We would then use the ANOVA method to see if the mean white blood cell count for the four age group populations are all the same. In this example, the number of populations under study is k = 4. Another example would be a study of painkillers for relief of headache pain. The response variable might be the time to relief and the factor or treatment variable might be the type of painkiller, using different levels: new drug, standard drug, and a placebo. In this example the number of treatment groups is k = 3. While ANOVA allows us to compare the means of more than two populations, it can only tell us whether differences exist, not specifically which means vary. Consequently, the appropriate hypotheses for ANOVA are H0: μ1 = μ2 = … = μk (that the population means are equal, where k is the number of treatment groups) and Ha: at least one of the population means, μi, is different. As with our other hypothesis tests, several assumptions are required for ANOVA. The response for each population is assumed to be normally distributed with equal variance across the populations. The equal population variance assumption here can be checked the same way as the two independent samples t-test – side-by-side boxplots, comparing sample standard deviations, and Levene’s test. Further, the data are assumed to consist of independent random samples. 100 The analysis of variance involves decomposing the total variation of the responses into two parts: (1) that due to the variation among sample means (between groups variation), and (2) that due to natural variation within groups (variation due to error, or within groups variation). SS Total = SS Groups + SS Error If the sum of squares between groups (SS Groups) is large relative to the sum of squares within groups (SS Error), it implies that the model of different treatment means explains a significant portion of the observed variability. In order to determine what is "relatively large," the sum of squares values are divided by their respective degrees of freedom, creating what are called mean square terms. The degrees of freedom for SS Groups is the number of treatment groups, k, minus one (k - 1), while for SS Error, the degrees of freedom is the total number of observations, N, minus the number of treatment groups (N -k). MS Groups = SS Groups/(k – 1) MS Error = SS Error/(N – k) The ratio of these two mean squares forms the F-statistic, which has numerator degrees of freedom (k - 1) and denominator degrees of freedom (N - k). Note in the following equation that MSE stands for MS Error. F Variation among sample means MS Groups Natural variation within groups MSE We can view this F-statistic as the ratio of two estimators of the common population variance, σ2, where the denominator (MSE) is a good (unbiased) estimator, and the numerator (MS Groups) is only good when H0 is true (otherwise, it tends to overestimate σ2). Thus, large F values are evidence against the null hypothesis of equal population means. If we reject the null hypothesis, indicating that at least one of the population means is different, then we can turn to a multiple comparisons procedure for determining which population mean(s) appear to be different and how they differ. The most common method to analyze this is by looking at the set of all pairwise comparisons. Two equivalent techniques can be used for each pair of means: either perform a hypothesis test to see if two population means are significantly different, or construct a confidence interval for the difference in population means to see whether the value of 0 is contained in the interval. Specifically, a multiple comparisons procedure called Tukey’s procedure, which is available in SPSS, controls for the overall Type I error rate (overall significance level) or the overall confidence level. 101 Formula Card: Activity: Is There a Difference Among the Mean Freshman GPAs for Three Different Socioeconomic Classes? Background: Sociologists often conduct experiments to investigate the relationship between socioeconomic status and college performance. Socioeconomic status is generally partitioned into three classes: lower, middle, and upper. Consider the problem of comparing the mean grade point average (GPA) of college freshmen across the three socioeconomic populations. The GPAs for random samples of seven college freshmen from each of the three socioeconomic classes were selected from a university’s files at the end of the first academic year. The data are in the gpa.sav data set. (Source: Mendenhall and Sincich, 1996, page 589) Task: Perform a test to assess whether the mean freshman GPAs among the three socioeconomic classes differ. If there is sufficient evidence to indicate significant differences, determine which groups differ and how. Recall: Write out the Five Steps for conducting a test of hypotheses (reference page 53). 1. 2. 3. 4. 5. 102 Clarify: Before conducting any test, here are a set of questions to ask yourself: a. How many populations are there? How many variables are there? One Two One Two More than two What is the response variable? What type of variable is the response? Categorical Quantitative What is the explanatory variable? What type of variable is the explanatory variable? Categorical Quantitative What type of parameter would be useful for summarizing this response, considering the explanatory variable? (See Supplement 3) Proportion Mean Other Procedure: Use your answers to these questions, to guide you to the appropriate inference procedure. You may refer back to Supplement 3: Name that Scenario for assistance. The appropriate inference procedure for this scenario is __________________________________________________________ , and the value of k for this problem is ___________________ . Hypothesis Test: You can now implement the Five Steps for conducting a test of hypotheses. 1. State the Hypotheses: H0: _________________________ and Ha: _________________________ , where __________ represents: Remember: Your hypotheses and parameter definition should always be a statement about the population(s) under study. 103 2. Checking the Assumptions and Computing the Test Statistic Assumptions: a. For this scenario, we need to assume that the k samples are ____________________ from each other. b. We need to assume that each sample is a ________________ sample. To check this assumption, we would make a ________ plot (if there was time order) for each sample, and look for ____________________ . c. Each sample needs to come from a normally distributed ___________________ . To check this assumption, we would make a __________ plot for each ______________ . d. Finally, for ANOVA, we need to check the assumption that all k populations have equal ___________________ . e. Comment on each assumption below, using graphs and output when appropriate. i. Are the three samples independent? ii. Are the samples random samples? iii. Note that there is no time order for this data. If there were, since you need EACH sample to be a random sample, how many time plots would you need to make to check this assumption? _______ time plot(s) iv. Construct the Q-Q plots necessary to check the assumption about normally distributed populations. (Recall that in order to split a data file, the command is: Data> Split File.) Does it appear that the assumption that each sample comes from a normally distributed population is met? Why? Note: The equal population variances assumption will be considered after the ANOVA output is generated. 104 Test-Statistic: f. Generate the ANOVA output using Analyze> Compare Means> One-Way ANOVA. (Make sure you have turned off any split file features you may have been using in your assumptions checks.) Under Options, select the Descriptive (which gives you sample means and standard deviations) and the Homogeneity of variance test (Levene’s test) options. i. Choose a way to determine if the assumption of equal population variances is valid. Check the assumption and comment. ii. The assumption of equal population variances appears to be: valid not valid iii. The symbol for the estimate of the common population standard deviation is ____ . For ANOVA, this value can be obtained by computing __________ , which for this problem is equal to ____________ . g. What is the value of the test statistic? ____ = ____________ h. What is the distribution of the test statistic if the null hypothesis is true? Note: This is not the same as the distribution of the population that the data were drawn from, and will be the model used to find the p-value. 3. Calculate the p-Value: a. What is the SPSS reported p-value? _____________ b. Draw a picture of the desired p-value. Include labels for both the distribution and x-axis. Use the pval() function in R to check your work. 4. Decision: What is your decision at a 5% significance level? Reject H0 Fail to Reject H0 Remember: Reject H0 Fail to Reject H0 Results statistically significant Results not statistically significant 105 5. Conclusion: What is your conclusion in the context of the problem? Note: Conclusions should always include a reference to the population parameter of interest. Be careful that your conclusion is not too strong; you can say that you have sufficient evidence or something equivalent, but do NOT say that we have proven anything. Follow-up Analyses: If ANOVA has indicated that there appears to be significant differences between two or more of groups, we can use a multiple comparison test to tell us which groups appear to be different and by how much. Obtain the multiple comparisons output using Analyze > Compare Means > OneWay ANOVA. Under Post Hoc…, choose Tukey, which has a default significance level of 0.05. The multiple comparisons output contains both p-values and confidence intervals for every possible pairwise comparison of groups, and either can be used to determine where differences exist. The p-values that are equal to or smaller than 0.05 or the confidence intervals that do NOT contain 0 indicate a difference between those two population means. Summarize the findings about the differences in population means for the GPAs of freshmen in the different socioeconomic classes. Which pairs are significantly different? Calculate a 95% confidence interval for the mean GPA for the middle class group, where the sample mean based on the 7 subjects involved in the group was 3.25. 106 Check Your Understanding: Complete the following sentences by circling words or filling in blanks as necessary. The p-value of 0.025 from this activity implies that if this study were repeated many times, we would see an F test statistic of 4.579 or greater less in about __________ % of repetitions if the population means were really all equal not equal. ANOVA procedures can be thought of as an extension of the two independent sample pooled population unpooled t-test, and hence requires the assumption of equal sample variances. One way to check this assumption is to use Levene’s test and see if the p-value is greater than less than or equal to 0.10 (or any reasonable significance level). Think About It: For an ANOVA test, would there ever be a situation in which we would need to divide the SPSS output p-value by 2? Why or why not? 107 Example Exam Question on ANOVA A study was conducted to compare the effects of two different therapy treatments and a control condition on weight gain in anorexic girls. Group 1 was the control condition subjects that received no intervention, Group 2 subjects received a cognitive-behavioral treatment condition, and Group 3 subjects received a family therapy condition. The response was weight gain over a fixed time period. a. The ANOVA output provided below is used to test a set of hypotheses. ANOVA Gain in Weight Between Groups Within Groups Total i. Sum of Squares 601.916 3331.037 3932.953 df 2 60 62 Mean Square 300.958 55.517 F 5.421 Sig. .007 State the null and alternative hypotheses. H0: __________________________________________________________ Ha: __________________________________________________________ The p-value for this test is reported as 0.007. Draw a sketch of the appropriate distribution showing how the p-value was determined for this ANOVA study. Provide all details. 108 b. Multiple comparisons were performed on the weight gain data (using Tukey’s method). Use the results to circle all pairs that are significantly different (using a 5% significance level). control versus cognitive behavior control versus family therapy cognitive behavior versus family therapy Multiple Comparisons Dependent Variable: Gain in Weight Tukey HSD (I) Condition Control Cog Behav Family (J) Condition Cog Behav Family Control Family Control Cog Behav Mean Difference (I-J) -3.65 -8.29 3.65 -4.64 8.29 4.64 95% Confidence Interval Lower Bound Upper Bound -8.77 1.48 -14.36 -2.22 -1.48 8.77 -10.58 1.30 2.22 14.36 -1.30 10.58 c. Calculate a 95% confidence interval for the mean weight gain for the family therapy group, where the sample mean based on the 23 subjects involved in group 3 was 7.4 pounds. Final answer: __________________________ 109