ANALYSIS OF VARIANCE Jennifer Kensler ONE-WAY ANOVA ANOVA is used to determine whether three or more populations have different distributions. A B Medical Treatment C ANOVA STRATEGY The first step is to use the ANOVA F test to determine if there are any significant differences among means. If the ANOVA F test shows that the means are not all the same, then follow up tests can be performed to see which pairs of means differ. ANOVA ASSUMPTIONS The samples are random and independent of each other. The populations are normally distributed. The populations all have the same variance. The ANOVA F test is robust to the assumptions of normality and equal variances. THE ANOVA MODEL The one-way ANOVA is a linear model. ANOVA can be formulated as a regression model. ONE-WAY ANOVA MODEL In other words, for each group the observed value is the group mean plus some random variation. ONE-WAY ANOVA HYPOTHESIS We test whether there is a difference in the means. ANOVA F TEST A B C A B C Medical Treatment Compare the variation within the samples to the variation between the samples. ANOVA TEST STATISTIC Variation within groups small compared with variation between groups → Large F Variation within groups large compared with variation between groups → Small F MSG The mean square for groups, MSG, measures the variability of the sample averages. SSG stands for sums of squares groups. MSE Mean square error, MSE, measures the variability within the groups. SSE stands for sums of squares error. EXAMPLE 1 We would like to determine if there is a difference in a health index depending on which medical treatment (A, B or C) is used. 150 patients are randomly assigned to a treatment (50 people in each treatment). JMP demonstration Analyze Fit Y By X Y, Response: Health Index X, Factor: Treatment EXAMPLE 1: JMP OUTPUT FOLLOW-UP TEST The p-value of the overall F test indicates that the health index is not the same for all treatments. We would like to know which pairs of treatments are different. One method is to use Tukey’s HSD (honestly significant differences). TUKEY TESTS Tukey’s test simultaneously tests for all pairs of factor levels. Tukey’s HSD controls the overall type I error. JMP demonstration Oneway Analysis of Health Index By Treatment Compare Means All Pairs, Tukey HSD JMP OUTPUT The JMP output shows that all pairs of treatments are significantly different from one another. ANALYSIS OF COVARIANCE (ANCOVA) Covariates are variables that may affect the response but cannot be controlled. Covariates are not of primary interest to the researcher. We will look at an example with two covariates, the model is EXAMPLE 2 Consider the previous example where we tested whether the health index was different depending on the treatment. Perhaps age and gender may influence the health index. We can use age and gender as covariates. JMP demonstration Analyze Fit Model Y: Health Index Add: Treatment Age Gender JMP OUTPUT CONCLUSION ANOVA and ANCOVA methods allow us to determine whether the means of several groups are statistically different. For information about using SAS and SPSS to do ANOVA: http://www.ats.ucla.edu/stat/sas/topics/anova.htm http://www.ats.ucla.edu/stat/spss/topics/anova.htm