IModule` 7: One-Way Analysis of Variance (ANOVA)L

advertisement
Author(s): Brenda Gunderson, Ph.D., 2011
License: Unless otherwise noted, this material is made available under the
terms of the Creative Commons Attribution–Non-commercial–Share
Alike 3.0 License: http://creativecommons.org/licenses/by-nc-sa/3.0/
We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your
ability to use, share, and adapt it. The citation key on the following slide provides information about how you
may share and adapt this material.
Copyright holders of content included in this material should contact open.michigan@umich.edu with any
questions, corrections, or clarification regarding the use of content.
For more information about how to cite these materials visit http://open.umich.edu/education/about/terms-of-use.
Any medical information in this material is intended to inform and educate and is not a tool for self-diagnosis
or a replacement for medical evaluation, advice, diagnosis or treatment by a healthcare professional. Please
speak to your physician if you have questions about your medical condition.
Viewer discretion is advised: Some medical content is graphic and may not be suitable for all viewers.
Some material may be sourced from:
Mind on Statistics
Utts/Heckard, 3rd Edition, Duxbury, 2006
Text Only: ISBN 0495667161
Bundled version: ISBN 1111978301
Material from this publication used with permission.
Attribution Key
for more information see: http://open.umich.edu/wiki/AttributionPolicy
Use + Share + Adapt
{ Content the copyright holder, author, or law permits you to use, share and adapt. }
Public Domain – Government: Works that are produced by the U.S. Government. (17 USC §
105)
Public Domain – Expired: Works that are no longer protected due to an expired copyright term.
Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain.
Creative Commons – Zero Waiver
Creative Commons – Attribution License
Creative Commons – Attribution Share Alike License
Creative Commons – Attribution Noncommercial License
Creative Commons – Attribution Noncommercial Share Alike License
GNU – Free Documentation License
Make Your Own Assessment
{ Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. }
Public Domain – Ineligible: Works that are ineligible for copyright protection in the U.S. (17 USC § 102(b)) *laws in
your jurisdiction may differ
{ Content Open.Michigan has used under a Fair Use determination. }
Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act. (17 USC § 107) *laws in your
jurisdiction may differ
Our determination DOES NOT mean that all uses of this 3rd-party content are Fair Uses and we DO NOT guarantee that
your use of the content is Fair.
To use this content you should do your own independent analysis to determine whether or not your use will be Fair.
Module 8: One-Way Analysis of Variance (ANOVA)
Objective: In this module you will perform a one-way Analysis of Variance, often abbreviated ANOVA.
We have already seen that the two independent samples t test can be used to compare the means of
two populations (when the samples are independent). What if we want to compare the means of
three or more populations? We turn to a technique called Analysis of Variance (ANOVA). You can think
of ANOVA as sort of an extension of the two independent sample pooled t-test since it can compare
several population means and requires the assumption that the populations have equal variances.
Overview: Analysis of Variance (ANOVA) is a statistical tool for analyzing how the mean value of a
quantitative response (or dependent) variable is affected by one or more categorical variables, known
as treatment variables or factors. We base our conclusions regarding the equality of the population
means on an F test that ANOVA produces.
For example, we might administer a new antibiotic drug to a random sample of people. The response
variable is white blood cell count and the grouping variable or factor could be age with levels 1 = 0 to 19
years; 2 = 20 to 29 years; 3= 31 to 40 years; and 4 = 40 years and older. We would then use the ANOVA
method to see if the mean white blood cell count for the four age group populations are all the same.
In this example the number of populations under study is k = 4. Another example is in a study of
painkillers for relief of headache pain. The response variable might be the time to relief and the factor
or treatment variable might be the type of painkiller. The different levels of the type of painkiller might
be a new drug, a standard drug, and a placebo. In this example the number of populations under study,
or treatment groups, is k = 3.
Several assumptions are made in ANOVA. The response for each population is assumed to be normally
distributed with equal variance across the populations. The data are assumed to consist of
independent random samples. The analysis of variance involves decomposing the Total variation of the
responses into two parts: (1) that due to the variation among sample means (Between Groups
variation), and (2) that due to natural variation within groups (variation due to Error).
SS Total = SS Groups + SS Error
If the sum of squares between groups (SS Groups) is large relative to the sum of squares within groups
(SS Error), it implies that the model of different treatment means explains a significant portion of the
observed variability. In this case, the null hypothesis H0: 1 = 2 = … = k (that the population means are
equal) might then be rejected, in favor of the alternative hypothesis Ha: at least one of the population
means i is different. In order to determine what is "large" (for SS Groups relative to SS Error), the sum
of squares values are divided by their respective degrees of freedom, and the resulting mean square
terms are used to calculate an F-statistic. The degrees of freedom for SS Groups are the number of
treatment groups, k, minus one (k - 1); for SS Error they are the total sample size, N, minus the number
of treatment groups (N -k).
MS Groups = SS Groups/(k – 1)
MS Error = SS Error/(N – k)
78
The ratio of these two mean squares forms the F-statistic with numerator degrees of freedom
(k - 1) and denominator degrees of freedom (N -k).
F
Variation among sample means MS Groups

Natural variation within groups
MSE
We can view this F-statistic as the ratio of two estimators of the common population variance, 2: the
denominator (MSE) is a good (unbiased) estimator, while the numerator (MS Groups) is only good when
H0 is true and otherwise tends to overestimate 2. Thus, large F values are evidence against the null
hypothesis of equal population means.
If at least one of the population means appears to be different, then we can turn to a multiple
comparisons procedure for learning which population mean(s) appear to be different and how they
differ. The most common set multiple comparisons that are analyzed is the set of all pairwise
comparisons. Either of two equivalent techniques can be used for each pair of means: perform a test
to see if the two population means are significantly different; or construct a confidence interval for the
difference in population means and see whether the value of 0 is in the interval or not. Several multiple
comparisons procedures are available that control for the overall type I error rate (overall significance
level) or the overall confidence level. One such procedure is called Tukey’s procedure, which is one of
the options available in SPSS.
Formula Card:
Activity:
Is there a Difference among the Mean Freshman GPAs for
three different socioeconomic classes?
Background: Sociologists often conduct experiments to investigate the relationship between
socioeconomic status and college performance. Socioeconomic status is generally partitioned into three
groups: lower class, middle class and upper class. Consider the problem of comparing the mean grade
point average of college freshmen across the three socioeconomic populations. The grade point
averages (GPA) for random samples of seven college freshmen from each of the three socioeconomic
classes (socclass) were selected from a university’s files at the end of the first academic year. The data
are in the GPA.sav data set (Source: Mendenhall and Sincich, 1996, page 589).
Do the data provide sufficient evidence to indicate a difference among the mean freshmen GPAs for the
three different socioeconomic classes? If so, which groups appear to be significantly different and how
do they differ?
79
Task: Perform and interpret an analysis of variance using the GPA data set.
Recall: Write out the Five Steps for conducting a test of hypotheses (Reference page 51).
1.
2.
3.
4.
5.
Before conducting any test, here are a set of questions to ask yourself:
 How many populations are there?
One
Two
More than two
 How many variables are there?
One
Two
 What is the response variable?
 What type of variable is the response?
Categorical
Quantitative
 What is the explanatory variable (if applicable)?
 What type of variable is the explanatory variable (if applicable)?
Categorical
Quantitative
 What type of parameter would be useful for summarizing this response, considering the explanatory
variable (if any)?
Proportion
Mean
Other (see Supplement 3)
Based on the answers to these questions, you should be able to identify the appropriate inference
procedure. You may refer back to Supplement 3 – Name that Scenario for assistance.
The appropriate inference procedure for this scenario is ______________________________
and the value of k for this problem is ___________________ .
1. State the hypotheses: H0: ___________________ Ha: _______________________
where _____ represents
Your parameter definition should always be a statement about the population(s) under study.
80
2. Assumption Checks and Computing the Test Statistic:
Assumptions:
a. For this scenario, we need to assume that the k samples are ________________
from each other.
b. We need to assume that each sample is a ___________ sample.
To check this assumption, we would make a __________ plot (if there was time order) for
each sample and look for ________________________________.
c. Each sample needs to come from a normally distributed _________________ .
To check this assumption, we would make a _______ plot for each __________.
d. Finally, for ANOVA, we need to assume all k populations have equal ________________.
Checking equal population variances:
There are three ways to check the assumption of equal population variances.
o Examine the sample standard deviations. If they are similar, then the assumption
is valid. (This is because variance is standard deviation squared).
o Examine side-by-side boxplots of the sample data. If the IQRs are similar,
then the assumption is valid.
o Use Levene’s test. If the Levene’s test p-value is greater than 0.05 (or the specified
significance level), the assumption of equal population variances appears to hold.
e. Do the Assumptions Appear Valid?
Comment on each assumption below, using graphs and output when appropriate.
Are the three samples independent?
Are the samples random samples?
Note there is no time order for this data. If there was time order, since you need EACH sample
to be a random sample, how many time plots would you need to make to check this
assumption? ______ time plot(s)
Construct the Q-Q plots to check the assumption about normally distributed populations. Recall
that if you need to split a data file the command is: Data> Split File
Does it appear that the assumption that each sample comes from a normally distributed
population is met? Why?
Note: The equal population variances assumption will be considered after the ANOVA output is
generated next.
81
Test-statistic:
e. Generate the ANOVA output.
Use Analyze> Compare Means> One-Way ANOVA.
Under Options, select the Descriptive (gives you sample means and standard deviations)
and the Homogeneity of variance test (this is the Levene’s test) options. Use this and any
additional output you feel is appropriate to answer the following questions.
o
Choose a way to determine if the assumption of equal population variances is valid.
Check the assumption and comment.
o
The assumption of equal population variances appears to be
Explain.
o
Obtain an estimate of the common population standard deviation for the response.
valid not valid.
The notation for the common population standard deviation is ______.
This value can be obtained by computing
__________ ,
and for this problem it is equal to ___________
f.
What is the notation for and value of the test statistic? ________ = ____________
g. What is the distribution of the test statistic if the null hypothesis is true?
This is the same as asking what model you use to find the p-value.
3. Calculate the p-value:
a. What is the SPSS reported p-value? _____________.
b. Draw a picture of the p-value. Use the “pval()” function in R to check your work.
4. Decision:
What is your decision at a 5% significance level? Reject H0 Fail to reject H0
Remember:
Reject H0

Fail to reject H0 
Results statistically significant
Results not statistically significant
5. Conclusion: What is your conclusion in context of the problem?
Conclusions should not be too strong -- i.e. say you have sufficient evidence or equivalent, do NOT
say we have proven.
82
Conclusions should always include a reference to the population parameter of interest.
6. Follow-up Analyses: ANOVA assesses whether there appears to be a difference between two or
more of groups. A multiple comparison test can tell us which groups appear to be different and by
how much those groups differ. Multiple comparison tests are a group of tests that follow after an
ANOVA, but only if significant differences have been found. It would appear that they could be
used on their own but because they are not as powerful as ANOVA, they can occasionally fail to find
differences when the ANOVA F test would succeed.
a. Obtain the multiple comparisons output. You can request multiple comparisons by clicking on
the Post Hoc … button in the dialog box under the One-Way ANOVA command. Choose Tukey
from the list. The default significance level is 0.05. Click on continue and then on Ok. The
multiple comparisons output contains p-values and confidence intervals for every possible
pairwise comparison of groups to indicate where the differences are. The p-values that are
equal to or smaller than 0.05 or the confidence intervals that do NOT contain 0 indicate a
difference between those two population means.
b. Summarize the findings about the differences in population means for the GPAs of freshmen in
the different socioeconomic classes. Which pairs are significantly different?
c. Calculate a 95% confidence interval for the mean GPA for the middle income group, where the
sample mean based on the 7 subjects involved in the group was 3.25.
83
Check Your Understanding:
Circle the appropriate words and fill in the blank line to complete the following sentences.
The p-value of 0.025 from this activity implies that if this study were repeated many times,
we would see an F test statistic of 4.579 or greater less
in about ____________%
of repetitions if the population means were really all
not equal.
equal
ANOVA procedures can be thought of as an extension of the two independent sample
pooled unpooled
t-test and hence requires the assumption of equal population sample
variances. One way to check this assumption is to use Levene’s test and see if the p-value is
greater than
less than or equal to
0.10 (or any reasonable significance level).
Think about it…
For the p-value of an ANOVA test, would there be a situation in which we would need to divide the SPSS
output p-value by 2? Why or why not?
84
Example Exam Question on ANOVA
A study was conducted to compare the effects of two different therapy treatments and a control
condition on weight gain in anorexic girls. Group 1 was the control condition subjects that received no
intervention, Group 2 subjects received a cognitive-behavioral treatment condition, and Group 3
subjects received a family therapy condition. The response was weight gain over a fixed time period.
a. The ANOVA output provided below is used to test a set of hypotheses.
ANOVA
Gain in Weight
Between Groups
Within Groups
Total
i.
Sum of
Squares
601.916
3331.037
3932.953
df
2
60
62
Mean Square
300.958
55.517
F
5.421
Sig.
.007
State the null and alternative hypotheses.
H0: ___________________________________________________________
Ha: ____________________________________________________________
ii. The p-value for this test is reported as 0.007. Draw a sketch of the appropriate distribution
showing how the p-value was determined for this ANOVA study. Provide all details.
Multiple Comparisons
Dependent Variable: Gain in Weight
Tukey HSD
b. Multiple
comparisons
were
performed on the weight gain
data (using Tukey’s method). Use
the results to circle all pairs that
are significantly different (using a
5% significance level).
control versus cognitive behavior
(I) Condition
Control
Cog Behav
Family
control versus family therapy
(J) Condition
Cog Behav
Family
Control
Family
Control
Cog Behav
Mean
Difference
(I-J)
-3.65
-8.29
3.65
-4.64
8.29
4.64
95% Confidence Interval
Lower Bound Upper Bound
-8.77
1.48
-14.36
-2.22
-1.48
8.77
-10.58
1.30
2.22
14.36
-1.30
10.58
cognitive behavior versus family therapy
c. Calculate a 95% confidence interval for the mean weight gain for the family therapy group, where
the sample mean based on the 23 subjects involved in group 3 was 7.4 pounds.
Final answer: ____________________________________
85
Download