19-f12-bgunderson-wb-module8 - Open.Michigan

advertisement
Author: Brenda Gunderson, Ph.D., 2012
License: Unless otherwise noted, this material is made available under the terms of the
Creative Commons Attribution-NonCommercial-Share Alike 3.0 Unported License:
http://creativecommons.org/licenses/by-nc-sa/3.0/
The University of Michigan Open.Michigan initiative has reviewed this material in accordance
with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it.
The attribution key provides information about how you may share and adapt this material.
Copyright holders of content included in this material should contact
open.michigan@umich.edu with any questions, corrections, or clarification regarding the use of
content.
For more information about how to attribute these materials visit:
http://open.umich.edu/education/about/terms-of-use. Some materials are used with permission
from the copyright holders. You may need to obtain new permission to use those materials for
other uses. This includes all content from:
Mind on Statistics
Utts/Heckard, 4th Edition, Cengage L, 2012
Text Only: ISBN 9781285135984
Bundled version: ISBN 9780538733489
SPSS and its associated programs are trademarks of SPSS Inc. for its proprietary
computer software. Other product names mentioned in this resource are used for identification
purposes only and may be trademarks of their respective companies.
Attribution Key
For more information see: http:://open.umich.edu/wiki/AttributionPolicy
Content the copyright holder, author, or law permits you to use, share and adapt:
Creative Commons Attribution-NonCommercial-Share Alike License
Public Domain – Self Dedicated: Works that a copyright holder has
dedicated to the public domain.
Make Your Own Assessment
Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for
copyright.
Public Domain – Ineligible. WOrkds that are ineligible for copyright
protection in the U.S. (17 USC §102(b)) *laws in your jurisdiction may
differ.
Content Open.Michigan has used under a Fair Use determination
Fair Use: Use of works that is determined to be Fair consistent with the
U.S. Copyright Act (17 USC § 107) *laws in your jurisdiction may differ.
Our determination DOES NOT mean that all uses of this third-party content are Fair Uses and
we DO NOT guarantee that your use of the content is Fair. To use t his content you should
conduct your own independent analysis to determine whether or not your use will be Fair.
Module 8: One-Way Analysis of Variance
(ANOVA)
Objective: In this module you will perform a one-way analysis of variance, often
abbreviated ANOVA. We have already seen that the two independent samples ttest can be used to compare the means of two populations (when the samples are
independent). However, when we want to compare the means of three or more
populations, we turn to ANOVA. You can think of ANOVA as an extension of the
two independent sample pooled t-test since it compares several population means
and requires the assumption that the populations have equal variances.
Overview: ANOVA is a statistical tool for analyzing how the mean value of a
quantitative response (or dependent) variable is affected by one or more
categorical variables, known as treatment variables or factors. For example, we
might administer a new antibiotic drug to a random sample of people. The
response variable is white blood cell count and the grouping variable or factor
could be age with 4 levels: 1 = 0 to 19 years, 2 = 20 to 29 years, 3 = 31 to 40 years,
and 4 = 40 years and older. We would then use the ANOVA method to see if the
mean white blood cell count for the four age group populations are all the same.
In this example, the number of populations under study is k = 4. Another example
would be a study of painkillers for relief of headache pain. The response variable
might be the time to relief and the factor or treatment variable might be the type
of painkiller, using different levels: new drug, standard drug, and a placebo. In this
example the number of treatment groups is k = 3.
While ANOVA allows us to compare the means of more than two populations, it
can only tell us whether differences exist, not specifically which means vary.
Consequently, the appropriate hypotheses for ANOVA are H0: μ1 = μ2 = … = μk (that
the population means are equal, where k is the number of treatment groups) and
Ha: at least one of the population means, μi, is different. As with our other
hypothesis tests, several assumptions are required for ANOVA. The response for
each population is assumed to be normally distributed with equal variance across
the populations. The equal population variance assumption here can be checked
the same way as the two independent samples t-test – side-by-side boxplots,
comparing sample standard deviations, and Levene’s test. Further, the data are
assumed to consist of independent random samples.
100
The analysis of variance involves decomposing the total variation of the responses
into two parts: (1) that due to the variation among sample means (between
groups variation), and (2) that due to natural variation within groups (variation due
to error, or within groups variation).
SS Total = SS Groups + SS Error
If the sum of squares between groups (SS Groups) is large relative to the sum of
squares within groups (SS Error), it implies that the model of different treatment
means explains a significant portion of the observed variability. In order to
determine what is "relatively large," the sum of squares values are divided by their
respective degrees of freedom, creating what are called mean square terms. The
degrees of freedom for SS Groups is the number of treatment groups, k, minus one
(k - 1), while for SS Error, the degrees of freedom is the total number of
observations, N, minus the number of treatment groups (N -k).
MS Groups = SS Groups/(k – 1)
MS Error = SS Error/(N – k)
The ratio of these two mean squares forms the F-statistic, which has numerator
degrees of freedom (k - 1) and denominator degrees of freedom (N - k). Note in
the following equation that MSE stands for MS Error.
F
Variation among sample means MS Groups

Natural variation within groups
MSE
We can view this F-statistic as the ratio of two estimators of the common
population variance, σ2, where the denominator (MSE) is a good (unbiased)
estimator, and the numerator (MS Groups) is only good when H0 is true
(otherwise, it tends to overestimate σ2). Thus, large F values are evidence against
the null hypothesis of equal population means.
If we reject the null hypothesis, indicating that at least one of the population
means is different, then we can turn to a multiple comparisons procedure for
determining which population mean(s) appear to be different and how they differ.
The most common method to analyze this is by looking at the set of all pairwise
comparisons. Two equivalent techniques can be used for each pair of means:
either perform a hypothesis test to see if two population means are significantly
different, or construct a confidence interval for the difference in population means
to see whether the value of 0 is contained in the interval. Specifically, a multiple
comparisons procedure called Tukey’s procedure, which is available in SPSS,
controls for the overall Type I error rate (overall significance level) or the overall
confidence level.
101
Formula Card:
Activity:
Is There a Difference Among the Mean
Freshman GPAs for Three Different
Socioeconomic Classes?
Background: Sociologists often conduct experiments to investigate the
relationship between socioeconomic status and college performance.
Socioeconomic status is generally partitioned into three classes: lower, middle, and
upper. Consider the problem of comparing the mean grade point average (GPA) of
college freshmen across the three socioeconomic populations. The GPAs for
random samples of seven college freshmen from each of the three socioeconomic
classes were selected from a university’s files at the end of the first academic year.
The data are in the gpa.sav data set. (Source: Mendenhall and Sincich, 1996, page
589)
Task: Perform a test to assess whether the mean freshman GPAs among the three
socioeconomic classes differ. If there is sufficient evidence to indicate significant
differences, determine which groups differ and how.
Recall: Write out the Five Steps for conducting a test of hypotheses (reference
page 53).
1.
2.
3.
4.
5.
102
Clarify: Before conducting any test, here are a set of questions to ask yourself:
a.
How many populations are there?
 How many variables are there?
One
Two
One
Two
More than two
 What is the response variable?
 What type of variable is the response?
Categorical
Quantitative
 What is the explanatory variable?
 What type of variable is the explanatory variable?
Categorical
Quantitative
 What type of parameter would be useful for summarizing this response,
considering the explanatory variable? (See Supplement 3)
Proportion
Mean
Other
Procedure: Use your answers to these questions, to guide you to the appropriate
inference procedure. You may refer back to Supplement 3: Name that Scenario for
assistance.
The appropriate inference procedure for this scenario is
__________________________________________________________ ,
and the value of k for this problem is ___________________ .
Hypothesis Test: You can now implement the Five Steps for conducting a test of
hypotheses.
1. State the Hypotheses: H0: _________________________ and
Ha: _________________________ , where __________ represents:
Remember: Your hypotheses and parameter definition should always be a
statement about the population(s) under study.
103
2. Checking the Assumptions and Computing the Test Statistic
Assumptions:
a. For this scenario, we need to assume that the k samples are
____________________ from each other.
b. We need to assume that each sample is a ________________ sample. To
check this assumption, we would make a ________ plot (if there was time
order) for each sample, and look for ____________________ .
c. Each sample needs to come from a normally distributed
___________________ . To check this assumption, we would make a
__________ plot for each ______________ .
d. Finally, for ANOVA, we need to check the assumption that all k populations
have equal ___________________ .
e. Comment on each assumption below, using graphs and output when
appropriate.
i. Are the three samples independent?
ii. Are the samples random samples?
iii. Note that there is no time order for this data. If there were, since you
need EACH sample to be a random sample, how many time plots
would you need to make to check this assumption? _______ time
plot(s)
iv. Construct the Q-Q plots necessary to check the assumption about
normally distributed populations. (Recall that in order to split a data
file, the command is: Data> Split File.) Does it appear that the
assumption that each sample comes from a normally distributed
population is met? Why?
Note: The equal population variances assumption will be considered after the
ANOVA output is generated.
104
Test-Statistic:
f. Generate the ANOVA output using Analyze> Compare Means> One-Way
ANOVA. (Make sure you have turned off any split file features you may
have been using in your assumptions checks.) Under Options, select the
Descriptive (which gives you sample means and standard deviations) and
the Homogeneity of variance test (Levene’s test) options.
i. Choose a way to determine if the assumption of equal population
variances is valid. Check the assumption and comment.
ii. The assumption of equal population variances appears to be:
valid
not valid
iii. The symbol for the estimate of the common population standard
deviation is ____ . For ANOVA, this value can be obtained by
computing
__________ , which for this problem is equal to
____________ .
g. What is the value of the test statistic? ____ = ____________
h. What is the distribution of the test statistic if the null hypothesis is true?
Note: This is not the same as the distribution of the population that the
data were drawn from, and will be the model used to find the p-value.
3. Calculate the p-Value:
a. What is the SPSS reported p-value? _____________
b. Draw a picture of the desired p-value. Include labels for both the
distribution and x-axis. Use the pval() function in R to check your work.
4. Decision:
What is your decision at a 5% significance level?
Reject H0 Fail to Reject H0
Remember: Reject H0

Fail to Reject H0 
Results statistically significant
Results not statistically significant
105
5. Conclusion:
What is your conclusion in the context of the problem?
Note: Conclusions should always include a reference to the population
parameter of interest. Be careful that your conclusion is not too strong;
you can say that you have sufficient evidence or something equivalent,
but do NOT say that we have proven anything.
Follow-up Analyses: If ANOVA has indicated that there appears to be significant
differences between two or more of groups, we can use a multiple comparison test
to tell us which groups appear to be different and by how much.
Obtain the multiple comparisons output using Analyze > Compare Means > OneWay ANOVA. Under Post Hoc…, choose Tukey, which has a default
significance level of 0.05. The multiple comparisons output contains both
p-values and confidence intervals for every possible pairwise comparison of
groups, and either can be used to determine where differences exist. The
p-values that are equal to or smaller than 0.05 or the confidence intervals that
do NOT contain 0 indicate a difference between those two population means.
Summarize the findings about the differences in population means for the GPAs of
freshmen in the different socioeconomic classes. Which pairs are significantly
different?
Calculate a 95% confidence interval for the mean GPA for the middle class group,
where the sample mean based on the 7 subjects involved in the group was
3.25.
106
Check Your Understanding:
Complete the following sentences by circling words or filling in blanks as necessary.
The p-value of 0.025 from this activity implies that if this study were repeated
many times, we would see an F test statistic of 4.579 or greater less in about
__________ % of repetitions if the population means were really all
equal
not equal.
ANOVA procedures can be thought of as an extension of the two independent
sample pooled
population
unpooled t-test, and hence requires the assumption of equal
sample variances. One way to check this assumption is
to use Levene’s test and see if the p-value is greater than
less than or equal to
0.10 (or any reasonable significance level).
Think About It:
For an ANOVA test, would there ever be a situation in which we would need to
divide the SPSS output p-value by 2? Why or why not?
107
Example Exam Question on ANOVA
A study was conducted to compare the effects of two different therapy treatments
and a control condition on weight gain in anorexic girls. Group 1 was the control
condition subjects that received no intervention, Group 2 subjects received a
cognitive-behavioral treatment condition, and Group 3 subjects received a family
therapy condition. The response was weight gain over a fixed time period.
a. The ANOVA output provided below is used to test a set of hypotheses.
ANOVA
Gain in Weight
Between Groups
Within Groups
Total
i.
Sum of
Squares
601.916
3331.037
3932.953
df
2
60
62
Mean Square
300.958
55.517
F
5.421
Sig.
.007
State the null and alternative hypotheses.
H0: __________________________________________________________
Ha: __________________________________________________________
The p-value for this test is reported as 0.007. Draw a sketch of the appropriate
distribution showing how the p-value was determined for this ANOVA
study. Provide all details.
108
b. Multiple comparisons were performed on the weight gain data (using Tukey’s
method). Use the results to circle all pairs that are significantly different (using
a 5% significance level).
control versus cognitive behavior
control versus family therapy
cognitive behavior versus family therapy
Multiple Comparisons
Dependent Variable: Gain in Weight
Tukey HSD
(I) Condition
Control
Cog Behav
Family
(J) Condition
Cog Behav
Family
Control
Family
Control
Cog Behav
Mean
Difference
(I-J)
-3.65
-8.29
3.65
-4.64
8.29
4.64
95% Confidence Interval
Lower Bound Upper Bound
-8.77
1.48
-14.36
-2.22
-1.48
8.77
-10.58
1.30
2.22
14.36
-1.30
10.58
c. Calculate a 95% confidence interval for the mean weight gain for the family
therapy group, where the sample mean based on the 23 subjects involved in
group 3 was 7.4 pounds.
Final answer: __________________________
109
Download