Handout

advertisement
Sociology 541
Analysis of Variance (ANOVA)
So far, we’ve been comparing the means between two groups, but we often want to make
comparisons between three or more groups.
ANOVA enables us to compare means for multiple (two or more) groups.
Individual t tests result in a rapid increase in the probability of a Type I error.
ONE-WAY ANOVA
Alternative way to compare means is using technique called Analysis of Variance (ANOVA)
developed by Ronald Fisher (1920s).
ANOVA can be used when:
Have quantitative/interval dependent/response variable (hours of exercise per week)
Have qualitative independent/explanatory variable (age group)
Notation
G denotes the number of groups/populations.
The means of the response/dependent variable for the g groups/populations are 1, 2,..., g
Denote group sample sizes by n1, n2, ..., ng and total sample size N= n1 + n2 + ...+ ng
Sample means for each group are: Y 1 , Y 2 ,..., Y g
Sample standard deviations for each group are: s1, s2, ..., sg
Grand mean (across all groups): Y
ANOVA is a test of the Ho: 1= 2=...= g against H1 that at least two means are unequal.
Basic logic behind ANOVA:
The greater the variability between sample means and the smaller the variability within each
group of sample observations, the stronger the evidence that the means across the groups may not
be equal.
An ANOVA analysis constructs a test statistic called an F statistic that compares the variation
between groups (variability of the sample means about the overall mean) to the variation within
groups (variability of the sample observations about their separate means).
1
How can we go about measuring these different types of variation?
Sum of Squares
Sum of squares is found by squaring the deviations of each observation from the mean of a
distribution and adding these squared deviations together. Numerator for the variance.
 (Y  Y )
2
i
Remember that this is equivalent to the following expression:  Yi2 – ( Yi)2 / n
Taking into account that each of the N observations belongs to one of the g groups:
g
ng
 (Yi  Y ) 2   (Yig  Y ) 2
g 1 i 1
To be able to estimate the proportion of variation in the outcome Y (mean hours of exercise) due
to group effects (age) and due to unexplained factors or random variation, we can partition the
numerator of the total variance into two independent additive components: Variation between
groups and variation within groups.
How might we partition this total sum of squares?
(Yig  Y )  Yig  (Y g  Y g )  Y
You can re-group these terms:
(Yig  Y )  (Yig  Y g )  (Y g  Y )
The second term in this equation is the estimate of the variation between groups.
The first term in the equation is an estimate of the within-group variation.
Summing over all observations and groups and squaring terms:
SStotal = SSwithin + SSbetween
2
Estimate of Total Variance (ignoring groups)
 (Y
i
 Y )2
N 1
Numerator of this variance estimate based on the entire sample is called the total sum of squares
(TSS).
Denominator of this variance estimate based on the entire sample is the total degrees of freedom
(DFt)
Between Estimate of Variance (or mean square between groups)
How much variation is there between the groups in outcome of interest?
Estimate of 2 (variance) based on variability between each sample mean and the overall mean:
 2 betweengroups 

n g (Yg  Y ) 2
g 1
Numerator for this estimate is called the between sum of squares (BSS).
Denominator of between variance is the degrees of freedom between groups.
Ratio of BSS to its degrees of freedom, g-1, is the between groups estimate of the variance.
Within Estimate of Variance (or mean square within groups)
How much variation is there within each group in outcome of interest?
Pool together the sum of squares of the observations about their respective means. Since we're
pooling these variances, the homogeneity of variance assumption is required.

2
withingroups

(Yig  Y g ) 2
Ng
The numerator for this expression is called the within sum of squares (WSS).
The denominator is called the degrees of freedom within groups.
WSS has degrees of freedom equal to sum of DF of component parts:
n1 -1 + n2 -1 + ... + ng -1 = N-g
Note: Kurtz uses the sums-of-squares formulas to calculate the different variances.
Remember that this approach and the one presented in this handout for calculating
variances are equivalent (review pages 69-74 of Kurtz).
3
F Test Statistic
The above formulas enable you to partition the variance in the total sample into the amount of
variance between groups and the amount of variance within groups.
If a relatively large amount of the variance is explained between groups compared to within
groups, we can conclude that differences between groups is probably real.
To determine this, calculate an F Statistic.
F test statistic for Ho: 1= 2=...= g is the ratio of the between-group variance estimate to the
within-group variance estimate.
F
BetweenEstimate
BSS /( g  1)

WithinEstimate WSS /( N  g )
Known as the Analysis of Variance F statistic, or ANOVA F statistic.
We know the sampling distribution of F and therefore know the probability of finding a given F.
Thus, we know the magnitudes of F needed to establish statistical significance at various levels.
Table b.7 in Appendix B of Kurtz presents the minimum F ratios necessary for significance at
different p levels.
The probability associated with an F ratio depends on the degrees of freedom. The two degrees
of freedom terms are the denominators of the between estimate and the within estimate. This F
test statistic has the F sampling distribution with df1=g-1 and df2=N-g.
A p value less than 0.05 indicates that the probability is less than 5% on any given test of the null
hypothesis that the outcome does not differ by group.
NOTE FOR FUTURE REFERENCE:
Post-hoc comparisons, such as Tukey's HSD (honestly significant difference) or Bonferroni, can
be used to investigate the multiple comparison of means, controlling for the Type I error
probabilities.
4
Example
Group
Young
Middle-aged
Old
Hours of exercise per week
11 12 6 7 3 9
0 8 4 2 5 5
0 2 4 1 3 2
Group Mean
Grand Mean
Sample size
6
6
6
Are the age groups really different in their propensity to exercise? Or are differences due to
chance fluctuations alone?
I State null hypothesis.
II Calculate mean hours of exercise for each group (group mean) and mean hours of exercise for
all groups combined (grand mean). Does there appear to be variation between groups based on
this information? Does there appear to variation within groups in hours of exercise?
III Calculate the between-group sum of squares (BSS) and the between-group variance.
IV Calculate the within-group sum of squares (WSS) and the within-group variance.
V Calculate the appropriate test statistic (F) and find corresponding p-value.
VI Conclusions.
5
ANOVA Table
Common way to summarize the results of analysis of variance.
ANOVA
HOURS
Between Groups
W ithin Groups
Total
Sum of
Squares
112.000
104.000
216.000
df
2
15
17
Mean Square
56.000
6.933
F
8.077
Sig.
.004
Sum of BSS and WSS is the total sum of squares, denoted by TSS.
TSS   (Yi  Y ) 2  BSS  WSS
Sums of Squares divided by their degrees of freedom are called mean squares. The two mean
squares are the between-groups and within-groups estimates of the population variance 2.
Ratio of two mean squares is the F test statistic.
Assumptions of ANOVA
1.
2.
3.
4.
The dependent variable/outcome is measured at the interval/ratio level.
Random samples are selected from the g populations.
The g samples are independent of one another.
Population distributions on response variable for g groups are normal (One-way independent
groups ANOVA is generally considered robust against violation of this assumption if n  30
for all groups).
5. Standard deviations/variances of population distributions for g groups are equal
(Homogeneity of variance assumption).
6
Problem
Listed below are gains on the SAT for three groups, those who did nothing special to prepare for
the test (controls), those who prepared by using a set of print materials designed to improve SAT
scores (Print), and those who used a computer-assisted (computer) set of materials designed to
improve the scores:
Control
4
2
2
3
5
Print
5
7
7
8
5
Computer
7
9
10
10
11
Conduct an ANOVA analysis of these data and display your results in an ANOVA summary
table. Interpret all components of the ANOVA table.
7
Analyze-Compare Means-Oneway ANOVA-Options
(in file called anova1.sav)
ONEWAY
hours BY agegrp
/STATISTICS DESCRIPTIVES HOMOGENEITY
/PLOT MEANS
/MISSING ANALYSIS .
Oneway
Descriptives
HOURS
N
young
middle-aged
old
Total
6
6
6
18
Mean
8.0000
4.0000
2.0000
4.6667
Std. Deviation
3.3466
2.7568
1.4142
3.5645
Std. Error
1.3663
1.1255
.5774
.8402
95% Confidence Interval for
Mean
Lower Bound Upper Bound
4.4879
11.5121
1.1069
6.8931
.5159
3.4841
2.8941
6.4393
Minimum
3.00
.00
.00
.00
Maximum
12.00
8.00
4.00
12.00
ANOVA
HOURS
Between Groups
W ithin Groups
Total
Sum of
Squares
112.000
104.000
216.000
df
Mean Square
56.000
6.933
2
15
17
F
8.077
Sig.
.004
Means Plots
9
8
7
6
Mean of HOURS
5
4
3
2
1
young
middle-aged
old
A GEGRP
8
Relationship of t to F
The t test and ANOVA comparing two groups yield identical results. Indeed, the square root of an
F test with 1 and x degrees of freedom equals a t test with x degrees of freedom for the same set
of data. Take example of difference in hours of exercise per week for young and middle-aged
groups only (2-group comparison).
T-Test
Group Statistics
HOURS
AGEGRP
young
middle-aged
N
Mean
8.0000
4.0000
6
6
Std. Deviation
3.3466
2.7568
Std. Error
Mean
1.3663
1.1255
Independent Samples Test
Levene's Test for
Equality of Variances
F
HOURS
Equal variances
assumed
Equal variances
not assumed
.488
t-test for Equality of Means
Sig.
t
.501
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
2.260
10
.047
4.0000
1.7701 5.592E-02
7.9441
2.260
9.646
.048
4.0000
1.7701 3.623E-02
7.9638
Oneway
Descriptives
HOURS
N
young
middle-aged
Total
6
6
12
Mean
8.0000
4.0000
6.0000
Std. Deviation
3.3466
2.7568
3.5929
Std. Error
1.3663
1.1255
1.0372
95% Confidence Interval for
Mean
Lower Bound Upper Bound
4.4879
11.5121
1.1069
6.8931
3.7172
8.2828
Minimum
3.00
.00
.00
Maximum
12.00
8.00
12.00
ANOVA
HOURS
Between Groups
W ithin Groups
Total
Sum of
Squares
48.000
94.000
142.000
df
1
10
11
Mean Square
48.000
9.400
F
5.106
Sig.
.047
9
1998 GSS:
Test the hypothesis that marital status (MARITAL) is related to hours per day spent
watching TV (TVHOURS) by doing an ANOVA. Set =0.01. Interpret the results.
MARITAL
What is your current marital status?
1
married
2
widowed
3
divorced
4
separated
5
never married
9
NA
TVHOURS
Hours per day watching TV?
98
DK
99
NA
Remember to make any necessary recodes – treat all NA/DK responses as missing.
Test the hypothesis that marital status (MARITAL) is related to happiness (HAPPY) by
doing an ANOVA. Set =0.01. Interpret the results.
HAPPY
General happiness?
0
NAP
1
Very happy
2
Pretty happy
3
Not too happy
8
DK
9
NA
10
Download