variance hence

advertisement
Briand CSBS 320 Notes Caldwell Ch.10 - 1 Ch.10 Analysis of Variance IN-CLASS READING ASSIGNMENT 03/08/07
Definition
ANOVA: Analysis Of Variance (one-way ANOVA, i.e. one variable)
Extension of the difference of means test since it's based on a comparison of sample means
But it involves the comparison of different estimates of population variance -- hence the name ANOVA
ANOVA is appropriate for situations involving 3 or more samples and a variable measured at the
interval/ratio level of measurement.
e.g. educational psychologist wanting to know if students exposed to 3 different treatment conditions or
learning environments (positive sanction, negative sanction, sanction neutral) exhibit different test scores. If
test scores are based on interval/ratio scale of measurement, ANOVA is appropriate
e.g. a geographer is interested in the growth rates of 4 types of cities -- manufacturing centers, government
centers, retail centers, and financial centers.
e.g. a market researcher wants to determine if there's a significant difference between the response rates to 5
different marketing campaign
e.g. a sociologist wants to determine if different types of school personnel (teachers, counselors, and
coaches) vary in their abilities to recognize risk factors for youth suicide.
e.g. we want to know if scores on an aptitude test actually vary for students in different types of schooling
environments -- home schooling, public schooling, and private schooling.
Research problems like the ones above could be thought to be approached with t tests on all possible pairs of
sample means; problems with that:
 with e.g. of 4 types of cities, we would need to run 6 t-tests
 probability of Type I error would be magnified e.g. from Spatz p. 225 Ch. 10. Suppose you have 15
samples that all come from the same population (Because there is just one population, the null
hypothesis is clearly true). These 15 sample means will vary from one another as a result of chance
factors. Now suppose you calculate every possible t test (all 105 of them), retaining and rejecting
each null hypothesis at the .05 level. How many times would you reject the null hypothesis? The
answer is about 5. When the null hypothesis is true (as in this example) and  is .05, 100 t test will
produce about 5 type I errors. Back to the reporting of the experiment. Suppose you conducted a 15
group experiment, ran 105 tests, and found five significant differences. If you then pulled out those 5
and said they were reliable difference (i.e. differences that are not due to chance), you don't
understand statistics. You can protect yourself from making such a mistake by using a statistical
technique that keeps the overall risk of Type I error at an acceptable level (such as .05 and .01)
Briand CSBS 320 Notes Caldwell Ch.10 - 2 Application we’ll be working with: We're interested in urban unemployment and whether or not the
unemployment levels in cities vary by region of the country. We've used a random sampling technique to
select cities in 4 different regions and we've recorded unemployment levels.
Table 10.1 Levels of unemployment by region
North
3.8
7.1
9.6
8.4
5.1
11.6
6.2
7.9
9.0
10.3
n North  10
79
X North 
 7.9
10
South
4.2
6.5
4.4
8.1
7.6
5.8
4.0
7.3
5.2
4.8
nSouth  10
57.9
X South 
 5.79
10
East
8.8
5.1
12.7
6.4
9.8
6.3
10.2
8.5
11.9
8.6
n East  10
88.3
X East 
 8.83
10
West
4.8
1.2
8.0
9.4
3.6
8.7
6.5
nW est  7
42.2
X W est 
 6.03
7
Each sample mean or group mean is simply the average of the unemployment levels in each group.
We can also compute an overall mean or grand mean based on all the data in all groups:
 X all   X North   X South   X East   X W est
X grand 
ntotal
n North  n South  n East  nW est

3.8  7.1  ....4.2  6.5  ...8.8  5.1  ...  3.6  8.7  ... 79  57.9  88.3  42.2

 7.23
37
10  10  10  7
NOTE:
(1) The number of cities in each sample or group is not necessarily the same.
(2) Because each group has a different number of cases, you can’t just simply take the average of the group
means to compute the grand mean:
X
 X South  X East  X W est
grand mean  North
, where K is the number of groups
K
(in the example above K = 4)
(3) To compute the grand mean, you could compute a weighted average of the group means:
n
X
 nSouth X South  n East X East  nW est X W est
grand mean  North North
n North  nSouth  n East  nW est
Briand CSBS 320 Notes Caldwell Ch.10 - 3 The null hypothesis
H0 simply states that the means of the regions are equal.
H0: 1   2   3   4
Logic of ANOVA
We want to look at the variation of scores within each group, i.e. how far away scores from each group
deviate from their group mean. We want to compare this variation of scores within groups to the variation of
the group means or group mean scores from their own mean – the grand mean.
If there is more variation between groups than within groups, then there’s support for the assertion that
unemployment levels in cities vary by region.
To compare the variation of scores within groups to the variation of scores between group, we compute a Fstatistic, which is the ratio of an estimate of between-groups variance over an estimate of within-groups
variance.
F  ratio 
estimate of between  groups variance
estimate of within  groups variance
If there is more variation between groups than within groups, our F-ratio or F-statistic is large and that gives
us ground to reject H0.
How large does our F-ratio needs to be for us to reject H0?
We qualify our F-ratio as being large by comparing it to a critical F-ratio (sounds familiar?). If F-statistics 
FC, reject H0. If F-statistics < FC, fail to reject H0.
NOTE: F stands for Fisher, after Sir Ronald A. Fisher (1890-1962) who invented ANOVA. Fisher wrote the
book on statistics: Statistical Methods for Research Workers first published in 1925.
The F-ratio
2
If you recall the definition of variance for a sample: s =

X  X 
2
, you’ll see similarity between s2 and the
n 1
estimate of between-groups variance and within-groups variance used in computing the F-ratio.
NOTE: the sum of squared deviations,  X  X  , will be referred to, from now on, as sum of squares.
2
Briand CSBS 320 Notes Caldwell Ch.10 - 4 F
estimate of between  groups variance
estimate of within  groups variance
F
MS B
MS W
where:
MSB is the mean square between (or estimate of the between-groups variance)
SS
MS B  B
df B
where: SSB is the between-groups sum of squares
2
2
2
SS B  n1 X 1  X grand   n2 X 2  X grand   ...  nk X k  X grand 
dfB are the between-groups degrees of freedom
dfB = K – 1, where K is the number of groups or samples
MSW is the mean square within (or estimate of the within-groups variance)
SS
MS W  W
df W
where: SSW is the within-groups sum of squares
2
2
2
SSW   X 1  X 1    X 2  X 2   ...   X k  X k 
or
dfW are the within-groups degrees of freedom
dfW = ntotal – K, where ntotal is the total number of cases across samples
dfW = (nNorth - 1) + (nSouth - 1) + (nEast - 1) + (nWest - 1)
Briand CSBS 320 Notes Caldwell Ch.10 - 5 Calculating the F-ratio
Table 10.1 Levels of unemployment by region
North
X North
X
 X North 
(3.8-7.9)2=16.81
(7.1-7.9)2=0.64
(9.6-7.9)2=2.89
(8.4-7.9)2=0.25
(5.1-7.9)2=7.84
(11.6-7.9)2=13.69
(6.2-7.9)2=2.89
(7.9-7.9)2=0
(9.0-7.9)2=1.21
(10.3-7.9)2=5.76
2
North
3.8
7.1
9.6
8.4
5.1
11.6
6.2
7.9
9.0
10.3
 X North  X North  X North 2
= 79
=51.98
n North  10
79
X North 
 7.9
10
South
X North
X
 X North 
(4.2-5.79)2=2.53
(6.5-5.79)2=0.50
(4.4-5.79)2=1.93
(8.1-5.79)2=5.34
(7.6-5.79)2=3.28
(5.8-5.79)2=0
(4.0-5.79)2=3.20
(7.3-5.79)2=2.28
(5.2-5.79)2=0.35
(4.8-5.79)2=0.98
2
North
East
X North
X
 X North 
(8.8-8.83)2=0
(5.1-8.83)2=13.91
(12.7-8.83)2=14.98
(6.4-8.83)2=5.90
(9.8-8.83)2=0.94
(6.3-8.83)2=6.40
(10.2-8.83)2=1.88
(8.5-8.83)2=0.11
(11.9-8.83)2=9.42
(8.6-8.83)2=0.05
2
North
4.2
8.8
6.5
5.1
4.4
12.7
8.1
6.4
7.6
9.8
5.8
6.3
4.0
10.2
7.3
8.5
5.2
11.9
4.8
8.6
2
 X North  X North  X North   X North  X North  X North 2
= 57.9
= 88.3
=20.39
=53.60
nSouth  10
n East  10
57.9
88.3
X South 
 5.79
X East 
 8.83
10
10
X grand  7.23
West
X North
4.8
1.2
8.0
9.4
3.6
8.7
6.5
X
North
X
 X North 
(4.8-6.03)2=1.51
(1.2-6.03)2=23.32
(8.0-6.03)2=3.89
(9.4-6.03)2=11.37
(3.6-6.03)2=5.90
(8.7-6.03)2=7.14
(6.5-6.03)2=0.22
2
North
 X
 X North 
2
North
= 42.2
=53.33
nW est  7
42.2
X W est 
 6.03
7
Briand CSBS 320 Notes Caldwell Ch.10 - 6 -
SS B  n North X North  X grand   nSouth X South  X grand   n East X East  X grand   nW est X W est  X grand 
2
2
2
2
 10 7.9  7.23  10 5.79  7.23  10 8.83  7.23  7 6.03  7.23  60.88
2
2
2
2
dfB = K – 1 = 4 – 1 = 3
MS B 
=>
SS B 60.88
 20.29
=
3
df B
SSW   X North  X North    X South  X South    X East  X East    X W est  X W est 
2
2
2
2
 51.98  20.39  53.60  53.33  179.30
dfW = ntotal – K = 37 – 4 = 33
=>
MS W 
=>
F
SSW 179.30
 5.43
=
33
df W
MS B 20.29
 3.74
=
5.43
MS W
Comparing F to FC, and interpreting results
F= 3.74
Look for FC in Appendix D or Appendix E p. 305 and 306 of your textbook.
Let  = 5%
dfB = 3 and dfW = 33
Choose dfW = 30 (lowest degree of freedom available) since there is no dfW = 33.
=> FC = 2.92
F>FC, reject H0: Our result suggests that levels of unemployment in cities do vary by region.
NOTE: ANOVA allows us to determine whether or not there's a significant difference across groups or
samples, but it doesn't tell us whether there is a difference between each one of those groups.
Briand CSBS 320 Notes Caldwell Ch.10 - 7 Ch.10 Analysis of Variance IN-CLASS READING ASSIGNMENT 03/09/07
As we pointed out earlier, ANOVA allows us to determine whether or not there's a significant difference
across groups or samples, but it doesn't tell us whether there is a difference between each one of those
groups.
In our urban unemployment example, our null hypothesis was that the means of the unemployment levels in
cities were equal across regions.
Our implicit alternative hypothesis was that at least one of the regions had a different unemployment level
than the others.
When we rejected H0, we were able to conclude that the levels of unemployment in cities do vary by region.
But we were not able to tell which region had a different unemployment level than the others, nor were we
able to tell whether one or more regions had a different unemployment level than the others.
Tukey’s Honestly Significant Difference (HSD)
The Tukey’s Honestly Significant Difference test, henceforth called HSD, is a procedure that allows us to
determine between which regions the levels of unemployment differ.
NOTE: The HSD test is used ONLY after significant results are found. In other words, it is used only if H0
was rejected; if H0 wasn’t rejected, no additional test is needed.
How does the HSD test work?
The HSD test is equivalent to doing successive difference of means tests. We compare two sample means at
a time, and compute a Q statistics for each comparison or for each pair.
In our example of regional unemployment, we have four sample means, and thus we’ll have 6 pair wise
comparisons:
North-South
North-West
North-East
South-West
South-East
West-East
We compare each Q statistics to a critical Q value to determine whether or not sample means are
significantly different for each pair. If Q-statistics  QC, the two means are significantly different. If Qstatistics < QC, the two means are not significantly different.
Briand CSBS 320 Notes Caldwell Ch.10 - 8 -
Q-statistics
(1) When all sample sizes are equal: Q 
X1  X2
, where
MS W
n
X 1 and X 2 are any two means
n: # of cases in each sample
(2) When all sample sizes are unequal: Q 
X1  X2
MS W
n~
, where
X 1 and X 2 are any two means
n~ : harmonic mean
n~ 
K
1
1
1

 ... 
n1 n2
nk
K: number of samples or groups
Calculating the Q-statistics
Since our regional unemployment example involves unequal sample sizes, we’ll use equation (2) above.
Denominator of Q statistics:
n~ 
K
1
n North

1
nSouth

1
nW est

1
n East
MS W  5.43 (previous result)
=>
MSW
5.43

 0.60  0.77
~
n
9.09
=
4
1 1 1 1
  
10 10 10 7

1
4

 9.09
.10  .10  .10  .14 .44
Briand CSBS 320 Notes Caldwell Ch.10 - 9 -
Q statistics:
North-South comparison:
Q
North-West comparison:
Q
North-East comparison:
Q
South-West comparison:
Q
South-East comparison:
Q
West-East comparison:
Q
X North  X South
MS W
n~
X North  X W est
MS W
n~
X North  X East
MSW
n~
X South  X W est
MS W
n~
X South  X East
MS W
n~
X W est  X East
MS W
n~
7.9  5.79







2.11
 2.74
0.77

0.93
 1.21
0.77

1.87
 2.43
0.77
0.77
7.9  8.83
0.77
7.9  6.03
0.77
5.79  8.83
0.77
5.79  6.03
0.77
8.83  6.03
0.77

3.04
 3.95
0.77

0.24
 0.31
0.77

2.80
 3.64
0.77
Comparing Q statistics to QC, and interpreting results
Look for QC in Appendix F or Appendix G p. 307 and 308 of your textbook.
Let  = 5%
K = 4 and dfW = 33 (previous result)
Choose dfW = 30 (lowest degree of freedom available) since there is no dfW = 33.
=> QC = 3.85
North-South comparison:
North-West comparison:
North-East comparison:
South-West comparison:
South-East comparison:
West-East comparison:
Q  2.74 < QC
Q  1.21 < QC
Q  2.43 < QC
Q  3.95 > QC
Q  0.31 < QC
Q  3.64 < QC
=>
=>
=>
=>
=>
=>
the means are not significantly different
the means are not significantly different
the means are not significantly different
the means are significantly different
the means are not significantly different
the means are not significantly different
Briand CSBS 320 Notes Caldwell Ch.10 - 10 -
ANNOUNCEMENT:
(1)
(2)
(3)
(4)
For practice problems, use end of the chapter ones
Monday, March 12: short-exam on Ch.10 OPEN BOOK
For any question, please email me at gbriand@ewu.edu
HAVE A GOOD WEEKEND!
Download