Some additional information about pooling variances

advertisement
The following illustrates what would happen if we just averaged the variances of two groups instead
of taking a weighted average, when sample sizes are unequal. (You won’t be tested on this. I’m just
posting this information in response to a student’s question).
Suppose you have:
s1 2 = 4
s2 2 = 8
n1 = 16
n2 = 10
As you can see, the sample sizes are unequal.
Suppose you took a “regular” (non-weighted) average of the two variances.
You’d get
(4 + 8) /2 = 6.
Now, suppose we find the standard error. We use this average variance in the standard error
formula:
s
X1X2
=
s12 s22

n1 n2
6 6
  .375.6  .9874
16 10
=
So, if we take a “regular” average of the variances, we’d get a standard error of .9874.
If INSTEAD we get a weighted (pooled) average of the variances, we’d get this:
s 2p
(n1 1)s12  (n2 1)s22 2 (16 1)4  (10 1)8

sp 
(n1  n2  2)
(16 10  2)
(15)4  (9)8 60  72

 5.5
24
24
The first group has a bigger sample size, and this method gives more weight to that group’s
variance in the calculation. And, since the group with the bigger sample size had the smaller
variance, our weighted-average variance estimate is smaller (5.5) than it was when we took a
“regular” average variance estimate (6).
Now, suppose we find the standard error. We use this average variance in the standard error
formula:
s
X1X2
=
s12 s22

n1 n2
=
5.5 5.5

 .34375.55  .9454
16 10
So, if we take a weigted/pooled average of the variances, we’d get a standard error of .9454.
In general, as sample sizes get bigger, sample variances are better estimates of their population
variances. So, using this pooled (weighted) procedure allows us to give more weight to the better
variance estimates.
Download