Paul von Hippel - The University of Texas at Austin

advertisement
Paul T. von Hippel
Department of Sociology & Initiative in Population Research
Ohio State University
300 Bricker Hall
190 N. Oval Mall
Columbus, OH 43210
von-hippel.1@osu.edu
614 688-3768
592 words
Difference of proportions. When people are divided into groups, different
proportions of each group will display certain traits, attitudes, or outcomes. For example,
different proportions of men and women vote for Democrats, different proportions of
Americans and Japanese suffer from breast cancer, and different proportions of mainline
and fundamentalist Protestants approve of abortion for unmarried women.
The difference between the abortion attitudes of mainline and fundamentalist
Protestants may be estimated using data from the US portion of the 1990 World Values
Survey, summarized in the following table.
Table 1.
Mainline
Fundamentalist
Totals
Approve
172 (n11)
23 (n12)
195 (n1.)
Disapprove
404 (n21)
125 (n22)
529 (n2.)
Totals
576 (n.1)
148 (n.2)
724 (n..)
Note. To simplify the example, we treat the World Values Survey like a SIMPLE RANDOM
SAMPLE.
The actual survey design was more complicated.
Paul von Hippel
Page 1
2/16/2016
Of n.1=576 mainline Protestants, n11=172 approved of abortion for unmarried women, a
proportion of p1=172/576=.2986. Of n.2=148 fundamentalist Protestants; n12=23
approved of abortion for unmarried women, a proportion of p2=23/148=.1554. Among
the surveyed Protestants, then, the difference of proportions was p1-p2=.1432: the
surveyed mainline Protestants were 14.32% more likely to approve of abortion for
unmarried women.
If we wish to generalize from this sample to the US Protestant population, we
need to know the SAMPLING DISTRIBUTION for the difference of proportions. In large
samples, the difference of proportions p1-p2 approximates a NORMAL DISTRIBUTION,
whose STANDARD DEVIATION is known as the STANDARD ERROR for the difference of
proportions.
Different standard errors are used for CONFIDENCE INTERVALS and HYPOTHESIS
TESTS.
For confidence intervals, the most common standard error formula is
s p1  p2 
p1 (1  p1 ) p2 (1  p 2 )
n
n

, where p1  11 and p2  12 .
n.1
n.2
n.1
n.2
and the corresponding approximate confidence interval is
p1  p 2  z1 2 s p1  p2
where z1 / 2 is the (1-/2)X100th PERCENTILE of the STANDARD NORMAL DISTRIBUTION.
Although popular, this confidence interval has poor coverage, especially when the
table counts nij are low. Better coverage, especially for low counts, is obtained by adding
1 to each nij and 2 to each n.j. (Agresti & Cato, 2000). The improved standard error is
~
s p1  p2 
~
n 1
p1 (1  ~
p1 ) ~
p (1  ~
p2 )
n 1
 2
p1  11
p 2  12
, where ~
and ~
n.1  2
n.2  2
n.2  2
n.1  2
Paul von Hippel
Page 2
2/16/2016
and the corresponding approximate confidence interval is
~
p1  ~
p 2  z1 2 ~
s p1  p2
For the Protestant data, ~
p1  ~
p2  .1393 , ~s p1  p2 =.0355, and z.975=1.96, so a 95%
confidence interval runs from .0698 to .2088. In other words, the population of mainline
Protestants is probably between 6.98% and 20.88% more likely to approve of abortion for
unmarried women. (Sometimes the confidence limits calculated from this formula exceed
the lower bound of –p1 or the upper bound of 1-p1. On such occasions, it is customary to
adjust the confidence limits inward.)
If we wish to test the NULL HYPOTHESIS of equal proportions, we use a slightly
different standard error. Under the null hypothesis, we regard the surveyed Protestants as
a single sample of n..=724 persons of whom n1.=195 approve of abortion for unmarried
women, a proportion of p=195/724=.2693. In this setting, an appropriate standard error
formula is
sˆ p1  p2 
 1
n.
1 
 , where p  1
p(1  p)

n..
 n.1 n.2 
For the Protestant data, sˆ p1  p2 =.0409. An appropriate test statistic is
z
p1  p 2
,
sˆ p1  p2
which, under the null hypothesis, follows an approximate standard normal distribution.
For the Protestant data, z=.1432/.0409=3.503, which has a 2-TAILED P VALUE of .00046.
So we reject the null hypothesis that mainline and fundamentalist Protestants are equally
likely to approve of abortion for unmarried women.
Paul von Hippel
Page 3
2/16/2016
When squared, this z statistic becomes the CHI-SQUARE (2) statistic used to test
for association in a 2X2 contingency table. For the Protestant data 2=z2=12.27, which
like the z statistic has a p value of .00046.
The z and 2 tests are considered reasonable approximations when all table counts
nij are at least 5 or so. For tables with lower counts, z and 2 should be avoided in favor of
Fisher’s exact test (Hollander & Wolfe, 1999).
PAUL T. VON HIPPEL
References
Agresti, Alan, & Cato, Brian. (2000). Simple and effective confidence intervals
for proportions and differences of proportions result from adding two successes and two
failures. The American Statistician 54(4), 280-288.
Agresti, Alan, & Finlay, Barbara. (1997). Statistical methods for the social
sciences. Upper Saddle River, NJ: Prentice Hall.
Hollander, Myles, & Wolfe, Douglas A. (1999). Nonparametric statistical
methods (2nd ed.). New York: Wiley.
Paul von Hippel
Page 4
2/16/2016
Download