Exact and Approximate Confidence Intervals for Proportions

advertisement
ENV 260
Exact and Approximate Confidence Intervals for Proportions
In a population, a certain proportion p have a certain characteristic. To estimate p, a
random sample of size n is drawn. The proportion of the sample that have the
characteristic is p̂ . We would like to calculate a confidence interval for the population
proportion from the sample.
There is no formula for an exact confidence interval. However computers can now
calculate exact confidence intervals. In Minitab, simply use Stat->Basic Statistics->1
Proportion.
There are a number of formulas for approximate confidence intervals for population
proportions. By far the most popular approximation is the Wald estimate. A very good,
but less well known, estimate is the Wilson estimate, which is what our book presents. I
will present both.
The Wald estimate gives an approximate confidence interval for a population proportion
of
pˆ  z*
pˆ (1  pˆ )
n
where p̂ is the sample proportion and z* is NORMINV( This is known as the
Wald estimate. For example a 95% confidence interval is
pˆ  1.96
pˆ (1  pˆ )
n
The Wald estimate is sometimes called the normal distribution approximation method.
A more accurate approximate confidence interval is the Wilson estimate. Define p to
be (y+2)/(n+4), where y is the number in the sample that possess the characteristic and n
is the sample size. Then
p  1.96
p(1  p)
n4
is an approximate 95% confidence interval.
Example.
In a study of human blood types in nonhuman primates, a sample of 71 orangutans were
tested and 14 were found to be blood type B. (Erskine, A. G. and Socha, W. W. (1978)
The Principles and Practices of Blood Grouping. St. Louis: Mosby. p. 209.) Construct a
95% confidence interval for the relative frequency of blood type B in the orangutan
population.
Answer.
The exact confidence interval from Minitab is 11.2% to 30.9%.
Using the Wald estimate you would get
14
 1.96
71
14
71
(1  14
71 )
71
This yields an interval from 10.5% to 29.0%.
Using the Wilson estimate, we first have to calculate p . p 
14  2 16

 21.3% .
71  4 75
The Wilson estimate would be
16
 1.96
75
16
75
(1  16
75 )
75
This yields 12.1% to 30.6%. The Wilson estimate is usually a little more accurate. It is
not very popular however!
Download