Determining the Sample Size Necessary for a Desired Margin of Error

advertisement
8 - Determining the Sample Size Necessary to Obtain a Desired Margin
of Error When Estimating  and p with a Confidence Interval
Population Mean ()
In the handout 7 – Sampling Distributions we found that the interval
X  1.96 

up to X  1.96 

n
n
had a 95% chance of covering the population mean provided we “knew” the population
standard deviation  . The margin of error for this interval is
Margin of Error  1.96

n
If we wanted our margin of error to be at most E units what sample size should we use?
This says that to obtain a 95% CI for  with a margin of error no larger than E we should
use a sample size of
 1.96   
n

E


2
However we cannot calculate this in practice unless we know  . Which of course we
don’t and furthermore we don’t even know s, the sample standard deviation, until we
have our data in hand. Thus in order to use this result we need to plug in a “best guess”
for  . This guess might come from:
 Pilot study where s = sample standard deviation is calculated
 Prior studies (literature reviews)
 Use approximation based on the Range,   Range . Granted we don’t
4
the range until the data is collected, but we might be able to guess the
largest and smallest values we might expect to see when collect our data.
 In general, using a  which is too large is better than using one that is too
small.
Example: What sample size would be necessary to estimate the mean cholesterol level
for the population of females between the ages of 30 – 40 with a 95% confidence interval
that has a margin of error no larger than 5 mg/dl?
75
Population Proportion (p)
In the handout 7 – Sampling Distributions we found that the interval
p(1  p)
p(1  p)
up to pˆ  1.96 
pˆ  1.96 
n
n
had a 95% chance of covering the population proportion. The margin of error for this
interval is
p(1  p)
Margin of Error  1.96
n
If we wanted this to be at most E units what sample size should we use?
This says that to obtain a 95% CI for p with a margin of error no larger than E we should
use a sample size of
 1.962 p(1  p) 

n  
2
E


However we cannot calculate this in practice unless we know p? Which of course we
don’t and furthermore we don’t even know p̂ , the sample proportion, until we have our
data in hand. In order to use this result we need to plug in a “best guess” for p. This
guess might come from:
 Pilot study where p̂ = sample proportion is calculated
 Prior studies
 Use the worst case scenario by noting that p(1  p)  .25 and is equal to
.25 when p=.50. Using p = .50 simplifies the formula to
1.96 2
n
4E 2
If you have no “best guess” for p this conservative approach is the one
you should take.
Example: How many patients would need to be used to estimate the success rate of
medical procedure, if researchers initially believe the success rate is no smaller than 85%
and wish to estimate the true success rate using a 95% confidence interval with a margin
of error no larger than E = .03?
What if they wish to assume nothing about the success rate initially?
76
Download