8.2 Estimating a Population Proportion

advertisement
8.2 Estimating a Population Proportion
Objectives
SWBAT:
• STATE and CHECK the Random, 10%, and Large Counts conditions for
constructing a confidence interval for a population proportion.
• DETERMINE critical values for calculating a C % confidence interval for a
population proportion using a table or technology.
• CONSTRUCT and INTERPRET a confidence interval for a population proportion.
• DETERMINE the sample size required to obtain a C % confidence interval for a
population proportion with a specified margin of error.
What are the three conditions for constructing a confidence interval for a
proportion? What happens if one of the conditions is violated?
Conditions for Constructing a Confidence Interval About a Proportion
• Random: The data come from a well-designed random sample or
randomized experiment.
o 10%: When sampling without replacement, check
1
N
that n £
10
• Large Counts: Both
npˆ and n(1  pˆ ) are at least 10.
• If a condition is violated, then the actual capture rate of the intervals will differ from the
advertised capture rate (e.g., 95%).
• Why do we use p̂ instead of p in the Large Counts condition?
• We don’t have p. That is what we are trying to estimate.
1) What is the difference between the standard deviation of a statistic and the standard error
of a statistic?
2) What is the formula for the standard error of the sample proportion? How do you interpret
this value? Is this formula on the formula sheet?
We can use the general formula from
Section 8.1 to construct a confidence
interval for an unknown population
proportion p:
statistic ± (critical value) × (standard deviation of statistic)
The sample proportion pˆ is the statistic we use to estimate p.
When the Independent condition is met, the standard deviation
of the sampling distibution of pˆ is
p(1- p)
s pˆ =
n
Since we don't know p, we replace it with the sample proportion pˆ .
This gives us the standard error (SE) of the sample proportion :
pˆ (1- pˆ )
n
When the standard deviation of a statistic is estimated from
data, the results is called the standard error of the statistic.
Standard error describes how close
the sample proportion will typically
be to the population proportion in
repeated SRSs of size n.
This is NOT on the formula sheet.
Only standard deviation is on there.
What is a critical value? How is it calculated? What’s up with the *?
How do we find the critical value for our confidence interval?
statistic ± (critical value) × (standard deviation of statistic)
If the Large Counts condition is met, we can use a Normal curve. To
find a level C confidence interval, we need to catch the central area C
under the standard Normal curve.
To find a 95% confidence interval,
we use a critical value of 2 based
on the 68-95-99.7 rule.
Using Table A or a calculator, we
can get a more accurate critical
value.
Note, the critical value z* is
actually 1.96 for a 95% confidence
level.
How can we do this on
the calculator?
We need to use the
invNorm command.
Use the far most area to
the left of the boundary.
Use a mean of 0 and a
standard deviation of 1.
For this problem, we
would do invNorm(area:
0.1, mean: 0, st dev: 1)=
-1.281551567
So use a z* of 1.28.
Use Table A to find the critical value z* for an 80% confidence interval.
Assume that the Large Counts condition is met.
Since we want to capture the central
80% of the standard Normal
distribution, we leave out 20%, or
10% in each tail.
Search Table A to find the point z*
with area 0.1 to its left.
The closest entry is z = – 1.28.
z
.07
.08
.09
– 1.3
.0853
.0838
.0823
– 1.2
.1020
.1003
.0985
– 1.1
.1210
.1190
.1170
So, the critical value z* for an 80% confidence interval is z* = 1.28.
A few more words on critical values…
• The critical value measures how many standard errors we need to extend the
interval to get the desired level of confidence.
• The asterisk reminds us it isn’t calculated from the data like other z-scores.
Alternate Example: Find the critical value for a 96% confidence interval for a
proportion.
From the table: look up 0.98. The closest to this is 0.9798 which gives a z* of
2.05.
From the calculator: invNorm(area: 0.02, mean: 0, st dev: 1)= 2.05
What is the formula for a one-sample z interval for a proportion? Is this
formula on the formula sheet?
One-Sample z Interval for a Population Proportion
When the conditions are met, a C% confidence interval for the
unknown proportion p is
pˆ (1 - pˆ )
pˆ ± z *
n
where z* is the critical value for the standard Normal curve with
C% of its area between −z* and z*.
This is not on the formula sheet.
Alternate Example: Students in an AP Statistics class want to estimate the proportion of
pennies in circulation that are more than 10 years old. To do this, they gathered all the
pennies they had in their pockets and purses. Overall, 57 of the 102 pennies they have are
more than 10 years old.
a) Identify the population and the parameter of interest.
The population is all pennies currently in circulation.
The parameter is p = the true proportion of all pennies currently in circulation that are more
than 10 years old.
b) Check the conditions for calculating a confidence interval for the parameter.
Random: The class took an SRS of 102 pennies from circulation
10%: Because the class sampled without replacement, the number of pennies in the
population must be at least 10(102) = 1020. Because there are more than 1020 pennies in
circulation, the 10% condition is met.
Large counts:
Alternate Example: Students in an AP Statistics class want to estimate the proportion of
pennies in circulation that are more than 10 years old. To do this, they gathered all the
pennies they had in their pockets and purses. Overall, 57 of the 102 pennies they have are
more than 10 years old.
c) Construct a 99% confidence interval for the parameter.
pˆ (1  pˆ )
pˆ  z *
n
d) Interpret the interval in context.
We are 99% confident that the interval from 0.432 to 0.686 captures the true
proportion of pennies in circulation that are more than 10 years old.
e) Is it plausible that more than 60% of all pennies in circulation are more
than 10 years old?
Yes. Because 0.6 is included in the confidence interval, it is plausible that
60% of all the pennies in the collection are more than 10 years old.
How can we make life easier and do this on our calculator?
STAT -> TESTS -> A: 1-PropZInt
Enter the number of successes for x, the sample size for n, and your
confidence level for C-level
For our previous example we would enter
x: 57
n: 102
C-Level: 0.99
What is the four-step process for calculating a confidence interval? What do
you need to do in each step? Do you always have to do the four steps?
Confidence Intervals: A Four-Step Process
State: What parameter do you want to estimate, and at what
confidence level?
Plan: Identify the appropriate inference method. Check conditions.
Do: If the conditions are met, perform calculations.
Conclude: Interpret your interval in the context of the problem.
It is essential that you follow the four-step process whenever a problem asks you to
calculate a confidence interval. These steps were carefully designed based on the scoring
rubrics for confidence interval questions on the AP exam.
Is it OK to use your calculator to calculate the interval?
• You may use your calculator to compute a confidence interval on the AP
exam But there is a risk involved. If you just give the calculator answer with
no work you’ll get either full credit for the “Do” step (if the interval is correct)
or no credit (if it’s wrong). If you opt for the calculator-only method, be sure
to name the procedure (e.g., one-proportion z interval) and to give the
interval (e.g., 0.514 to 0.607).
• If you want to be able to get partial credit, show work.
• It is still important to know the formula because you will need it for multiple
choice questions.
• Also, on the calculator, make sure x is an integer!
Alternate Example: In her first-grade social studies class, Jordan learned that 70% of the Earth’s surface
was covered in water. She wondered if this was really true and asked her dad for help. To investigate,
he tossed an inflatable globe to her 50 times, being careful to spin the globe each time. When she
caught it, he recorded where her right index finger was pointing. In 50 tosses, her finger was pointing
to water 33 times. Construct and interpret a 95% confidence interval for the proportion of the Earth’s
surface that is covered in water.
State: We want to estimate p = the true proportion of the Earth’s surface that is covered in water with
95% confidence.
Plan: We should use a one-sample z interval for p if the conditions are met.
• Random: The 50 locations are a random sample of all possible locations on the globe.
• 10%: We do not need to check the 10% condition because the locations were not selected
without replacement.
• Large counts:
Do:
Conclude: We are 95% confident that the interval from 0.529 to 0.791 captures the true proportion of
the Earth’s surface that is covered in water. This is consistent with the claim that 70% of the Earth’s
surface is covered in water, because 0.70 is one of the plausible values in the interval.
What is the formula for the margin of error for a confidence interval for a
proportion? Is this formula on the formula sheet?
The margin of error (ME) in the confidence interval for p is
pˆ (1 - pˆ )
ME = z *
n
Not on the formula sheet
Because the margin of error involves the sample proportion pˆ , we have to
guess the latter value when choosing n. There are two ways to do this :
• Use a guess for pˆ based on past experience or a pilot study
• Use pˆ = 0.5 as the guess. ME is largest when pˆ = 0.5
Calculating a Confidence Interval
To determine the sample size n that will yield a level C confidence
interval for a population proportion p with a maximum margin of
error ME, solve the following inequality for n:
pˆ (1- pˆ )
z*
£ ME
n
where pˆ is a guessed value for the sample proportion. The margin of error
will always be less than or equal to ME if you take the guess pˆ to be 0.5.
Alternate Example: Suppose that you wanted to estimate p = the true proportion
of students at your school who have a tattoo with 98% confidence and a margin
of error of no more than 0.10. How many students should you survey?
Our z* will be 2.33
We need to survey at least 136 students to estimate the true
proportion of students with a tattoo with 98% confidence and
a margin of error of at most 0.10.
Download