Chapter 19 - TeacherWeb

advertisement
"How many statisticians does it take to change a light bulb? 1  3"
Author Unknown
Chapter 19: Confidence Intervals for Proportions (Pages 432 - 450)
Overview: Last chapter we looked at how proportions varied from sample to sample. When we use our
sample to estimate the parameters, there will be some difference in what the next Joe Schmo
says about the population based off his stats and what we said. Since we expect sampling
variability, we do our best by creating confidence intervals. Confidence Intervals are formed
by taking our estimate and adding or subtracting a margin of error. The methods we use in
this chapter will allow us to make statements such as “We have 95% confidence that our
interval contains the true proportion.”
Statistical Inference allows us to draw a conclusion about the population from a sample.
*Inference is most reliable if the sample was chosen by a random sampling design!
2 Types of Formal Statistical Inference:
1) Confidence Intervals –
2) Test of Significance –
**both are based on what would happen if we repeated the sample or experiment many times!
Confidence Interval – an interval of values computed from sample data that is likely to include the
true population value. A confidence interval has 2 parts:
1)
2)
As a user of statistics, you (or I) will often choose the confidence level. Most often we choose a high
confidence level (90% or above) because we want to be quite sure of our conclusion
C = confidence level in decimal form
A level C confidence interval for a parameter is an interval computed from sample data by a method that
has a probability, C, of producing an interval containing the true value of the parameter.
EXAMPLE: C = .95 means 95% confidence
Interpreted as: In the long run, about 95% of all confidence intervals computed in this way will
capture the true population parameter of the proportion, and about 5% of them will miss the true
population parameter of the proportion.
When the following conditions are met, we are ready to find the confidence interval for the population
proportion, p.
Assumptions:
1. The sample values must be independent of each other.
2. The sample size, n , must be large enough.
Since it is hard to check assumptions, we verify the following conditions:
1. Randomization Condition: The data values must be sampled randomly.
2. 10% Condition: The sample size, n, is less than 10% of the population.
3. Success/Failure Condition: The sample size has to be big enough to have at least 10 successes and
10 failures. So npˆ  10 and nqˆ  10 .
Confidence Interval for a Population Proportion
Where p̂ is the estimate and z *
pˆ qˆ
is the margin of error (MOE)
n
p̂ =
SE ( pˆ ) 
pˆ qˆ
=
n
n=
How do we get z* (critical value)?
Let’s say we are finding a 90% confidence interval. (C = .90)
Draw a normal curve & mark the middle 90% of the data, find the z value (or z score) that would
capture the middle 90% of the data (use the body of the z table or the invnorm(.05) function on the
calculator). z = -1.645 and 1.645. z* is the positive z score known as the upper critical value. In this
example z* = 1.645.
Most common confidence intervals and their corresponding z*:
CONFIDENCE LEVEL
80%
90%
95%
99%
TAIL AREA (to the right of
z*)
Z*
Steps to Construct a Confidence Interval:
1) Identify the population of interest and the parameter you want to draw conclusions
about.
2) Choose the appropriate inference procedure. Verify the conditions/assumptions for
using the selected procedure.
3) If the conditions are met, carry out the inference procedure.
CI = Sample estimate  Margin of error
4) Interpret your results in context of the problem.
Let’s try one…
1. Your local newspaper polls a random sample of 330 voters, finding 144 who say they will vote
“yes” on the upcoming school budget. Create a confidence interval for actual sentiment of all voters.
Solution:
Think – We want a confidence interval for the proportion of all voters who will vote “yes” on the
upcoming school budget, based on a random sample of 330 voters. In our sample, 144 respondents
stated that they would vote “yes”. We want to be reasonably confident in our results, so we will
construct a confidence interval of approximately 95% confidence.
Plausible Independence Assumption: It is reasonable to think that the responses were
independent, provided good surveying techniques were used.
Random Sampling Condition: The voters were sampled randomly.
10% Condition: Provided there are more than 3300 eligible voters, 330 is less than 10% of the
population of voters.
Big Enough Sample Assumption: must have at least 10 successes and 10 failures.
Success/Failure Condition: npˆ = 144 and nqˆ = 186 , which are both at least 10, so the sample is large
enough.
The conditions are satisfied, so I can use the Normal model to find a one proportion z-interval.
Show –
The margin of error is MOE =
The confidence interval is
Tell –
Choosing the Sample Size
To determine the sample size n that will yield a confidence interval for a population proportion with a
specific margin of error, MOE: let the MOE be greater than or equal to the expression for the margin of
error and solve for n.
2. An experiment finds that 27% of 53 subjects report improvement after using a new medicine. Create a
95% confidence interval for the actual cure rate. Why is this interval so wide? Make it narrower – 90%
confidence. What are the advantages and disadvantages? What sample size would we need in a follow up study if we want a margin of error of 5% with 98% confidence?
Solution:
Think – We want a confidence interval for the proportion of all people who will improve after using a
new medication, based on an experiment involving 53 subjects. In our sample, 27% of the subjects
improved.
Plausible Independence Assumption: One patient responding to the medication shouldn’t have an
influence on other patients responding to the medication.
Random Sampling Condition: The patients were part of an experiment, which hopefully included
random assignment of volunteers to treatment groups.
10% Condition: The 10% condition doesn’t apply. We are testing medication, not patients.
Big Enough Sample Assumption: must have at least 10 successes and 10 failures
Success/Failure Condition: npˆ =53(0.27)=14 and nqˆ =53(0.73)=39 , which are both at least 10, so the
sample is large enough.
The conditions are satisfied, so I can use the Normal model to find a one proportion z-interval.
Show –
The margin of error is
The confidence interval is
Tell –
This interval is quite wide for a couple of reasons. The sample size of 53 people doesn’t provide a great
deal of accuracy in our estimate. The standard error is still quite large. Also, the more confident we are
that we have succeeded in capturing the true proportion, the less precise our interval becomes.
90% Confidence Interval
The margin of error is
The confidence interval is
We are 90% confident that between
medication.
and
of people will improve after using the new
This interval has the advantage of being more precise, but we are less confident in our ability
to capture the true proportion within our interval.
Finding the sample size required for ± 5%, with 98% confidence.
Consider the formula for margin of error. We believe the improvement rate to be 0.27 from our
preliminary study. The value of z* for 98% confidence is
3. What sample size does it take to estimate the outcome of an election with a margin of error of 3%?
(with 95% confidence)
Solution: We’ll do what polling organizations usually do, using 95% confidence and choosing the most
cautious proportion, 50% since we have no estimate of the population proportion. A sample size of at
least
likely voters is required.
Download