Chapter 8.1, 8.2 and 9.1, 9.2: Confidence Intervals for a Proportion

advertisement
Chapter 8.1, 8.2 and 9.1, 9.2: Confidence Intervals for a Proportion
In chapter 7 we learned that in a sampling distribution,
(____________________________________________________________________________) we looked
at SOCS, without the O!
Shape: Because the population distribution is Normal, so is the ____________________________________ of
ð‘ĨĖ… . Thus the ______________condition is met. (Recall that would be np ≥ 10 and nq ≥ 10)
Center: The mean of the sample distribution of ð‘ĨĖ… was the same as the unknown mean μ of the entire population .
𝜇ð‘ĨĖ… = 𝜇
Spread: The standard deviation of ð‘ĨĖ… : _______________
Sampling distribution of ð‘ĨĖ… tells us:
_____________________________________________________________________________________.
Lesson 8.1: Now, when we want to talk about how close to ð‘ĨĖ… the unknown population mean µ is likely to be, we are
looking at the confidence intervals for µ.
General form of a confidence interval:
estimate
ï‚ą margin of error
or
statistic
ï‚ą (critical value) ∗(standard deviation of statistic)

Margin of error: ________________________________________________________________.

Confidence level:__________________________________________________________________.

Confidence interval: “We are C% confident that the interval from ________ to __________captures the
actual value of the [population parameter in context].”
 WHAT THE CONFIDENCE INTERVAL DOES NOT DO:
The confidence level does not tell us
_________________________________________________________________________.
We are basically saying we got these numbers by a method that gives correct results 95% (or whatever level) of
the time.
The probability that our 95% confidence interval captures the parameter is NOT________!
ïķ We would have a 95% chance of getting a sample mean that is within 2 𝜎ð‘ĨĖ… of the mystery mu which would
lead to a confidence interval that captures mu.
ïķ Once we have chosen a random sample, the sample mean ð‘ĨĖ… either is or is not within 2 𝜎ð‘ĨĖ… of mu. The
resulting confidence interval either does or does not contain mu.
ïķ After we construct a confidence interval, the probability that it captures the population parameter is
either ________________________.
Confidence intervals are statements about __________________________.
Check your understanding:
How much does the fat content of Oscar Meyer hot dogs vary? To find out, researchers measured the fat content
(in grams) of a sample of 10 Oscar Meyer hot dogs. A 95% confidence interval for the population standard
deviation σ is 2.84 to 7.55.
1. Interpret the confidence interval.
_______________________________________________________________________________
2. Interpret the confidence level.
______________________________________________________________________________
3. True or False: The interval from 2.84 to 7.55 has a 95% chance of containing the actual population
standard deviation σ.
________________________________________________________________________________
Constructing a Confidence Interval:
ïķ The confidence interval for estimating a population parameter has the form:
statistic
ï‚ą (critical value) ∗(standard deviation of statistic)
A couple of things to keep in mind:
 Critical value is tied directly to the confidence level.
 Greater confidence requires a larger critical value.
 The standard deviation of the statistic depends on the sample size n, the larger the sample the
more precise the estimates which means less variability in the statistic.
 Margin of error gets smaller when 1) the confidence level decreases and 2) the sample size n
increases.
 Usually, if we do not know the µ, we do not know the σ. In section 8.2 we will learn how to do
confidence intervals for a population proportion p. In section 8.3, we will construct confidence
intervals for a population mean.
Example A:
Each bead in the bag represents a person’s opinion.
People are in favor (___________) or opposed (____________) to some new legislation.
Take a sample of size 50 and find the proportion in favor of the new legislation. We’ll use these results in a second.
Before calculating a confidence interval for µ or for p, you must check the conditions:
1.
Randomization Condition: ________________________________________________________
2.
10% Condition: _______________________________________________________________
3.
Success/Failure Condition: (AKA Normal)
For means:
______________________________________________________________________________________
______________________________________________________________________________________
For proportions: _________________________________________________________________________
Lesson 8.2:
So we can use a Normal Model, but why do we need to?
The critical value (𝒛∗ pronounced as z-star )from the confidence interval formula comes from the normal model.
Option 1: Use the 68-95-99.7 Rule (not exact):
We want to pick a 95% confidence level, so we should choose where 95% of the numbers should fall. Roughly within
2 sd of the mean, so our z* = 2.
Option 2: Use the invNorm function
invNorm(.025) = ___________________
invNorm(.975) = ___________________
so our 𝒛∗ =1.96
Option 3: Use the formula sheet: Use the table in the back of the book.
The Standard Deviation of Statistic part of the confidence interval formula also comes from the normal model.
(Really called the Standard Error because we are using pĖ‚ instead of p .)
𝑝Ė‚𝑞Ė‚
S.E. = √
𝑛
Let’s put all of this together: From above:
Example A: Each bead in the bag represents a person’s opinion. People are in favor (___________) or opposed (____________) to some new
legislation. Take a sample of size 50 and find the proportion in favor of the new legislation.
ï‚ą (critical value) * (standard deviation of statistic)
General Form
statistic
For One Proportion
𝑝Ė‚ ± 𝑧 ∗ ∙ √
𝑝Ė‚𝑞Ė‚
𝑛
With Numbers
Conclusion using words in context:
I am 95% confident that the true proportion of orange county voters who are in favor of this new legislation lies
between _________ and __________.
What if we wanted to be 90% confident?
What if we wanted to be 99% confident?
Now if we want to find the confidence interval with our calculator:
STAT
Tests
A: 1-Prop Zint
Enter
Then put in for :
X: # of successes
N: sample size
C-level: confidence level as a decimal
Then scroll down to calculate, hit enter.
Type of test used: 1-PropZInt.
Example 1: Advertisers send junk mail to thousands of potential customers in the hope that some will buy the
company’s product. The response rate is usually quite low. Suppose a company wants to test the response to a new
flyer, and sends it to 1000 people randomly selected from their mailing list of over 200,000 people. They get
orders from 123 recipients. Create a 90% confidence interval for the percentage of people the company contacts
who may buy something.
Assumptions:
Name of Test:
Math:
Conclusion:
The company must decide whether to now do a mass mailing. The mailing won’t be cost-effective unless it produces
at least a 5% return. What does your confidence interval suggest?
Example 2: Say we take a sample of size 100 and construct a 95% confidence interval.
How can we make the margin of error smaller?
If we still want to be 95% confident, what sample size do we need to cut the margin of error in half?
What if we wanted to make the margin of error one third as large? What sample size do we need?
Examples 3 and 4 – Finding the sample size:
A state’s environmental agency worries that many cars may be violating clean air emissions standards. The agency
hopes to check a sample of vehicles in order to estimate that percentage with a margin of error of 3% with 90%
confidence. To gauge the size of the problem, the agency first picks 60 cars and finds 9 with faulty emissions
systems. How many should be sampled for a full investigation?
In preparing a report on the economy, we need to estimate the percentage of businesses that plan to hire
additional employees in the next 60 days. How many randomly selected employers must we contact in order to
create an estimate in which we are 98% confident with a margin of error of 5%?
Example 5: A TV news reporter says that a proposed constitutional amendment is likely to win approval in the
upcoming election because a poll of 1505 likely voters indicated that 52% would vote in favor. The reporter goes
on to say that the margin of error for this poll was 3%.
Explain why the poll is actually inconclusive.
What confidence level did the pollsters use?
Now we are going to do a full work-up of a scenario:
Example 6: In a random sample of machine parts, 18 out of 225 were found to have been damaged in shipment.
Establish a 99% confidence interval for the proportion of machine parts that are damaged in shipment.
Assumptions:
1. Randomization:
2. 10% Condition:
3. Success/Failure:
∴ Since all conditions are met, we may use a normal model N(
Test:
Math:
Conclusion:
)
Suppose we want to decrease the margin of error to 3% while keeping our 99% confidence level. What size sample
do we need to use?
Practice Multiple Choice:
Chapter 9.1 & 9.2: Hypothesis Test for One Proportion
Card Activity: We have a population of 416 cards (8 decks). Take a sample of size 40. What is the proportion of
black cards?
What value do we expect pĖ‚ to be close to?
Sampling Results:
number of black cards
ï€―
total number of cards
What is the value of pĖ‚ ?
Is there a difference between what we expected (based on what we know about cards) and what we observed from
the sample?
Can we quantify how usual/unusual the value of pĖ‚ is?
Option 1: z-value
Option 2: probability
Based on the analysis we just completed using our sample results, what do you think about the true proportion of
black cards in the population?
We have just done an informal hypothesis test. In this case it would be called a one-proportion z test.
Let’s go through the fancy math steps now to answer the question: Is the proportion of black cards less than 50%?
ASSUMPTIONS:
TEST:
These are the same 3 we have been checking in the past two chapters.
We want to be able to use a Normal Model.
There will be many types of intervals and tests as the semester progresses so we need to indicate
if we are doing a confidence interval or a hypothesis test and which kind.
HYPOTHESES:
There are two hypotheses:
The null hypothesis (Ho) is in the form parameter = hypothesized value
The alternative hypothesis (Ha) is in the form parameter (>, <, ï‚đ) hypothesized value
MATH:
We did this up above in our informal analysis of the sample results but let’s go through it again.
We are finding a standardized test statistic…in this case a z-value.
PICTURE/P-value:
It’s nice to visualize how a standardized test statistic (z-value) relates to the original data.
How often do we see a sample result at least as extreme as ours if H o is true (p=0.5)?
CONCLUSION:
We want to explain 3 things in our conclusion:
 How unusual our sample was using a pre-determined ïĄ  level
 What this means in regards to the null hypothesis (H o)
 How strong our evidence is in regards to the alternative hypothesis (H a)
CONFIDENCE INTERVAL:
Construct a 95% confidence interval for the true proportion of black cards.
Since we don’t think the proportion is 0.5, what is it?
What you may be thinking:
How do we know what to write for the hypotheses?
1) Census data for a certain county shows that 19% of the adult residents are Hispanic. Suppose 72 people
are called for jury duty, and only 9 of them are Hispanic. Does this apparent under-representation of
Hispanics call into question the fairness of the jury selection system?
2) A study of the effects of acid rain on trees in the Hopkins Forest shows that of 100 trees sampled, 25 of
them exhibited some sort of damage from acid rain. This rate seemed to be higher than the 15% quoted in
a recent Environmetrics article on the average proportion of damaged trees in the Northeast. Does the
sample suggest that trees in the Hopkins Forest are more susceptible than the rest of the region?
3) The College Board reported that 60% of all students who took the 2004 AP Statistics exam earned scores
of 3 or higher. One teacher wondered if the performance of her school was different. She believed that
year’s students to be typical of those who will take AP Stats at that school and was pleased when 65% of
her 54 students achieved scores of 3 or better. Can she claim her school is different?
How does the sign in the alternative hypothesis affect the “Math” and “Picture” parts of the solution?

One-sided alternative
Ho: p = 0.5
Ha: p > 0.5
Sample results: 31 out of 50 support a local school bond

Two-sided alternative
Ho: p = 0.5
Ha: p ≠ 0.5
Sample results: 31 out of 50 support a local school bond
What am I supposed to say in the “Conclusion” part of the solution?
There are two options:
P-Value
Ho
Ha
Whoa, what’s a P-value?
What does it mean for a sample result to be “statistically significant”?
And what about the ïĄ  level ?
Why can’t we accept the null hypothesis?
How do confidence intervals and ïĄ  levels relate so that our decisions will be consistent?
Example #1: The proportion of people that typically feel better after taking extra strength Tylenol is 0.67. You
are in charge of a chemistry lab and have stumbled upon a “new and improved” extra strength Tylenol pill. You test
your new pill on 200 people and find that 144 people felt better after taking your new pill. Is your new and
improved Tylenol pill really better? Provide statistical evidence to support your answer.
ASSUMPTIONS:
TEST:
HYPOTHESES:
MATH:
PICTURE/P-Value:
CONCLUSION:
Example #2: You are a campaign manager. You believe that 45% of potential voters currently support your
candidate. You are trying to decide on a new ad for television. You test the new ad out on 90 people and find that
50 of them will be supporting your candidate. Is there evidence the ad has changed the proportion of people who
will be supporting your candidate? Provide statistical evidence to support your answer.
ASSUMPTIONS:
TEST:
HYPOTHESES:
MATH:
PICTURE/P-Value:
CONCLUSION:
Example #3: The proportion of cars running red lights in 2008 was 0.06. City planners hope that the installation
of red light cameras has reduced the proportion of red light runners in recent years. Of 250 cars going through an
intersection, 10 ran through a red light. Do cameras reduce the proportion of red light runners? Provide
statistical evidence to support your answer.
ASSUMPTIONS:
TEST:
HYPOTHESES:
MATH:
PICTURE/P-Value:
CONCLUSION:
Example #4: A company hopes to improve customer satisfaction, setting as a goal no more than 5% negative
comments. A random survey of 350 customers found only 10 with complaints. Create a 95% confidence interval
for the true level of dissatisfaction level among customers. Does this provide evidence that the company has
reached its goal?
ASSUMPTIONS:
TEST:
HYPOTHESES:
MATH:
CONCLUSION:
Practice Multiple Choice:
Lessons 8.3 and 9.3 - One Sample Means
A couple of items to address:
Why are we using a t-distribution instead of the normal distribution?
What are the similarities and differences between the distributions?
What does df stand for?
What is t* for the following situations?
99% confident with a sample size of 22
95% confident with a sample size of 5
90% confident with a sample size of 10
Finding the Sample Size
Example 1: How large a sample would you need to estimate the mean body temperature to within 0.1 degrees
with 98% confidence. Suppose you have taken a sample previously and know that s = 0.6824.
Example 2: How large a sample would you need to estimate the mean glove size to within 0.3 cm with
95% confidence. Suppose you have taken a sample previously and know that s = 0.04.
P-values and Significance Levels
Say you are testing the following hypotheses:
Ho: μ = 1000
Ha: μ > 1000
You have taken a random sample of 15 people
and know that their average was 1036 and the
standard deviation was 75.
Find the standardized test statistic and p-value:
What is your conclusion at the 0.05 significance level?
We would like to make a corresponding confidence interval that gives similar results.
Example 3:
A coffee machine dispenses coffee into paper cups. You're supposed to get 10 ounces of coffee, but the amount
varies slightly from cup to cup. Here are the amounts measured in a random sample of 20 cups:
9.9
9.9
9.9
10
10
9.7
9.6
9.6
9.9
9.5
10
9.8
10.2
9.5
9.7
10.1
9.8
9.8
9.9
10.1
Find a 95% confidence interval for the average amount of coffee the machine dispenses.
Example 4: A company has set a goal of developing a battery that lasts over 5 hours (300 minutes) in continuous
use. In a first test of 12 of these batteries the following lifespans (in minutes) were recorded:
270
326
321
281
295
336
332
311
351
253
311
288
Is there evidence that the company has met its goal?
Example 5: An IRS representative claims that the average deduction for medical care is $1250. A taxpayer who
believes that the real figure is lower samples 12 families and comes up with a mean of $920 and a standard
deviation of $616.
Is this evidence that the average deduction for medical care is indeed lower than $1250?
Download