Chapter 22—“Inference for Two Proportions.” Inference for the

advertisement
Chapter
22—“Inference for Two Proportions.”
Inference for the difference between two proportions. The approach, logic, and
interpretations are the same. Only the standard deviation changes and it’s not hard to find that for
the difference of independent random variables, variances add. That allows us to create a
confidence interval for the difference of two proportions. Add one more
idea—pooling samples to better estimate the population proportion—and we can test a
hypothesis about the two proportions. TI Tips lead students through the calculator procedures for
two proportion inference.
Comments
This is the chapter where you should begin to see the benefits of the approach we have taken to
inference. All the pieces are in place. We’ve been working with proportions for three chapters
now. Now we can create and interpret a confidence interval and write their conclusions. It’s quite
logical to turn from looking at one proportion to comparing two proportions. Toss in adding
variances to describe the variability in the difference between independent random variables, and
these pieces go together easily. Students see they can handle inference for two proportions quite
well. You’ll find this chapter is as much about reviewing and solidifying what students already
know about inference as it is about learning two-sample procedures.
Looking Ahead
Our first test of student’s understanding of inference follows this chapter, and then we move on
to look at inference for means. The fundamental concepts about confidence intervals and
hypothesis tests are the same – if you’ve seen one confidence interval or hypothesis test you’ve
seen ‘em all. With a firm grasp of the ideas, logic, and meaning of inference, students should
readily embrace inference for one mean in Chapter 23, then compare two means in Chapter 24,
and analyze matched pairs in Chapter 25.
Using proper notation, always important, becomes even more vital now that there are two
groups. Insist that students clearly identify what their notation means. We need to use subscripts
to distinguish between the two groups, something obvious, like “M” and “F” for male
and female.
Hypotheses. We wonder if the true proportions are the same in the two populations from
which our samples are drawn. As always, a null hypothesis means nothing
unusual is afoot—in other words, the two proportions are really the same and any difference
that showed up in the samples can be explained by sampling error. you will probably
propose that you make the null hypothesis about equal proportions ( p1 = p2 ). There’s nothing
wrong with that, and you should write it down. But also the alternative way of saying “no
difference”: p 1 - p2 = 0 . The fact that we are talking about the difference of two proportions will
be important when we need to find the standard deviation of the sampling model for pˆ − pˆ .
Model. You should list the conditions. They need to check the usual conditions for inference
about proportions for both groups: randomness, less than 10% of the respective populations, and
enough successes and failures. There’s another condition, but don’t write that yet. When the need
to add variances arises, we’ll see that the two groups must be independent. You can return to the
conditions and add that assumption to the list. At that point, students will understand why it’s
important.
Mechanics, part I. Let the students proceed as they usually would: writing down the observed
statistics, drawing the Normal curve, shading the region representing the P-value, and starting
to find z. The numerator is easy: the observed difference in sample proportions minus the
hypothesized difference of 0. The denominator presents the first place they’ll hesitate. The
problem is in knowing what standard deviation to use. (As an aside, you can point out that
standard deviations are always the problem. Once we know how to find the appropriate
standard deviation, inference is usually pretty straightforward.) Let them stumble around a few
minutes right here. One of your students is likely to remember the “variances add” mantra.
Then you can derive (or show) the formula you need in the denominator and add the
independence requirement to the list of conditions.
Mechanics, part II. The formula for standard deviation of the difference of two independent
proportions requires that we know both population proportions, p1 and p2. We don’t. In the
earlier one-proportion case, we had a hypothesized value to use, but not now. Students will
suggest that we simply find a standard error by substituting our best sample-based estimates, p1
and p2. Great idea – this shows they have caught on to our basic tactics. But we can make a
slight conceptual improvement here. The null hypothesis is that there is no difference in the
two population proportions. In other words, we are currently operating under the assumption
that p1 and p2 . This means that there is one common population proportion (call it p), and that is
the value we should be substituting for both p1 and p2. (It makes some sense to use two
different estimates for the same value, and substitute different numbers into the formula where
we have hypothesized the values are the same, but it would be better to use one common
value.) We don’t know this magical value of p. We need to come up with the best possible
estimate. Chances are someone will suggest using the total number of successes and total
number of trials in both samples lumped together. That’s called “pooling” and is the right
approach when we are testing the hypothesis that two proportions are the same.
Conclusion. We know what the P-value means and how to link it to
their decision. They will be able to write a good conclusion in the proper context.
And now the confidence interval . . . We just decided that the difference was significant. So
how big might it be? We need a confidence interval. We have already checked the conditions.
The interval is (as always) an estimate plus or minus a margin of error.
Here, that’s the observed difference in proportions plus or minus 1.96 times the standard error
of the difference. We already have the formula for that standard error, but now we no longer
believe the two proportions are the same. Now there’s no justification for pooling. Because we
believe proportions 1 p and 2 p could be different, we should use the two different estimates
ˆp 1and ˆp2 to find the standard error. They’ll need to think carefully to write a clear statement
that correctly interprets that confidence interval. Continue to clarify the pooling issue. A short
explanation should make sense. When finding a confidence interval for the difference there is an
implied assumption that the two proportions could well be different. So use different estimates –
don’t pool. When testing the hypothesis that the two proportions are equal, we pretend they are,
so we use the same value – the pooled estimate – for each. This latter point should be seen for
what it is – a technical improvement. The difference between the two SDs isn’t great, but when
we assume from the null hypothesis that the two proportions are equal, we should use that
information.
Test your Understanding
AP Statistics Quiz Chapter 22
Great Britain has a great literary tradition that spans centuries. One might assume,
then, that Britons read more than citizens of other countries. Some Canadians,
however, feel that a higher percentage of Canadians than Britons read. A recent
Gallup Poll reported that 86% of 1004 randomly sampled Canadians read at least
one book in the past year, compared to 81% of 1009 randomly sampled Britons.
Do these results confirm a higher reading rate in Canada?
1. Test an appropriate hypothesis and state your conclusions.
2. Find a 99% confidence interval for the difference in the proportion of
Britons and Canadians who read at least one book in the last year. Interpret
your interval.
We are 99% confident that the proportion of Britons who read at least one book in the past year
between 0.8-percentage points and 9.3-percentage points lower than the proportion of Canadians
who read at least one book in the past year.
Download