Transcript

advertisement
Slide 1
Statistical Sampling Part III – Confidence Intervals
Slide 2
This video is designed to accompany pages 41-76 of the workbook “Making Sense of Uncertainty:
Activities for Teaching Statistical Reasoning,” a publication of the Van-Griner Publishing Company
Slide 3
Let’s take a look at a CBS News/New York Times poll concerning gun laws. Here is what the article says:
As the president outlined sweeping new proposals aimed to reduce gun violence, a new CBS News/New
York Times poll found that Americans back the central components of the president's proposals,
including background checks, a national gun sale database, limits on high capacity magazines and a ban
on semi-automatic weapons. Asked if they generally back stricter gun laws, more than half of
respondents - 54 percent - support stricter gun laws …. That is a jump from April - before the Newtown
and Aurora shootings - when only 39 percent backed stricter gun laws but about the same as ten years
ago.
…
(Emphasize the underlined part)
This poll was conducted by telephone from January 11-15, 2013 among 1,110 adults nationwide. Phone
numbers were dialed from samples of both standard land-line and cell phones. The error due to sampling
for results based on the entire sample could be plus or minus three percentage points.
In what sense does this statement about sampling error make us better informed?
Slide 4
Our goal is to develop a mature understanding of what sampling error is. To do this, we will need to
become familiar with some of the mathematical theory on which statistical science is built.
“Sampling error” or “error due to sampling” is also commonly referred to as “margin of error.” You may
well have heard about margin of error a long time ago, but even if you did, there are likely things here
for you to learn. Indeed, the well-known polling organization known as Harris Polls claimed in a
November 2007 article on the BusinessWire that “'Margin of Error'', When Used by Pollsters, Is Widely
Misunderstood and Confuses Most People .”
Those are pretty strong words coming from a polling organization. The fact is, however, a solid
understanding of margin of error requires us to embark on a journey that is, in many ways, much more
subtle than most people suspect.
Slide 5
As the King said to the White Rabbit: “Begin at the beginning and go on till you come to the end: then
stop.”
For our beginning, we need to acknowledge a fundamental problem. Another, equally well-chosen
sample of size 1,110 adult Americans, asked the same question, would almost surely not yield the same
sample percentage (54%) who favor stricter gun laws.
Nor do we know if that would even be a good thing since we have no way of knowing for sure if the true
proportion of all adult Americans who favor stricter gun laws – the parameter of interest – is even close
to 0.54.
Slide 6
The variability seen in a statistic from sample to sample is called “sampling variability.”
We might ask, if a sample statistic is going to vary from sample to sample, then which one is the correct
one? As long as every one of the samples were taken correctly – as probabilistic samples – then the
answer is that all the statistics are correct.
A better question would be “how can we do a decent job of even estimating the parameter of interest in
the face of this kind of variability?”
The answer to this last question is the essence of elementary statistical inference.
Slide 7
Meaningful, useful parameter estimation depends largely on the following fact: with SRS-like samples,
the variability in a statistic from sample to sample is predictable, in some senses “known in advance” for
all such samples. Think about this some. It is a mathematical statement in disguise, but a critical
statement nonetheless.
With samples that are not SRS-like – really, not probabilistic samples – such as convenience samples,
then sampling variability is not predictable and, as such, even elementary parameter estimation is hard
to do with any integrity.
Slide 8
In what sense is the sampling variability that emerges from SRS-like samples predictable?
Understanding the answer depends on you remembering what a histogram is. If you don’t recall, you
may want to pause this video and do a quick web search to remind yourself.
Here is what we know. Suppose we were to take many different samples of size n, and for each sample
record the statistic of interest, say the proportion in the sample who said “yes” when asked “Are you in
favor of same-sex marriages?” If we were to plot all of those sample proportions in a simple histogram
then we would be guaranteed the following. The histogram would be bell-shaped and the peak of the
histogram would be above the parameter from the population.
Think about this. It is much more subtle than it often appears the first time it is encountered. T he
upshot is not a recommendation in any sense to take many different samples of size n. However, it is a
powerful piece of mathematical insight. It says, if we were (emphasized) to take many different samples
of size n and look at the sampling variability by way of a simple histogram, then the variability would not
be erratic. It would follow the shape of a bell, and, moreover the peak of the bell would be above the
population parameter.
This histogram is called a “sampling distribution.”
Slide 9
Let’s have another look at sampling distributions. Suppose you took an SRS of 80 people and asked “Are
you in favor of same-sex marriages?” and record the proportion who said “yes.” Suppose, also that you
took 24 more samples of size 80 and did the same thing. In the end you will have 25 sample
proportions.
A histogram of those 25 proportions would look something like this. It would be bell-shaped and peak
above the parameter in the population. In this case, the parameter appears to be about 2/3.
A word of caution is in order. It is true that constructing a sampling distribution would be a way of
getting a handle on what the parameter really is. But this is never how the parameter is actually
estimated. Rather, what statistical inference does is to take advantage of the knowledge that the
sampling distribution has to look this way, if one were to take the time and undergo the expense to
construct it, and uses that knowledge to make meaningful statements about the parameter just based
on the one sample of size n that was taken. That’s what is awesome about elementary inference.
Slide 10
What kind of mathematical knowledge comes from knowing that a sampling distribution has a particular
bell shape that peaks above the parameter? Let’s answer that by speaking specifically to the sampling
distribution of a sample proportion. We will use the letter “p” to denote the true, unknown population
proportion of interest, and p̂ to denote a prototypical sample proportion.
Mathematics guarantees us that p will be right under the peak of the bell curve. Moreover one can
comment on how likely it is that a particular p̂ will be in any particular range of possible values of p̂ on
the horizontal axis, under the bell. A few such ranges are easy to describe. For example: 95% of all p̂
1
1
values based on a simple random sample of size n will fall between the values of p − and p + ..
1 1
√n
68% will fall between p − 2
3 1
.
2 √n
1 1
.
√n
and p + 2
√n
3 1
√n
And 99.7% of all p̂ values will fall between p − 2
√n
and p +
This may be a little hard to appreciate at first glance because it may be unclear how to operationalize
the knowledge. Notice, however, that these very specific probability statements can be made
regardless of what p is. You only needed to know n and that your sample was SRS-like.
For example, if n is 100 then
1
√n
is 0.10. So regardless of what p is in the application at hand, you know
in advance that about 95% of all possible sample proportions you could observe in samples of size n,
have to be within 0.10 of p on either side of p.
This is profound but still has to be operationalized into the idea of a confidence interval.
Slide 11
The well-known margin of error can now be more carefully defined. If you are estimating a proportion
1
p, using the sample proportion from an SRS-like sample, then a 95% margin of error is roughly n. . The
√
“95%” is often suppressed in the media and just the words “margin of error” are used. A 99.7% margin
3 1
of error would be
.
2 √n
What is the role of this percentage? The MOE is operationalized in what is called a “confidence
interval” and the percentage being used is the “confidence.” For example, a 95% confidence interval is
a range in which you have 95% confidence of having captured p. The range is formed by constructing p̂
1
plus or minus the 95% margin of error, or p̂ ± .
√n
The correct interpretation of such an interval is very important to us and we will pay close attention to
that below. We will also make sure we understand better how this interval comes from the sampling
distribution. But first let’s practice constructing one.
Slide 12
Recall the survey on gun laws. The survey was of 1,110 adults and reported a sampling error of +/- 3
percentage points. Where did this come from?
(Hit Return)
Since the sample size is 1,110 and the (unsaid) confidence is 95% then we should be able to reproduce
the 3% by forming 1 divided by the square root of 1,110. Notice this works. In fact, a 95% confidence
interval would be 0.54 plus or minus 0.03.
Slide 13
A confidence interval for a population proportion based on a sample proportion from an SRS-like sample
can be computed for any level of confidence. Here is the formula for eight common levels. To use this,
just identify the confidence level (say 70%) and then take the corresponding z* value (say 1.04), divide it
by 2 and multiply it by 1 over the square root of n. This will be the margin of error corresponding to
that confidence level. From there the confidence interval is easy to form as phat plus or minus the
margin of error.
Slide 14
Remember, if the sample is not an SRS-like sample, then the formulas we’ve seen for the margin of error
and corresponding confidence intervals don’t apply. If they are used regardless then you have created
an assessment of where p might be that is not only meaningless, but deceptively so.
Slide 15
Finally, it is important to understand the correct interpretation of a confidence interval. It is common to
say that a 95% confidence interval is a range in which there is a 95% chance that the true, unknown
parameter p will be. That’s not quite the case and we don’t want to settle for an interpretation that
loose.
To access the correct interpretation we need to revisit our bell curve. This sampling distribution
describes where the sample proportion phat is likely to fall. Now think of phat as being the center of an
interval whose width is two times a specified margin of error. This is precisely what a confidence
1
interval is (for example: p̂ ± n ). But if you think of this center, the phat, as being the random part of
√
the interval, then the sampling distribution tells you how likely you are to have an interval like this one
cover the parameter p. That is, as long as phat is not more than 1 over the square root of n on either
side of p, then the interval will be wide enough to cover p. But the sampling distribution tells you that
95% of all phats will be that close to p. Similarly for all other levels of confidence.
Slide 16
So what is the correct interpretation of a confidence interval? Have a look at the animated graphic
shown here.
(Hit return twice)
Let’s suppose we are taking SRS-like samples from a population where, unbeknownst to us, 79% of the
population would say “Yes” to the question “Are you in favor of gay marriage?” The parameter is
indicated by the dashed vertical line segment. The horizontal line segment is just a part of the number
line showing the values of possible sample proportions.
(Return)
Let’s suppose you take a sample of size n, ask the question above, and find that about 90% in that
sample say “Yes.”
(Return)
Now suppose you compute the corresponding margin of error and create a simple 95% confidence
interval around that sample value. Notice that the interval does not contain the parameter.
(Return)
What about other samples of size n? They might all yield different phat values.
(Return)
And as such they might all yield distinct confidence intervals, though all the intervals will be the same
width of course. Notice that some of those intervals contain the parameter and some don’t.
(Return)
The fact is, if the intervals were 95% confidence intervals, then about 95% of the samples of size n would
produce intervals that would contain the parameter. Think about this. This is true even when you
(obviously) don’t know the parameter. What’s tricky is that you don’t know if the interval you have
computed or are reading from the news clip contains the parameter or not. But in the careful sense
mentioned above you can quantify the chance that it does.
So the correct interpretation of a K% confidence interval based on a sample of size n is as follows:
about K% of all samples of size n will produce confidence intervals that contain the parameter.
Slide 17
This concludes our video on confidence intervals. Remember, simple formulas are available for the
margin of error and associated confidence intervals, provided the data were collected in a simple
random sample, or similarly statistically correct fashion.
Download