Statistical Sampling Part III – Confidence Intervals This video is designed to accompany pages 41-76 in Making Sense of Uncertainty Activities for Teaching Statistical Reasoning Van-Griner Publishing Company Error Due to Sampling Americans and Their Guns Title: Poll: Majority of Americans Back Stricter Gun Laws Authors: Sarah Dutton, Jennifer De Pinto, Anthony Salvanto, Fred Backus, Leigh Ann Caldwell Source: CBS News January 17, 2013 As the president outlined sweeping new proposals aimed to reduce gun violence, a new CBS News/New York Times poll found that Americans back the central components of the president's proposals, including background checks, a national gun sale database, limits on high capacity magazines and a ban on semi-automatic weapons. Asked if they generally back stricter gun laws, more than half of respondents - 54 percent - support stricter gun laws …. That is a jump from April - before the Newtown and Aurora shootings - when only 39 percent backed stricter gun laws but about the same as ten years ago. … This poll was conducted by telephone from January 11-15, 2013 among 1,110 adults nationwide. Phone numbers were dialed from samples of both standard land-line and cell phones. The error due to sampling for results based on the entire sample could be plus or minus three percentage points. Margin of Error The 3% is called the “margin of error” for the survey. • You may have heard about this in elementary school. • You may never have heard about it before now. • Our goal is to make sure you understand what it really is and isn’t. Fundamental Problem Another, equally well-chosen SRS of n = 1,110 adult Americans, asked the same question, will almost surely yield a different p. And it can be quite different from the 0.54 seen here. Sampling Variability The variability seen in a statistic from sample to sample is called “sampling variability.” Question: Answer: Which statistic is correct? They are all correct! Better How can you estimate the parameter in Question: the face of this variability? Mathematics to the Rescue With SRS-type samples, sampling variability is understandable, predictable. With convenience samples it is not. If sample is not a probabilistic sample, then it will be very difficult to do the formal inference with integrity. Predictable? How? 1. If you were to do the sampling over and over and plot the different statistics you get …. 2. Then that plot – called a sampling distribution – would exhibit predictable characteristics. 3. In particular, it would be bell-shaped and peak above the parameter from the population. Predictable? How? Suppose you take a SRS of size 80 and ask “Are you in favor of samesex marriages?” Record the proportion who say “Yes.” Put those 80 back in the population and take another sample of size 80. Do this 25 times. Number of Sample Proportions Observed A Sampling Distribution 7 6 5 4 3 2 1 0 .41 to .46 to .51 to .56 to .61 to .66 to .71 to .76 to .81 to .86 to .91 to .45 .50 .55 .60 .65 .70 .75 .80 .85 .90 .95 Interval for Sample Proportion A histogram of these 25 sample proportions would have to be bell-shaped and peak above the parameter in the population, which appears to be about 2/3 in this case. Enormously Useful Can make quantitative statements about how far p is likely to be from p, even if you don’t know p. From the graphic, one can see that 95% of p’s based on samples of size n, will be within 1 of p. n 95% Confidence Interval If you are estimating a proportion p … From a simple random sample of size n … And want 95% “confidence” in your estimate … Then your margin of error is MOE = 1 n And a 95% confidence interval for p is 1 p +/− n … Example Revisited Americans and Their Guns Title: Poll: Majority of Americans Back Stricter Gun Laws Authors: Sarah Dutton, Jennifer De Pinto, Anthony Salvanto, Fred Backus, Leigh Ann A 95% confidence interval for p is Caldwell 1 Source: CBS News January 17, 2013 0.54 +/− 1110 =0.54 +/- 0.03 As the president outlined sweeping new proposals aimed to reduce gun violence, a new CBS News/New York Times poll found that Americans back the central components of the president's proposals, including background checks, a national gun sale database, limits on high capacity magazines and a ban on semi-automatic weapons. Asked if they generally back stricter gun laws, more than half of respondents - 54 percent - support stricter gun laws …. That is a jump from April - before the Newtown and Aurora shootings - when only 39 percent backed stricter gun laws but about the same as ten years ago. … This poll was conducted by telephone from January 11-15, 2013 among 1,110 adults nationwide. Phone numbers were dialed from samples of both standard land-line and cell phones. The error due to sampling for results based on the entire sample could be plus or minus three percentage points. Changing Confidence? 𝑧∗ 1 p +/− 2 n Level of Confidence Z* Level of Confidence Z* 50% 0.67 90% 1.64 60% 0.84 95% 2.00 70% 1.04 99% 2.58 80% 1.28 99.9% 3.29 No Probabilistic Sample? 𝑧∗ 1 p +/− 2 n Numbers can still be produced from this type of formula since plugging in is easy! But if the samples were not SRS-like, then those numbers are meaningless. A Technical Interpretation This kind of plot describes where p is likely to fall, relative to p, for any SRS of size n. Since 95% of p’s based on samples of size n, will be 1 within of p, it follows that n for 95% of all the p’s observed, the range 𝑝 +/− 1 n will be wide enough to cover p. Illustration of Interpretation Parameter p = 0.79 95% confidence” means to 95% of a long list of these intervals would contain the parameter! 0.60 0.70 0.80 0.90 One-Sentence Reflection Simple formulas are available for the margin of error and associated confidence intervals, provided the data were collected in a simple random sample, or similarly statistically correct fashion.