Math 3680 Lecture #20 Polling As we all know, chance error is inevitable in a public opinion poll (or any other kind of poll, for that matter). For example, in 2004, the Gallup poll predicted that President Bush would garner 49.0% of the popular vote. In fact, he had 51.1% of the vote, for an error of -2.1%. For the Gallup poll in 2004, Population percentage, p = 51.1% (parameter) Sample percentage, p = 49.0% (statistic) We typically represent parameters by Greek letters. Chance error is inevitable in a public opinion poll. Recall that the magnitude of the chance error can be estimated (using the conservative estimate p ≈ 0.5) by Standard Error = 0.5 /n For example, if the survey has size n = 400, then the chance error is likely to have magnitude around 0.025, or 2.5%. To determine the outcome of the next election, it is impractical to ask the entire population — unless you actually held the election. Instead, a sample is chosen, and the results from the sample are extrapolated to the population. Such an extrapolation is only reasonable when the sample is representative — the topic of today's lecture. Example. Suppose two hormonally-charged 16-year-old boys are asked to conduct a mall survey to determine the demographics of mall patrons. Just who do you think will be overrepresented in the sample? Big Idea: Sample percentage = Population percentage + chance error + bias A Simple Example: Explain the roles of bias and chance error when attempting to measure the height of a toddler. TYPES OF BIAS THAT CAN OCCUR IN A PUBLIC OPINION POLL • Sample bias: the sample does not represent the population • Response bias: the phrasing of the questions influences the answers • Non-response bias: a high rate of nonparticipation • Overparticipation: people who say they intend to vote but do not • Undecided: people who have no idea how they’ll eventually vote Imagine a pot of soup. By tasting a tablespoon full from the pot, can you tell what the soup tastes like? Answer: it depends. If the soup was well stirred, even a small sample will give a good sense of the flavor of the whole pot. On the other hand, if the soup is not stirred well, a small sample is not likely to be representative. It’s the same with polling. A carefully designed survey can effectively estimate a population parameter even with a relatively small sample. On the other hand, a survey that includes bias can be useless even if a large proportion of the population is polled. Literary Digest poll of 1936 Prediction: Roosevelt - 43%; Actual - 62% Problems: 1. The sample was not representative of America’s population: sample bias. 2. The Digest poll was sent to 10 million people; only 2.4 million responded. Thus, the results were subject to non-response bias. This high rate of non-response is typical of surveys of convenience, like call-in polls or the daily poll on CNN’s Web page. You will often hear journalists apply the caveat “This is a nonscientific poll, but the results are…” Whenever you hear or read such a statement, replace the word nonscientific with worthless. Question: for a call-in poll, why isn’t the sample representative of the population? NON-RESPONSE BIAS Non-respondents can be very different than respondents. In practice, low-income and high-income people tend not to respond to questionnaires, so the middle class is overrepresented. For this reason pollsters prefer interviews, which have a higher response rate (65%) than questionnaires (25%). However, non-response bias is still a concern in modern polling, which is why (in a telephone poll) the pollster will call back up to 3 times and call in the evening (or during the day or on weekends). Non-response bias — for example, people hanging up when called to participate in a survey — has been on the rise in the past 20 years. Why? Gallup Poll of 1948 Truman Dewey Thurmond Wallace 44% 50% 2% 4% Problem: Quota Sampling: For example, the interviewer in St. Louis had to survey 13 people. Of these 13 people 6 women 7 men 1 black Monthly rent: 1 of $44.01 or more 6 white 3 of 2 of $18.01-$44 $18.00 or less In this way, the sample is forced to match the characteristics of the population (collected from the Census Bureau). What potential bias exists in this method of sampling? Year 2008 2004 2000 1996 1992 1988 1984 1980 1976 1972 1968 1964 1960 1956 1952 1948 1944 1940 1936 ACCURACY OF THE GALLUP POLL Winner Obama Bush Bush Clinton Clinton Bush Reagan Reagan Carter Nixon Nixon Johnson Kennedy Eisenhower Eisenhower Truman Roosevelt Roosevelt Roosevelt Final Survey Results Deviation 55.0% 49.0% 48.0% 52.0% 49.0% 56.0% 59.0% 47.0% 48.0% 62.0% 43.0% 64.0% 51.0% 59.5% 51.0% 44.5% 51.5% 52.0% 55.7% 52.6% 51.1% 47.9% 50.1% 43.2% 53.9% 59.1% 50.8% 50.0% 61.8% 43.5% 61.3% 50.1% 57.8% 55.4% 49.9% 53.3% 55.0% 62.5% +2.4% -2.1% +0.1% +1.9% +5.8% +2.1% -0.1% -3.8% -2.0% +0.2% -0.5% +2.7% +0.9% +1.7% -4.4% -5.4% -1.8% -3.0% -6.8% As most everyone will remember, not every major poll has gone well in modern times. For example, in 1994, there were three major political contests in the New York area: NJ governor: Florio (D) v. Whitman (R) NYC mayor: Dinkins (D) v. Guiliani (R) NY governor: Cuomo (D) v. Pataki (R) The Saturday before the election, Channel 2 (WCBS) in New York predicted that the three Democratic candidates would all win by at least 10 percentage points. When the election was held, all of the Republican candidates won by more than 10 percentage points. Ideal: Taking a simple random sample of the population. That is, the pollster draws tickets at random without replacement from a box of tickets. Every ticket has an equal chance of being drawn and the interviewer thus has no discretion as to whom they interview. The law of averages thus dictates that the sample percentage is close to the population percentage. Problem: In real life, there is no “master list” with the names of all of the millions of Americans who will participate in the next election. Also, even if such a list existed, the potential respondents would be located (actually dispersed) all over the country, and the cost of doing personal interviews would be too high. Besides, even if sample bias was eliminated in this way, there are still plenty of other types of bias to worry about. TYPES OF BIAS THAT CAN OCCUR IN A PUBLIC OPINION POLL • Sample bias: the sample does not represent the population • Response bias: the phrasing of the questions influences the answers • Non-response bias: a high rate of nonparticipation • Overparticipation: people who say they intend to vote but do not • Undecided: people who have no idea how they’ll eventually vote Richard Morin, “Choice Words,” Washington Post (1/10/99) Wall Street Journal (2/8/99) TYPES OF BIAS THAT CAN OCCUR IN A PUBLIC OPINION POLL • Sample bias: the sample does not represent the population • Response bias: the phrasing of the questions influences the answers • Non-response bias: a high rate of nonparticipation • Overparticipation: people who say they intend to vote but do not • Undecided: people who have no idea how they’ll eventually vote Exit poll mania spread through media and campaign circles Tuesday afternoon after first wave of morning data showed Kerry competitive in key states.... National Election Pool -- representing six major news organizations -- shows Kerry in striking distance -- with small lead -- in Florida and Ohio, sources tell DRUDGE... But early sample was based on a 59-41 women to men ratio... MORE... Senate races: Thune +4 Castor +3 Burr +6 Bunning +6 Coburn +6 Demint +4 Salazar +4... KERRY FINDS COMFORT IN FIRST BATCH OF EXIT POLLS; BOTH CAMPS URGE CAUTION November 2, 2004 at 8:38 PM EST ‘I'm Going to Learn’ By Evan Thomas January 10, 2005 It was a little after 7 p.m. on election night 2004. The network exit polls showed John Kerry leading George Bush in both Florida and Ohio by three points. Kerry's aides were confident that the Democratic candidate would carry these key swings states; Bush had not broken 48 percent in Kerry's recent tracking polls. The aides were a little hesitant to interrupt Kerry as he was fielding satellite TV interviews in a last get-out-the-vote push. Still, the 7 o'clock exit polls were considered to be reasonably reliable. Time to tell the candidate the good news. Kerry had slept only two hours the night before. He was sitting in a small hotel room at the Westin Copley (in a small irony of history, next door to the hotel where his grandfather, a boom-and-bust businessman, shot himself some 80 years ago). Bob Shrum, Kerry's friend and close adviser, couldn't resist the moment. "May I be the first to say 'Mr. President'?" said Shrum. TYPES OF BIAS THAT CAN OCCUR IN A PUBLIC OPINION POLL • Sample bias: the sample does not represent the population • Response bias: the phrasing of the questions influences the answers • Non-response bias: a high rate of nonparticipation • Overparticipation: people who say they intend to vote but do not • Undecided: people who have no idea how they’ll eventually vote “Choice Words,” Washington Post (1/10/99) TYPES OF BIAS THAT CAN OCCUR IN A PUBLIC OPINION POLL • Sample bias: the sample does not represent the population • Response bias: the phrasing of the questions influences the answers • Non-response bias: a high rate of nonparticipation • Overparticipation: people who say they intend to vote but do not • Undecided: people who have no idea how they’ll eventually vote R. Morin, “Campaign Trail Oversight,” Washington Post (2/27/00) The media’s current infatuation with Big Picture stories, written in the narrative voice and awash in prediction, encourages journalists and opinion writers to simplify complex and fast-moving events, or to draw broad conclusions from scattered, conflicting or otherwise confusing facts. Will Michigan Democrats and independents stand by John McCain in November? The correct answer is, of course, who knows? But just try saying that on national network news. Of course a good narrative needs tension and action. And that’s why reporters love to tell stories like these: “New Hampshire typically is keeping the country guessing with a too-close-to-call Republican race.” Bob Edwards, NPR, day before NH primary “On the eve of a South Carolina primary that’s too close to call, both candidates were beginning to focus on the mechanics of getting voters to the polls.” AP, day before SC primary “The race is too close to call.” Alison Stewart, ABC, day before MI primary Let the record show that • McCain comfortably beat Bush by 8 percentage points in Michigan. • Bush beat McCain by 11 points in South Carolina, and • McCain won by 18 points in New Hampshire. Voters might be forgiven if they wonder: How close do you have to get to see a wipeout coming? …[A] word of advice for my colleagues: If you must write those Big Picture stories awash in narrative and dramatic sweep, then remember to use only the tastiest words. Chances are you’ll be eating them. Dana Blanton, “FOX News Poll: Obama Leads in New Hampshire, Clinton Slips to Second,” Fox News.com (1/7/08) In the new post-Iowa Caucuses world of politics, Barack Obama is now the front-runner in [tomorrow’s] New Hampshire Democratic primary, with Hillary Clinton in second place behind the Hawkeye State winner and John Edwards unchanged in third place, according to a poll released by FOX News on Monday. Obama received a nice bump from his performance in the Iowa caucuses. Obama now captures the support of 32 percent of likely Democratic primary voters in New Hampshire, up from 25 percent in mid-December and Clinton receives the backing of 28 percent today, down from 34 percent (December 11-13). Edwards also received a bit of help from Iowa and is now at 18 percent, up from 15 percent last month. The telephone poll was conducted for FOX News by Opinion Dynamics Corp. among 500 likely Democratic primary voters in New Hampshire from Jan. 4 to Jan. 6. The poll has a 4-point error margin. Howard Kurtz, “Media Blow It Again,” washingtonpost.com (1/9/08) Let's review yesterday's papers… Boston Herald: "She's So Yesterday," with a cover shot of the old Beatles record. That was then. This is now…. at 10:31, MSNBC projected Hillary as the winner. CNN and Fox followed suit 15 minutes later, and the scrambling began. Spin was modified, explanations revised. "One of the greatest political upsets in American political history," Tim Russert said. "Bill Clinton helped her in the end," Fox's Bill Kristol said. CNN's Anderson Cooper questioned whether people lied to pollsters about supporting an African-American candidate. MSNBC's Joe Scarborough raised the same issue. Washington Post columnist Gene Robinson said he didn't think it was a major factor. But a lot of people made up their minds in the last 24 hours — too late to be caught by the almighty polls. Jon Cohen and Jennifer Agiesta, “Polls Were Right About McCain but Missed the Call on Clinton’s Primary Win,” WP (1/9/08) While pre-election polls in New Hampshire got Sen. John McCain's margin of victory about right on the Republican side, late polls fundamentally mischaracterized the status of the Democratic race. Polls released in the two days before the election had Sen. Barack Obama (Ill.) with a five- to 13-percentage-point lead over Sen. Hillary Rodham Clinton (N.Y.) in the Granite State, but Clinton defeated Obama, 39 percent to 36 percent. [Edwards: 17 percent.] Most polls accurately reflected the large bloc of likely Democratic voters yet to make up their minds or who said they were open to switching their support in the closing days. On the network exit poll, nearly 4 in 10 said they made their final decision within the last three days; 17 percent said they decided how to vote yesterday. Among those making up their minds on the day of the primary, 39 percent supported Clinton, 36 percent Obama. Clinton did even better among the third of the electorate who settled on their choice a month or more ago. Evaluate the credibility of the following poll results based on the given news article. Schwarzenegger Poll Validity Questioned Wednesday, April 13, 2005 For many Californians, Republican Gov. Arnold Schwarzenegger came to office in 2003 as something of a knight in shining armor following the recall of former Gov. Gray Davis. But these days, poll numbers show that his attempts to reform state govern are not wildly popular. A San Jose State University survey shows Schwarzenegger's approval rating below 50 percent for the first time since he took office. Pollster Phil Trounstine led the research. “He's too interested in gimmicks, public relations and image.” “I think things in California are generally going in the right direction or are they seriously off on the wrong track.” 2004 Exit Poll, Question J "Which ONE issue mattered most in deciding how you voted for president?" • Education, 4% • Taxes, 5% • Health Care, 8% • Iraq, 15% • Terrorism, 19% • Economy and Jobs, 20% • Moral Values, 22% Daily Telegraph (UK) poll of British Muslims a week after the 7/7/05 London bombings Roger Clemens squared off against his former trainer in nearly five hours of testimony before a House committee on Wednesday [2/12/08]. Please answer a few questions on what you took away from the proceedings. Who was more convincing in his testimony in Wednesday's hearing? Roger Clemens Brian McNamee Who did you believe was telling the truth prior to the hearing? Roger Clemens Brian McNamee Considering the Republicans in the hearing sided generally with Clemens and the Democrats with McNamee, do you believe partisan politics played a role in the questioning? Yes No Should Clemens be investigated for perjury based on his testimony under oath in the hearing and his deposition? Yes No Should Andy Pettitte have been excused from the hearing? Yes No Should Pettitte be suspended for his further admission of HGH use in his deposition? Yes No Do you approve of Congress spending time and taxpayer money on investigating steroids in sports? Yes, they have helped clean up sports No, the congressmen are grandstanding Would you like to see Congress continue its hands-on oversight of MLB and other professional sports leagues? Yes, the sports can't police themselves No, Congress has better things to do For future reference, the following brief article gives good practical advice about how to conduct a survey: Rachel Ivie and Roman Czujko, “What’s your survey telling you?” Physics Today (November 2007), pp. 78-79. SUMMARY • Statistic = Parameter + Chance Error + Bias • Bias may lead to incorrect conclusions. • To minimize sample bias, a probability method uses an objective chance process to construct the sample. • Large samples do not preclude the possibility of bias. • Relatively small samples that are properly constructed can be used to predict the behavior of a population of millions. For example, the Gallup poll has correctly called the correct winner for 50 years. • Even if a sample is properly chosen, bias may result when soliciting information from the sample. • Practicing statistics is a complicated mix of art and science.