3680 Lecture 20 - College of Arts & Sciences

advertisement
Math 3680
Lecture #20
Polling
As we all know, chance error is
inevitable in a public opinion poll (or
any other kind of poll, for that matter).
For example, in 2004, the Gallup poll
predicted that President Bush would
garner 49.0% of the popular vote. In
fact, he had 51.1% of the vote, for an
error of -2.1%.
For the Gallup poll in 2004,
Population percentage, p = 51.1%
(parameter)
Sample percentage, p = 49.0%
(statistic)
We typically represent parameters
by Greek letters.
Chance error is inevitable in a public
opinion poll. Recall that the magnitude of
the chance error can be estimated (using
the conservative estimate p ≈ 0.5) by
Standard Error = 0.5 /n
For example, if the survey has size
n = 400, then the chance error is likely
to have magnitude around 0.025, or 2.5%.
To determine the outcome of the next election, it is
impractical to ask the entire population — unless you
actually held the election. Instead, a sample is
chosen, and the results from the sample are
extrapolated to the population. Such an extrapolation
is only reasonable when the sample is representative
— the topic of today's lecture.
Example. Suppose two hormonally-charged
16-year-old boys are asked to conduct a mall survey
to determine the demographics of mall patrons. Just
who do you think will be overrepresented in the
sample?
Big Idea:
Sample percentage =
Population percentage
+ chance error
+ bias
A Simple Example:
Explain the roles of bias and chance error
when attempting to measure the height
of a toddler.
TYPES OF BIAS THAT CAN OCCUR IN
A PUBLIC OPINION POLL
• Sample bias: the sample does not represent the population
• Response bias: the phrasing of the questions influences the answers
• Non-response bias: a high rate of nonparticipation
• Overparticipation: people who say they intend to vote but do not
• Undecided: people who have no idea how they’ll eventually vote
Imagine a pot of soup. By tasting a tablespoon full from
the pot, can you tell what the soup tastes like?
Answer: it depends. If the soup was well stirred, even a
small sample will give a good sense of the flavor of the
whole pot. On the other hand, if the soup is not stirred
well, a small sample is not likely to be representative.
It’s the same with polling. A carefully designed survey
can effectively estimate a population parameter even with
a relatively small sample. On the other hand, a survey that
includes bias can be useless even if a large proportion of
the population is polled.
Literary Digest poll of 1936
Prediction: Roosevelt - 43%; Actual - 62%
Problems:
1. The sample was not representative of America’s population:
sample bias.
2. The Digest poll was sent to 10 million people; only 2.4 million
responded. Thus, the results were subject to non-response bias.
This high rate of non-response is typical of surveys of convenience,
like call-in polls or the daily poll on CNN’s Web page. You will
often hear journalists apply the caveat “This is a nonscientific poll,
but the results are…” Whenever you hear or read such a statement,
replace the word nonscientific with worthless.
Question: for a call-in poll, why isn’t the sample representative
of the population?
NON-RESPONSE BIAS
Non-respondents can be very different than respondents. In
practice, low-income and high-income people tend not to
respond to questionnaires, so the middle class is overrepresented. For this reason pollsters prefer interviews, which
have a higher response rate (65%) than questionnaires (25%).
However, non-response bias is still a concern in modern
polling, which is why (in a telephone poll) the pollster will
call back up to 3 times and call in the evening (or during the
day or on weekends).
Non-response bias — for example, people hanging up when
called to participate in a survey — has been on the rise in
the past 20 years. Why?
Gallup Poll of 1948
Truman
Dewey
Thurmond
Wallace
44%
50%
2%
4%
Problem: Quota Sampling: For example, the interviewer in St. Louis
had to survey 13 people. Of these 13 people
6 women
7 men
1 black
Monthly rent:
1 of
$44.01 or more
6 white
3 of
2 of
$18.01-$44 $18.00 or less
In this way, the sample is forced to match the characteristics of the
population (collected from the Census Bureau).
What potential bias exists in this method of sampling?
Year
2008
2004
2000
1996
1992
1988
1984
1980
1976
1972
1968
1964
1960
1956
1952
1948
1944
1940
1936
ACCURACY OF THE GALLUP POLL
Winner
Obama
Bush
Bush
Clinton
Clinton
Bush
Reagan
Reagan
Carter
Nixon
Nixon
Johnson
Kennedy
Eisenhower
Eisenhower
Truman
Roosevelt
Roosevelt
Roosevelt
Final Survey
Results
Deviation
55.0%
49.0%
48.0%
52.0%
49.0%
56.0%
59.0%
47.0%
48.0%
62.0%
43.0%
64.0%
51.0%
59.5%
51.0%
44.5%
51.5%
52.0%
55.7%
52.6%
51.1%
47.9%
50.1%
43.2%
53.9%
59.1%
50.8%
50.0%
61.8%
43.5%
61.3%
50.1%
57.8%
55.4%
49.9%
53.3%
55.0%
62.5%
+2.4%
-2.1%
+0.1%
+1.9%
+5.8%
+2.1%
-0.1%
-3.8%
-2.0%
+0.2%
-0.5%
+2.7%
+0.9%
+1.7%
-4.4%
-5.4%
-1.8%
-3.0%
-6.8%
As most everyone will remember, not every major poll has
gone well in modern times.
For example, in 1994, there were three major political
contests in the New York area:
NJ governor: Florio (D) v. Whitman (R)
NYC mayor: Dinkins (D) v. Guiliani (R)
NY governor: Cuomo (D) v. Pataki (R)
The Saturday before the election, Channel 2 (WCBS) in
New York predicted that the three Democratic candidates
would all win by at least 10 percentage points. When the
election was held, all of the Republican candidates won
by more than 10 percentage points.
Ideal: Taking a simple random sample of the
population. That is, the pollster draws tickets at
random without replacement from a box of tickets.
Every ticket has an equal chance of being drawn and
the interviewer thus has no discretion as to
whom they interview.
The law of averages thus dictates that the sample
percentage is close to the population percentage.
Problem: In real life, there is no “master list” with the
names of all of the millions of Americans who will
participate in the next election. Also, even if such a
list existed, the potential respondents would be
located (actually dispersed) all over the country, and
the cost of doing personal interviews would be too
high.
Besides, even if sample bias was eliminated in this
way, there are still plenty of other types of bias to
worry about.
TYPES OF BIAS THAT CAN OCCUR IN
A PUBLIC OPINION POLL
• Sample bias: the sample does not represent the population
• Response bias: the phrasing of the questions influences the answers
• Non-response bias: a high rate of nonparticipation
• Overparticipation: people who say they intend to vote but do not
• Undecided: people who have no idea how they’ll eventually vote
Richard Morin,
“Choice Words,”
Washington Post
(1/10/99)
Wall Street
Journal
(2/8/99)
TYPES OF BIAS THAT CAN OCCUR IN
A PUBLIC OPINION POLL
• Sample bias: the sample does not represent the population
• Response bias: the phrasing of the questions influences the answers
• Non-response bias: a high rate of nonparticipation
• Overparticipation: people who say they intend to vote but do not
• Undecided: people who have no idea how they’ll eventually vote
Exit poll mania spread through media and campaign circles Tuesday afternoon after
first wave of morning data showed Kerry competitive in key states.... National
Election Pool -- representing six major news organizations -- shows Kerry in
striking distance -- with small lead -- in Florida and Ohio, sources tell DRUDGE...
But early sample was based on a 59-41 women to men ratio... MORE... Senate races:
Thune +4 Castor +3 Burr +6 Bunning +6 Coburn +6 Demint +4 Salazar +4...
KERRY FINDS COMFORT IN FIRST
BATCH OF EXIT POLLS;
BOTH CAMPS URGE CAUTION
November 2, 2004 at 8:38 PM EST
‘I'm Going to Learn’
By Evan Thomas
January 10, 2005
It was a little after 7 p.m. on election night 2004. The network exit
polls showed John Kerry leading George Bush in both Florida and
Ohio by three points. Kerry's aides were confident that the
Democratic candidate would carry these key swings states; Bush had
not broken 48 percent in Kerry's recent tracking polls. The aides were
a little hesitant to interrupt Kerry as he was fielding satellite TV
interviews in a last get-out-the-vote push. Still, the 7 o'clock exit
polls were considered to be reasonably reliable. Time to tell the
candidate the good news.
Kerry had slept only two hours the night before. He was sitting in a
small hotel room at the Westin Copley (in a small irony of history,
next door to the hotel where his grandfather, a boom-and-bust
businessman, shot himself some 80 years ago). Bob Shrum, Kerry's
friend and close adviser, couldn't resist the moment. "May I be the
first to say 'Mr. President'?" said Shrum.
TYPES OF BIAS THAT CAN OCCUR IN
A PUBLIC OPINION POLL
• Sample bias: the sample does not represent the population
• Response bias: the phrasing of the questions influences the answers
• Non-response bias: a high rate of nonparticipation
• Overparticipation: people who say they intend to vote but do not
• Undecided: people who have no idea how they’ll eventually vote
“Choice Words,” Washington Post (1/10/99)
TYPES OF BIAS THAT CAN OCCUR IN
A PUBLIC OPINION POLL
• Sample bias: the sample does not represent the population
• Response bias: the phrasing of the questions influences the answers
• Non-response bias: a high rate of nonparticipation
• Overparticipation: people who say they intend to vote but do not
• Undecided: people who have no idea how they’ll eventually vote
R. Morin, “Campaign Trail Oversight,” Washington Post (2/27/00)
The media’s current infatuation with Big Picture stories, written in the
narrative voice and awash in prediction, encourages journalists and
opinion writers to simplify complex and fast-moving events, or to draw
broad conclusions from scattered, conflicting or otherwise confusing
facts. Will Michigan Democrats and independents stand by John
McCain in November? The correct answer is, of course, who knows?
But just try saying that on national network news.
Of course a good narrative needs tension and action. And that’s why
reporters love to tell stories like these:
“New Hampshire typically is keeping the country guessing
with a too-close-to-call Republican race.”
Bob Edwards, NPR, day before NH primary
“On the eve of a South Carolina primary that’s too
close to call, both candidates were beginning to focus
on the mechanics of getting voters to the polls.”
AP, day before SC primary
“The race is too close to call.”
Alison Stewart, ABC, day before MI primary
Let the record show that
• McCain comfortably beat Bush by 8 percentage points in Michigan.
• Bush beat McCain by 11 points in South Carolina, and
• McCain won by 18 points in New Hampshire.
Voters might be forgiven if they wonder: How close do you have to get
to see a wipeout coming?
…[A] word of advice for my colleagues: If you must write those Big
Picture stories awash in narrative and dramatic sweep, then remember
to use only the tastiest words. Chances are you’ll be eating them.
Dana Blanton, “FOX News Poll: Obama Leads in New Hampshire,
Clinton Slips to Second,” Fox News.com (1/7/08)
In the new post-Iowa Caucuses world of politics, Barack Obama is now
the front-runner in [tomorrow’s] New Hampshire Democratic primary,
with Hillary Clinton in second place behind the Hawkeye State winner
and John Edwards unchanged in third place, according to a poll released
by FOX News on Monday.
Obama received a nice bump from his performance in the Iowa caucuses.
Obama now captures the support of 32 percent of likely Democratic
primary voters in New Hampshire, up from 25 percent in mid-December
and Clinton receives the backing of 28 percent today, down from 34
percent (December 11-13). Edwards also received a bit of help from
Iowa and is now at 18 percent, up from 15 percent last month.
The telephone poll was conducted for FOX News by Opinion Dynamics
Corp. among 500 likely Democratic primary voters in New Hampshire
from Jan. 4 to Jan. 6. The poll has a 4-point error margin.
Howard Kurtz, “Media Blow It Again,” washingtonpost.com (1/9/08)
Let's review yesterday's papers…
Boston Herald: "She's So Yesterday," with a
cover shot of the old Beatles record.
That was then. This is now…. at 10:31, MSNBC
projected Hillary as the winner. CNN and Fox
followed suit 15 minutes later, and the scrambling
began. Spin was modified, explanations revised.
"One of the greatest political upsets in American political history," Tim
Russert said.
"Bill Clinton helped her in the end," Fox's Bill Kristol said.
CNN's Anderson Cooper questioned whether people lied to pollsters
about supporting an African-American candidate. MSNBC's Joe
Scarborough raised the same issue. Washington Post columnist Gene
Robinson said he didn't think it was a major factor.
But a lot of people made up their minds in the last 24 hours — too late
to be caught by the almighty polls.
Jon Cohen and Jennifer Agiesta, “Polls Were Right About McCain
but Missed the Call on Clinton’s Primary Win,” WP (1/9/08)
While pre-election polls in New Hampshire got Sen. John McCain's
margin of victory about right on the Republican side, late polls
fundamentally mischaracterized the status of the Democratic race.
Polls released in the two days before the election had Sen. Barack
Obama (Ill.) with a five- to 13-percentage-point lead over Sen. Hillary
Rodham Clinton (N.Y.) in the Granite State, but Clinton defeated Obama,
39 percent to 36 percent. [Edwards: 17 percent.]
Most polls accurately reflected the large bloc of likely Democratic
voters yet to make up their minds or who said they were open to
switching their support in the closing days. On the network exit poll,
nearly 4 in 10 said they made their final decision within the last three
days; 17 percent said they decided how to vote yesterday. Among those
making up their minds on the day of the primary, 39 percent supported
Clinton, 36 percent Obama. Clinton did even better among the third of
the electorate who settled on their choice a month or more ago.
Evaluate the credibility of the
following poll results based on
the given news article.
Schwarzenegger Poll Validity Questioned
Wednesday, April 13, 2005
For many Californians, Republican Gov. Arnold
Schwarzenegger came to office in 2003 as something
of a knight in shining armor following the recall of
former Gov. Gray Davis.
But these days, poll numbers show that his attempts to reform state
govern are not wildly popular.
A San Jose State University survey shows Schwarzenegger's approval
rating below 50 percent for the first time since he took office. Pollster
Phil Trounstine led the research.
“He's too interested in gimmicks, public relations and image.”
“I think things in California are generally going in the right
direction or are they seriously off on the wrong track.”
2004 Exit Poll, Question J
"Which ONE issue mattered most in deciding
how you voted for president?"
• Education, 4%
• Taxes, 5%
• Health Care, 8%
• Iraq, 15%
• Terrorism, 19%
• Economy and Jobs, 20%
• Moral Values, 22%
Daily Telegraph (UK)
poll of British Muslims
a week after the 7/7/05
London bombings
Roger Clemens squared off against his former trainer in nearly five hours
of testimony before a House committee on Wednesday [2/12/08]. Please
answer a few questions on what you took away from the proceedings.
Who was more convincing in his testimony in
Wednesday's hearing?
Roger Clemens
Brian McNamee
Who did you believe was telling the truth prior to the hearing?
Roger Clemens
Brian McNamee
Considering the Republicans in the hearing sided generally
with Clemens and the Democrats with McNamee, do you
believe partisan politics played a role in the questioning?
Yes
No
Should Clemens be investigated for perjury based on his
testimony under oath in the hearing and his deposition?
Yes
No
Should Andy Pettitte have been excused from the hearing?
Yes
No
Should Pettitte be suspended for his further admission
of HGH use in his deposition?
Yes
No
Do you approve of Congress spending time and taxpayer
money on investigating steroids in sports?
Yes, they have helped clean up sports
No, the congressmen are grandstanding
Would you like to see Congress continue its hands-on
oversight of MLB and other professional sports leagues?
Yes, the sports can't police themselves
No, Congress has better things to do
For future reference, the following brief article gives
good practical advice about how to conduct a survey:
Rachel Ivie and Roman Czujko, “What’s your survey
telling you?” Physics Today (November 2007), pp. 78-79.
SUMMARY
• Statistic = Parameter + Chance Error + Bias
• Bias may lead to incorrect conclusions.
• To minimize sample bias, a probability method uses an
objective chance process to construct the sample.
• Large samples do not preclude the possibility of bias.
• Relatively small samples that are properly constructed can be
used to predict the behavior of a population of millions. For
example, the Gallup poll has correctly called the correct
winner for 50 years.
• Even if a sample is properly chosen, bias may result when
soliciting information from the sample.
• Practicing statistics is a complicated mix of art and science.
Download