Factors Affecting Polling Accuracy in the 2004 U.S. Presidential

advertisement
Factors Affecting Polling Accuracy in the 2004 U.S. Presidential Election
Wayne Wanta
School of Journalism
University of Missouri
Columbia, MO 65211-1200
573-884-9689
wantaw@missouri.edu
Hans K. Meyer
School of Journalism
University of Missouri
Columbia, MO 65211-1200
hanskmeyer@gmail.com
Antonie Stam
College of Business
University of Missouri
Columbia, MO 65211
573-882-6286
stama@missouri.edu
** Paper presented at the annual conference for the World Association for Public
Opinion Research, Berlin, Germany, September 2007.
** Wanta is a professor in the School of Journalism and executive director of the Center
for the Digital Globe, Meyer is a doctoral student in the School of Journalism, and Stam
is the Leggett & Platt Distinguished Professor of MIS in the College of Business – all at
the University of Missouri.
Factors Affecting Polling Accuracy in the 2004 U.S. Presidential Election
Abstract
Using polling data from the 50 states and District of Columbia during the 2004
U.S. presidential election, this study examined four factors that could potentially affect
how accurate a poll could predict eventual voting results. Findings show that polls were
most accurate for states in which the election was very closely contested and in which the
poll was conducted close to the election day. How many respondents and how many
undecided respondents were in the poll did not play roles in polling accuracy.
2
Factors Affecting Polling Accuracy in the 2004 U.S. Presidential Election
Every election, scores of polls are conducted, aimed at finding trends among
respondents that could lead to predictions of eventual winners. These polls are conducted
at local, regional, state and national levels.
All polls are not created equal, however. Past polls have been so inaccurate as to
predict the wrong winner, such as the 1936 election when the Literary Digest poll
predicted Alf Landon would defeat Franklin Roosevelt. In this case, survey methods
over-selected high-income individuals, leading to more Republicans answering the
survey questions.
Certainly, survey methodology has made great strides since 1936. Pre-election
public opinion polls in the 2004 U.S. presidential election, in fact, were exceptionally
accurate. As Traugott (2005) notes, based on three different polling accuracy measures,
pollsters had their best year since 1956.
According to Shelley and Hwang (1991), a number of factors influence the way a
poll can accurately gauge pubic opinion. In a study of all the polls of five major news
organizations from Jan. 1 to election day in 1988, they describe how polling trends and
the impact of events contributed to the accuracy and consistency of poll numbers. They
found that the average deviation from President George H. Bush’s final election result
was 4 percentage points while Michael Dukakis was 4.4. This is consistent with Crespi’s
1988 finding that final local, state and presidential primary and general election polls
have had an average absolute deviation of 5.7 percentage points.
3
This study looks at the effects of four factors on polling accuracy for individual
states during the 2004 presidential campaign: the timing of a poll, the closeness of an
election, the number of respondents in a poll and the number of undecided voters in a
poll. Each of these variables could affect how accurate a poll would be in predicting the
eventual winner of the U.S. presidential race. Polls were identified from the National
Election website for the Los Angeles Times, which reported the most recent polls
available from each individual state.
Given that the 2004 polling was very accurate, this study hopes to identify factors
that contributed to these accurate poll results. By identifying factors affecting accuracy,
researchers can take these variables into consideration when examining future polling
results.
Literature Review
Since the early 1800s, journalists have done all they can to predict elections
(Frankovic, 2003). “They have used many available tools, from carrier pigeons to the
Internet, to tell the important story of the peaceful transfer of power in a democratic
process” (p. 19). One of the most common tools has been the public opinion poll, which
Crespi (1980) says can be traced back to the 1820s. In fact, he writes newspapers have
chased public opinion through polls because no greater news scoop exists than predicting
the winner of an election before it has even happened. “The expression of public opinion
in the voting booth has an immediate, direct and power political effect: therefore, it is
news. Moreover, being able to anticipate correctly the outcome of an election before
anyone else is a form of news scoop” (p. 465).
4
Newspaper organizations might have had their best year in predicting the 2004
election (Traugott, 2005). Of the 19 polls he studied, Traugott (2005) found that all but
one were within the range of plus or minus 4 percentage points, with 13 showing Bush
ahead, two showing ties and five showing Kerry ahead (p. 645). One reason for their
overall accuracy is pollsters expected a close race and altered their mix of study designs.
Even their overall success in predicting the 2004 election, however, did not inoculate
them from criticism, especially in how they defined likely voters. “As the voting
procedures and devices change and more opportunities to vote early arise, pre-election
and exit pollsters will face new challenges … More public disclosure of methods and
their consequences will be required to maintain public confidence in the profession, as
well as in the basic foundation of the American electoral system and its transparency to
the public” (Traugott, 2005: 653). Before examining one poll in particular, the L.A.
Times’ online compendium of state pre-election polls, it is important to understand the
thinking and theory behind opinion polls, why the media participate so actively and what
measures influence a polls ability to correctly predict an election.
Polling and Democracy
In virtually every democracy in the world, Gallup, or one of its rivals, conducts
and publishes public opinion surveys of voter intentions prior to a major election (LewisBlack, 2005). Researchers continue to disagree on how polls impact the political process.
Sidney Verba called public opinion polling “the most significant development in
promoting the democratization of the American regime” (cited in Pearson, 2004). George
Gallup, founder of the poll that bears him name, called polls a “continuous referendum by
5
which majority will could be made known” (cited in Meyer, 1990). By providing
information about a likely outcome, polls allow leaders and followers alike to make the
adjustments they feel are necessary (Lewis-Beck, 2005). Good election forecasts are
information “that is intrinsically interesting in a healthy democracy” (p. 146).
While polls may promote democracy by enabling political leaders to know what
their constituents think, the picture of what public opinion polls create is illusory and “is
typically purchased at the price of moral reasoning, which tends not be quantifiable in its
very nature” (Pearson, 2004: 71). Polling needs to work with democratic theory to
encourage participation, but it cannot make science the solution to all problems. “This
makes it all the more imperative that in a democracy such as ours it is the philosophical
problems that are most in need of clarity because our culture is already predisposed to
believe that science is the answer to whatever problem we face” (p. 71).
This illusion of “certainty” of who will win has the potential to affect the outcome
of elections. Critics say voters are less likely to participate if they think their vote will not
make a difference in the final tally. Polls that only predict winners, or horse-race polls,
also lead to voters knowing less about the issues and caring less about the election
(Meyer & Potter, 1998). So many public opinion polls exist today that voters have neither
the time, nor the desire, nor the ability to sort through them all (Lewis-Black, 2005). This
glut of opinion polls actually works to empower voters with better information on which
to base their decision. It can also have a “catalytic effect” that makes the contest more
interesting to citizens. “As citizens follow the candidate’s progress or lack of it, they
become interested in the race and why it is developing that way, and they start to look at
issues” (Meyer & Potter, 1998: 36).
6
The key to ensuring that polls work with and not against democracy is for
pollsters to disclose all of the research biases and limitations that can push their findings
away from reality (Rotfield, 2007). Even though news organizations have become
increasingly reliant on polls as the basis for most of their political news, they have failed
to adequately address their limitations. “Small shifts of public opinion have become the
lead for all major news programs with reporters giving the statistical sampling error as if
that and that alone would explain any limitations to the data” (p. 187). Newspapers
especially should recognize the need to report limitations because from their beginnings,
they have always placed truth above entertainment (Rotfield, 2007).
Polls and Journalism
A newspaper’s commitment to truth, however, presents a conundrum in reporting
and participating in public opinion polling. Traditionally, newspapers have shied away
from making news, even as they have embraced opinion polls more and more. News
organizations should not feel guilty because public opinion polling represents the kind of
“precision journalism” news organizations should practice (Meyer, 1990). In fact, it has
become a “vital component of the picture journalists draw of a political campaign”
(Stovall & Solomon, 1984).
Being able to predict the outcome of an election can only strengthen a news
organization’s credibility because it moves journalism closer to science, which states its
propositions in forms which can be tested (Meyer, 1990: 456). They also enhance
credibility because polls remain one of the few journalistic offerings whose accuracy can
be quickly and decisively tested, “and for that reason, they tend to keep us honest”
7
(Meyer, 1990: 454). While these predictions may affect the outcome of elections, at least
they are more likely to be accurate than data based on rumor, speculation and the
“musings of spin doctors” (p. 458). Besides, predicting who will win an election is the
bread and butter of journalism.
“The most interesting fact about an election is who wins … Yes, of
course, you want to know about the dynamics of the campaign and
what put the front-runner where he or she is … But none of that
interesting and useful information is going to make much sense
unless you can identify the front runner” (Meyer, 1990: 455).
Journalism and Poll Accuracy
The solution to ensuring polls serve democracy is by providing more information,
not less. “By keeping track of the polls’ successes and failures at predicting elections,
(news organizations) can help the information marketplace sort the good from the bad”
(Meyer, 1990). While news organizations have a long tradition of using polls, they have
not always been either the most accurate or the most forthcoming in how they obtained
the results. The initial acceptance of public opinion polls as a credible source of
information about public opinion rested primarily on the belief that pre-election polls
predict elections accurately. Three components to this belief are 1) polls have generally
been very accurate, 2) pre-election polls predict how people will vote, 3) measuring
voting behavior is comparable to measuring opinions on issues (Crespi, 1989). News
organizations did not begin to seriously question what polls and polling procedures meant
until pollsters inaccurately predicted Dewey would beat Eisenhower in 1948. The
debacle, in fact, hastened “the wholesale adoption of probability sampling” (Bogart,
1998).
8
Most of the media’s navel gazing about polls relates to their impact. As early as in
the 1980 election (Stovall & Soloman, 1984), researchers examined whether newspapers
focused too much on the horserace aspects of polls instead of the issues. What they found
is that newspapers are actually less likely to play a poll story as prominently as other
campaign stories (622). Meyer and Potter (1998) agreed that media may have hurt
themselves by minimizing coverage of horse-race polls. “With the horserace to arouse
interest, citizen attention to the expanded issue coverage was reduced – perhaps by an
amount sufficient to wash out the effect of that coverage” (p. 42).
The role the accuracy of these predictions had, however, was more specifically
examined in the 1988 presidential election. In a study of all the polls from Jan. 1 to
election day in 1988 of five major news organizations (New York Times/CBS News,
Wall Street Journal/NBC News, The Washington Post/ABC News, Newsweek/Gallup,
and Time/Yankelovich Clancy Shulman) Shelley & Hwang (1991) suggest accurate polls
do a good job in pulling people into the election. They defined accuracy as “the
difference between the poll predictions and the actual election outcome,” and found that
the average deviation from the final election result for both the first President Bush and
Michael Dukakis was between 4 and 4.4 percentage points. Their findings suggest that
the dynamic of public opinion in the 1988 presidential campaign was more a process of
initially or temporarily undecided voters coming to a decision and less a matter of
prospective voters switching from one candidate to the other. Timing remained critical
for the ability of the poll to predict accurately because intervening events dramatically
shifted poll numbers.
9
Accurate election prediction was most questioned during the 2000 presidential
election when Republican George W. Bush won the presidency despite losing the popular
vote to Democrat Al Gore. For the most part, news organizations and pollsters succeeded
in calling elections before and even after 2000, much of that success came from elections
where the outcome was relatively clear cut (Konner, 2003). “Polls are statistical
calculations, not factual realities. They are imperfect measures of voter intent and actual
voting, and their inaccuracies are especially perilous in close elections” (p. 16). The best
thing to come from the 2000 election might be that news organizations began to
understand they need to make it clear they were reporting only projections and explaining
more clearly how calls were made (Frankovic, 2003). “Calling elections is not magic, but
too often it is presented as such. Consequently, many reporters and many viewers
(including the candidates themselves) have held a mistaken belief in the news media’s
election omniscience” (p. 30). In studying Israeli polls and elections, Weimann (1990)
found those polls that carefully detail methodological problems, “thus limiting the value
of predictions based on the poll” are more accurate in their predictions (p. 404). News
organizations were less likely to do this as the election drew nearer.
“However, when this growing reliance on surveys and polls is not
accompanied by increasing familiarity and understanding of the
statistical and methodological problems involved in polling, and
when standards for reporting polls are non-existent or poorly
observed, the results of such ‘precision journalism’ would emerge
as far from accurate and valid” (p. 406).
Accuracy Measures
Describing statistical procedures and methodological problems is not as easy as
listing a few formulas. Polling is a “complex human enterprise that requires many
different steps, some mechanical, some judgmental,” Bogart (1998) writes (p. 12). All
10
survey statistics, he adds, arise from a series of professional judgments. One of the most
important, and one of the least commonly identified, is how to decide who will
participate. All polls must weigh results to conform to the population characteristics
identified in the U.S. Census. Shoehorning data into demographic proportions, however,
has never been simple as it has never been easy to “ascertain the psychological attributes
of the many people with whom interviews are never completed” (Bogart, 1998). In
predicting a presidential election, meeting census estimates are not enough because the
population does not choose the president. The Electoral College does. To predict the
Electoral College votes, one simply needs to use the predicted statewide popular vote to
project a winner for each state, and DeSart & Holbrook (2003) found that state-wide trial
heat polls taken in September “predicted the outcome of the 2000 election well. In fact,
they did a better job of predicting the 2000 election than they did the previous two
elections.
It is not enough just to find the demographic groups. As early as 1945, researchers
said polls must focus on those who intend to vote because it ensures a better estimate of
actual voting behavior (Lazarsfeld & Franzen, 1945). Finding people who are likely to
vote is not as easy as polling those who are registered because many who are registered to
vote do not. There is no easy way of determining who will vote (Bogart, 1998: 11), and it
is only getting more difficult as voting procedures and devices change and more
opportunities to vote early arise (Traugott, 2005). “More public disclosure of methods
and their consequences will be required to maintain public confidence in the profession,
as well as in the basic foundation of the American electoral system and its transparency
to the public.”
11
To compensate, many polling companies alter who they sample to cover all the
bases. The Harris Poll, for example, began its 1995 presidential election poll by reporting
the opinions of the general public, then switched to sampling registered voters in the
summer (Bogart, 1998). But Harris’ switch underscores the importance timing plays in
the efficacy of pre-election polling. Generally speaking the closer to the election a poll is
conducted the more accurate it is, but Wolfers & Leigh (2002) found that polls taken one
month prior to the election also have substantial predictive power (p. 226). Timing also
tends to lessen the number of undecided voters who can add another wrinkle to election
prediction. Panagakis (1999) said that undecided voters choose challengers more often
than incumbents because it takes less time to decide to vote for a known incumbent than
an unknown challenger. In addition, the less interested people were, the later they made
up their minds.
Undecided voters present an additional challenge in determining a poll’s accuracy
because some debate exists over how they are counted in the final tally. The single most
important statistic that determines which candidate will win an election and the one most
commonly reported by the media is the margin between the top two candidates
(Mitofsky, 1999). The most common ways to measure this statistic come from the
systematic evaluation of polling accuracy conducted by Mosteller et al. after the 1948
presidential election debacle (Martin, Traugott & Kennedy, 2005). Mitofsky (1999) has
repeatedly supported two of Mosteller’s measure as the most viable for prediction poll
accuracy: Measure 3 computes the average error on all major candidates between the
prediction and the actual results, while Measure 5 examines only the difference between
the top two candidates. He also questions how to handle the percentages of undecided
12
voters. Panagakis (1999), and much of the literature, agree that undecided voters must be
dealt with in some way besides simply not including their percentages. He argues for
allocating the undecided voters in proportion to their pre-election poll support. Mitofsky
(1999) found more consistency among different measures when the undecided voters
were allocated proportionally. To account for both accuracy and bias, as well as compare
elections across time, Martin, Traugott & Kennedy (2005) proposed a new measure that
calculates the odds of a Republican and Democratic choice in a given poll against the
total number of respondents who favor either the Democrat or the Republican in the poll.
In examining the L.A. Times decision to aggregate polls for each state on its Web
site during the 2004 election, this study will look most closely at the difference between
each major candidate’s predicted and actual totals and the proportion of undecided voters.
It also looks at how the timing and the number of respondents affected the poll’s
accuracy.
Method
Polling data were gathered from an election website produced by the Los Angeles
Times. The Times website reported on polling results for all 50 states and the District of
Columbia throughout the 2004 U.S election campaign, updating the site nearly daily.
Poll results came from several sources, including from the news media of each state.
The analysis examined one dependent variable: Polling accuracy. The variable
was determined by taking the margin of victory predicted by an individual poll from a
state and comparing it to the actual vote totals for the state. Thus, the variable had 51
observations (one for each of the 50 states and the District of Columbia). The variable
13
ranged from zero (Connecticut, Delaware and New Hampshire – the three states where
the poll results and election results differed by less than one-half percent) to 14 (District
of Columbia)
Independent variables: Four independent variables were included in the
analysis.
Timing: The number of days before the date of the election was recorded. This
variable had a range of 2 (Ohio, New Jersey, Florida) to 53 (Idaho). Logically, a poll
conducted closer to an election would be more accurate than a poll conducted earlier in a
campaign (Crespi, 1989).
Closeness: The difference between the winner of a state’s election total and the
loser was recorded. This variable had a range of 1 (Iowa, New Hampshire, New Mexico,
Wisconsin) to 81 (District of Columbia). How close an election was could influence how
accurate a poll is. Voters in states with a wider margin of victory may have fluctuated
more in their voting patterns, with some voters not bothering to vote because they may
have felt that their vote would not matter.
Number: This variable measured the number of respondents in a poll. It ranged
from 400 (South Dakota) to 1,345 (Wisconsin). Logically, the more respondents polled,
the more accurate the poll.
Undecided: This variable recorded the percentage?? of undecided respondents in
a poll. It ranged from 2 (New Hampshire) to 16 (Delaware). Undecided voters could be
swayed either way in an election. The more undecided voters, the less accurate a poll
might be.
14
The data were analyzed through a path analysis model. Path coefficients from
each of the four independent variables were examined leading to the dependent variable
of polling accuracy.
Results and Discussion
Overall, the polls were very accurate. More than half of the polls (54.9 percent)
were within 3 percent of the actual vote count. As mentioned above, three polls
(Connecticut, Delaware and New Hampshire) predicted the actual vote results exactly
(less than one-half percent difference). Ten other polls were off by just one percent of
the actual vote.
Table 1 shows the path analysis coefficients for all of the variables in the study.
As the table shows, two factors were able to predict the accuracy of the polls: The timing
of the poll and the closeness of the race. Timing was positively related to the accuracy of
the polls – the closer the poll was conducted to the election date, the more accurate the
poll was. The closeness of the race was also positively related to accuracy – the closer a
race was, the more accurate the poll was. The number of respondents in a poll and the
number of undecided voters in a poll were not related to polling accuracy.
The lack of statistical significance of the variable measuring the number of
respondents points to the accuracy overall of polling methods. Results were accurate even
with a relatively small number of respondents. The lack of significance for the variable
measuring the number of undecided voters in a state was surprising – apparently,
undecided voters were equally distributed among Democrats and Republicans in a
proportion similar to the eventual voting results. In other words, undecided respondents
voted similarly to the respondents who had made their choice for president much earlier.
15
The two significant variables showed that (1) polling results were more accurate
the closer in time to an election – later polls captured the trends of voters more accurately
than polls conducted earlier in a campaign; and (2) it is easier to track voters’ opinions in
a closer election – landslide elections could produce wilder swings among voters because
of lower turnout or other factors.
Both of the significant predictors of polling accuracy are good news for survey
researchers. Timing of polls is important. Early polls cannot capture voting trends as
well as later polls, suggesting that polls late in a campaign can accurately gauge voter
intentions. In addition, hotly contested states would be more important to track than
landslide states. Pollsters are less concerned about states in which one candidate has a
sizable margin. The winner can be easily predicted, though the margin of victory is
somewhat less clear. Pollsters are much more interested in attempting to predict the
winner in the hotly contested states. The results here show that they were successful in
gauging voter intentions.
Overall, the findings point to the successes of public opinion polling during the
2004 U.S. presidential election. Future studies should examine other factors that could
play a role in polling accuracy, as well as investigating whether these same factors affect
polling accuracy in other election settings – such as local elections – or in other cultures
and other countries.
16
Table 1: Path coefficients examining factors related to polling accuracy.
Beta
Standardized
Beta
Sig.
Timing
R: .288; R-square: .083;
Adjusted R-Square: .064.
.061
.288
.041
Closeness
R: .301; R-square: .090;
Adjusted R-Square: .072.
.074
.301
.032
Respondents
R: .024; R-square: .001;
Adjusted R-Square: -.020.
.000
-.024
.869
Undecided
R: .052; R-square: .003;
Adjusted R-Square: -.018.
-.061
-.052
.715
17
References
Bogart, L. (1998). "Politics, Polls, and Poltergeists." Society, 35(4), 8-16.
Crespi, Irving. (Winter, 1980). “Polls as Journalism.” Public Opinion Quarterly, 44 (4):
462-476
Crespi, Irving (1989). Public Opinion, Polls, and Democracy. Westview Press: Boulder,
Colo.
Frankovic, K. A. (2003). "News Organizations' Responses to the Mistakes of Election
2000." Public Opinion Quarterly, 67(1), 19-31.
Konner, J. (2003). "The Case for Caution." Public Opinion Quarterly, 67(1), 5-18.
Lazarsfeld, P. F., & Franzen, R. H. (1945). "Prediction of Political Behavior in America."
American Sociological Review, 10(2), 261-273.
Martin, E. A., Traugott, M. W., & Kennedy, C. (2005). "A Reviews and Proposal for a
New Measure of Poll Accuracy." Public Opinion Quarterly, 69(3), 342-369.
Meyer, P. (1990). "Polling as Political Science and Polling as Journalism." Public
Opinion Quarterly, 54(3), 451-459.
Meyer, P., & Potter, D. (1998). Preelection polls and issue knowledge in the 1996 U.S.
presidential election. Harvard International Journal of Press/Politics, 3(4), 35.
Mitofsky, W. J. (1999). "The Polls - Reply." Public Opinion Quarterly, 63(2), 282-284.
Panagakis, N. (1999). "The Polls - Response." Public Opinion Quarterly, 63(2), 278-281.
Shelley, Mack C. II and Hwarng-Du Hwang. (January 1991). “The Mass Media and
Public Opinion Polls in the 1988 Presidential Election: Trends, Accuracy,
Consistency, and Events.” American Politics Quarterly, 19 (1): 59-79
Traugott, M. W. (2005). "The Accuracy of the National Preelection Polls in the 2004
Presidential Election." Public Opinion Quarterly, 69(5), 642-654.
Tsfati, Y. (2001). "Why do People Trust Media Pre-Election Polls? Evidence from the
Israeli 1996 Elections." International Journal of Public Opinion Research, 13(4),
433-441.
Weimann, G. (1990). "The Obsession to Forecast: Pre-election Polls in the Israeli Press."
Public Opinion Quarterly, 54(3), 396-408.
18
Wolfers, Justin & Andrew Leigh. (2002). "Three Tools for Forecasting Federal Elections:
Lessons from 2001." Australian Journal of Political Science, 37 (2): 223-240.
19
Download