The Statistical State of the Presidential Race

advertisement
September 24, 2012, 10:16 am101 Comments
The Statistical State of the Presidential Race
By NATE SILVER
With fewer than 45 days left in the presidential campaign, it’s no longer a cliché to say that every
week counts. And there are a few polling-related themes we’ll be watching especially closely
this week.
This is probably about the last week, for instance, in which Mitt Romney can reasonably hope
that President Obama’s numbers will deteriorate organically because of a convention bounce.
That is not to say that Mr. Obama’s standing could not decline later on in the race, for any
number of reasons. But if they do, it will probably need to be forced by Mr. Romney’s campaign,
or by developments in the news cycle, not the mere loss of post-convention momentum.
We’ll also be looking to see if there is a greater consensus in the polls this week. In general, last
week’s numbers started out a bit underwhelming for Mr. Obama — suggesting that the
momentum from his convention was eroding — but then picked up strength as the week wore
on.
Still, there were splits among the tracking polls and among other national surveys; between state
polls that called cellphones and those which did not; and among pollsters who came to a wide
variety of conclusions about whose supporters were more enthusiastic and more likely to turn
out.
But before we get lost in the weeds, let’s consider a more basic question. What did the polling
look like at this stage in past elections, and how did it compare against the actual results?
Our polling database contains surveys going back to 1936. The data is quite thin (essentially just
the Gallup national poll and nothing else) through about 1968, but it’s nevertheless worth a look.
In the table below, I’ve averaged the polls that were conducted 40 to 50 days before the election
in each year — the time period that we find ourselves in now. (In years when there were no polls
in this precise time window, I used the nearest available survey.)
The table considers the race from the standpoint of the incumbent party (designated with the
color purple) and the challenging party (wearing the orange jerseys), without worrying about
whether they were Democrats or Republicans. Mr. Obama’s position, for instance, is probably
more analogous to that of the Republican incumbent George W. Bush in 2004 than it is to the
candidate from his own party that year, John Kerry.
This is an awful lot of data, but there are several reasonably clear themes.
First, the polling by this time in the cycle has been reasonably good, especially when it comes
to calling the winners and losers in the race. Of the 19 candidates who led in the polls at this
stage since 1936, 18 won the popular vote (Thomas E. Dewey in 1948 is the exception), and 17
won the Electoral College (Al Gore lost it in 2000, along with Mr. Dewey).
Of course, if Mr. Obama led in the race by 30 percentage points — as Lyndon B. Johnson did in
1964 — there wouldn’t be much need for such detailed analysis, and FiveThirtyEight might be
free to blog about the baseball playoffs.
If you eliminate the candidates with double-digit leads, the front-runner’s record is eight
Electoral College wins in 10 tries, or a batting average of 80 percent.
This a simple method — to the point of being crude. But it’s interesting, nevertheless, that the 80
percent figure corresponds quite well with the FiveThirtyEight forecast, which gave Mr. Obama
a 78 percent chance of winning as of Sunday night, and with the odds on offer by bookmakers,
many of whom list Mr. Obama as about a 4-to-1 favorite.
The second theme is one that we’ve brought up before. There has not been any tendency, at
least at this stage of the race, for the contest to break toward the challenging candidate.
Instead, it’s actually the incumbent-party candidate who has gained ground on average since
1936. On average, the incumbent candidate added 4.6 percentage points between the late
September polls and his actual Election Day result, whereas the challenger gained 2.5 percentage
points.
You can slice the data in slightly different ways if you like: by looking at only true incumbent
presidents, for instance, as opposed to those who represented the incumbent party after the sitting
president retired — or furthermore, you can restrict the sample to elected incumbents, which
would exclude cases like Gerald R. Ford in 1976. But it gets you to more or less the same
answer.
It is also important to observe, however, that the challenging party’s candidate has gained
more ground than the incumbent in each of the past four election cycles (from 1996 through
2008). Statistically speaking, this streak does not tell us all that much (the incumbent party
closed well in each year from 1988 through 1992). But perhaps this reflects the fact that the
conventions are being held later and later, meaning that the incumbent-party candidate, who
holds his convention last, could still be in the midst of a modest convention bounce at this stage
of the race. For that reason, I think we’ll need to wait until at least the end of the week to see if
Mr. Obama’s numbers hold.
But the point is not to argue for the idea that Mr. Obama is likely to gain ground so much as
against the notion that Mr. Romney will necessarily have a tail wind. In 14 of the 19 elections
since 1936, both the incumbent and the challenger added at least some points to their standing
relative to each candidate’s late September polls.
A corollary to this is that the incumbent (or the challenger, for that matter) does not need to
be at 50 percent of the vote to be a clear favorite to win: the eventual winner will probably
pick up at least some undecided voters, and at least a few votes will go to third-party candidates.
Mr. Obama’s current number in the polls — about 48 or 49 percent on average in national
surveys — is very similar to those of George W. Bush in 2004, George H.W. Bush in 1988, and
Franklin D. Roosevelt in 1944, all of whom won, some of them easily.
Harry S. Truman won the 1948 election despite being at just 39 percent at this point in the polls.
His opponent, Mr. Dewey, achieved the highest standing in the late September polls (47 percent)
of any candidate (incumbent or challenger) who failed to win the election, although John F.
Kennedy came quite close to losing in 1960 despite being at 49 percent in the Gallup poll in
September.
To the extent there’s a useful rule of thumb about a candidate achieving 50 percent in the polls, it
is this: a candidate who reaches 50 percent of the vote late in the race is almost certain to
win. Below that threshold, there are fewer guarantees. But a candidate (incumbent or challenger)
at 48 or 49 percent of the vote will normally be a clear favorite.
Nonetheless, another theme: although Mr. Obama’s raw vote share looks reasonably strong, Mr.
Obama’s margin over Mr. Romney is not that impressive for an elected incumbent. On
average, elected incumbents have led by 7.7 percentage points that this stage of the race — larger
than Mr. Obama’s advantage, which is in the range of four points instead.
However, this also helps to explain why Mr. Obama is leading in the race despite a mediocre
economy. If an elected incumbent wins by a margin in the high single digits in an the average
year, that gives him quite a bit of slack if conditions are below-average, but not terrible. The
economy is bad, but perhaps not quite bad enough to oust an elected incumbent who otherwise
has a fair number of advantages.
The next point is that large changes can occur late in the race, or at least large errors in the
polling. There were four years (1936, 1948, 1968 and 1972) in which the actual election result
diverged by at least 10 points from the late September polls, and several other years (like 1980)
when there was a shift in the mid-to-high single digits. Of these years, only 1948 reversed the
winner — but there were also a lot of close calls, like a near-comeback by Hubert H. Humphrey
in 1968, who went from 15 points down to losing to Richard M. Nixon by less than a full
percentage point.
A general rule in statistical analysis is that close calls really ought to count, at least for partial
credit. Several election years — certainly 1960, 1968 and 2000, and arguably 1976 and 2004 —
were close enough that their results could have been altered by essentially random factors.
But these late changes in the polls seem to be becoming less frequent. Since 1972, the average
change between the late September polls and the election result is 4.9 percentage points in one
direction or another, versus an average error of 7.1 percentage points between 1936 and 1968.
And the shifts have been smaller still, 3.7 percentage points on average, in the five elections
since 1992.
Does this reflect improved (or at least more abundant) polling, changing behavior in the
electorate, or both? Presumably a little of both. Gallup, for instance, had Mr. Dewey defeating
Mr. Truman in 1948, but if there had been a dozen pollsters in the field back then, would they all
have shown that same result? (Consider that, until Sunday, Gallup’s national tracking poll
showed a tied race — whereas virtually every other state and national pollster has produced
numbers consistent with Mr. Obama holding at least a small lead.)
But there should also be little doubt that Americans are tuning into the presidential race earlier,
and that they are becoming more partisan, two trends that lock them into their candidate choices
sooner and reduce late-stage volatility. And an increasing number of Americans are taking
advantage of early voting — which is already under way in some states — meaning that they
cast their ballot sooner in an entirely literal sense.
Next, and related, there are few undecided voters this year. On average among national polls,
about 7 percent of voters either say they are undecided, or that they will vote for a third-party
candidate — the same percentage as in 2004, when voters committed early to Mr. Bush or Mr.
Kerry. The figures are slightly lower than at a comparable point in 2008, and considerably lower
than in 2000.
By the way, I am intentionally lumping undecided voters and potential votes for third-party
candidates together. Some voters who are not thrilled with the major-party choices may name a
third-party candidate when a pollster gives them the option, but then grudgingly vote Democrat
or Republican for fear of wasting their votes otherwise. For this reason, polls generally overstate
the standing of third-party candidates, and for forecasting purposes it may be proper to treat
ostensible third-party voters as de facto undecideds.
The exception is when a third-party candidate is potentially more viable, like H. Ross Perot in
1992. But just as a greater number of undecided voters contributes volatility to the outcome, so
does the presence of strong third-party choices. In those years, there are three vectors along
which votes can move — between the Democrat and the independent, the Democrat and the
Republican, and the independent and the Republican — as opposed to just one. Many of the
years associated with the largest late-stage errors in the polling, like 1968 and 1980, were also
associated with third-party candidates.
Thus, although a shift of several percentage points in Mr. Romney’s favor is far from impossible,
or even all that unlikely, this also looks like a year in which volatility in the polls might be lower
than average. Third-party candidates are playing only a minor role this year, there are few
undecideds and the late-stage movement in the polls has been on a secular downward trend over
the past two decades.
Furthermore, there tends to be less movement in the polls in reasonably close elections than in
blowouts, when the trailing candidate can sometimes receive a dead-cat bounce, or when the
front-runner’s advantage grows from large to larger if the trailing candidate’s supporters are too
despondent to turn out, as may have been the case for Walter Mondale’s Democrats in 1984.
And indeed, volatility has been low throughout the campaign. Just as in the stock market, past
volatility seems to predict future volatility in the polls.
So this is why, despite the importance of the big picture, we will also need to sweat the small
stuff this week. It seems plausible that by seven days from now, the consensus of data could
point toward anything from Mr. Obama being a two-point favorite (about where the race was
before the conventions) to being as much as six points ahead (as some of his stronger state polls
seem to imply). Likewise, he could be at anywhere from about 47 percent of the vote (if his
numbers recede from a convention bounce) to 50 percent (if his bounce holds and he inches
forward as undecided voters commit.)
This makes an enormous amount of difference. Based on the way that our forecast model
calculates it, a candidate ahead by two percentage points at this stage would be about a two-toone favorite to win — odds that Mr. Romney might have to accept at this stage, improving his
position enough to make further gains later. But a candidate ahead by six points would have
around a 90 percent chance of victory.
Download