ECONOMICS AND ELECTING THE PRESIDENT THE WORK OF RAY FAIR http://fairmodel.econ.yale.edu / http://fairmodel.econ.yale.edu /vote2008/index2.htm http://fairmodel.econ.yale.edu /RAYFAIR/PDF/2006CHTM.HT M Presidential Election Links http://fairmodel.econ.yale.edu/ http://www.apsanet.org/content_58382.cf m http://www.douglashibbs.com/Election2012/2012ElectionMainPage.htm Economic Growth and the United States Presidency: Can You Evaluate the Players Without a Scorecard? David J. Berri Department of Applied Economics California State University – Bakersfield Bakersfield, California 93311 661-654-2027 dberri@csub.edu James Peach P. O. Box 30001/ MSC 3CQ Department of Economics New Mexico State University Las Cruces, NM 88003 505-646-2113 jpeach@nmsu.edu ABSTRACT In several academic papers and a book, Ray Fair (1978, 1996, 2002) has demonstrated a link between the state of the macroeconomy and the outcome of the Presidential Election in the United States. Beginning with the 1916 election, Fair’s model, based on such factors as economic growth, inflation, and incumbency, was able to accurately predict the winner in virtually every election. The purpose of this research is to take the Fair model back to the 19th century. The question we address is as follows: Can a version of Fair’s model accurately predict in an environment where economic data was not made available to the voter? LOUIS BEAN (1948) HOW TO PREDICT ELECTIONS “Business depressions played a powerful role in throwing the Republicans out of office in 1874, after 1908, and in 1932, and they had exactly the same influence in ousting Democrats after the panic of 1858 and during the economic setbacks of 1894 and 1920.” “Harding in 1920, McKinley in 1896, and Cleveland in 1884 were also depression-made presidents. Had the deciding electoral vote been cast for the candidate who had the majority of the popular vote in 1876, Tilden too, would have been a depression-made President.” THE WORK OF RAY FAIR Fair, Ray C. 1978. “The Effect of Economic Events on Votes for President.” The Review of Economics and Statistics (Vol. LX, No. 2):159-173 May 1978. Fair, Ray C. 1978. “The Effect of Economic Events on Votes for President: 1980 Results.” The Review of Economics and Statistics (Vol. 64, No. 2):322-25 May 1978. Fair, Ray. C. 1996. “Econometrics and Presidential Elections.” Journal of Economic Perspectives (Vol. 10, No 3):89-102 (Summer 1996). Fair, Ray C. 2002. “The Effect of Economic Events on Votes for President: 2000 Update.” http://fairmodel.econ.yale.edu/RAYFAIR/PDF/2002DHTM Downloaded Feb 2, 2006. Fair, Ray C. 2002. Predicting Presidential Elections and other things. Stanford: Stanford Business Books. A FAIR MODEL VOTE= a1 + a2GROWTH+ a3INFLATION + a4PARTY + a5PERSON + a6DURATION + a7GOODNEWS + ε DEFINING THE VARIABLES EMPLOYED HTTP://FAIRMODEL.ECON.YALE.EDU/RAYFAIR/PDF/2002DHTM.HTM VOTE = Incumbent share of the two-party presidential vote. GROWTH = annual growth rate of real per capita GDP in the first three quarters of the election year. INFLATION = absolute value of the growth rate of the GDP deflator in the first 15 quarters of the administration (annual rate) except for 1920, 1944, and 1948, where the values are zero. DEFINING THE VARIABLES EMPLOYED HTTP://FAIRMODEL.ECON.YALE.EDU/RAYFAIR/PDF/2002DHTM.HTM PARTY = 1 if Democrats are in power, = -1 if Republicans are in power PERSON = 1 if the president is running, = 0 otherwise DURATION = 0 if the incumbent party has been in power for one term, 1 if the incumbent party has been in power for two consecutive terms, 1.25 if the incumbent party has been in power for three consecutive terms, 1.50 for four consecutive terms, and so on. WAR = 1 for the elections of 1920, 1944, and 1948 and 0 otherwise GOODNEWS = number of quarters in the first 15 quarters of the administration in which the growth rate of real per capita GDP is greater than 3.2 percent at an annual rate except for 1920, 1944, and 1948, where the values are zero. Table One The Accuracy of the Fair Model 1916-2000 Actual VOTE Predicted VOTE Incumbent Party received by received by Year Candidate Challenger incumbent incumbent Error 1916 1920 1924 1928 1932 1936 1940 1944 1948 1952 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 Wilson Cox Coolidge Hoover Hoover Roosevelt Roosevelt Roosevelt Truman Stevenson Eisenhower Nixon L. Johnson Humphrey Nixon Ford Carter Reagan G. Bush G. Bush Clinton Gore Hughes Harding Davis Smith Roosevelt Landon Willkie Dewey Dewey Eisenhower Stevenson Kennedy Goldwater Nixon McGovern Carter Reagan Mondale Dukakis Clinton Dole G.W. Bush 51.7 36.1 58.2 58.8 40.8 62.5 55.0 53.8 52.4 44.6 57.8 49.9 61.3 49.6 61.8 48.9 44.7 59.2 53.9 46.5 54.7 50.3 50.9 39.2 57.3 57.6 38.8 63.8 55.7 52.5 50.5 44.4 57.3 51.6 61.1 50.2 59.4 48.9 45.7 62.0 51.3 51.7 53.7 48.9 Source: http://fairmodel.econ.yale.edu/RAYFAIR/PDF/2002DHTM.HTM -0.8 3.1 -1.0 -1.2 -2.1 1.4 0.7 -1.2 -1.8 -0.2 -0.5 1.7 -0.3 0.6 -2.4 0.0 1.0 2.9 -2.6 5.1 -1.0 -1.3 Actual Winner Predicted Winner Wilson Harding Coolidge Hoover Roosevelt Roosevelt Roosevelt Roosevelt Truman Eisenhower Eisenhower Kennedy L. Johnson Nixon Nixon Carter Reagan Reagan G. Bush Clinton Clinton G.W. Bush Wilson Harding Coolidge Hoover Roosevelt Roosevelt Roosevelt Roosevelt Truman Eisenhower Eisenhower Nixon L. Johnson Humphrey Nixon Carter Reagan Reagan G. Bush G. Bush Clinton G.W. Bush SUMMARIZING FAIR: 1916-2000 Only incorrect in three elections: 1960, 1964, 1992. Average absolute error: 1.5 Results are driven by economic variables with no consideration of a candidate’s appearance, debating talents, advertisements, or general campaign skills. TAKING FAIR BACK TO 1824 New measures of growth and inflation are needed. Louis Johnston and Samuel H. Williamson, "The Annual Real and Nominal GDP for the United States, 1789 - Present." Economic History Services, April 2002, URL : http://www.eh.net/hmit/gdp/ This data has been updated. Updated data did not change our general findings. THE MODELS TO BE ESTIMATED Model 1 Original Fair Model (1916-2000) Model 2 Fair Model with new measures of GROWTH and INFLATION (1916-2000) Model 3 Fair Model with new measures of GROWTH and INFLATION, no GOODNEWS (1916-2000) Model 4 Fair Model with new measures of GROWTH and INFLATION, no GOODNEWS (1916-2004) Model 5 Fair Model with new measures of GROWTH and INFLATION, no GOODNEWS (1824-1912) Model 6 Fair Model with new measures of GROWTH and INFLATION, no GOODNEWS (1824-2004) Table Three Various Estimates of Fair’s Model Dependent Variable is VOTE White Heteroskedasticity-Consistent Standard Errors & Covariance Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 1916-2000 1916-2000 1916-2000 1916-2004 1824-1912 1824-2004 0.691* 0.457* 0.409** 0.413** -0.135 0.342 (6.169) (3.526) (2.569) (2.649) (-0.560) (1.577) INFLATION -0.775* -0.808* -0.988* -0.943* 0.438 -0.191 (3.915) (4.132) (5.405) (5.051) (0.511) (0.731) PARTY -2.713* -2.054** -1.644*** -1.252 0.647 -0.594 (5.434) (2.644) (1.946) (1.363) (0.344) (0.580) PERSON 3.251*** 2.145 1.701 1.578 -0.372 1.705 (1.837) (1.007) (0.698) (0.663) (0.083) (0.625) DURATION -3.628** -4.238** -5.216** -4.519*** 1.414 -0.448 (2.517) (2.410) (2.315) (1.968) (0.611) (0.208) 3.855 5.268 2.905 2.262 -1.881 -0.835 (1.102) (0.832) (0.102) (0.214) Sample: GROWTH WAR GOODNEWS INTERCEPT R-Square (1.213) (1.369) 0.837** 0.539 (2.932) (1.238) 49.607* 52.871* 57.769* 56.892 48.458 50.889 (19.550) (15.512) (18.169) (17.736) (10.618) (15.184) 0.923 0.845 0.828 0.769 0.093 0.116 0.683 8.883* 23 -0.248 0.272 23 -0.020 0.850 46 Adjusted R-Square 0.885 0.767 0.759 F-Statistic 24.000* 10.888* 12.015* Observation 22 22 22 t-statistics in parenthesis below each coefficient. * - Significant at the 1% level ** - Significant at the 5% level *** - Significant at the 10% level Table Four The Accuracy of the Fair Model 1824-1912 Forecast Based on Model 3 Year 1824 1828 1832 1836 1840 1844 1848 1852 1856 1860 1864 1868 1872 1876 1880 1884 1888 1892 1896 1900 1904 1908 1912 Incumbent Party Candidate Challenger Jackson J.Q. Adams J.Q. Adams Jackson Jackson Clay Van Buren W. Harrison Van Buren W. Harrison Clay Polk Cass Taylor Scott Pierce Buchanan Fremont Breckinridge Lincoln Lincoln McClellan Grant Seymour Grant Greeley Hayes Tilden Garfield Hancock Blaine Cleveland Cleveland B. Harrison B. Harrison Cleveland Bryan McKinley McKinley Bryan T. Roosevelt Parker Taft Bryan Taft/Roosevelt Wilson * - did not win popular vote Actual VOTE received by incumbent 57.2 43.8 59.2 58.1 47.0 49.3 47.3 46.3 57.8 31.2 55.0 52.7 55.9 48.5 50.0 49.9 50.4 48.3 47.8 53.2 60.0 54.5 54.7 Predicted VOTE received by incumbent 39.8 56.9 58.0 43.7 51.4 58.6 53.5 62.0 55.2 56.4 42.9 48.5 55.9 50.3 50.0 45.1 59.1 61.2 53.7 59.5 49.9 46.8 50.6 Error 17.4 13.1 1.2 14.4 4.4 9.4 6.2 15.7 2.6 25.2 12.1 4.2 0.0 1.8 0.2 4.8 8.7 13.0 5.9 6.3 10.1 7.6 4.1 Actual Winner J.Q. Adams* Jackson Jackson Van Buren W. Harrison Polk Taylor Pierce Buchanan Lincoln Lincoln Grant Grant Hayes* Garfield Cleveland B.Harrison* Cleveland McKinley McKinley T. Roosevelt Taft Wilson* Predicted Winner J.Q. Adams J.Q. Adams Jackson W. Harrison Van Buren Clay Cass Scott Buchanan Breckinridge McClellan Seymour Grant Hayes Hancock Cleveland Cleveland B. Harrison Bryan McKinley Parker Bryan Taft-Roosevelt WHY DOES THE FAIR MODEL FAIR POORLY BEFORE 1916? Economic data did not exist. U.S. economy not integrated. Federal government was not held responsible for the macroeconomy. Non-economic issues were more important in the 19th century. Econometrics and Presidential Elections Larry M. Bartels OVERVIEW OF THE FAIR MODEL One of the most interesting aspects of Fair's essay is the unusually frank and detailed description it provides of the enormous amount of exploratory research underlying published analyses of aggregate election outcomes. What is the relevant sample period? Which economic variables matter? Measured over what time span? What does one do with third party votes, war years, or an unelected incumbent? In fewer than a dozen pages, Fair raises and resolves many such questions, as any data analyst must. In the process, he makes clear how much of what Leamer (1978) has referred to as “specification uncertainty” plagues this (or any other) statistical analysis of presidential election outcomes. CHOOSING A MODEL ….(Fair’s) choice of model specification seems to have been guided by goodness-of-fit considerations rather than by a priori political or economic considerations. His data set begins in 1916 because “some experimentation . . . using observations prior to 1916" produced results that “were not as good.” Gerald Ford is sometimes counted as an incumbent and sometimes not, depending upon which treatment “improves the fit of the equation.” Revised economic data produced significant changes in several key coefficients, prompting renewed searching “to see which set of economic variables led to the best fit,” and so on. WHAT HAVE WE LEARNED? What most electoral scholars really care about is what the relationship between economic conditions and election outcomes tells us about voting behavior and democratic accountability. On that score, what have we learned, and what have we yet to learn? The clearest and most significant implication of aggregate election analyses is that objective economic conditions -- not clever television ads, debate performances, or the other ephemera of day-to-day campaigning -- are the single most important influence upon an incumbent president's prospects for reelection. Despite a good deal of uncertainty regarding the exact form of the relationship, the relevant time horizon, and the relative importance of specific economic indicators, there can be no doubt that presidential elections are, in significant part, referenda on the state of the economy. WHY DOES THE ECONOMY MATTER? Do economic conditions matter because people vote their own pocketbooks, or because they respond to changes in the whole nation's economic condition? The work of Markus (1988) and others has demonstrated that personal and national economic fortunes are both important. However, this demonstration does almost nothing to resolve the related question of whether voters' underlying motivations are selfish or altruistic. (Selfish voters could rationally base forecasts of their own future incomes on recent changes in the national economy, while altruistic voters could rationally base expectations regarding their fellow citizens' future economic fortunes on their own recent economic experience.) MORE ON “WHY”… Can voters untangle the complex contributions of the president and other actors to the making of government policy? Can they untangle the even more complex contributions of government policy, exogenous economic forces, and dumb luck to observed levels of economic growth, wage changes, unemployment, or inflation? Alesina, Londregan, and Rosenthal (1993) attempted to distinguish between “rational” and “naive” economic voting by estimating separate electoral effects for economic “shocks” and economic growth that was “predictable” on the basis of previous growth, partisan effects, and military mobilization. They found no significant difference between these potentially distinct effects, a result they interpreted (1993, 23) as “consistent with the hypothesis of naive retrospective voting.” AND MORE ON “WHY”… …we know remarkably little about why voters reward the incumbent president for prosperity or punish him for economic distress. Do they have any rational basis for supposing that economic conditions in the election year are indicative of future conditions if the incumbent is reelected? (As far as I know, nobody has demonstrated such a connection.) Do they know or care what, if anything, the out party would do differently? Or, as Ferejohn (1986) and others would have it, are they simply holding up their end of a simple-minded implicit contract intended to extract whatever effort a self-interested incumbent may be able to exert on their behalf -- the only sort of accountability feasible in a situation marked by massive uncertainty and asymmetric information? MY OWN THOUGHTS… Three voters in the election… Republicans (vote Republican) Democrates (vote Democrat) Independents The only free agents are independents. These are voters who care so little, they don’t join a party. And these are the voters that matter. Why the economy? It is the one issue that matters to the independent. POLITICS AND FOOTBALL… Some football coaches believe the run sets up the pass. Others think the pass sets up the run. Fans, though, don’t care. You win, you keep your job. You lose, you lose your job. Applied to politics… some people believe in smaller government and low taxes. Others believe in more government to solve problems. Independents, though, don’t care. The economy does well, you keep your job. If not, your fired. What the politician believes is simply not relevant. Writing an academic paper 1. 2. 3. 4. Introduction (tell cute story) Literature Review Describe Data Create model: 1. 2. 3. 4. Identify dependent variable Identify independent variables State hypothesis Estimate model DATA SUMMARY AND DESCRIPTION Population Parameters – Summary and descriptive measures for the population. Sample Statistics – Summary and descriptive measures for a sample. NOTE: We rarely have data for the population. Hence we need to be able to draw inferences from a sample. MEASURES OF CENTRAL TENDENCY Mean – The average Issue:You must note the distribution of the sample. If it is unbalanced the mean may be misleading. Median – “Middle” observation SYMMETRICAL VS. SKEWNESS Symmetrical – A balanced distribution. Median = Mean Skewness – A lack of balance. Skewed to the left: Median > Mean Skewed to the right: Median < Mean If skewness is observed one may wish to examine a sub-sample of the data. MEASURES OF DISPERSION Range – Difference between the largest and smallest sample observations. Only considers the extremes of the sample Solution: Inter-quartile or percentile range Variance and Standard Deviation Sample Variance – Average squared deviation from the sample mean. Sample Standard Deviation – Squared root of the sample variance. COEFFICIENT OF VARIATION Coefficient of variation – Standard deviation divided by the mean. A measure that does not rely on the size of the observations or the unit of measurement. This is used to compare relative dispersion across a variety of data. Hypothesis Testing Hypothesis Testing – Statistical experiment used to measure the reasonableness of a given theory or premise NOTE: WE DO NOT “PROVE” A THEORY Type I Error – Incorrect rejection of a ‘true’ hypothesis. Type II Error – Failure to reject a ‘false’ hypothesis. Regression Analysis Definitions Regression analysis – statistical method for describing the relationship between a dependent variable Y and independent variable(s) X. Deterministic Relation = An identity A relationship that is known with certainty. Statistical Relation – An inexact relation Regression Analysis Types of Data Time series – A daily, weekly, monthly, or annual sequence of data. i.e. GDP data for the United States from 1950 to 2012 Cross-section – Data from a common point in time. i.e. GDP data for OECD nations in 1986. Panel data – Data that combines both crosssection and time-series data. i.e. GDP data for OECD nations from 1960 to 2012. Steps in Regression Analysis 1. 2. 3. 4. Specify the dependent and independent variable(s) to be analyzed Obtain reliable data. Estimate the model. Interpret the regression results. Specifying the Regression Analysis The choice of independent variables Univariate analysis = Simple regression model - A regression model with only one independent variable. Issue: Cannot impose ceteris paribus Multivariate analysis = Multiple regression model - A regression model with multiple independent variables. Univariate Analysis Y = a + bX Where Y = The Dependent Variable, or what you are trying to explain (or predict). X = The Independent Variable, or what you believe explains Y. a = the y-intercept or constant term. b = the slope or coefficient The Least Squares Model Ordinary Least Squares: a statistical method that chooses the regression line by minimizing the squared distance between the data points and the regression line. Why not sum the errors? Generally equals zero. Why not take the absolute value of the errors? We wish to emphasize large errors. The slope coefficient How do we interpret the slope coefficient? Example: Winning percentage = -0.830 + 0.014*(Points per game) Each additional one point per game results in a 0.014 increase in winning percentage. How many wins is this? 1.1 over an 82 game season. Is this the ‘truth’? We never know the truth, we are simply attempting to derive estimates. Is this a ‘good’ estimate? Clearly points alone do not explain wins. The constant term How do we interpret the constant term? The constant term must be included in the regression, or else we are forcing the regression line through zero. The constant term is used to impose a zero mean for the error term, hence it acts as a garbage collector. In other words, it captures all the factors not explicitly utilized in the equation. The constant term is theoretically the value of Y when X is zero. Frequently this is outside the range of possibility, and therefore the constant term should not be interpreted. The Error Term Error term (e) = random, included because we do not expect a perfect relationship. Sources of error 1. Omitted variables 2. Measurement error 3. Incorrect functional form Multivariate Analysis Introducing the idea of ceteris paribus. One cannot impose ceteris paribus unless all relevant variables are included in the model. Coefficient of Determination Coefficient of Determination – Percentage of Y-variation explained by the regression model. Also referred to as R2 R2 = Variation Explained by Regression Total Variation in Y R2 ranges from 0 to 1. TSS, ESS, RSS in words Total Sum of Squares = TSS = How much variation there is to explain. Explained Sum of Squares = ESS = How much variation you explained. Residual Sum of Squares = RSS = How much variation you did not explain. Adjusted R-Squared Adding any independent variable will increase R2. To combat this problem, we often report the adjusted R2. F Statistic F-statistic tells us if the independent variables as a group explain a statistically significant share of the variation in the dependent variable. Judging the significance of a variable The t-statistic: estimated coefficient / standard deviation of the coefficient. The t-statistic is used to test the null hypothesis (H0) that the coefficient is equal to zero. The alternative hypothesis (HA) is that the coefficient is different than zero. Rule of thumb: if t>2 we believe the coefficient is statistically different from zero. WHY? Understand the difference between statistical significance and economic significance. Multicollinearity Multicollinearity - more than two independent variables exhibit a linear correlation. Consequences a. Standard errors will rise, t-stats will fall b. Estimates will be sensitive to changes in specification c. Overall fit of regression will be unaffected Other Econometric Issues Omitted Variable Bias: You cannot impose ceteris paribus if relevant independent variables are not included in the model. Small Sample Bias: You cannot adequately assess a relationship with an inadequate sample. Remember, we are trying to learn about the underlying population. THE BIG WORDS: Heteroskedasticity and Autocorrelation