Comparison of methods for the estimation of voter transition rates IOANNIS ANDREADIS* & THEODORE CHADJIPADELIS Department of Political Sciences, Aristotle University Thessaloniki Keywords: Voter transition rates, Ecological inference, Simulation Paper prepared for the Elections, Public Opinion and Parties 2008 Annual Conference Introduction Estimation of voter transitions between two elections has occupied political science researchers for more than 50 years and it still attracts their attention. The most recent method in the literature is presented by Andreadis & Chadjipadelis (2009). In our paper we use data from the first and the second round of the French presidential elections in 2007 and we present a) the transition matrix estimated by our algorithm, b) the transition matrix estimated by the hierarchical method proposed by Rosen et al (2001) and c) exit poll estimates. Table 1 replicates our findings. Table 1 Voter transition rates for the French Presidential elections 2007 Non Valid + Abstention PM** RO*** PM** RO*** PO* PM** RO*** 2% 20% 98% 28% 2% 0% 52% 99% 27% 0% 24% 2% 0% 50% 35% 23% 46% 25% 21% 18% 52% 4% 21% 83% 22% 22% 11% 57% of IPSOS† and TNS-SOFRES‡. ** Estimates using the proposed SARKOZY PO* PO* SARKOZY 97% 1% ROYAL 0% 98% BAYROU 40% 39% LE PEN 64% 14% * Poll estimates: Averages ROYAL method. *** Estimates of the method proposed by Rosen et al. In Table 1 we present the estimation of voter transition rates between the top four candidates of the first round and the two candidates of the second round of the French Presidential elections which took place in 2007. Simple comparisons of the results presented in Table 1 indicate that the voter transition matrices estimated by the * Correspondence Address: Ioannis Andreadis, Department of Political Sciences, Aristotle University Thessaloniki, 46 Egnatia Str, 54625, Greece. Email: john@polsci.auth.gr † Available at http://www.ipsos.fr/presidentielle-2007/pdf/ssu-2eTour.pdf ‡ Available at http://2007.tns-sofres.com/etude.php?id=394 proposed method are much closer to exit polls estimates than the estimates provided by the hierarchical method proposed by Rosen et al. In this paper we present a more detailed comparison of the known to the authors methods for the estimation of voter transition rates which are available in R. The two aforementioned methods use as input the electoral results only. We also deal with the entropy maximizing approach initially proposed by Johnston and Hay (1983) which requires an estimation of the general voter transition matrix (also available in R). Simulation results provide substantial information on the accuracy and usefulness of each method for the estimation of voter transition rates. Simulation (based on the French Presidential Elections) We set up the simulation in the following way: Suppose we deal with the first and the second round of an election. Let four parties participate in the first round and three parties (including "not valid + abstention" as one of these parties) participate in the second round. Elections take place in 96 departments. In the first round the general results for the four parties are as follows: n1./n=36%, n2./n=30%, n3./n=22% and n4./n=12% (for SARKOZY, ROYAL, BAYROU and LE PEN respectively). Finally, suppose that the real 4x3 general (for the entire country) transition matrix is given in Table 2. Table 2 General transition matrix (nij/ni.) SARKOZY ROYAL BAYROU LE PEN SARKOZY ROYAL 97% 0.5% 40% 64% 1% 98% 39% 14% Non Valid + Abstention 2% 1.5% 21% 22% With the aforementioned details we have prepared a framework similar to the 2007 Presidential Elections in France. For the sake of simplicity we use as size of the electorate n only the votes for the four candidates who have earned the support of a significant part of the French electorate in the first round. In each of the 96 departments we draw a sample from a multinomial distribution Multinomial(nk, p) with 12 possible outcomes, where nk is the size of the electorate in department k=1, 2, …, and p=c(0.3503, 0.0015, 0.0874, 0.0797, 0.0036, 0.2901, 0.0852, 0.0174, 0.0072, 0.0044, 0.0459, 0.0274) is a vector of 12 values which correspond to the number of voter transitions nij divided by the size n of the total electorate. Thus, for each department we have nk independent trials from a multinomial distribution, where each trial results in exactly one of the 12 possible outcomes, with probabilities p. We store the sample values to a 4x3x96 array with elements nijk where i=1, 2, 3, 4, j=1, 2, 3 and k=1, 2, …, 96. The values of nijk represent the number of voters who have moved from party i to party j in department k. Summing for all k we get the 4x3 matrix with elements nij, where nij is the number of voters who have moved from party i to party j in all departments. Summing for all j we get the 96x4 matrix F with elements ni.k, where ni.k is the number of voters of the first round of the elections who have voted for party i in department k. Summing for all i we get the 96x3 matrix S with elements n.jk, where n.jk is the number of voters of the second round of the election who have voted for party j in department k. Then we use matrices F and S along with vector N=(n1, n2, …, nk) where nk is the size of the electorate in department k, as input to the methods we compare. The aforementioned procedure was replicated for 100 times and the findings are presented in the following section. Simulation results Markov Chain Monte Carlo Hierarchical Method Applying methods proposed for the general ecological inference problem, we could deal with the problem of voter transition estimation which is a special case of the general problem. Although some researchers have dealt with the RxC ecological inference problem, voter transition estimation remains a difficult and tedious goal. For instance, the MCMCHM (Markov Chain Monte Carlo Hierarchical Method), a Multinomial-Dirichlet model, proposed by Rosen et al (2001), is computational intensive and extremely demanding for computer resources. Also, the simpler and quicker version of the MCMCHM, which is available as R code, appears to give questionable estimates, since comparisons of these estimates with exit-poll estimates indicate that the output of the hierarchical method cannot be accepted as a good fit to reality. For a more formal evaluation of the MCMCHM we present Table 3. This table includes the average voter transition rates estimated by the method after 100 replicates of the simulation procedure. The method fails to estimate the high loyalty rates of those who have voted for Sarkozy or Royal in the first round of the elections. More specifically, according to MCMCHM estimates, those who vote for Royal in the first round appear to have almost the same probability of voting either Sarkozy or Royal in the second round. In addition, according to MCMCHM, Sarkozy appears to earn less Le Pen voters than Royal and his loyalty rate is less than 60%. Table 3 Average voter transition estimates provided by Rosen et al SARKOZY ROYAL BAYROU LE PEN SARKOZY ROYAL 58,42% 44,38% 56,39% 41,79% 34,78% 47,77% 33,67% 43,70% Non Valid + Abstention 6,80% 7,85% 9,94% 14,51% Table 4 Summary statistics for Sarkozy voters Min. 1st Qu. Median 3rd Qu. Max. SARKOZY ROYAL 48,94% 56,09% 58,36% 60,65% 68,03% 25,23% 32,54% 34,83% 37,06% 44,28% Non Valid + Abstention 6,70% 6,77% 6,79% 6,82% 6,91% In Table 4 we present the summary statistics of the MCMCHM estimates for the first round voters of Sarkozy. Even the maximum value of the estimates for the loyalty rate of Sarkozy voters is almost 30 percentage units away from the "real" loyalty rate which is 97%. Also, the minimum estimated defection rate towards Royal is 25 times greater than the "real" rate which is 1%. VTR algorithm proposed by Andreadis and Chadjipadelis This method consists of five tasks: i) initial data preparation, ii) the selection of the best pair of parties, iii) ecological inference for 2x2 tables, iv) extraction of the estimated quantities of interest and data preparation for the next cycle and v) final calculations. A short outline of the method is described as follows: Matrices F, S and vector N are used as input to the algorithm. Each cycle begins with the selection of the "best pair" of parties (i, j) where party i has run for the first round and party j has run for the second round of the elections. "Best pair" choice is essentially based on the correlation coefficients. For this pair of parties we apply a 2x2 ecological inference method to estimate the voter transitions from party i to party j in each division k. Next, we calculate the remaining electorate, after subtracting the estimated voter transitions and we proceed to the next cycle. Table 5 Average voter transition estimates provided by VTR SARKOZY ROYAL BAYROU LE PEN SARKOZY ROYAL 95,80% 0,52% 43,20% 62,00% 1,22% 94,20% 41,80% 17,50% Non Valid + Abstention 2,98% 5,29% 15,00% 20,40% Table 5 includes the average voter transition rates estimated by the VTR algorithm after 100 replicates of the simulation procedure. The method provides close estimates for the high loyalty rates of those who have voted for Sarkozy or Royal in the first round of the elections. It also gives good estimates of voter transitions between these two candidates. The largest deviation from the "real" values appears in the underestimation (15% instead of the "real" 21%) of those who have voted for Bayrou in the first round and have not voted any of the remaining candidates in the second round. In Table 6 we present the summary statistics of the VTR estimates for the first round voters of Sarkozy. Even the minimum value of the estimates for the loyalty rate of Sarkozy voters is far closer to the "real" value than the maximum value provide by MCMCHM. This means that the worst case scenario of VTR is better than the best case scenario of MCMCHM for the estimation of the loyalty rate of Sarkozy voters. Table 6 Summary statistics for Sarkozy voters SARKOZY ROYAL 83,30% 92,90% 96,50% 99,80% 99,90% 0,00% 0,00% 0,06% 1,71% 7,52% Min. 1st Qu. Median 3rd Qu. Max. Non Valid + Abstention 0,00% 0,15% 2,38% 5,43% 11,50% Comparison of the methods As an accuracy index of the estimation of the general transition matrix we use the total absolute deviation from the "real" general transition values. The "real" general transition values are calculated as the ratio pij=nij/n. Let pˆ ij be the estimated voter transition rate from party i to party j. The total absolute deviation is calculated by the 4 3 formula tad pˆ ij pij and it represents the total number of voters (as a fraction i 1 j 1 of the electorate) who were misclassified by the method. Another useful index is the average absolute deviation which is calculated by the formula: aad 1 4 3 pˆ ij pij . 12 i 1 j 1 This index shows the absolute average deviation of the estimated value from the "real" value in each of the 12 cells of the general transition matrix. We include this index because it is helpful to remember that a 12% total absolute deviation in our case with a 4x3 transition matrix means that the value of average absolute deviation is 1%, i.e. for the average cell the difference between the estimates and the real values is 1%. In Table 7 we present summary statistics of absolute deviations for MCMCHM and VTR methods. The average total number of voters (as a fraction of the electorate) who were misclassified by MCMCHM is 72,18%. This means that on average more than seven out of ten voters are misclassified when we use this method. For VTR the corresponding value is 10,89%. Table 7 Summary statistics of absolute deviations Method MCMCHM VTR Min. 61,00% 3,71% 1st Qu. 69,18% 7,98% Median 72,05% 10,02% Mean 72,18% 10,89% 3rd Qu. 75,17% 13,35% Max. 84,15% 22,77% In Diagrams 1 and 2 we present histograms of absolute deviations for MCMCHM and VTR respectively. It becomes obvious that VTR provides estimates which are closer to reality than the estimates provided by MCMCHM. The total number of voters who are misclassified by VTR is less than 13,35% in 75% of the total cases and the maximum deviation is 22,77%. 20 0 10 Frequency 30 40 Histogram of MCMCHM 0.60 0.65 0.70 0.75 0.80 0.85 Absolute deviations of MCMCHM Diagram 1. Histogram of absolute deviations of MCMCHM 10 5 0 Frequency 15 20 Histogram of VTR 0.05 0.10 0.15 0.20 Absolute deviations of VTR Diagram 2 Histogram of absolute deviations of VTR Entropy maximizing approach Johnston and Pattie (2000) have demonstrated that the maximum entropy approach provides correct and rather accurate estimates. This was demonstrated by using real data from New Zealand’s (1996) general election. On the other hand, the entropy maximizing method requires an estimation of the general voter transition matrix. In some cases (i.e. historical data) this matrix may not be available. Even if such a matrix is available, (i.e. with an exit-poll using questions concerning the vote in both the current and the previous elections), accepting this matrix as an accurate and representative description of the electorate and using it as a major factor on the estimation procedure is not fully justified by studies demonstrating that people tend to change their memory about their vote in the previous election (especially when they change their vote) in order to appear consistent with their previous vote (Benewick et al. 1969; Himmelweit, Biberian, and Stockdale 1978; Shachar and Eckstein 2007). Even if we accept that there is no response bias, the sample may not represent the population accurately due to sampling errors. Thus, to evaluate the maximum entropy method we should not use the "real" general transition matrix as the sample. Instead, we need to draw a sample from the total population of size ns. Thus, we use the vector sout= (m11, m21, m31, m41, m12, m22, m32, m42, m13, m23, m33, m43) where mij corresponds to voting for i in the first round and for j in the second round, i.e. sout is the set of all possible sampling outcomes. We also use the vector ws={nij , i=1, 2, 3, 4 and j=1, 2, 3} which is a vector of probability weights for obtaining the elements of the vector being sampled. Then we take a sample of the specified size ns from the elements of sout with replacement. In order to define the values of ns we need to take into account that our knowledge about the general transition matrix usually comes from exit polls. A usual sample size for exit polls is about 2000 questionnaires. Thus, for testing purposes in our case we define the value ns=2000. The findings of the simulation indicate that the values of the estimated general transition matrix from the entropy maximizing method are very close to the values of the general transition matrix given by the sample. Thus, if the sample represents the population, the estimated general transition matrix provided by the entropy maximizing method is very close to reality. On the other hand, if the general transition matrix given by the sample is different from the real general transition matrix (due to sample errors), ME estimates deviate from reality in a similar way. As an example we use the findings in one of the cycles of the algorithm. In this cycle the sample is not an accurate representation of the population as the sample transition matrix indicates (Table 8). In this sample Sarkozy earnings from both Bayrou and Le Pen are underestimated. Table 8 Sample transition estimates SARKOZY ROYAL BAYROU LE PEN SARKOZY ROYAL 97,59% 0,68% 35,46% 52,90% 0,54% 98,97% 47,45% 17,39% Non Valid + Abstention 1,87% 0,34% 17,09% 29,71% In Table 9 we present the estimated transition rates provided by ME. The estimated values are very close to the values presented in Table 8. In this case the absolute deviation is very close to the average absolute deviation of the VTR method which does not need the additional information provided by the general transition matrix estimated by the sample. If the sample gets worse (which is very possible with smaller sampling sizes), ME estimates will also get worse. Table 9 Voter transition estimates provided by ME SARKOZY ROYAL BAYROU LE PEN SARKOZY ROYAL 97,66% 0,76% 37,20% 53,94% 0,48% 98,87% 45,03% 16,04% Non Valid + Abstention 1,86% 0,37% 17,77% 30,01% Conclusions In this paper we have compared methods for the estimation of voter transitions in multiparty systems. In the aftermath of an election, political parties need to analyse their performance at the maximum possible detailed level. If they have raised their power, they need to know previous electoral behaviour of their new voters. If they have suffered from defections, they need to know the political parties which have earned from these defections. These methods can serve as useful tools for these cases because they provide a general table of voter transitions for the entire country and a detailed array of voter transitions for each of the departments included in this country. The simulation findings presented in this paper indicated that the VTR algorithm provided by Andreadis and Chadjipadelis gives estimates which are significantly closer to reality than the estimates given by the Markov Chain Monte Carlo Hierarchical Method provided by Rosen et al. The maximum entropy approach which requires the knowledge of the general transition matrix (i.e. from an exit poll) gives estimates which are strongly influenced from this matrix. If the general transition matrix deviates from reality, the ME estimates deviate from reality in a similar way. This paper includes preliminary results on the comparison of methods for the estimation of voter transition rates. Future research should include more extensive simulations, i.e. with more than 100 replicates/cycles, with larger than 4x3 tables and with the addition of the logit method for voter transition estimation proposed by Thomsen (1987) which is available in a different environment (Stata) and its assessment is out of the scope of this paper. References Andreadis, I. and Th. Chadjipadelis. 2009. A method for the estimation of voter transition rates. Journal of Elections, Public Opinion and Parties (forthcoming) Benewick, R. J., A. H. Birch, J. G. Blumler, and A. Ewbank. 1969. The floating voter and the liberal view of representation. Political Studies 17, (2): 177-95. Himmelweit, H. T., M. J. Biberian, and J. Stockdale. 1978. Memory for past vote: Implications of a study of bias in recall. British Journal of Political Science 8, (3): 365-75. Johnston, R., and C. Pattie. 2000. Ecological inference and entropy-maximizing: An alternative estimation procedure for split-ticket voting. Political Analysis 8, (4):333. Johnston, R. J., and A. M. Hay. 1983. Voter transition probability estimates: An entropy-maximizing approach. European Journal of Political Research 11, (1):93-8. Rosen, Ori, W. Jiang, Gary King, and M. A. Tanner. 2001. Bayesian and frequentist inference for ecological inference: The R × C case. Statistica Neerlandica 55:134-56. Shachar, R., and Z. Eckstein. 2007. Correcting for bias in retrospective data. Journal of Applied Econometrics 22, (3): 657. Thomsen, S. R. 1987. Danish elections 1920-1979: A logit approach to ecological analysis and inference. Politica.