Computational Complexity of Social Choice Procedures DIMACS Tutorial on Social Choice and Computer Science May 2004 Craig A. Tovey Georgia Tech Part I: Who wins the election? Introduction Notation Rationality Axioms Social Choice HOW and should does (normative) (descriptive) a group of individuals make a collective decision? Typical Voting Problem: select a decision from a finite set given conflicting ordinal preferences of set of agents. No T.U., no transferable good. Case of 2 Alternatives Majority Rule n voters, 2 alternatives Theorem (Condorcet) If each voter’s judgment is independent and equally good (and not worse than random), then majority rule maximizes the probability of the better alternative being chosen. Notation [m] P([m]) ||x|| A1 >i A2 1..m set of all permutations of [m] Norm of x, default Euclidean Voter i prefers A1 to A2 Social Choice Function (SCF): chooses a winner Social Welfare Ordering (SWO): chooses an ordering Social Choice What if there are ¸ 3 alternatives? Plurality can elect one that would lose to every other (Borda). Alternatives A1,…,Am Condorcet Principle (Condorcet Winner) IF an alternative is pairwise preferred to each other alternative by a majority 9 t2 [m] s.t. 8 j2 [m], j t: |i2 [n]: At >i Aj| > n/2 THEN the group should select Aj. Condorcet’s Voting Paradox Condorcet winner may fail to exist Example: choosing a restaurant Craig prefers Indian to Japanese to Korean John prefers Korean to Indian to Japanese Mike prefers Japanese to Korean to Indian Each alternative loses to another by 2/3 vote 1 2 3 2 3 1 3 1 2 1 3 2 Pairwise Relationships 8 directed graphs G=(V,E) 9 a population of O(|V|) voters with preferences on |V| alternatives whose pairwise majority preferences are represented by G. Proof: Cover edges of K|V| with O(|V|) ham paths Create 2 voters for each path, each direction Now the tournament graph has no edges. Assign to each ordered pair (i,j) a voter with preference ordering {…j,i,…}. Don’t re-use! Flip i and j to create any desired edge. 1 2 3 4 5 5 4 3 2 1 1 3 5 2 4 4 2 5 3 1 4 1 5 3 2 2 3 5 1 4 1 2 3 4 5 5 4 3 2 1 1 3 5 2 4 4 2 5 3 1 4 1 5 3 2 2 3 5 1 4 Now the tournament graph has no edges. Assign to each ordered pair (i,j) a voter with preference ordering {…j,i,…}. Don’t re-use! Flip i and j to create any desired edge. 1 2 3 4 5 3>4 5 3 4 2 1 1 3 5 2 4 4 2 5 3 1 4 1 5 3 2 2 3 5 1 4 1 2 3 4 5 5 4 2 3 1 2>3 1 3 5 2 4 4 2 5 3 1 4 1 5 3 2 2 3 5 1 4 Formulation of Social Choice Problem Alternatives Aj, j2 [m] Voters i 2 [n] For each i, preferences Pi 2 P([m]) Voting rule f: P[m]n a [m] Social Welfare Ordering (SWO): P[m]n a P[m] SWP: permit ties in SWO Sometimes we permit ties in P_i Axiomatic Viewpoint Rationality Criteria Properties Anonymous: symmetric on [n] Neutral: symmetric on Aj, j2 [m] monotone: if Aj is selected, and voter i elevates Aj in Pi (no other change), then Aj will still be selected. strict monotone: ties permitted, but an elevation changes a tie to unique selection. Axiomatic justification of Majority Rule Theorem (May, 1952) Let m=2. Majority rule is the unique method that is anonymous, neutral, and strictly monotone. (Note for m =2 monotonicity ) strategyproof.) So, what if there are ¸ 3 alternatives and there is no Condorcet winner? some (Cond. consistent) SCFs Copeland: outdegree – indegree in tournament graph. Simpson: min # votes mustered against any opponent Dodgson: minimize the # of pairwise adjacent swaps in voter preferences to make alternative a Condorcet winner Multistage elimination tree (Shepsle & Weingast) So, what if there are ¸ 3 alternatives and there is no Condorcet winner? some (Condorcet consistent) SCOs Copeland, Simpson, Dodgson no scoring method (Fishburn 73) MLE Kemeny (1959), Young (1985), Condorcet?!: Let d(P,P’)= # pairwise disagreements between P,P’. Choose P to Arrow’s (im)possibility theorem Arrow(1951, 1963) Let m ¸ 3. No SWP simultaneously satisfies: 1. Unanimity (Pareto) 2. IIA: indep. of irrelevant alternatives 3. No dictator, no i2 [n] s.t. f(P[n])=Pi original proof uses sets of voters similar to what we’ve seen many combinations of properties are inconsistent Main point: No fully satisfactory aggregation of social preferences exists. Maximum Likelihood Voting Theorem (Young & Levenglick 1978) Kemeny is the only SWP that simultaneously satisfies: 1. Neutral 2. Condorcet 3. Consistent over disjoint voter set union “The only drawback … is the difficulty in computing it ….” [Moulin 1988] Part II: Who won the election? Procedures that are hard to execute Maximum Likelihood Voting Theorem: [Bartholdi Tovey Trick 89a]: Kemeny score (or winner) is NP-hard. Proof: Use the tournament construction and reduce from feedback arc set. Note: 1st archival result of this type (together w/Dodgson score thm). Found earlier in Orlin letter 81; Wakabayashi thesis 86. Corollary: If P NP no SWP simultaneously satisfies: 1. Neutral 2. Condorcet 3. Consistent over disjoint voter set union 4. Polynomial-time computable Maximum Likelihood Voting Theorem [Ravi Kumar 2001] Kemeny optimum is NP-hard for 4 voters Theorem [Hemaspaandra-SpakowskiVogel ~2001]: Kemeny Winner is complete for P||NP Theorem [Kumar 2004] “Median rank aggregation” is a O(1)-factor approximation to Kemeny optimum. note: approximation may lose all rationality properties --- an example of differing tastes in social choice and computer science. additional note: there is some work on “approximate” adherence to axioms,e.g. Nisan&Segal 2002 for almost Pareto. Dodgson Score Theorem: [Bartholdi Tovey Trick 89a] Dodgson score is NP-hard. Proof: reduction from X3C. Remark: polynomial for fixed m or fixed n. Sharper result by Hemaspaandra2-Rothe [JACM 97] Theorem: Dodgson Winner is complete for P||NP Significance Computational complexity of computation should be one of the criteria by which voting procedures are evaluated In different recent work, Segal [2004] finds the minimally informative messages verifying that an alternative is in the Pareto choice set – communication complexity [e.g.Kushilevitz & Nisan 97] Part III: Strategic Voting Manipulation by Individual Voters Strategic voting As early as Borda, theorists noted the “nuisance of dishonest voting” Very common in plurality voting Majority voting is strategyproof when m=2 How about m¸ 3? Answer is closely related to Arrow’s Theorem [see also Blair and Muller 1983]. Strategyproof A voting rule is strategyproof if 8 u 2 P[m]n ,8 i 2 [n],8 P2 P[m]: f(u) ¸i f(Pi,u-i). Equivalently, for all possible profiles of preferences, “everyone votes sincerely” is a Nash equilibrium. If everyone else is sincere, no voter benefits by being insincere. Gibbard-Satterthwaite Theorem (1973, 1975) Let m¸ 3. No voting rule simultaneously satisfies: 1. Single-valued 2. No dictator 3. Strategyproof 4. 8 j2 [m] 9 voter population profile that elects j Proof: similar to proof of, or uses, Arrow’s theorem. Gardenfors’s Theorem Let m ¸ 3. No SWP simultaneously satisfies: 1. Anonymous 2. Neutral 3. Condorcet winner consistent 4. Strategyproof Greedy Manipulation Algorithm [BTT89b] 1st inquiry into computational difficulty of manipulation Works for voting procedures represented as polynomial time computable candidate scoring functions s.t. 1. responsive (high score wins) 2. “monotone-iia” i. ii. iii. iv. v. Plurality Borda count Maximin (Simpson) Copeland (outdegree in graph of pairwise contests) Monotone increasing functions of above Definition Second order Copeland: sum of Copeland scores of alternatives you defeat Once used by NFL as tie-breaker. Used by FIDE and USCF in round-robin chess tournaments (the graph is the set of results) A New “Good” Use of Complexity: resisting manipulation Theorem[BTT89b]: Both second order Copeland, and Copeland with second order tiebreak satisfy: 1. Neutral 2. No dictator 3. Condorcet winner 4. Anonymous 5. Unanimity (Pareto) 6. Polynomial-time computable 7. NP-complete to manipulate (by 1 voter) Note: 1st result of this type Single-Valued Version Break ties by lexicographic order Theorem[BTT89b]: Both second order Copeland, and Copeland with second order tiebreak satisfy: 1. Single-valued 2. No dictator 3. Condorcet winner 4. Anonymous 5. Unanimity (Pareto) 6. Polynomial-time computable 7. NP-complete to manipulate (by 1 voter) Note: 1st result of this type Proof Ideas Last-round-tournament-manipulation is NP-Complete w.r.t. 2nd order Copeland. 3,4-SAT (To84) Special candidate C0, clause candidates Cj Literal candidates Xi,Yi C2 X5 X6 X7 Y5 Y6 Y7 Proof Ideas All arcs in graph are fixed except those between each literal and its complement Clause candidate loses to all literals except the three it contains To stop each clause from gaining 3 more 2nd order Copeland points, must pick one losing (= True) literal for each clause Proof Ideas Pad so each clause candidate is 1. tied with C_0 in 1st order Copeland 2. 3 behind C_0 in 2nd order Copeland This proves last round tourn manip hard. Then use arbitrary graph construction to make all other contests decided by 2 votes, so one voter can’t affect other edges. Another resistant procedure Theorem (BO:SCW 91) Single Transferable Vote is NP-hard to manipulate (by a single voter) for a single seat. Corollary: Non-monotonicity is NP-hard to detect in STV. Used in elections for Parliament in Ireland, Tasmania; Senate in Australia, South Africa, N. Ireland; local authorities in Ireland, Canada, Australia; school board in NYC. Proof ideas Candidates with fewest votes are h1, h2, … ~1, ~2,… fewest next fewest …. hn ~n Most supporters h_1 h1 ~1 … a few supporters h1 h1 h1 s4 … s7 … s9 … where (s4,s7,s9) is from a X3Cover instance Proof ideas Placing ~1 first forces h1 to be eliminated first (and vice-versa) Choose ~i or hi for each i2 [n] Must distribute new votes for s candidates evenly so no s_j beats your favored candidate Simplified but has main ideas Conitzer and Sandholm’s Universal Preround Complexifier Give up neutrality Add a pre-round of b m/2 c pairwise contests. If m is odd, one candidate gets a “bye”. The SCF is performed on the d m/2 e survivors. Modified procedure is NP-hard, #P-hard, and PSPACE-hard respectively to manipulate by 1 voter, depending on whether pairing is ex ante, ex post, or interleaved with the voting. Works for Plurality, Borda, Simpson, STV. Tweak or Tstrong? Implications Gibbard-Satterthwaite, Gardenfors, other such theorems open door to strategic voting. Makes voting a richer phenomenon. Both practically and theoretically, complexity can partly close door. Plurality voting is still widely used. Voting theory penetrates slowly into politics. One might consider using a hard-tocompute procedure Part IV: Complexity of Other Kinds of Manipulation Agenda Manipulation Manipulating Voters Coalitions Agenda Control Add small # of “spoiler” candidates (alternatives) Disqualify small # of candidates Partition candidates and use 2-stage sequential election Partition candidates and use run-off election Dates back to Roman times, at least! Complexity of Agenda Control Theorem [BTT 92]: Preceding types of agenda control are NP-hard for plurality voting Theorem [IBID] Preceding types of agenda control are polynomially solvable for Condorcet voting (note: impossible for adding candidates). 1st inquiry into computational difficulty of election manipulation Election Control: Manipulating Voters Add small # of voters Chicago voting* Remove small # of voters Detroit voting** Partition voters into two groups. Each group votes to nominate a candidate; then the voters as a whole decide between the candidates (if different). Complexity of Election Control by Manipulating Voters Theorem [BTT 92]: Preceding types of election control are NP-hard for Condorcet voting Theorem [IBID] Preceding types of agenda control are polynomially solvable for plurality voting. Main point: different voting procedures have different levels of computational resistance or vulnerability to various types of manipulation. Note: agenda manipulation by adding/deleting candidates relates to IIA in Arrow’s theorem, but I think that computational complexity is not a circumvention because that rationality criterion is not principally about agenda manipulation. Coalitions Coalition members may coordinate their votes A winning coalition can force the outcome of the SCF. Core: no coalition of voters has a safe and profitable deviation. Core is set of undominated candidates (undominated: no winning coalition unanimously prefers another candidate). Example: if SCF is Condorcet, core is Condorcet winner (if exists) or empty. Thm [BNT 91] “Is an alternative dominated?” is NP-complete in the Euclidean model. Coalitions Core Stable: SCF has nonempty core for all preference profiles. Theorem [Nakamura 1979]: SCF is core stable iff Nakamura number > m (minimal # winning coalitions with empty intersection). Theorem [BNT 91] Nakamura number · m is strongly NP-complete in weighted voting games. Theorem[Conitzer & Sandholm 2003] Core non-empty is NP-complete for non-TU and TU cooperative games. Coalitions Setup: Borda voting, but voter i has weight wi on her vote. Question: Can a given coalition C strategically coordinate its votes to get a given candidate j to win, if all other voters are sincere? (an atypical question from voting or game theory viewpoints) Theorem [CS 2002] NP-complete for 3 candidates. Proof: put j first, then partition wi: i2 C between other 2 for 2nd place. Similar results for STV, Copeland,Simpson.[IBID] Modern Manipulation The Ethicist (NY TIMES 2004) Bush supporter donates money to Nader campaign. Related Work Voting Schemes for which It Can Be Difficult to Tell Who Won the Election, Social Choice and Welfare 1989. Bartholdi, Tovey, Trick [BTT89a] Aggregation of binary relations: algorithmic and polyhedral investigations, 1986, Univerisity of Augsburg Ph.D. dissertation. Y. Wakabayashi The Computational Difficulty of Manipulating an Election, SCW 1989. Bartholdi, Tovey, Trick [BTT89b] Related Work Single Transferable Vote Resists Strategic Voting, SCW 1991. Bartholdi, Orlin Universal Voting Protocol Tweaks to Make Manipulation Hard. Conitzer, Sandholm. PART V SPATIAL (EUCLIDEAN) MODEL Definition of Spatial Model Voter i has ideal (bliss) point xi 2 <k Each alternative is represented by a point in <k A1 ¸i A2 iff ||xi-A1|| · || xi – A2|| Can use norms other than Euclidean e.g. ellipsoidal indifference curves 1D spatial model Informally used by U.S. press and many others Shockingly effective predictively in current U.S. politics. See Keith Poole’s website, e.g. Supreme Court. Similar to single-peaked preferences (a little more restrictive). For polyhedral explanation of “nice” behavior of singlepeaked prefs, see MOR 2003. Spatial Model Largely descriptive role rather than normative The workhorse of empirical studies in political science k=1,2 are the most popular # of dimensions In U.S. k=2 gives high accuracy (~90%) , k=1 also very accurate since 1980s, and 1850s to early 20th century. What do the dimensions mean? Different schools of thought Use expert domain knowledge or contextual information to define dimensions and/or place alternatives Fit data (e.g. roll call) to achieve best fit Maximize data fit in 1st dimension, then 2nd Impute meaning to fitted model 2D is qualitatively richer than 1D x1 A1 A2 x3 x2 A3 A1 >A2 > A3 > A1 Condorcet’s voting paradox in Euclidean model x1 A1 A2 x3 x2 A3 Hyperplane normal to and bisecting line segment A1A2 permitted alternatives, no Condorcet winner exists A1 x1 A2 x3 x2 Chaos theorems McKelvey [1979], Schofield [83]. Majority vote can take the agenda anywhere. (not precisely the meaning of chaos in system dynamics) Major Question: Conditions for Existence of Stable Point (Undominated, Condorcet Winner) Plott (67) For case all xi distinct Slutsky(79) General case, not finite Davis, DeGroot, Hinich (72) Every hyperplane through x is median, i.e. each closed halfspace contains at least half the voter ideal points. McKelvey, Schofield (87) More general, finite, but exponential. Are there better conditions? Recognizing a Stable (Undominated) Point is co-NPcomplete Theorem: [BNT 91]Given x1…xn and x0 in <k, determining whether x0 is dominated is NP-complete. Proof: use Johnson & Preparata 1978. Algorithm [BNT 91]: In O(kn) given x_1…x_n can find x_0 which is undominated if any point is. Corollary: Majority-rule stability is co-NPcomplete. Implications Puts to rest efforts to find simpler necessary and sufficient conditions. In this case complexity theory provides insight. Computing the radius of the yolk is NPhard Computing any other solution concept that coincides with Condorcet winner when it exists, is NP-hard Related Work The densest hemisphere problem, Theor. Comp. Sci, 1978. Johnson, Preparata Limiting median lines do not suffice to determine the yolk, SCW 1992. Stone, Tovey A polynomial time algorithm for computing the yolk in fixed dimension, Math Prog 1992. Tovey Dynamical Convergence in the Spatial Model, in Social Choice, Welfare and Ethics, eds. Barnett, Moulin, Salles, Schofield, Cambridge 1995. Tovey Some foundations for empirical study in Part VI: Discussion What can we learn from each other? Benefits of multidisciplinary meetings. Possible Benefits Idea to use for real problem faced in your field. New area to generate papers in your field. (Let’s be honest). Opportunity to help solve a problem in another field. Acquire idea or info from another field which alters a basic question in your field.