Common Voting Rules as Maximum Likelihood Estimators Vincent Conitzer

advertisement
Common Voting Rules as
Maximum Likelihood Estimators
Vincent Conitzer
(Joint work with Tuomas Sandholm)
Early version of this work appeared in UAI-05
Voting (rank aggregation) rules
• Set of m candidates (alternatives) C
• n voters; each voter ranks the candidates (the voter’s vote)
– E.g. b > a > c > d
• Voting rule f maps every (multi-)set of votes to either:
– a winner in C, or
– a complete ranking of C
• E.g. plurality:
– every voter votes for a single candidate (equiv. we only consider the
candidate’s top-ranked candidate)
– candidate with most votes wins
• E.g. single transferable vote (STV):
– candidate ranked first by fewest voters drops out and is removed from rankings
– repeat
– final ranking is inverse of order in which they dropped out
Two views of voting
1. Voters’ preferences are idiosyncratic; only
purpose is to find a compromise winner/ranking
2. There is some absolute sense in which some
candidates are better than others, independent of
voters’ preferences; votes are merely noisy
perceptions of candidates’ true quality
“correct”aoutcome
(outcome=winner
or ranking)
“correct”aoutcome
P(all votes|outcome)
agents’
a votes
vote
a 1
vote
a 2
…
P(vote|outcome)
vote
a n
conditional independence assumption
Goal: given votes, find maximum likelihood estimate of correct outcome
Different noise model  different maximum likelihood estimator/voting rule
Marquis de Condorcet [1785]
• Condorcet was interested in the “correct ranking” model
• He assumed noise model where voter ranks any two
candidates correctly with fixed probability p > 1/2,
independently
• With some probability this gives a cycle…
– E.g. if the correct ranking is a > b > c, then with probability p2(1p) a voter will prefer a > b, b > c, c > a
• But, it does not matter for the MLE approach as long as we
get a probability for each (acyclic) vote
– Equivalently, we can renormalize the probabilities over the
acyclic votes
– Equivalently, we can say that if a cyclic vote is drawn, it must be
redrawn
• Condorcet solved for the MLE rule for the cases of 2 and 3
candidates
The Kemeny rule [1959]
• Given a ranking r, a vote v, and two candidates a, b, let
δab(r, v) = 1 if r and v disagree on the relative ranking of
a and b, and 0 otherwise
• A Kemeny ranking r minimizes ΣabΣvδab(r, v)
• Young [1986]’s observation: the Kemeny rule is the
solution to Condorcet’s problem!
• Drissi & Truchon [2002] extend to the case where p is
allowed to vary with the distance between two candidates
in correct ranking
What is next?
• Does this suggest using Kemeny rule?
– Many other noise models possible
– Some of these may correspond to other, better-known rules
• Goal of this work: Classify which common rules are a
maximum likelihood estimator for some noise model
– Positive and negative results
– Positive results are constructive
• Motivation:
– Rules corresponding to a noise model are more natural
– Knowing a noise model can give us insight into the rule and its
underlying assumptions
– If we disagree with the noise model, we can modify it and
obtain new version of the rule
Conditional independence restriction
“correct”aoutcome
agents’
a votes
• Without any independence
restriction, it turns out that any
rule has a noise model:
• P(vote set|outcome) > 0 if and
only if f(vote set)=outcome
• So, will focus on conditionally
“correct”aoutcome
independent votes
• If a rule has a noise model in this
vote
vote
a 1
a 2 … vote
a n setup we call it an
conditional independence
assumption
– MLEWIV rule if producing winner
– MLERIV rule if producing ranking
– (IV = Independent Votes)
Any scoring rule is MLEWIV and MLERIV
• Scoring rule gives a candidate a1 points if it is
ranked first, a2 points if it is ranked second, etc.
– plurality rule: a1 = 1, ai = 0 otherwise
– Borda rule: ai = m-i
– veto rule: am = 0, ai = 1 otherwise
• MLEWIV noise model: P(v|w) = 2al(v,w) where
l(v,w) is the rank of w in v
– want to choose w to maximize Πv 2al(v,w) = 2Σval(v,w)
• MLERIV noise model: P(v|r) = Π1≤i≤m(m+1-i)al(v,r )
where ri is the candidate ranked ith in r
i
Single Transferable Vote (STV) is MLERIV
• STV rule: candidate ranked first by fewest voters
drops out and is removed from rankings; repeat;
final ranking is inverse of order in which they
dropped out
• MLERIV noise model:
– Let ri be the candidate ranked ith in r
– Let δv(ri) = 1 if all the candidates ranked higher than ri
in v are ranked lower in r (i.e. they are all contained in
{ri+1, ri+2, …, rm}), otherwise 0
– P(v|r) = Π1≤i≤mkiδv(ri) where ki+1 << ki < 1
Lemma to prove negative results
correct outcome
vote 1
…
vote set 1
vote k
vote k+1
…
vote n
vote set 2
vote set 3
• For any noise model, if there is a single outcome that
maximizes the likelihood of both vote set 1 and vote set 2,
then it must also maximize the likelihood of vote set 3
• Hence, a voting rule that produces the same outcome on
both set 1 and set 2 but a different one on set 3 cannot be
a maximum likelihood estimator
STV rule is not MLEWIV
• STV rule: candidate ranked first by fewest voters drops out and is
removed from rankings; repeat. Final ranking is inverse of order in
which they dropped out
• First vote set:
–
–
–
–
3 times c > a > b
4 times a > b > c
6 times b > a > c
c drops out first, then a wins
• Second vote set:
–
–
–
–
3 times b > a > c
4 times a > c > b
6 times c > a > b
b drops out first, then a wins
• But: taking all votes together, a drops out first!
– (8 votes vs. 9 for the others)
Bucklin rule is not MLEWIV/MLERIV
• Bucklin rule:
– For every candidate, consider the minimum k such that more than half of the
voters rank that candidate among the top k
– Candidates are ranked (inversely) by their minimum k
– Ties are broken by the number of voters by which the “half” mark is passed
• First vote set:
– 2 times a > b > c > d > e
– 1 time b > a > c > d > e
– gives final ranking a > b > c > d > e
• Second vote set:
–
–
–
–
2 times b > d > a > c > e
1 time c > e > a > b > d
1 time c > a > b > d > e
gives final ranking a > b > c > d > e
• But: taking all votes together gives final ranking b > a > c > d > e
– (b goes over half at k=2, a does not)
Pairwise election graphs
• Pairwise election: take two candidates and see which one
is ranked above the other in more votes
• Pairwise election graph has edge of weight k from a to b
if a defeats b by k votes in the pairwise election
• E.g. votes a > b > c and b > a > c together produce
pairwise election graph:
(Roughly) all pairwise election graphs can be realized
• Lemma: any graph with even weights is the pairwise
election graph for some votes
• Proof: can increase the weight of edge from a to b by
two by adding the following two votes:
– a > b > c1 > c2 > … > cm-2
– cm-2 > cm-1 > … c1 > a > b
• Hence, from here on, we will simply show the pairwise
election graph rather than the votes that realize it
Copeland is not MLEWIV/MLERIV
• Copeland rule: candidate’s score = number of pairwise
victories – number of pairwise defeats
– i.e. outdegree – indegree of vertex in pairwise election graph
=
+
a: 3-1 = 2
b: 2-1 = 1
c: 2-2 = 0
d: 1-2 = -1
e: 1-3 = -2
a: 3-1 = 2
b: 2-1 = 1
c: 2-2 = 0
d: 1-2 = -1
e: 1-3 = -2
b: 2-0 = 2
a: 2-1 = 1
c: 2-2 = 0
d: 1-2 = -1
e: 0-2 = -2
Maximin is not MLEWIV/MLERIV
• maximin rule: candidate’s score = score in worst
pairwise election
– i.e. candidates are ordered inversely by weight of largest
incoming edge
=
+
a: 6
b: 8
c: 10
d: 12
a: 6
b: 8
c: 10
d: 12
c: 2
a: 4
d: 6
b: 8
Ranked pairs is not MLEWIV/MLERIV
• ranked pairs rule: pairwise elections are locked in
according by margin of victory
– i.e. larger edges are “fixed” first, an edge is discarded if it
introduces a cycle
+
b > d fixed
a > b fixed
d > a discarded
b > c fixed
c > d fixed
result: a > b > c > d
=
a > c fixed
c > d fixed
d > a discarded
b > c fixed
a > b fixed
result: a > b > c > d
d > a fixed
c > d fixed
a > c discarded
b > d fixed
a > b discarded
b > c fixed
result: b > c > d > a
Consistency & scoring rules
• A rule is consistent if, whenever it produces the same
winner on two vote sets, it produces the same winner on
the union of those sets
• Known result: A rule is consistent if and only if it
determines the winner according to a scoring rule
[Young 1975]
• Hence, the following are equivalent properties of a rule:
– Consistency
– Determining the winner according to a scoring rule
– MLEWIV
• These questions are open (as far as I know):
– What is the characterization of MLERIV rules?
– What is the characterization of “ranking-consistent” voting
rules?
– What is the relationship between these?
Conclusions
• We asked the question: which common voting rules are
maximum likelihood estimators (for some noise model)?
• If votes are not independent given outcome
(winner/ranking), any rule is MLE
• If votes are independent given outcome, some rules are
MLEWIV (MLE for winner), some are MLERIV (MLE
for ranking), some are both:
MLERIV
not MLERIV
MLEWIV
scoring rules (incl.
plurality, Borda, veto)
hybrids of MLEWIV
and (not MLERIV) rules
not MLEWIV
STV, Kemeny
Bucklin, Copeland,
maximin, ranked pairs,
Slater
Thank you for your attention!
Download