theorem denote

advertisement
This report does not have a title; as in our attempt to
decide on one, we got stuck in a Condorcet Cycle
Contents
1.Introduction to voting
A Brief History
Condorcet Paradox (cycle)
Positional voting
Voting Vectors
Examples of systems
2
2
2
2
3
2.Voting Theory
Unanimity Criterion
Independence of Irrelevant Alternatives
Arrow's Theorem and proof
Muller-Satterthwaite Theorem and proof
Gibbard-Satterthwaite Theorem
Example of Tactical Voting
Bertrand's Ballot theorem and proofs
Condorcet's Jury Theorem and proof
4
4
5
7
12
12
13
16
3.Other Voting Systems
Weighted Voting
Banzhaf Powers (US Election example)
Range Voting (A Perfect Voting system?)
Figure Skating Example
Example of range voting
Conclusion
1
17
17
20
20
21
22
1.Introduction to Voting
“We drive to the polls and vote, gossip with colleagues at work about who might win, return home for dinner,
and briefly turn on the T.V. to find the projected election outcomes, outcomes which may be predicted within
minutes after the close of the polls and even before many ballot boxes are even opened.” - Chaotic Elections!
by Donald G. Saari p.1.
The analysis of voting systems dates back hundreds of years, originating arguably towards the end of the 18 th
Century around the time of the French Revolution. There were two influential voting theorists at the time, both
with opposing ideas which still live on today; Jean-Charles de Borda, inventor of the Borda Count and Nicolas
de Condorcet who came up with many results relevant to voting theory (including the Condorcet Cycle or
paradox after which this project is so ingeniously titled).
Example 1.1 Condorcet Paradox (cycle)
Consider three candidates, A, B, C, and three voters each with the following preferences. (where a>b denotes a
preference of a over b)
Voter
Preference
Voter 1
A>B>C
Voter 2
B>C>A
Voter 3
C>A>B
Let's say A is chosen as the winner. However it would be reasonable of C to claim that C is the worthy winner
as two of the three voters (Voters 2 and 3) prefer C over A. However B could then make a similar claim as
voters 1 and 2 prefer B over C. But two voters, namely voters 1 and 3, prefer A to B. This therefore leads to a
paradox where no candidate is a winner.
Let's now give define some important voting theories and terms.
Definition 1.2 Positional Voting is a method of voting in which candidates receive points depending on their
position in each voter's preference.
Positional voting is the primary focus of this paper and covers many different systems. The fundamental
difference between these systems can be illustrated using something called a voting vector.
Definition 1.3 A Voting Vector is a vector which describes the way in which scores are cast. For a given voting
system in which there are n candidates, voting is described by the following vector (v 1, v2, v3, …, vn) where vi is the
number of points given to candidate i and candidates are in order of preference.
Example 1.4 Consider an election with four candidates and nine voters with preferences as below.
Number of
voters
2
2
2
Preference
A>B>C>D
A>D>C>B
C>B>D>A
2
3
D>B>C>A
 Here Candidate A wins with the voting vector (1,0,0,0);
Candidate
Score
A
4
B
0
C
2
D
3
This voting vector is representative of the plurality vote.
Definition 1.5. In plurality, the winner is the candidate with the most first place votes.i
 Here B doesn’t receive a single first place vote, but wins with the voting vector (1, 1, 0, 0);
Candidate
Score
A
4
B
7
C
2
D
5
 Here with the voting vector (1, 1, 1, 0), C wins.
Candidate
Score
A
4
B
7
C
9
D
7
This is an essentially an example of Anti Plurality Voting.
Definition 1.6 Anti plurality voting is a voting system in which every voter essentially votes against one
candidate and the winner is the candidate with the fewest votes against.
 D wins with the voting vector (3,2,1,0)
Candidate
Score
A
12
B
14
C
13
D
15
This is a classic case of the Borda Count.
3
Definition 1.7 A Borda Count corresponds to a voting vector of (n-1, n-1, …, 1, 0) for an election with n voters. ii
This is a special case of a positional voting or (scoring) system. In the Borda count the maximum points that
can be awarded is the number of candidates minus one, giving the minimum points as 0. Borda also doesn’t
allow for the same number of points to be awarded to two different candidates. This was one of the first
modern positional voting systems.
4
2. Voting Theory
Arguably the most important theory in Voting Theory explores the properties of these positional voting
systems. It was derived by Kenneth Arrow, a twentieth century economist, and forms the basis of much of
modern voting theory including several theories which we will go on to discuss.
Before we can formally state this theorem, we will give two important definitions.
Definition 2.1 Unanimity Criterion. A ranking method satisfies the unanimity criterion if it guarantees that X is
preferred to Y by society if X is preferred to Y by every voter. iii
It is fairly common for this criterion to be achieved, since it is fairly safe to assume that if every person in a
population prefers one option over another, the preferred one will rank higher than the unfavoured option by
society as whole.
Definition 2.2 Independence of Irrelevant Alternatives Criterion. A ranking method satisfies the independence of
irrelevant alternatives (IIA) criterion if the relative societal ranking of any two candidates X,Y depends only on the
relative ranking of X,Y on each individual ballot-not on how voters rank other candidates. iv
This seems natural; why would an introduction of a third candidate ever influence how society views two other
candidates relative to each other? It would seem reasonable to assume that if society preferred one candidate
to another then regardless of the introduction of any other candidates, this preference would always remain
the same.
We cannot, however, assume this criterion as readily as the last. It can be demonstrated that a common voting
system does not achieve this criterion.
Example 2.3 Independence of Irrelevant Alternatives
Consider the following election result between 3 candidates with 10 voters (with plurality voting system):
1st Choice
A
B
C
2
Choice
B
A
B
3 Choice
C
C
A
Number of people of
this opinion
4
3
3
nd
rd
As you can see, 6 people prefer B to A, compared to 4 people preferring A to B. Yet A is the candidate that will
get into power – not what you’d expect considering more people prefer B to A in the population. Now
consider this example:
1st Choice
A
B
B
2nd Choice
B
A
C
3 Choice
C
C
A
Number of people
with this opinion
4
3
3
rd
Once again, the number of people preferring B to A is the same (6 people), but this time, candidate B will
reach power. This shows that the ranking of C, an irrelevant alternative to A and B, does indeed have an
5
influence on the final ranking of A and B, and can therefore not be considered independent – hence the IIA
Criterion is not satisfied
We require two more definitions before we can prove the theorem.
Definition 2.4 A Polarizing candidate is a candidate that is either ranked first or last on every ballot.
Definition 2.5 A dictatorship is a voting system in which the opinion of one voter decides the opinion of society.
We can now formally state and prove Arrow's Theorem.
Theorem 2.6 Arrow's Theorem. Any method of voting for ranking n>2 candidates, which satisfies the Unanimity
Criterion and the Independence of Irrelevant Alternatives Criterion, must be a dictatorship.
Proof We start with an arbitrary candidate, A, being ranked last on every ballot:
Voter
1
2
3
...
(N-1)
1st Choice
.
.
.
...
.
nd
2 Choice
.
.
.
...
.
...
...
...
...
...
...
Last Choice
A
A
A
...
A
N
.
.
...
A
By the unanimity criterion, A must be ranked last by society as every voter prefers all other candidates to A.
Below we have altered the preference of voter 1, by moving A to first choice from last, leaving all other relative
preferences the same.
Voter
1
2
3
...
(N-1)
N
st
1 Choice
A
.
.
...
.
.
2nd Choice
.
.
.
...
.
.
...
...
...
...
...
...
...
Last Choice
.
A
A
...
A
A
It can be shown that a ‘Polarising Candidate’ – that is a candidate that is ranked either first or last by each
voter in the population – must be ranked either first or last in the final ranking from the entire population
(proof omitted). Now we check to see where A ranks overall, and since A is a polarizing candidate, they will
either be ranked first of last by society. If A is still ranked last, we repeat the process with candidate 2:
Voter
1st Choice
2nd Choice
...
Last Choice
1
A
.
...
.
2
A
.
...
.
3
.
.
...
A
...
...
...
...
...
(N-1)
.
.
...
A
N
.
.
...
A
This process continues until we change the ballot of a certain voter, k, when A becomes the preference of
society. (We know this must happen at some point, since if all N ballots are changed, A would be ranked first
by all voters, and would therefore win by the unanimity criterion.)
This means there must be a stage where the following ranking is reached.
6
Voter
1st
2nd
...
Last
1
A
.
...
.
2
A
.
...
.
...
...
...
...
...
k-1
A
.
...
.
k
.
.
...
A
k+1
.
.
...
A
...
...
...
...
A
(N-1)
.
.
...
A
N
.
.
...
A
Above, A remains last in society’s preference. Now we change the preference of voter k, placing A first on their
ballot.
Voter
1st
2nd
...
Last
1
A
.
...
.
2
A
.
...
.
...
...
...
...
...
k-1
A
.
...
.
k
A
.
...
.
k+1
.
.
...
A
...
...
...
...
A
(N-1)
.
.
...
A
N
.
.
...
A
A is now ranked first by society and it is clear that voter k has the pivotal vote here – since when this ballot is
changed, the ranking of the entire society changes as well.
Now we need to prove that this “pivotal voter” k is in fact, a dictator. To do this, we need to introduce two
other candidates – labelled B and C. (A≠B≠C)
Assume voter k ranks B over C (So B>C). We now consider the preference schedule where A is ranked first by
the first k-1 voters (the same as in the schedule where A was ranked last by society), A is ranked last by voters
k+1 through to n (the same as in the schedule where A was ranked first by society) and A is ranked below B,
but above C in ballot k – as follows:
Voter
1
2
...
k-1
k
k+1
...
(N-1)
N
1st
A
A
...
A
.
.
...
.
.
nd
2
.
.
...
.
B
.
...
.
.
...
...
...
...
...
...
...
...
...
...
.
...
...
...
...
A
...
...
...
...
...
...
...
...
...
...
...
...
...
...
.
...
...
...
...
C
...
...
...
...
Last
.
.
...
.
.
A
A
A
A
It is important to note that the relative rankings of B and C have not been altered on any of the ballots.
Firstly we will simply consider only candidates A and B. In this case, A ranks above B for the first k-1 voters,
and below B for all voters after that. As we have shown previously, when A is the first choice of only the first
k-1 voters, A ranks last in the overall ranking of society. Therefore, B>A.
In a similar process, we now only inspect candidates A and C. In this case, A ranks above C in the ballot of
voter k, so A actually ranks above C in the first k ballots, and below in all ballots after that. Again, as shown
previously, when A is ranked first by the first k voters, A is the preferred option in society and this means that
A>C.
This leads to the preference result B>A>C by society – and therefore B>C – it is this fact that shows that voter
k is the ballot in control of the rankings of candidates B and C in this ballot.
7
Since B and C were any two arbitrary candidates, it follows that the relative rankings of any two candidates
who are not candidate A by voter k is also the relative ranking of the whole society. This is known as voter k
being the A*-dictator. (So no other voter can have any influence over the ranking of two candidates that are
not A – apart from voter k).
To prove that voter k governs the ranking for all candidates, we introduce a candidate X – this candidate is any
candidate who is not candidate A. So we effectively need to prove that the ranking of voter k between X and
A is also that of society.
We need to introduce a third candidate, Y. There must be a Y*-dictator, since A was arbitrary, (no other voter,
apart from the Y*-dictator, can ever alter the relative societal ranking of two other candidates). This means
that only the Y*-dictator can alter the rankings of A and X.
As shown previously, there must be at least one situation where voter k can influence the ranking of A and X –
namely where X is candidate B or C. This means that the Y*-dictator must also be voter k.
This shows that the ballot of voter k also dictates the societal relative ranking of X and A.
The fact that voter k is both the A*-dictator AND the Y*-dictator goes to prove that this voter must in fact
dictate the entire result of the ballot – that is to say that the preference schedule of k is always that of society
in all cases.
Hence, voter k can be considered a dictator in this ballot.v
There are two important voting theorems closely related to Arrow's theorem; the Muller-Satterthwaite Theorem
and the Gibbard-Satterthwaite Theorem.
First some more definitions are required.
Definition 2.7 Pareto-efficiency is a weaker version of the unanimity criterion. It ensures that if a candidate is
ranked first by every voter, the candidate is made the winner.
Definition 2.8 A system is monotonic if a ballot change in favour of a winning candidate can not cause that
candidate to become a loser.
Definition 2.9 Independence of Irrelevant Comparisons (IIC). A winner selection method satisfies the
independence-of-irrelevant-comparisons (IIC) criterion if, for each candidate X, it is possible to determine
whether or not X is among the winners if one merely knows, for each voter and each candidate Y different from
X, whether the voter ranks X above or below Y.vi
Example 2.10 Pareto-Efficiency
Number of Voters
Preference
3
B>C>A>D>E
5
C>B>A>E>D
1
B>A>C>D>E
8
If we know that our winner selection method is Pareto-efficient and nothing else, we cannot conclude
anything from the above schedule as there is not a single candidate that is ranked first by all voters.
If we know that our winner selection method satisfies the unanimity criterion, then we can conclude that A, D
and E cannot win as they are all ranked below B by every voter.
Any winner selection method which satisfies the unanimity criterion is also Pareto efficient, however there are
some methods which do not satisfy the unanimity criterion but they are Pareto efficient, an example of this is
the Smith method. We would hope all methods would be Pareto efficient, it would be a very unusual method
not to be.
The following theorem refers to single-winner methods, this is a selection method which never leads to any
ties. Any system can be made in to a single winner method by a convention. The one that is usually used is
that in the event of the tie, the candidate who comes first alphabetically will be the winner. Now this would
seem unfair if your name was Zuckerburg and you’d be very grateful if your name was Appleby. Therefore to
make this slightly fairer candidates are assigned a letter randomly.
Theorem 2.11 The Muller-Satterthwaite Theorem. When there are more than two candidates, the only Paretoefficient, monotonic single-winner method satisfying the IIC criterion is dictatorship of the k th voter for some k
between 1 and N (the number of voters)
Proof Let us assume conversely that we are given a Pareto efficient, monotonic single-winner method that
satisfies the IIC criterion. We will then prove that the method is a dictatorship of the kth voter.
Let us first prove that the method satisfies the unanimity criterion. We will suppose that it does not satisfy the
unanimity criterion. That is to say that all candidates rank X above Y, but Y was still the winner. Y would still be
the winner by the IIC if we were to move X to first on all preference schedules, without changing the ranking
of Y relative to any candidate on any ballots. However since X is ranked first by all voters, by the Pareto
efficient, X must be the winner. So Y is no longer the winner. This contradiction proves that the method must
satisfy the unanimity criterion.
We will now pick an arbitrary candidate, A say. Let us first think of a situation in which we know what must
happen, for instance if A was ranked first by all ballots. By Pareto-efficient, A is therefore the winner.
This is indicated in the following way:
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
A
.
.
.
.
.
.
...
...
...
...
...
...
...
A
.
.
.
.
.
.
A
.
.
.
.
.
.
A
.
.
.
.
.
.
...
...
...
...
...
...
...
A
.
.
.
.
.
.
A
1.1
This is not a reduced preference schedule but a full one, the numbers along the top row just label voters, each
column refers to one voters preference.
9
Here we can see clearly that A should be the winner, step by step we will make small changes to the
preference schedule, whilst still ensuring that A is the winner. Eventually, A will be very unpopular yet still the
winner, the only explanation will be a dictatorship. We will now bring in another candidate, say B. A will remain
first on each ballot, and must therefore still win by the Pareto efficient.
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
A
.
.
.
.
.
B
...
...
...
...
...
...
...
A
.
.
.
.
.
B
A
.
.
.
.
.
B
A
.
.
.
.
.
B
...
...
...
...
...
...
...
A
.
.
.
.
.
B
A
1.2
We will now move B up to second on the first ballot, A remains the winner as long as B does not reach first
position on the ballots by the IIC criterion, as well as by Pareto efficiency.
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
A
B
.
.
.
.
.
...
...
...
...
...
...
...
A
.
.
.
.
.
B
A
.
.
.
.
.
B
A
.
.
.
.
.
B
...
...
...
...
...
...
...
A
.
.
.
.
.
B
A
1.3
As soon as B moves above A on the first ballot it is no longer clear who the winner is.
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
B
A
.
.
.
.
.
...
...
...
...
...
...
...
A
.
.
.
.
.
B
A
.
.
.
.
.
B
A
.
.
.
.
.
B
...
...
...
...
...
...
...
A
.
.
.
.
.
B
?
1.4
It is easy to assume that A would still be the winner as it is ranked first an all ballot papers, except on the first
ballot where it is ranked second. This would be true for most single-winner methods, however, if the single
winner method is a dictatorship of the first voter then B would in fact be ranked first. So the winner would
change from A to B.
10
In the above preference schedule, we can be sure that the winner is either A or B. To see this we will introduce
a third candidate, C. We will suppose that C is the winner. By the IIC, C would have to be the winner in 1.3 but
we know that the winner was A not C. So we have concluded:
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
B
A
.
.
.
.
.
...
...
...
...
...
...
...
A
.
.
.
.
.
B
A
.
.
.
.
.
B
A
.
.
.
.
.
B
...
...
...
...
...
...
...
A
.
.
.
.
.
B
A or B
1.5
Let us suppose that the winner is still A. So we will move B upward on the second ballot. As long as B is not
first on the second ballot then A is still the winner by the IIC. When B is placed first and A second on the
second ballot then either candidate A is still the winner or B becomes the winner by the argument above. If A
remains the winner we then move on to ballot 3, and bring B to the top and push A down to second. We
continue until the winner changes from A to B. This will definitely happen eventually, as when B is placed first
on all ballots, B must be the winner as the method is Pareto-efficient. Let’s say that B becomes the winner
when B moves above A on the kth ballot.
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
B
A
.
.
.
.
.
...
...
...
...
...
...
...
B
A
.
.
.
.
.
A
B
.
.
.
.
.
A
.
.
.
.
.
B
...
...
...
...
...
...
...
A
.
.
.
.
.
B
A
1.6
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
B
A
.
.
.
.
.
...
...
...
...
...
...
...
B
A
.
.
.
.
.
B
A
.
.
.
.
.
A
.
.
.
.
.
B
...
...
...
...
...
...
...
A
.
.
.
.
.
B
B
1.7
Here you can see that the choice voter k makes is very important. If voter k places A above B then A remains
the winner, as soon as he places B above A then B becomes the winner. This may not seem too surprising as k
might be approximately N/2, so the winner changes from A to B as soon as A no longer has the majority.
11
However we can make this seem more surprising, we will first move A to the bottom on all ballots 1 to k-1,
and second from bottom for all k+1 to N, and A will still be the overall winner.
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
B
.
.
.
.
.
A
...
...
...
...
...
...
...
B
.
.
.
.
.
A
A
B
.
.
.
.
.
.
.
.
.
.
A
B
...
...
...
...
...
...
...
.
.
.
.
.
A
B
A
1.8
First we must notice that B can definitely not be the winner, as if B were the winner in 1.8 then B would
certainly be the winner in 1.6 by IIC and we know that the winner in 1.6 is A. We must also note that the winner
of 1.8 cannot be C, for if
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
B
.
.
.
.
.
A
...
...
...
...
...
...
...
B
.
.
.
.
.
A
A
.
.
.
.
.
.
.
.
.
.
.
A
B
...
...
...
...
...
...
...
.
.
.
.
.
A
B
C
1.9
then,
1
...
k-1
k
k+1
...
N
st
B
...
B
B
.
...
.
nd
.
...
.
A
.
...
.
...
.
...
.
.
.
...
.
...
.
...
.
.
.
...
.
...
.
...
.
.
.
...
.
...
.
...
.
.
A
...
A
Last
A
...
A
.
B
...
B
1
2
Winner
C
1.10
by IIC, but 1.7 implies
1
...
1
2nd
...
...
st
B
.
.
.
...
...
...
...
k-1
k
k+1
...
N
Winner
B
.
.
.
B
A
.
.
.
.
.
.
...
...
...
...
.
.
.
.
B
12
...
...
Last
.
.
A
...
...
...
.
.
A
.
.
.
.
A
B
...
...
...
.
A
B
1.11
again by IIC.
So we have proved that 1.8 is in fact correct and that despite A being at or near the bottom on all ballots
except the kth ballot, A still wins.
What would happen if we were to move A to the bottom on all ballots except the kth. Would A still be the
winner overall?
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
B
.
.
.
.
.
A
...
...
...
...
...
...
...
B
.
.
.
.
.
A
A
B
.
.
.
.
.
.
.
.
.
.
B
A
...
...
...
...
...
...
...
.
.
.
.
.
B
A
?
1.12
So far all we can say is that C can’t be the winner, as if it was the winner in the above schedule it would also be
the winner in 1.8, and we have already seen that A was the winner.
To show that A wins 1.12 we will introduce a third candidate C. We will include C in such a way that it will not
change the fact that A is the winner.
1
2nd
...
...
...
...
Last
st
1
...
k-1
k
k+1
...
N
Winner
C
B
.
.
.
.
A
...
...
...
...
...
...
...
C
B
.
.
.
.
A
A
C
B
.
.
.
.
.
.
.
.
C
A
B
...
...
...
...
...
...
...
.
.
.
.
C
A
B
A
1.14
We can swap the order of A and B in the k+1 to N columns, and the winner will still be A or B because of
previous arguments. However by the unanimity criterion, as C is ranked above B on all ballots, B cannot win. So
even though A is ranked last by all voters except the kth, A is still the winner.
1
2nd
...
st
1
...
k-1
k
k+1
...
N
C
B
.
...
...
...
C
B
.
A
C
B
.
.
.
...
...
...
.
.
.
13
Winner
...
...
...
Last
.
.
.
A
...
...
...
...
.
.
.
A
.
.
.
.
.
C
B
A
...
...
...
...
.
C
B
A
A
1.15
From the above preference schedule we can make any other preference schedule in which voter k ranks A first.
We would first re order all candidates except A (if necessary) and then move candidate A up the preference
schedule for voters 1 to k-1 as well as voters k+1 to N (if necessary). In this way any other preference schedule
can be made, however A will still remain the winner by the IIC criterion and monotonicity.
When ever voter k ranks A first then A is the winner overall. Voter k is an A-dictator. Voter k does not
necessarily determine the outcome of an election unless voter k ranks A first, and in that case regardless of all
other ballots A will be the overall winner.
Now we should remember that at the beginning of this proof we chose A arbitrarily, so we can therefore
conclude that for any candidate X we have an X dictator. Now if we have candidate X and candidate Y, we
must determine if dictator X is different to dictator Y or if in fact they are the same.
Now if dictator X and dictator Y are two different voters, and both voters place their respective candidates first,
then both X and Y must win. However we know that we are discussing single-winner methods, therefore we
cannot have two winners, and thus dictator X and dictator Y must in fact be the same person.
This proves that there is a single dictator whose first preference determines the outcome of the election.vii
Theorem 2.11 The Gibbard-Satterthwaite Theorem states that if there are more than two candidates and the
system is not a dictatorship, then there are situations in which some voter should vote strategically, not sincerely,
to achieve a result in his best interests. viii
This essentially means that there exist certain situations in which a voter can achieve a societal ranking that is
closer to his/her personal preference by voting differently to his personal preference. We aren't going to prove
this theorem, but we are going to show an example of how strategic/tactical voting can work.
Example 2.12 Tactical Voting
Let us consider a hypothetical example in which tactical voting can be seen to have an effect. In a Borda count
with four candidates, i.e. with a voting vector (3, 2, 1, 0), A, B, C and ten voters, let's just consider one voter, k
and his preference, say B>D>A>C, where he prefers B over A over C. Now consider the following result
without the kth voter.
Candidate
Points
A
17
B
12
C
19
D
6
Now k's preference suggests that, if he voted sincerely, he would allocate three points to voter B, two points to
voter D, one point to voter A and no points to voter C.
14
If he did this, the result would be as follows:
Candidate
Points
A
18
B
15
C
19
D
8
This would give C the win, followed by A, B and D in that order. However k could vote insincerely and allocate
the three points to candidate A, two points to B, one point to D and no points to C i.e. voting with a preference
of A>B>D>C.
This would then five the following result:
Candidate
Points
A
20
B
14
C
19
D
7
Here, B still comes third and D comes last; however candidate A is now the winner rather than candidate C.
Now as we earlier said that k preferred A over C, k has successfully achieved a societal result closer to his
sincere preferences than he would if he had actually voted sincerely.
*
Joseph Bertrand theorised the probability of a winning candidate in a two candidate election being ahead of
the losing candidate throughout the election. This is of course relying on the votes cast having a chronological
order. Below is a simple example calculating this probability in a small election.
Example 2.13 If we have a ballot with four votes to be cast, all for either A or B, the possible outcomes of the
order of votes cast where A wins by three votes to one is as follows:
BAAA
ABAA
AABA
AAAB
Now of these possible situations where A wins, A is always ahead in two of the four cases. So the probability is
2 1
=
4 2
15
Theorem 2.14 Bertrand's Ballot Theorem Bertrand's Ballot Theorem, states that in an election with candidates A,
B where A receives a votes and B receives b votes with a>b, the probability of A being ahead of B throughout the
ballot is
a-b
a+b
Proof Taking again, a as the number of votes for A and b as the number of votes for B, the theorem clearly
holds for a>b with b=0. As the probability that A will be ahead throughout the vote is 1 because the only
votes cast are for A, and (1-0)/(1+0)=1, the theorem holds for a>b=0.
Observe also that the theorem holds for a=b>0 because obviously the probability will be 0 because A will
necessarily not be ahead of B when the final vote is counted because the number of votes cast for A and B
respectively are equal.
Now we assume that the theorem is true for a=k, b=l-1 and for a=k-1, with k>l. Now we look at the case
where a=k, b=l. Due to our assumptions, we consider the probabilities that the last votes will be cast for A and
B respectively. The probability that the last vote is cast for A is k/(k+l) and for B Is l/(k+l). So then to find the
probability that A is ahead of B throughout the count, we use:
k
l
(probabilitywitha = k -1, b = l) +
(probabilitywitha = k, b = l -1)
k +l
k +l
which due to our assumptions we can take as:
k æ (k -1) - l ö
l æ k - (l -1) ö k - l
ç
÷+
ç
÷=
k + l è (k -1) + l ø k + l è k + (l -1) ø k + l
Therefore, by induction, Bertrand's Ballot theorem holds for all a,b.
Alternative Proof A second proof of Bertrand’s Ballot Theorem was devised by Désiré André in which the vote
can be represented as a kind of random walk. ix We will show an altered version of André’s proof (André’s proof
dealt with the general case of A having a multiple, k, times the votes of B. We will be therefore proving the
case with k=1).
16
Every vote for A leads to one step up, and every vote for B leads to a step down. A ballot in which A is not
ahead of B throughout, is shown by the line touching the x-axis (after the point at which no vote has been
cast) or by going below the x-axis as this shows that at that point, there are the same number of or more votes
cast for B than for A. Following André’s method we shall call a path in which the line meets the axis (after the
origin) as a bad path and a step which takes the path from above the axis to meet the axis a bad step.
Consider two sets of bad paths; one set of the paths whose first bad step finishes one step below the axis and
the set of bad paths whose first bad step finishes on the axis. As we are only considering the case where k=1,
the first set will only contain the sets which start with a step downwards, i.e. the first vote is for B.
To find the number of voting permutations beginning with a vote for B we find the number of voting
permutations of the next a+b-1 votes, which is given by the number of ways of choosing the remaining a
votes to be cast for A from the remaining a+b-1 votes. (We could also consider choosing the remaining b-1
votes to be cast for B from the remaining votes to be cast.). This is given by
æ a + b -1 ö
ç
÷
a
è
ø
The rest of the bad paths will finish on the line. These two sets are disjoint so the union of the two sets is the
set of all the bad paths. To find the number of paths in the second set, involves a clever manipulation of the
above diagram. Below is the same diagram, with part of the line rotated 180o (shown in red).
17
Here the part of the line up until the point that
the line hits the x-axis for the first time, is rotated 180o.
From here it can be inferred relatively obviously
that there are the same number of paths which become
'bad' by hitting the axis as those which start by
dipping below the axis. Therefore, as these two sets are
disjoint, the unity of the two sets is precisely the set of all of the bad paths.
Therefore there are exactly
æ a + b -1 ö
2ç
÷
a
è
ø
'bad paths'
So there are
æ a + b ö æ a + b -1 ö
ç
÷ - 2ç
÷
a
è a ø è
ø
good paths and the probability of any path being 'good'.
Therefore the probability in a vote, of A always remaining ahead of B throughout is given by
æ a + b ö æ a + b -1 ö
ç
÷ - 2ç
÷
æ ( a + b -1)!b! ö
a
è a ø è
ø
÷÷
= 1- 2 çç
æ a+b ö
è ( a + b)!( b -1)! ø
ç
÷
è a ø
= 1=
2b
a+b
a + b - 2b
a+b
=
a-bx
a+b
Until now we have primarily discussed voting theorems in relation to elections, as an aside we now look at an
interesting theorem based on the decision of a jury. Condorcet’s jury theorem is about the relative probability
of a group of people reaching a correct decision. Marquis de Condorcet was the first person to arrive at this
theorem in his 1785 work, Essay on the Application of Analysis to the Probability of Majority Decisions.
18
Theorem 2.15 Condorcet's jury theorem The theorem asks how many voters we need to include in a group to
achieve a majority vote with the correct outcome, given that there are two possible outcomes. The theorem
states that a group wishing to reach a decision by majority vote, with two outcomes, one of which is correct,
each voter has probability p of reaching the correct decision.
The correctness of the result depends if p is greater than or less than 1/2:

If p is greater than 1/2 ie it is likely that each voter will vote for the correct decision, so if the number
of voters increases then the probability that the majority decision is correct increases; the probability
that the majority vote is correct approaches 1 as the number of voters increases.

If p is less than 1/2 ie each voter is likely to vote incorrectly, then adding more voters just compounds
the wrong decision; this would mean that an optimal jury would contain only one voter.
Proof Let n be the number of voters, and let us assume that n is odd in order to avoid the situation in which
there is a tie.
If we start with n voters, and m of these voters vote correctly, then we should consider what the effect of
adding two more voters would have to the majority vote. (We add two voters, so the total of voters remains
odd.) The majority vote would only be affected in two situations:

if m was previously one vote too small to achieve a majority of n, and both new voters vote correctly,
the majority vote changes from incorrect to correct.

m was just enough to be a majority of n, and both voters vote incorrectly, the majority changes from
correct to incorrect.
For all other situations, the votes will either cancel each other out of increase the difference between m and n.
So we only need to consider when a single vote, within the first n votes, separates a correct from an incorrect
majority.
Let’s imagine that the first n-1 votes cancel each other, so that the deciding vote is the nth vote. Here the
probability of the majority decision being correct is simply p.
Then we add to extra voters. The probability that they change an incorrect majority to a correct majority is (1p)2. The probability of getting of changing a correct majority to an incorrect majority is (1-p)(1-p)p.
So changing an incorrect majority to a correct majority is only more likely if p>1/2. Thus this proves the
theorem.
19
3.Other Voting Methods
Weighted Voting
Definition 3.1 Weighted voting refers to systems which give different voters different weights on the outcome of
an election because of an assumption that not all voters deserve equal say.
In the case of the US Electoral College, the states take on the role of the voters, as each state is given a
different number of votes which is equal to a number of people appointed to the House of Representatives
which is proportional to the state's population plus two senators. Another example of a weighted voting
system is when shareholders are awarded different numbers of votes based on their shareholding.
The notation used to denote weighted voting systems can vary but a common way is the following: (q; v1, v2, …
, vn ) where q is the quota, or the number of votes required for a vote to pass and vi is the number of votes, or
weighting, held by voter i for i =1,…,n where n is the number of voters.
Different voters will clearly have different influence, or power, over the outcome of ballots or elections. We
would expect that the more votes a voter has, the more power that voter has over the outcome of any vote.
One way in which the power a voter holds is to quantify it using the Banzhaf Power Index, introduced by John
F. Banzhaf III which is relatively simple to compute.
Definition 3.2 The Banzhaf Power Index of a voter, V, is simply the number of winning coalitions in which V is a
critical vote (without V having voted in favour of the winner, the winner would not have won).
Example 3.3 Calculating Banzhaf Power
Imagine a voting system in which there are 4 voters, A, B, C and D with weighting 7, 5, 3 and 1 respectively and
a quota of 8 (a winning outcome requires 8 votes), i.e. a voting system (8; 7, 5, 3, 1). If we treat each voter as
either being in or out of a coalition we can see that there are 2 4 possible coalitions. Below is a table of the
possible coalitions and the number of votes they each carry:
Coalition
A
B
C
D
AB
AC
AD
BC
BD
CD
ABC
ABD
ACD
BCD
ABCD
Votes carried
7
5
3
1
12
10
8
8
6
4
15
13
11
9
16
20
There is also, of course, the 'empty coalition' in which no one votes in the coalition, which obviously carries no
votes. From the table it is easy to see that the winning coalitions are the following: AB, AC, AD, BC, ABC, ABD,
ACD, BCD and ABCD, a total of nine out of sixteen.
To calculate the Banzhaf number of a voter we must look at the number of coalitions in which the voter is
present, the number of these coalitions which are winning coalitions and finally the number of these winning
coalitions in which the voter is critical. As an example let’s work out the Banzhaf number of A in this case. A is
present in eight of the sixteen coalitions (exactly half as we would expect). By inspection, the winning
coalitions in which A is critical are AB, AC, AD, ABD and ACD as if A were not in the coalitions; the votes carried
would fall short of the eight votes required. This leads us to conclude that the Banzhaf power index of A here
is 5 and that the probability of A affecting the outcome of the vote is 5/8.
Banzhaf power of a single voter is usually meaningless. Banzhaf powers are however useful to compare the
power of voters and see if the voting system satisfies certain criteria. In the above example, the Banzhaf
numbers of B, C and D respectively are 3, 3 and 1. This demonstrates a potential weakness in this voting
system. Although B and C are given different weights, implying that B should have more of an influence than C
in the result of elections, they have exactly the same power.
This shows an important property of weighted voting systems which is that the proportion of votes held by a
voter does not correspond to the proportion of influence a voter has, i.e. doubling the number of votes a voter
has may not double the influence (indeed in some cases it may not lead to an increase in power at all).
Another possible feature of flawed weighted voting systems is the presence of dummies, which are voters who
have Banzhaf powers of 0 or in other words, have no say in the outcome of the election.
US Electoral System
Probably the most well known weighted voting system is the US Electoral College. Here what were described
as 'voters' above are the fifty States of the USA and Washington DC (actually they are people elected to cast
votes on behalf of the state). They are given different numbers of votes based on the population which is
obtained from the most recent census (Washington DC is always awarded the same number of electoral votes
as the state with the lowest number, so at present has three.) The number of electoral votes varies from 55,
held by California, down to 3, held by a number of states and DC and the quota is 270 electoral votes for a
victory. Therefore the electoral system is a weighted voting system (270; 55, … , 3).
The Banzhaf power indexes here will show the power of each state. Because there are 251 possible coalitions, it
is impossible to compute the exact Banzhaf numbers, even using computers, so the Monte Carlo Simulation is
used which relies on random sampling to determine the final outcome. xi Note that this gives not the Banzhaf
number but the probability that a given state affects the outcome of the election. The results of this simulation
are available on page 273 of A Mathematical Look at Politics.
The results show that there are no dummies and that, without exception, the higher the number of votes a
state holds, the higher that states power. In this way, we can say that the US electoral system is broadly fair to
the states.
However, it is in fact millions of voters, rather than 51 states that actually vote in the election. It is when we
consider this that some astounding properties of the voting system are displayed.
Example 3.4 Presidential Election
21
It is in fact possible for a candidate to win the Presidential Election with approximately 0.00002% of the
popular vote.xii
This is how this might happen (however unlikely it might be!); If on 2012 election day, for some reason all the
voters besides one decide to stay home in the states of California, New York, Texas, Florida, Illinois, Ohio,
Pennsylvania, Michigan, New Jersey, Georgia and North Carolina and the one voter in each of these states
voted for the same candidate, say Obama, he would receive 270 electoral votes, the precise number required
for victory and would therefore win re-election no matter how high the turnout in the other states was even if
every single voter in those states voted for a Republican. So out of a turnout of tens of millions, the winning
candidate may only have received ELEVEN!
The above also demonstrates violation of anonymity. In other words, if certain voters were to swap votes then
the election result could be dramatically affected. In the extreme example above, if the eleven voters which
carried the election for Obama swapped votes with eleven voters from the other states, the result of the
election would swing from being 270-268 in favour of Obama to being 538-0 in favour of the Republican
candidate. A 270 electoral vote swing just from swapping eleven votes!
Although the above scenario is an extreme, hypothetical example it does show the potential weakness in
voting systems like that of the US Elections. Grouping individual voters together in different ways can have a
big effect on the result of an election. A real example of this is the 2000 US election, one of the closest in
history when Florida was won by George W Bush by just 537 votes out of a total nearly six million cast. Now as
Bush only won by five electoral votes, if just 269 Bush voters in Florida had swapped votes with 269 Al Gore
voters in another state in which Gore won by more than 269 votes, Gore would have won the election by 20
electoral votes.
Because of this lack of anonymity, different voters have more or less influence on the outcome on the election.
The chance that a single voter actually influences the outcome of an election is virtually zero because to do
this, the voter would have to break a tie in their own state and their state would have to be critical in the
outcome of the election. Even though this probability is very small, it can still be seen to vary for people in
different states and it is the ratios of these probabilities that we will now determine.
Example 3.5 Determining the power of a single voter
Firstly we will examine the probability of a voter affecting the outcome of the vote within one's state. Assuming
that the state consists of an odd number of voters, the probability of a voter casting a deciding vote in the
election is the probability of the rest of the population, taken as 2n as our assumption means that the
population minus the single voter is even, split their votes exactly.
Now there are exactly 22n possible permutations of the remaining votes. And the number of permutations in
which n voters vote for each of two candidates is given by
2n
Cn =
2n
n!n!
Therefore the probability of a single voter having the casting vote in an election of probability 2n is
2n!
2 n!n!
2n
Using Stirling's formula which gives us the approximation:
22
r! » r r e-r 2p r xiii
Letting p denote population, this gives the probability that a voter will cast the deciding vote in a state of
population is
2
p ( p -1)
To calculate the probability a single state is critical in an election is just the Banzhaf Power Index of the state
divided by 250; the number of coalitions in which the state is critical divided by the total possible permutations
of coalitions. We then multiply this number by the probability that a voter in a specific state is critical in their
own state to get the final probability that the voter casts a crucial vote in the final election. We see that by
performing these calculations, the higher the population of a state, the more power a person in that state has.
In fact a voter in California has over twice more 'power' than someone in Washington DC. Therefore despite
the appearance that the electoral system is fair to the states, due a lot to the two senators that all the states
get, the system is in fact not fair to the individual voters in different states.
Because of the problems surrounding the current electoral system, there is a campaign in the US to use what is
claimed to be a fairer system, Range voting. They believe that through range voting people are better able to
quantify their preference; it also claims that it eliminates ‘wasted votes’.
Range Voting
Definition 3.6 In range voting a number of points are assigned to each candidate, for instance using a scale of
0-99, each candidate is given a score within the range. The winner is the candidate with the greatest total after
all points are added together. Sometimes, voters can also choose to express no opinion about a candidate leaving
the ranking of that candidate to other voters.
It originates from Ancient Sparta. Public elections were decided depending on how loud the crowd shouted,
this is considered to be a form of range voting (or approval voting.)
Range voting is a modification of the Borda count. It differs from the Borda count in two major ways. Firstly
more than one candidate can be awarded the same number of points; secondly, the number range of numbers
has no correlation to the number of candidates.
Range voting is very popular among internet poll sites such as hotornot.com and Internet Movie Database,
where it is used to give an average score for each person involved in the poll.
In some sports the winner of an event is determined by a panel of judges, many people assume that range
voting is used, i.e. the person with the highest average or highest score will rank first and therefore win the
event but his is not the case.
Example 3.7 Figure Skating
23
The following is a real world example of voting in the world of sport; in this case figure skating.
‘The U.S. champion, Nicole Bobek, had skated into second place behind Chen Lu of China. In third place after
her final performance was Surya Bonaly of France. Then, a relatively unknown skater, 14 year old Michelle
Kwan, took the audience and judges by storm with a performance which catapulted her into fourth place. Ms.
Kwan’s skate did not alter any of the judges’ scores for Ms. Bobek or Ms Bonaly. But after votes were tallied,
their positions flipped. Ms. Bonaly went home with the silver, and Ms. Bobek won bronze!’ xiv
In 1997 something similar happened, it was the European Men’s Championships. Urmanov was leading,
followed by Zogorodniuk in second and Candeloro. They had all completed their final skate. Then, another
skater took to the ice, Vlascenko and came in sixth. Instead of the top 3 positions remaining as they were, it
actually caused Zogorodniuk and Candeloro to swap positions so that Candeloro finished second and
Zogorodniuk third.
The scoring system was an ordinal system. The way in which this worked was that each judge scored each
skater on a scale of 0 to 6. These scores themselves have no real meaning except that the scores are used to
indicate how that particular judge ranks that skater against the other skaters. These rankings are called
ordinals.
So how the particular individual’s score compares to the scores given to the other competitors by the same
judge is what matters.
Then the lowest-numbered place for which the competitor has a majority of votes from the judges is
determined. So if a player whose majority score is a lower numbered place will be ranked ahead of someone
with a majority score of a higher numbered place.
 When two skaters are tied for the same place the following rules are used:The skater with the larger
majority for the position is placed ahead of the others of the same position.
 If both skaters have the same majority, then add together the ordinals (not the scores) of the judges
that gave the majority. The skater with the lowest total ordinals is placed ahead.
 If the skaters are still tied, then consider the ordinals given by all of the judges, the skater with the
lowest total ordinals is the winner.
 If skaters are still tied, then they are both given the same position.
With one skater left to compete, the ordinals were as follows; where the numbers in the columns are the
position in which each voter placed the skater relative to the others. The last entry in the rows, a/b means that
a voters ranked that skater to position b or above.
Skater
Positions assigned to each skater by the judge
1. Alexei Urmanov
1
1
1
1
1
1
1
1
1
9/1
2. Viacheslav Zagorodniuk
5
5
4
4
2
3
2
2
3
5/3
3. Philippe Candeloro
3
2
5
2
3
2
5
5
5
5/3
4. Ilya Kulik
2
4
2
3
5
4
3
4
4
8/4
5. Alexei Yafudin
4
3
3
5
4
7
4
3
2
7/4
As you can see from above, Zagorodniuk and Candeloro where only split by the rules discussed above. Neither
of them had the majority for second place but both tied for third.
After the final skater, Andrejs Vlascenko had skated the result was as follows:
24
1. Alexei Urmanov
1
1
1
1
1
2
1
1
1
8/1
2. Philipe Candeloro
3
2
5
2
3
3
5
6
6
5/3
3. Viacheslav Zagorodniuk
5
5
4
4
2
4
2
2
3
7/4
4. Alexi Yagudin
4
3
3
6
4
8
4
3
2
7/4
5. Ilya Kulik
2
4
2
3
6
5
3
4
5
6/4
6. Andrejs Vlascenko
7
7
6
5
5
1
6
5
4
5/5
There is no longer a tie between Candeloro and Zagorodniuk. Zagorodniuk no longer has a majority of votes
for third place but is in fact fourth.
Example 3.8 An example of Range Voting
Consider an election with three candidates and ten voters. Below is a table showing each voter's preferences.
The number of points assigned to each candidate is out of 99. An 'X' denotes that the voter has decided not to
express an opinion about the candidate and so his opinion will not be taken into account when calculating
each candidates average scores.
Voter
Candidate A
Candidate B
Candidate C
1
60
55
10
2
80
40
50
3
X
10
99
4
80
0
20
5
99
X
0
6
40
75
20
7
10
80
10
8
50
X
0
9
5
38
90
10
66
12
X
Total Score
490
310
299
Average Score
490/9=54.44
310/8=38.75
299/9=33.22
We use the averages to decide the result, which means that A wins with an average of 54.44, B comes second
with an average of 38.75 and C comes third with an average of 33.22.
Range voting, at least theoretically can be considered in some sense a 'perfect' voting system. It appears to
contradict Arrow's theorem in that it satisfies the Independence of Irrelevant Alternatives criterion, the
unanimity criterion without meaning it is a dictatorship. It is not considered a contradiction of Arrow's theorem
however, as it is not is not a pure positional voting system, which is the type of system for which Arrow' holds.
It takes into account not only the voters' preferences but also the strength of these preferences. Another
important property of Range voting is that it is immune to strategic voting- i.e., a sincere vote can not result in
a worse result than an insincere vote.xv
However, Range voting, currently is impractical to implement in most elections, as there would be significant
difficulties in counting the votes. In the future it would be possible to introduce Range Voting in elections with
25
the use of IT whereby voting could be carried out on a computerized system and the votes counted
automatically.
Conclusion
Through our discussion of voting theory, we have established primarily that voting systems are inherently
flawed. We have seen the many different outcomes of an election which can be achieved with the same voting
preferences, due solely to the method of voting used sometimes reversing the result entirely. We have seen
that in all (positional) voting systems, it is always possible to have a situation in which the result can be
manipulated through tactical voting. We have seen that in trying to make an election fair for the voters, it can
end up being quite the opposite (US Election). And we have also seen that the one voting method which
promises to remove these such factors (range voting), is fundamentally impractical to implement. It looks like
the world is stuck with fundamentally flawed voting systems, at least for now.
i
http://www.eecs.harvard.edu/cs286r/papers/Taylor02.pdf
26
Chaotic Elections!: A Mathematician Looks at Voting, Donald Saari, American Mathematical
Society, 2001, page26
iii
Mathematics of Social Choice: Voting, Compensation and Division, Christoph Börgers,
Society for Industrial Mathematics, 2010, page 83
iv
Mathematics of Social Choice: Voting, Compensation and Division, Christoph Börgers,
Society for Industrial Mathematics, 2010, page 83
v
Mathematics of Social Choice: Voting, Compensation and Division, Christoph Börgers,
Society for Industrial Mathematics, 2010, page 88
vi
Mathematics of Social Choice: Voting, Compensation and Division, Christoph Börgers,
Society for Industrial Mathematics, 2010, page 62
vii
Mathematics of Social Choice: Voting, Compensation and Division, Christoph Börgers,
Society for Industrial Mathematics, 2010, page 69
viii
Chaotic Elections!: A Mathematician Looks at Voting, Donald Saari, American Mathematical
Society, 2001, page 94
ii
ix
http://webspace.ship.edu/msrenault/ballotproblem/Four%20Proofs%20of%20the%20Ballot%20Theo
rem.pdf
x
http://webspace.ship.edu/msrenault/ballotproblem/Four%20Proofs%20of%20the%20Ballot%20Theo
rem.pdf
xi
A Mathematical Look at Politics, E. Arthur Robinson Jr and Daniel H. Ullman, CRC Press,
2010, page 371-372
xii
A Mathematical Look at Politics, E. Arthur Robinson Jr and Daniel H. Ullman, CRC Press,
2010, page 381
xiii
A Mathematical Look at Politics, E. Arthur Robinson Jr and Daniel H. Ullman, CRC Press,
2010, page 384
xiv
Journalist Lila Guterman
xv
http://www.rangevote.net/
27
Download