At the last general election, the percentages of votes and numbers of seats in parliament for the three main parties in Britain were as follows: Labour, 35.3% of votes, 356 seats; Conservatives, 32.3% of votes, 198 seats; Liberal Democrats, 22.1% of votes, 62 seats. In the election coming up on the 6th May, there is a distinct possibility of some quite bizarre outcomes. For example, if some recent polls give a true picture of how people will vote (which is of course far from certain), then there is a good chance that the Liberal Democrats will get more votes than Labour, but well under half the number of seats. It is also a commonplace that the Conservatives will need a higher percentage of votes than Labour to become the party with the largest number of seats. In the past there have been occasions where the party with the largest number of votes has lost the election. (Much of what I am saying applies equally to the system for electing a US president, but I shall stick with the British system in this post.)
Supporters of the first-past-the-post system argue, correctly, that it makes it much more likely that one party will have an absolute majority. They also argue, much more controversially, that this is a good idea. However, regardless of outcome of that argument, there can be no doubt that it has the potential to lead to anomalous results, and this potential has been thrown into sharp focus in the last week or two because it has a good chance of being realized. Here I would like to discuss whether it is correct to describe these anomalies as unfair.
In case I have just given the impression that I am about to dazzle you by arguing convincingly that they are not in fact unfair, let me come straight out and say that there is no such surprise in store: our voting system is as unfair as it looks. My purpose is to discuss in more detail what “fair” means in this context. I should also add that I do not plan to discuss a whole array of different voting systems, give an account of Arrow’s paradox, etc. etc. That has been done to death in my opinion. (The rest of what I shall say is hardly new either, but I care about it enough to want to say it anyway.)
Why bother to vote, even if you care about the result?
A question that many people have asked themselves is this: what is the point of voting? After all, my constituency is huge, so it’s inconceivable that the result here will be decided by just one vote. Therefore, my vote will make no difference.
An obvious answer would be, “If lots of people did that, then those people between them would make a difference.” But that argument won’t do. The fact is that you know that lots of people are going to vote, and given that information it really is true that your vote is almost certain to make no difference.
Note, however, that I said “almost certain” rather than “certain”. It is certainly possible that by changing your vote you could change the outcome in your constituency. It’s even possible that changing that outcome would tip the balance of power and thus have a significant impact on future policy decisions.
The justification usually given for voting is that your expected reward for doing so is just about right: there is an incredibly small chance that your vote will have a very big influence, so your expected influence is small but non-zero, just as it should be given that you are just one voter amongst millions. Let us examine this argument in more detail.
What is the expected impact of my vote?
First let us consider the probability that your vote will change the result in your constituency. For simplicity, let us suppose that there are only two candidates who have a realistic chance of being elected, and that if there is a tie then the outcome will be decided on the toss of a coin. Let us call these two candidates A and B, and let us suppose that you vote for candidate A. Then a necessary condition for your vote to make a difference to the outcome is that, when all the other votes are counted, either A has one less vote than B (note for pedants — my opinion is that because “vote” is singular there, “less” is the right word rather than “fewer”, but not everyone agrees with this) or they have the same number of votes. And in both those two cases the chance that your vote actually does make a difference is 50%, because in the first case you need A to win the coin toss and in the second case your vote will not have helped A if A would have won the coin toss.
How likely is it that A and B will get almost the same number of votes? The obvious way of modelling the situation is to assume that there are voters who vote for either A or B and that each such voter has a probability of voting for A and of voting for B, where Thus, the expected number of votes for A is . For A and B to have almost exactly the same number of votes, A needs to have almost exactly votes. Therefore, if , the total number of A’s votes must differ from its expectation by approximately .
Now a commonplace in probabilistic combinatorics is that the binomial distribution is highly concentrated about its mean. In particular, the probability that it differs from its mean by is at most , where is some constant. (I think one can take .) In qualitative terms, unless is very small (and it has to be smaller and smaller the larger is), the probability that it differs from its mean by is vanishingly small.
Let me put that another way. The binomial distribution approximates a normal distribution with standard deviation . Now the probability that a normally distributed random variable differs from its mean by more than a few standard deviations is fabulously small. How does compare with ? Let’s suppose , and . (In reality, the typical size of a British constituency is four or five times bigger than this.) Then , and is roughly . So A requires the result to be 10 standard deviations away from the mean to win. The probability of this is around , which is absurdly small. It follows that the probability that your vote will make a difference is also absurdly small.
Incidentally, this model shows up a rather bizarre feature of the first-past-the-post system. Suppose that people’s voting intentions did not correlate with where they lived. Then if the most popular party had a lead of two or three percent over the next most popular party, then with high probability they would win every single seat! The only reason we do not see this phenomenon is that votes do correlate significantly with where people live. In the UK, for example, Labour tends to do better in the north and in inner cities and the Conservatives do better in the south and in the countryside. The Liberal Democrats suffer from a vote that is more evenly spread around the country and so they have much more difficulty winning seats. (So in fact we do see the phenomenon — but not quite in such an extreme form.) Indeed, the only reason the Liberal Democrats have the seats they do have is that there are pockets of the country, such as the extreme south west, the Scottish islands, and much more recently university towns such as Oxford, Cambridge and Bristol, where they traditionally do well.
Now consider what happens if Now the expected number of votes for A is Let’s assume for simplicity that is even. What is the probability that A gets precisely votes (from the votes that go to either A or B)? The answer is on the order of . This can either be calculated by estimating the size of the binomial coefficient or by using the following more hand-wavy argument: we know that the number of votes for A will be plus or minus a small number of standard deviations. The standard deviation is so that gives us about possibilities, not all equally likely but mostly of comparable likeliness, of which we want one. If , then in this case our probability of influencing the result in the constituency is on the order of 1/100, which is quite good going.
One can pursue such thoughts and reach the following not very surprising conclusion: if the two top candidates in your constituency are neck and neck, then you have a small but significant chance of influencing the result, whereas if one has a noticeable lead over the other, then your chances of influencing the result (whether the candidate with the lead is the one you support or the other one) are so small as to be effectively zero. Thus, the first-past-the-post system effectively disenfranchises anybody who belongs to a non-marginal constituency.
This might seem a rather curious conclusion. Surely you need people to go out and vote, in order to make the constituency safe for one party or another. So surely the people who vote in these constituencies are not disenfranchised after all.
I find this argument rather hard to counter — maybe others have some idea about it. The best I can think of is two remarks. The first is that other people behave as they behave, and given that behaviour it really is the case that if you are in a safe seat then the probability that your vote will make a difference to the outcome in that seat is negligible. The second is that the apparent paradox is closely related to the prisoner’s dilemma: in a safe seat, any individual voter is better off staying at home (because they lose nothing as far as the outcome is concerned and save themselves the trouble of voting) but if all the people who support the more popular party cooperate and actually vote for that party, then between them they may gain considerably more (because their collective action may effect the outcome) than they lose in the small effort needed to vote.
Is the model a good one?
So far, I have assumed that each voter who votes for A or B has a probability of voting for A. Is this a reasonable model? Surely there are lots of people who are certain to vote for A and lots who are certain to vote for B and only a few waverers.
This is not a big problem with the model. To see why, consider an extreme situation where everybody who might vote for A or B will definitely vote for A or definitely vote for B. That is, there is nobody who is undecided between A and B. Nevertheless, there is a certain amount of random fluctuation, because some people who are intending to vote will not in fact get round to it. So the outcome is not completely determined, and if the variance in the number of voters is reasonably large and (contrary to what actually happens) A-supporters and B-supporters are equally likely to be non-voters, then treating each person who arrives to vote as if they were making a purely random choice with a certain probability of going for A is not too bad a model.
However, it has a more serious defect if you are considering not voting, which is that you do not know the value of . Even if an opinion poll is taken in your constituency — which it almost never is — you will not necessarily get a reliable value of , since there can be biases in sampling, and people can change their minds in ways that are very far from independent. For example, the surge in support for the Liberal Democrats in Britain a couple of weeks ago was not the result of millions of people independently changing their minds but of millions of people having watched the TV debates and been impressed by the leader Nick Clegg. The best way to model this is to imagine that the value of has changed.
Now if the value of can move about, and if you don’t know what it is, then how does that affect the probability, given the evidence you have available, that your vote will make a difference to the outcome? It will usually increase the probability substantially. Why? Well, let us again take a rather extreme case and suppose that what happens is this: first the probability is chosen uniformly from the interval and then people vote for A with probability , their votes being independent once the choice of has been made. (Those who followed the discussion of DHJ will recognise their old friend equal-slices measure here.) Then, roughly speaking, your vote will have a reasonable chance of making a difference if is within about of , and when this is the case the chance that it will make a difference is itself around . So we end up with a probability of about , which sounds rather fair when you come to think of it.
More realistically, you will know to within some error. For example, you may be pretty confident that it lies between and . Then your vote has a reasonable chance of making a difference. If the interval of possible probabilities does not contain 1/2 (even approximately), then you might as well stay at home.
The conclusion is a fact that all politicians know: only the votes in the marginal constituencies count. This is the big unfairness in the first-past-the-post system. It means that the interests of those who live in the marginal constituencies are more likely to be taken into account by those in power than the interests of people in safe seats.
Does that matter? One might argue that the composition of people in the marginal constituencies is likely to be quite mixed, and therefore a reasonably representative sample of the country as a whole. And to some extent that does seem to be the case — a political party usually needs to court the votes in the centre and can ignore both its core supporters and the core supporters of the other party, who are likely to be more extreme. And perhaps this argument is a reasonable one as long as two parties dominate the political scene, as is the case in the US and as has been the case for a long time in the UK. It is even possible to argue that the interests of voters for minority parties are looked after, since if a major party sees that a third party is getting a lot of votes, it may decide to steal some of that party’s policies. For example, in Britain the Labour party was out of power for 18 years, during which time a new party, the Social Democratic Party, was formed by people who broke away from Labour. In the 1984 election the anti-Conservative vote was evenly split between the Labour Party and an alliance of the SDP with the old Liberal party, with the result that the Conservatives won by a landslide (though with under 50% of the vote — that is how things work in Britain). The Labour Party was forced to move to the centre, and by the time they won power in 1997 their policies were of a kind that the founders of the SDP would have had no objection to whatsoever.
Nevertheless, this kind of indirect influence feels less like true representation than the direct influence to be gained if all votes count equally.
In the interests of fairness, I should mention the criticism that is often levelled at more proportional systems, which is that smaller parties can hold larger parties to ransom. Again, let us consider an extreme case: there are two parties with 48% of the seats each and one party with 4% of the seats. Let’s suppose that the small party represents a small geographical region. Then it could simply auction itself: whichever of the larger parties offers more money to that region gets the support of the small party. It is easy to see that this will probably lead to that region getting more than its fair share of the country’s resources.
This is another game-theoretic situation. It reminds me of a very nice problem that has been studied by cognitive scientists. Imagine that you and a friend are walking down the street and a man appears out of nowhere with a huge wad of notes in his hand. It turns out that he has $100,000. He then says that he will give the money to you and your friend if and only if you agree on how to split it. You say, “Great, let’s split it 50-50,” and to your astonishment your friend says, “No, I’m not interested unless I get $90,000.” And it becomes clear that your friend is being serious. Your choice is either to accept the deal and gain $10,000 or to refuse it and not gain $10,000. What do you do? The opportunity is not likely to come round again …
Actually, the usual presentation of this problem is different. In the usual presentation, it is just your friend who is offered the money, but on condition that your friend shares it (in some proportion) with somebody else. And you are the chosen person. Most people say that they would accept the deal as long as they get about a third of the money — it seems that two thirds is the “reward” that your friend is considered to deserve for having been the one who met the rich man.
The relevance of this is that there is sometimes an interval of outcomes that will suit two parties who are trying to make a deal, and then it is far from clear where in that interval the deal will actually be struck.
I’d like to end with a defence of the alternative vote system. (I’m not saying it’s the best possible system, but just that it is a distinct improvement on what we have now, and not as bad as some people think.) This is the system where if nobody has an absolute majority, then the second-choice votes of the least popular candidate are distributed amongst the other candidates, and this process is iterated until some candidate does have an absolute majority.
Although this system is not proportional, and still leads to marginal constituencies mattering more than safe seats, it avoids one of the big abuses of the usual first-past-the-post system, namely the need for tactical voting. If your first choice is party A but only party B has a chance of beating party C, then under the current system you have a good reason for voting B. But under AV you have nothing to lose by voting for A as your first choice and B as your second. Since many people say that they do not vote for smaller parties because their vote would be “wasted” (blissfully unaware that it will almost certainly be wasted anyway), the alternative vote system could help people vote for the party whose policies they most supported. It doesn’t sound like much to ask, and even this would be a huge improvement on the current system, which, now that there is a third party that is not tiny, is monstrously unfair.