A by-election for the Seanad Éireann Dublin University constituency, arising from the election of Ivana Bacik to Dáil Éireann, is in progress. There are seventeen candidates, eight men and nine women. Examining the ballot paper, I immediately noticed an imbalance: the top three candidates, and seven of the top ten, are men. The last six candidates listed are all women. Is there a conspiracy, or could such a lopsided distribution be a matter of pure chance?
To avoid bias, the names on the ballot paper are always listed in alphabetical order. We may assume that the name of a randomly chosen candidate is equally likely to appear at any of the positions on the list; with 17 candidates, there about 6% chance for each of the 17 positions; the distribution for a single candidate is uniform. However, when several candidates are grouped, the distribution is more complicated [TM231 or search for “thatsmaths” at irishtimes.com].
How Many Combinations?
We consider the group of eight male candidates. The number of possible positions they might occupy on the ballot paper is the number of ways of choosing 8 positions out of 17. A mathematician might write this as C(17,8) and read it as “seventeen-choose-eight”. The actual value, available using a scientific calculator, is 24,310. It is reasonable to assume that each possible choice is equally likely.
We can compute the average of the 8 positions for each choice. The probability of each average value is the number of choices from which it arises divided by the total number of choices. The smallest possible average, 4.5, arises only when the men occupy the top 8 positions. The greatest value, 13.5, occurs only when they are all at the bottom of the list. Both of these cases are highly unlikely.
The Average Positions
We examined the distribution of the average positions for all 24,310 sets of 8 positions. The results can be summarised in terms of a few numbers. First, the mean value of all the averages gives the most probable position. Unsurprisingly, this comes to 9, the central value of the sequence from 1 to 17. The frequency for all the possible values follows a bell-shaped curve centred on the mean value. The width of the curve is given by the standard deviation. This root-mean-square value is computed by averaging the squares of the distances from the mean and taking the square root. The computed value is 1.3. This quantity is usually denoted by the Greek letter sigma.
The actual positions of the male candidates were 1, 2, 3, 5, 7, 8, 9, 11 with an average position of 5.75. This differs by two-and-a-half standard deviations, or 2.5 sigma, from the mean value of 9. For the canonical bell-shaped graph — the gaussian or normal law — the chance of a deviation of more than 2.5 sigma below the mean is 1 in 160. But our distribution is not exactly gaussian, so we calculated the actual relative frequency of values below this threshold, which came to 1 in 180.
The Improbability Principle
The diagram shows a normal distribution with mean of 9 and standard deviation of 1.3. The observed value of 5.75, marked by the red line, is highly unlikely. A first reaction might range from mild surprise to deep suspicion. But we must recall the Improbability Principle: this encapsulates the paradoxical idea that extremely improbable events happen frequently.
Given enough opportunities, something extraordinary is bound to occur. With regular elections throughout the globe, unusual patterns are expected somewhere, sooner or later. Moreover, changing assumptions about the underlying probability distribution can greatly affect the likelihood of extreme events. We conclude that, in this particular case, luck is on the side of the men, but we may be confident that this luck will not hold.
* * *