Several researchers have observed that, in a wide variety of collections of numerical data, the leading — or most significant — decimal digits are not uniformly distributed, but conform to a logarithmic distribution. Of the nine possible values, occurs more than of the time while is found in less than of cases (see Figure above). Specifically, the probability distribution is

A more complete form of the law gives the probabilities for the second and subsequent digits. A full discussion of Benford’s Law is given in Berger and Hill (2015).

We define the Benford sets for as

The relative density of in the range may be written

where . This oscillates between and as increases, and does not approach a limit. In particular, the set does not have a natural density. However, we can assign a probability of an arbitrary number being in following ideas outlined in Diaconis and Skyrmes (2018) and, in greater detail, in Tenenbaum (1995) [Earlier post: How many numbers begin with a 1?]

**Averaging Methods**

Different sequences behave differently. The Fibonacci numbers conform to Benford’s Law: the relative frequency of the leading digit converges to The density of the set of Fibonacci numbers that start with is The sequence of prime numbers does *not* follow Benford’s Law. For the sequence of natural numbers, the relative density oscillates, with and .

For a set , the density can be defined as

This is an instance of the Cesàro mean, assigning the weight to each of the first terms.

There are several alternative ways to specify density. The harmonic density replaces uniform weights by the decreasing sequence

The numbers are known as the harmonic numbers. As is well known, the harmonic series diverges, so . Diaconis and Skyrmes (2018) describe a generalisation of (2):

For , the function converges to the Riemann zeta-function .

In the Figure above, we show the relative frequency for the first digit of a number to be (blue curve) and (red curve) for varying from to . This illustrates that, for , the frequency oscillates between limits of approximately and .

In the Figure below, we show the relative frequency for the first digit of a number being , where the logarithmic mean (2) is used. The indication is that the frequency oscillates with reducing amplitude and tends to a limit of approximately 0.3, consistent with Benford’s Law.

**The Logarithmic Distribution**

We saw that the frequency of occurrence of as the leading digit follows a logarithmic law. But where does this come from? If we assume that all numbers in the range may occur with equal probability, then the uniform distribution

is appropriate. This leads to the conclusion that all decimal digits should occur with equal probability (since zero cannot be a leading digit). However, we could argue that smaller numbers are more probable than larger ones and assign another distribution, such as the logarithmic distribution. We recall that the harmonic numbers are asymptotic to the logarithmic function . Thus, to a good approximation, the probability that a randomly chosen number is in the range is .

Now consider a `decade’ of numbers . The probability that a random choice within this interval is , while numbers with leading digit (in the interval ) occur with probability . Thus, the relative frequency of numbers with leading digit is

or about . This is the special case of Benford’s Law for . The remaining cases may be demonstrated in a similar manner.

**Sources**

Berger, Arno and Theodore P. Hill, 2015: *An Introduction to Benford’s Law*. Princeton Univ. Press, 248pp. ISBN: 978-0-691-16306-2.

Diaconis, Persi and Brian Skyrms, 2018: *Ten Great Ideas About Chance*. Princeton Univ. Press, 255 pages [See Chapter 5].

Tenenbaum, Gérald, 1995: *Introduction to Analytic and Probabilistic Number Theory.* Cambridge University Press. ISBN 0-521-41261-7.

Thatsmaths: How many numbers begin with a 1?