The Sieve of Eratosthenes and a Partition of the Natural Numbers

The sieve of Eratosthenes is a method for finding all the prime numbers less than some maximum value {M} by repeatedly removing multiples of the smallest remaining prime until no composite numbers less than or equal to {M} remain. The sieve provides a means of partitioning the natural numbers. We examine this partition and derive an expression for the densities of the constituent “Eratosthenes sets”.

TABLE 1: Arrangement of the natural numbers as multiples of the prime numbers in sequence. The {k}-th row contains the “Eratosthenes set” {E_k}.

The Sieve of Eratosthenes

The primorial, {P_K}, is defined to be the product of the first {K} primes:

\displaystyle P_K = \prod_{k=1}^K p_k \,.

The sequence of primorials is {\{2, 6, 30, 210, 2310, \dots \}} and the terms of the sequence grow as {K^K}. It is convenient to set {M = P_K}. The algorithm of Eratosthenes goes as follows: starting from the set {I_M = \{ 1, 2, 3, \dots , M\}}, eliminate all multiples of {2} greater than {2}; eliminate all remaining multiples of {3} greater than {3};  eliminate all remaining multiples of {p_K} greater than {p_K}. All that remains is the set of the first {m} prime numbers, {\{2,3,5, \dots,p_m\}}, where {p_m} is the largest prime not exceeding {M = P_K}.

The {k}-th “Eratosthenes set”, {E_k}, is the set containing {p_k} together with all the numbers removed at stage {k}. Thus, {E_1} is the set of all multiples of {2}, that is, all the even numbers; {E_2} is the set of odd multiples of {3}; {E_3} is the set of multiples of {5} not divisible by {2} or {3}; {E_4} is the set of multiples of {7} not divisible by {2}, {3} or {5}; and so on. The {k}-th row in Table 1 contains the “Eratosthenes set” {E_k}.
Let {D_k} be the set of multiples of {p_k} up to {M} and {E_{k,M} = E_k\cap I_M} the set containing all multiples of {p_k} up to {M} that are not multiples of any smaller prime. Then

\displaystyle E_{k,M} = D_{k,M} \cap (D_{1,M}^C \cap D_{2,M}^C \cap \dots \cap D_{{k-1},M}^C) \,. \ \ \ \ \ (1)

Since all primes {p_k} for {k\le K} divide {M}, the sizes of the {D}-sets are known: {| D_{k,M} | = M/p_k } and so {| D_{j,M}^C | = M - M/p_j = M(1 -1/p_j ) }.

Density

We define the density of a set {A \subseteq I_M} (relative to {M}) to be {\rho(A) = |A|/M}. Then {\rho(D_{k,M}) = 1/p_k } and {\rho(D_{j,M}^C) = (1 - 1/p_j ) }. Clearly, density is additive for disjoint sets. Division by {M} converts cardinalities to densities. Thus, for example,

\displaystyle \rho(A\cup B) = \rho(A) + \rho(B) - \rho(A\cap B) \,.

By means of the inclusion-exclusion principle, we easily show that the density of the set {E_{k,M}} in (1) is the product of the densities of the component sets on the right side:

\displaystyle \rho(E_{k,M}) = \rho(D_{k,M})\rho(D_{1,M}^C) \rho(D_{2,M}^C) \dots \rho(D_{{k-1},M}^C) \,. \ \ \ \ \ (2)

Using explicit expressions for the terms on the right, the density of the set {E_{k,N}} is

\displaystyle \rho(E_{k,M}) = \frac{1}{p_k}\frac{(p_{1}-1)}{p_{1}}\frac{(p_{2}-1)}{p_{2}}\ \dots\ \frac{(p_{k-1}-1)}{p_{k-1}} = \frac{1}{P_k} \prod_{j=1}^{k-1} (p_j-1) \ (3)

where {P_k = p_1 p_2 \dots p_{k}}. We observe that the numbers {\rho_{k,M} := \rho(E_{k,M})} are generated by a recurrence relation

\displaystyle \rho_{k+1,M} = \left( \frac{p_k-1}{p_{k+1}} \right) \rho_{k,M} \,, \ \ \ \ \ (4)

with initial value {\rho_{1,M} = \frac{1}{2}}. This enables us to compute the sequence {\{\rho_{k,M}\}}. The first eight density values are given in Table 2.

TABLE 2: Density of the Eratosthenes sets {E_k} for {k\le 8}.

Passage from {I_N} to {\mathbb{N}}

For arbitrary {N \in\mathbb{N}}, let {I_N = \{1, 2, \dots , N\}} and let {D_{k,N}} denote {D_k \cap \{1,2, ...,N\}}, the set of all multiples of {p_k} not exceeding {N}. Then {| D_{k,N} | = [ N/p_k ]} and {\rho(D_{k,N}) = [N/p_k ]/N}. The limit of { \rho(D_{k,N})} exists, so that

\displaystyle \rho(D_k) := \lim_{N\rightarrow\infty} \rho(D_{k,N}) = \frac{1}{p_k} \qquad\text{and also}\qquad \rho(D_k^c) = 1 - \rho(D_k) = 1 - \frac{1}{p_k} \,.

In this way, we can pass from {I_N} to {\mathbb{N}}, obtaining the densities of all the Eratosthenes sets in {\mathbb{N}}. In particular, the values of {\rho_k} in Table 2 are also the densities of the first eight (infinite) Eratosthenes sets relative to the natural numbers.

Equations (2) and (3) remain valid in the limit {M\rightarrow\infty}, as does the recurrence relation for {\rho_{k} := \lim_{M\rightarrow\infty} \rho_{k,M}}. Thus,

\displaystyle \rho_{k+1} = \left( \frac{p_k-1}{p_{k+1}} \right) \rho_{k} \,. \ \ \ \ \ (5)

We can show that the series {\sum \rho_n} converges. However, the simple ratio test is inadequate, as {\lim \rho_{n+1}/\rho_n = 1}, and a more subtle and discriminating test is required (Lynch, 2023). Although the series converges, the convergence rate is quite slow. Writing {\sigma_N = \sum_{k=1}^N \rho_k} we have {\sigma_{10} = 0.842}, {\sigma_{1,000} = 0.938}, and {\sigma_{100,000} = 0.960}.

Partitioning the Natural Numbers

Defining {E_0 = \{1\}}, we obtain a partition of the natural numbers {\mathbb{N}}:

\displaystyle \mathbb{N} = \biguplus_{n=0}^\infty E_n \,, \ \ \ \ \ (6)

The disjoint union in (6) contains all the positive integers, each occurring just once, providing a partition of {\mathbb{N}}.  Euler’s totient function {\varphi(n)} counts the natural numbers up to {n} that are coprime to {n}. In other words, {\varphi(n)} is the number of integers {k} in the range {1 \le k \le n} for which the greatest common divisor {\text{gcd}(k, n)} is equal to 1. Clearly, for prime numbers, {\varphi(p) = p-1}.

The number of values {x} coprime to {\prod_{j=1}^k m_j} is, by definition, given by {\varphi(m_1m_2 \dots m_k)}. But Euler’s function is multiplicative for products of mutually coprime numbers {\{m_1, m_2, \dots , m_k\}}:

\displaystyle \varphi(m_1 m_2 \dots m_k) = \prod_{i=1}^k \varphi(m_j) \,.

Thus, for {M = P_K}, we have

\displaystyle \varphi(P_K) = \varphi\left(\prod_{j=1}^K p_j \right) = \prod_{j=1}^K \varphi(p_j) = \prod_{j=1}^K (p_j-1) = P_K \prod_{j=1}^K \left(1-\frac{1}{p_j} \right) \,.

Now, using (3), we can write the density in terms of the totient function:

\displaystyle \rho_k = \frac{1}{P_k} \prod_{j=1}^{k-1} (p_j-1) = \frac{1}{P_k} \prod_{j=1}^{k-1}\varphi(p_j) = \frac{ \varphi( P_{k-1} )}{ P_k } \,. \ \ \ \ \ (7)

Defining the cumulative density {\sigma_k = \sum_{j=1}^k \rho_k} and noting that, as the sets {E_k} are mutually disjoint, {\sigma_k} must approach {1}, we obtain the relationship

\displaystyle \sum_{k=1}^\infty \left[ \frac{ \varphi( P_{k-1} )}{ P_k } \right] = 1 \,.

This result must be well known, although it has not been found in a cursory search of the literature.

Sources

Lynch, Peter, 2023: The Sieve of Eratosthenes and a Partition of the Natural Numbers. Preprint: PDF