Sieve of Eratosthenes: Algorithm to Generate Prime Numbers
This article explains the Sieve of Eratosthenes and its optimizations for efficiently generating prime numbers within a range.
In the last article about prime numbers, I discussed about different types of primality test and Trial division method to verify if a number is a prime. If you haven’t read that yet, here is the article, Prime numbers and basic primality test. In this article, we’ll focus on Sieve of Eratosthenes, which helps us to compute primes in a range \([2, n] \) efficiently and later on we’ll see how we can modify this algorithm to compute primes from \(l\) to \(r\).
Bruteforce method
One easy way to generate primes numbers in a range \([2, n]\) is to bruteforce from \(2\) to \(n\) and check if the number is a prime using the square root primality test. This is very easy to understand but it’s much slow since for each iteration we are performing a square root primality test. If n is the number of elements in the range, then the time complexity is approximately \(O(n\sqrt{n})\).
vector<int> primes;
for (int i = a; i <= b; ++i) {
// if i is a prime
if (isPrime(i)) {
// insert i into the vector
primes.push_back(i);
}
}
Seive of Eratosthenes
The previous algorithm seems to be very slow. Let’s see how we can use the algorithm Sieve of Eratosthenes to generate primes in a range. It’s very efficient and also very easy to understand. At first we take all numbers in the range \([2, n]\). We start with \(2\) and we mark all multiples of \( 2\) of the form \(2k\) where \(k > 1\) as composite.
Then we find the next number that hasn’t been marked as composite. The next such number is \(3\). Since it was not marked, it means \(3\) is prime. Again we mark all numbers of the form \(3k\) where \(k > 1\). The next unmarked number is \(5\), which is the next prime number, again we mark all the numbers of the form \(5k\) where \(k > 1\) as composite, we continue this process. The idea is that when we get a unmarked number, we can say that the number is prime. So, we can push that number into a vector where we’ll store all the primes. After the end of the process, the vector will contain all the primes in the range \([2, n]\).
vector<int> primes; // all primes will be inserted here
bool marks[1010];
void sieve(int n) {
for (int i = 2; i <= n; ++i) {
if (marks[i])
continue; // if i is composite
// otherwise 'i' is prime
// so we find all the multiples of i
// and mark those as composite
for (int j = i + i; j <= n; j += i)
marks[j] = 1;
// insert 'i' into the vector
primes.push_back(i);
}
}
Sieving till \(\sqrt{n}\)
The complexity of the previous code is \(O(n \, log \, logn)\). We can further optimize our code, although the time complexity will remain the same but the number of operations will be reduced. Notice that we are going from \(2\) to \(n\) in the first loop, but we can go only upto \(\sqrt{n}\), since every composite number has a divisor less than or equal to the square root of itself. Why is that ?
Suppose that \(n\) is a composite. Then we can write \(n=a \times b\) for integers \(a\) and \(b\) which are both greater than \(1\). If both \(a>\sqrt{n}\) and \(b>\sqrt{n}\) then we would have \(ab> \sqrt{n} \times \sqrt{n}\) thus \(ab > n\), which is a contradiction. Hence either \(a \leq \sqrt{n}\) or \(b \leq \sqrt{n}\), so if you check as far as \(\sqrt{n}\) and don’t find a divisor of \( n\) greater than \(1\), then \(n \) must be a prime. Thus, we can conclude that a composite number has a divisor less than or equal to the square root of itself.
This way we mark all the composite numbers. We push \(2\) into our primes list. After that we can start a loop from \([3, n]\) incrementing \(2\) at a time (ignoring all are even numbers). Also we can use bit set instead of regular boolean array for memory optimization.
vector<int> primes; // all primes will be saved here
bitset<10000> marks; // using bitset for memory optimization
void sieve(int n) {
// we first fill the marks
// marks[i] = 0, if i is prime
// marks[i] = 1, if i is composite
for (int i = 2; i * i <= n; ++i) {
if (marks[i])
continue; // if i is composite
// otherwise 'i' is prime
// so we find all the multiples of i
// mark those as composite
for (int j = i + i; j <= n; j += i)
marks[j] = 1;
}
if (n >= 2)
primes.push_back(2);
// we can skip all the even numbers except 2
for (int i = 3; i <= n; i += 2)
if (marks[i] == 0)
primes.push_back(i);
}
Segmented Sieve
In the previous section, we saw how we can generate prime numbers in the range \([1, n]\). But the question is can we do the same in a range \([l, r]\)? For example, \(l = 100\) and \(r = 200\) will give us all the prime numbers between \(100\) to \(200\). Segmented sieve will help us to solve this problem. It is nothing but a modification of sieve algorithm.
The idea is quite similar. We know a composite number n has a prime divisor \(d \) such that \(d \leq \sqrt{n}\). We can compute all the primes in the range \([1, \sqrt{r}]\). For prime \(p_i\), mark all the multiples of it \((kp_i)\) as composite such that \( l \leq k p_i \leq r.\) For each prime in the range \([1, \sqrt{r}]\), we repeat the same process of marking composite numbers. After the end of process, we can traverse the boolean array to find out which ones are prime.
vector<int> res; // for primes in range [l, r]
vector<int> primes; // for primes in range [1, sqrt(r)]
void segmented_sieve(int l, int r) {
// compute all the primes in range [1, sqrt(r)]
sieve(sqrt(r));
// there are total (r - l + 1) numbers in the range
// marks[i] denotes whether (l + i) is prime or not
vector<bool> marks(r - l + 1, 0);
// mark[1] as composite
if (l == 1)
marks[0] = 1;
for (int i = 0; i < primes.size(); ++i) {
// we start from the multiple of primes[i]
// such that k * primes[i] >= l
int j = (l / primes[i]) * primes[i];
if (j < l)
j += primes[i];
// if j is the current prime
// we can't mark it as composite
// so we move to the next multiple
if (j == primes[i])
j += primes[i];
while (j <= r) {
// since we'll have no primes less than l
// we can say that marks[j - l] denotes
// whether j is a prime or not
marks[j - l] = 1;
j += primes[i];
}
}
for (int i = 0; i <= r - l; ++i) {
// marks[i] denotes whether (l + i) is prime or not
if (!marks[i])
res.push_back(l + i);
}
}
Related problems
SPOJ – TDPRIMES
SPOJ – TDKPRIME
SPOJ – PRIME1
Codeforces – Noldbach problem
LightOJ – Goldbach’s Conjecture
Codeforces – Colliders