Counting Primes in a Jupyter Notebook

In a recent CTF, the following question was posed: how many primes are there between 10 million and 11 million? We consider various ways of answering this.

Python has a builtin prime test function. It's located in the sympy package. But there are other functions there that might be useful!

In [11]:
import time
import sympy
count = 0
start = time.process_time()
for i in range(10000001,11000000,2):
    if sympy.isprime(i):
        count += 1
print("number of primes is %d" % count)
print("Execution time: %f" % (time.process_time() - start))
number of primes is 61938
Execution time: 1.250000

This is a brute force approach, but at least we didn't bother to test the even numbers for primeality. The runtime isn't too bad. Can we do it faster?

In [20]:
import time
import sympy
count = 0
start = time.process_time()
x = [i for i in sympy.primerange(10000001,11000000)]
print("number of primes is %d" % len(x))
print("Execution time: %f" % (time.process_time() - start))
number of primes is 61938
Execution time: 1.031250

Improvement, but small. Any other useful functions?

In [16]:
import time
import sympy
start = time.process_time()
count = sympy.primepi(11000000)-sympy.primepi(10000000)
print("number of primes is %d" % count)
print("Execution time: %f" % (time.process_time() - start))
number of primes is 61938
Execution time: 0.046875

Probably the best we can do in Python. How well does this scale? Try billions instead of millions...

In [21]:
import time
import sympy
start = time.process_time()
count = sympy.primepi(11000000000)-sympy.primepi(10000000000)
print("number of primes is %d" % count)
print("Execution time: %f" % (time.process_time() - start))
number of primes is 43336106
Execution time: 6.781250

Not bad! Moral of the story? There are several:

  1. Brute force is sometimes good enough.
  2. Avoid for loops in Python if you can.
  3. Use builtin functions when you can.
  4. Use Wolfram Alpha when you can.
In [ ]: