Python / March 9, 2018
Speeding Up a Python Script Using Parallelization (ProcessPoolExecutor)
A quick demo of using ProcessPoolExecutor to parallelize a brute-force primality check and cut runtime by over 70%.
Introduction
I want to keep this post really short, simple, and to the point. With that being said, here's the toy problem that we'll use to illustrate how we can leverage pleasingly parallel problem types with small I/O requirements to speed up Python programs. The speeds recorded in this article were obtained using an i7-6800K processor.
We are going to deploy the absolute worst method for determining the primality of a number (i.e. brute force), and then abuse all of our CPU cores. One could argue the only way to abuse a CPU worse than this is to mine cryptocurrency with it.
Required tech
- Python 3 (I use 3.6 for this example)
Code Review
Here's the bit of code that deals with the imports and prime numbers.
from concurrent.futures import ProcessPoolExecutor
import time
# a list of numbers we want to check for primality
numbers = [2, 7, 13, 28, 99991, 188877, 1616161, 4441939, 90870847,
92525533, 94939291, 98776551, 99999999, 100030001]
def primeCheck(number):
""" return True if the number is prime, else return False """
if number % 2 == 0 and number != 2:
return False
else:
for i in range(3, number):
if number % i == 0:
return False
return True We'll be using ProcessPoolExecutor to perform the parallel processing and time to compute the time. We've also been given a list of numbers that may or may not be prime, and we'll use the function primeCheck to brute force whether a number is prime or not. Let's check out the script that does the processing now.
if __name__ == "__main__":
# no speedup
print("Sequential processing...")
start = time.time()
results1 = list(map(primeCheck, numbers))
print(results1)
finish = time.time()
print("Finished in %.3f seconds." % (finish - start))
# speedup
print("Use all CPU cores...")
start = time.time()
pool = ProcessPoolExecutor(max_workers=None)
results2 = list(pool.map(primeCheck, numbers))
print(results2)
finish = time.time()
print("Finished in %.3f seconds." % (finish - start)) Visually the code in # no speedup is very similar to # speedup. The main difference is the usage of pool, and its application to map. Note that setting max_workers=None will utilize all possible cores. ProcessPoolExecutor does a lot of work behind the scenes to allow for speedup to occur. In fact, so much work goes on behind the scenes that it is very possible to experience no speedup (or worse a decrease of performance) when using parallelization if the overhead costs exceed the parallelization gains. The moral of the story is intense mathematical computations that don't depend on each other and have low I/O requirements have a good chance of being sped up.
Speedup Results
Here is the output from the script:
>> Sequential processing...
>> [True, True, True, False, True, False, True, True, True, True, True, False, False, True]
>> Finished in 22.667 seconds.
>> Use all CPU cores...
>> [True, True, True, False, True, False, True, True, True, True, True, False, False, True]
>> Finished in 6.319 seconds. Comparing numbers we see a 72.92% decrease in runtime (roughly 3.5 times faster) when using parallelization. My CPU has 6 cores and 12 threads, so clearly it didn't scale linearly. Why might that be? Here are a couple possibilities:
- There isn't enough work to benefit from the additional processing power.
- Due to the serialization performed by the multiprocessing module.
Conclusion
We can conclude by saying it is quite possible to speed up certain Python scripts by utilizing what multiprocessing has to offer. It requires very specific conditions to achieve, but those conditions appear quite often in programming applications so it is a tool worth having in our tool belts!
Further Reading
Project Code
Full project code
from concurrent.futures import ProcessPoolExecutor
import time
# a list of numbers we want to check for primality
numbers = [2, 7, 13, 28, 99991, 188877, 1616161, 4441939, 90870847,
92525533, 94939291, 98776551, 99999999, 100030001]
def primeCheck(number):
""" return True if the number is prime, else return False """
if number % 2 == 0 and number != 2:
return False
else:
for i in range(2, number):
if number % i == 0:
return False
return True
if __name__ == "__main__":
# no speedup
print("Sequential processing...")
start = time.time()
results1 = list(map(primeCheck, numbers))
print(results1)
finish = time.time()
print("Done in %.3f seconds." % (finish - start))
# speedup
print("Use all CPU cores...")
start = time.time()
pool = ProcessPoolExecutor(max_workers=None)
results2 = list(pool.map(primeCheck, numbers))
print(results2)
finish = time.time()
print("Done in %.3f seconds." % (finish - start))