In Python, a generator is a function that returns an iterator that produces a sequence of values when iterated over. Generators are useful when we want to produce a large sequence of values – so these values do not need to be stored in memory.
In this post I will demonstrate how you can use iterator chaining using Pythons generator syntax. As an example we use the enumeration of all prime numbers up to a given maximum number.
Generators & Iterators
Here’s an example of a generator function that produces a sequence of numbers:
def integers(n: int) -> int :
for i in range(2, n):
yield i
for i in integers(10):
print(i)
In the above example, the integers() generator function takes an integer n as an argument and produces a sequence of numbers from 2 up to n.
The yield keyword is used to produce a value from the generator and pause the generator function’s execution until the next value is requested.
The for loop iterates over the generator object produced by integers(10), and the print statement prints each value produced by the generator. Note that a generator function can only used once and you can imagine that the for loop is calling function each time until it returns.
It is even possible to define generators in generator expression syntax. The above code could be condensed to just three lines of code.
integers = (i for i in range(2, 10))
for i in integers:
print(i)
Here, we have created the generator object that will produce the integers of the numbers 2 through 10 when iterated over.
And then, to iterate over the generator and get the values, we have used the for
loop.
Generators and Iterators are the Python choice when we need to handle really huge sequences.
I definitely prefer the generator or generator expression syntax as the definition of iterators can be cumbersome. However the same generator can also be built with Pythons iterator mechanism – just using a few lines more of code.
class Integers:
def __init__(self, value, n):
self.value = value
self.n = n
def __iter__(self):
return self
def __next__(self):
if self.value < self.n:
raise StopIteration
self.value += 1
return self.value
for i in Integers(1,10):
print(i)
Iterator Chaining
Suppose we have a sequence of items that need to undergo some processing. In our case the items are numbers and we want to decide if they are prime or not. We could do this with for loops at one place in scope and probably a few lines of code. However this code is not reusable and this is how you get many functions with hundred of lines of code nobody can understand.
A way of avoiding this pain is to make the items/numbers go through a simple pipeline. In each step of the pipeline we do exactly one thing and transform or filter the input sequence into an output sequence. Another advantage of this pattern is that we do not have to store the whole input and output sequence in memory – as we handle item by item.
Building a prime number pipeline
A number n is prime if and only if it does not have a prime factor smaller than sqt(n). Knowing this we can construct the following pipeline. The first step in the pipeline is enumerating numbers staring from 2. The next step in the pipeline already knows about the prime numbers smaller than the number currently arriving. So it can check if any of these known prime numbers divide the currently arriving number. If any of the known prime number does the current number is not prime and it is prime otherwise.
def integers(maximum: int) -> int:
for i in range(2, maximum):
yield i
def is_coprime(i: int, numbers: [int]) -> bool:
for p in numbers:
if p < math.sqrt(i):
if i % p == 0:
return False
else:
return True
return True
def primes(sequence: [int]) -> int:
prims: [int] = []
for i in sequence:
if is_coprime(i, prims):
prims.append(i)
yield I
# Here we print out all primes up to 10'000
for p in primes(integers(10000)):
print(p)