Bits of Py.

If the implementation is hard to explain, it's a bad idea

Mon 12 June 2017

Generators in Python

Posted by marodrig in blog   

Iterators, Generators, Yield, and Generator expressions

Iterators

Python Iterator objects are used in loops to iterate through a collection, be it a list, a dictionary or other. The Iterator contract requires two methods to be defined.

iter: returns the iterator object itself. Used for loops when iterating through a collection.

next: returns the next value from the iterator. If it is the last item in the collection, a StopIteration exception is raised.

class my_range:
    def __init__(self, n):
        self.i = 0
        self.n = n

    def __iter__(self):
        return self

    def next(self):
        if self.i < self.n:
            i = self.i
            self.i += 1
            return i
        else:
            raise StopIteration()

An Iterator advantage is that we don't need all the elements before we can start using them. And that the elements won't be stored in memory. This results in a big performance advantage, as long as we don't need the collection more than once.

Generator Functions

A generator functions uses the yield keyword to return an Iterator object for us to use. For example:

def first_n(n):
    num = 0
    while num < n:
        yield num
        num += 1

This is a simple generator function that returns a number less than the input N. Lets play around with it

>>> for i in first_n(100:
...     print(i)
        0
        1
        2
        3
        4
        5
        6
        7
        8
        9

We can use our generator function in a for loop!

Generator expressions

Python allows us to create lists on the fly using list comprehension:

>>> doubles =[2*n for n in range(10)]
>>> doubles
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

We have a similar tool to create Generator expressions on the fly, we only need to use parenthesis instead of [] and we get a generator object:

>>>doubles =(2*n for n in range(10))
>>>doubles
<generator object <genexpr> at 0x102a3ef10>

Let's use our generator object in a loop, and see what happens:

>>> for item in doubles:
...     print(item)
...
0
2
4
6
8
10
12
14
16
18

Just as intended!

The range() and xrange() functions

Just a quick comment about the range() and xrange() functions for Python. For python 2.x, the built in function range() returns a list, while xrange() returns a Iterator object. Python 3.x version of range() returns an Iterator object as expected.

Summary

A Generator functions allows you to return an Iterator that can be used in for loops. Iterators are used once, and the collection is not stores in memory.

Common uses:

  • Anything with lazy evaluation, like calculating large sets of results.
  • Replace callbacks for iteration.

Benefits:

  • We get one item at a time, instead of the whole collection, be it a list or other type of collection.
  • No need to store all the items in memory, providing a performance improvement.

Best practices:

  • Be sure you don't intend to use the collection more than once! Iterators are use once only.