1 Part 1 Beginning Python
1.7 Functions, Modules, Packages, and Debugging
1.7.3 Iterators and generators
Concepts:
iterator
And iterator is something that satisfies the iterator protocol. Clue: If it's an iterator, you can use it in a for: statement.
generator
A generator is a class or function that implements an iterator, i.e. that implements the iterator protocol.
the iterator protocol
An object satisfies the iterator protocol if it does the following:
○ It implements a __iter__ method, which returns an iterator object.
○ It implements a next function, which returns the next item from the collection, sequence, stream, etc of items to be iterated over
○ It raises the StopIteration exception when the items are exhausted and the next() method is called.
yield
The yield statement enables us to write functions that are generators. Such
functions may be similar to coroutines, since they may "yield" multiple times and are resumed.
For more information on iterators, see the section on iterator types in the Python Library Reference http://docs.python.org/2/library/stdtypes.html#iteratortypes.
For more on the yield statement, see:
http://docs.python.org/2/reference/simple_stmts.html#theyieldstatement
Actually, yield is an expression. For more on yield expressions and on the next() and send() generator methods, as well as others, see: Yield expression
http://docs.python.org/2/reference/expressions.html#yieldexpressions in the Python language reference manual.
A function or method containing a yield statement implements a generator. Adding the yield statement to a function or method turns that function or method into one which, when called, returns a generator, i.e. an object that implements the iterator protocol.
A generator (a function containing yield) provides a convenient way to implement a filter. But, also consider:
● The filter() builtin function
● List comprehensions with an if clause Here are a few examples:
def simplegenerator():
yield 'aaa' # Note 1 yield 'bbb'
yield 'ccc'
def list_tripler(somelist):
for item in somelist:
item *= 3 yield item
def limit_iterator(somelist, max):
for item in somelist:
if item > max:
return # Note 2 yield item
def test():
print '1.', '' * 30 it = simplegenerator() for item in it:
print item
print '2.', '' * 30 alist = range(5)
it = list_tripler(alist) for item in it:
print item
print '3.', '' * 30 alist = range(8)
it = limit_iterator(alist, 4) for item in it:
print item
print '4.', '' * 30 it = simplegenerator() try:
print it.next() # Note 3 print it.next()
print it.next() print it.next()
except StopIteration, exp: # Note 4 print 'reached end of sequence'
if __name__ == '__main__':
test()
Notes:
1. The yield statement returns a value. When the next item is requested and the iterator is "resumed", execution continues immediately after the yield statement.
2. We can terminate the sequence generated by an iterator by using a return statement with no value.
3. To resume a generator, use the generator's next() or send() methods.
send() is like next(), but provides a value to the yield expression.
4. We can alternatively obtain the items in a sequence by calling the iterator's next() method. Since an iterator is a firstclass object, we can save it in a data structure and can pass it around for use at different locations and times in our program.
1. When an iterator is exhausted or empty, it throws the StopIteration exception, which we can catch.
And here is the output from running the above example:
$ python test_iterator.py
1.
aaa bbb ccc
2.
0 3 6 9 12
3.
0 1 2 3
4
4.
aaa bbb ccc
reached end of sequence
An instance of a class which implements the __iter__ method, returning an iterator, is iterable. For example, it can be used in a for statement or in a list comprehension, or in a generator expression, or as an argument to the iter() builtin method. But, notice that the class most likely implements a generator method which can be called directly.
Examples The following code implements an iterator that produces all the objects in a tree of objects:
class Node:
def __init__(self, data, children=None):
self.initlevel = 0 self.data = data if children is None:
self.children = []
else:
self.children = children
def set_initlevel(self, initlevel): self.initlevel = initlevel def get_initlevel(self): return self.initlevel
def addchild(self, child):
self.children.append(child) def get_data(self):
return self.data def get_children(self):
return self.children def show_tree(self, level):
self.show_level(level)
print 'data: %s' % (self.data, ) for child in self.children:
child.show_tree(level + 1) def show_level(self, level):
print ' ' * level, #
# Generator method #1
# This generator turns instances of this class into iterable objects.
#
def walk_tree(self, level):
yield (level, self, )
for child in self.get_children():
for level1, tree1 in child.walk_tree(level+1):
yield level1, tree1 def __iter__(self):
return self.walk_tree(self.initlevel)
#
# Generator method #2
# This generator uses a support function (walk_list) which calls
# this function to recursively walk the tree.
# If effect, this iterates over the support function, which
# iterates over this function.
#
def walk_tree(tree, level):
yield (level, tree)
for child in walk_list(tree.get_children(), level+1):
yield child
def walk_list(trees, level):
for tree in trees:
for tree in walk_tree(tree, level):
yield tree
#
# Generator method #3
# This generator is like method #2, but calls itself (as an iterator),
# rather than calling a support function.
#
def walk_tree_recur(tree, level):
yield (level, tree,)
for child in tree.get_children():
for level1, tree1 in walk_tree_recur(child, level+1):
yield (level1, tree1, )
def show_level(level):
print ' ' * level,
def test():
a7 = Node('777') a6 = Node('666') a5 = Node('555') a4 = Node('444')
a3 = Node('333', [a4, a5]) a2 = Node('222', [a6, a7]) a1 = Node('111', [a2, a3]) initLevel = 2
a1.show_tree(initLevel) print '=' * 40
for level, item in walk_tree(a1, initLevel):
show_level(level)
print 'item:', item.get_data() print '=' * 40
for level, item in walk_tree_recur(a1, initLevel):
show_level(level)
print 'item:', item.get_data()
print '=' * 40
a1.set_initlevel(initLevel) for level, item in a1:
show_level(level)
print 'item:', item.get_data() iter1 = iter(a1)
print iter1
print iter1.next() print iter1.next() print iter1.next() print iter1.next() print iter1.next() print iter1.next() print iter1.next()
## print iter1.next() return a1
if __name__ == '__main__':
test()
Notes:
● An instance of class Node is "iterable". It can be used directly in a for
statement, a list comprehension, etc. So, for example, when an instance of Node is used in a for statement, it produces an iterator.
● We could also call the Node.walk_method directly to obtain an iterator.
● Method Node.walk_tree and functions walk_tree and
walk_tree_recur are generators. When called, they return an iterator. They do this because they each contain a yield statement.
● These methods/functions are recursive. They call themselves. Since they are generators, they must call themselves in a context that uses an iterator, for example in a for statement.