This article needs additional citations for verification. (July 2007) |
In computer science, a generator is a special routine that can be used to control the iteration behaviour of a loop. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.
Generators can be implemented in terms of more expressive control flow constructs, such as coroutines or first-class continuations.[1]
History
Generators first appeared in CLU (1975),[2] were a prominent feature in the string manipulation language Icon (1977) and are now available in Python,[3] C#,[4] JavaScript,[5] Ruby and other languages. In CLU and C#, generators are called iterators, and in Ruby, enumerators.
Uses
Generators are usually invoked inside loops.[6] The first time that a generator invocation is reached in a loop, an iterator object is created that encapsulates the state of the generator routine at its beginning, with arguments bound to the corresponding parameters. The generator's body is then executed in the context of that iterator until a special yield action is encountered; at that time, the value provided with the yield action is used as the value of the invocation expression. The next time the same generator invocation is reached in a subsequent iteration, the execution of the generator's body is resumed after the yield action, until yet another yield action is encountered. In addition to the yield action, execution of the generator body can also be terminated by a finish action, at which time the innermost loop enclosing the generator invocation is terminated.
Because generators compute their yielded values only on demand, they are useful for representing sequences that are expensive to compute, or even infinite.
In the presence of generators, loop constructs of a language can be reduced into a single loop ... end loop construct; all the usual loop constructs can then be comfortably simulated by using suitable generators in the right way.
Python
An example Python generator:
def countfrom(n):
while True:
yield n
n += 1
# Example use: printing out the integers from 10 to 20.
# Note that this iteration terminates normally, despite
# countfrom() being written as an infinite loop.
for i in countfrom(10):
if i <= 20:
print(i)
else:
break
# Another generator, which produces prime numbers indefinitely as needed.
def primes():
yield 2
n = 3
p = []
while True:
#This works in Python 2.5+
if not any( n % f == 0 for f in
itertools.takewhile(lambda f: f*f <= n, p) ):
yield n
p.append( n )
n += 2
In Python, a generator can be thought of as an iterator that contains a frozen stack frame. Whenever the iterator's next()
method is called, Python resumes the frozen frame, which executes normally until the next yield
statement is reached. The generator's frame is then frozen again, and the yielded value is returned to the caller.
Generator Expressions
Python has a syntax modeled on that of list comprehensions, called a generator expression that aids in the creation of generators. The following extends the example above by using a generator expression to compute squares from the countfrom generator function:
squares = ( n*n for n in countfrom(10) )
for j in squares:
if j <= 20:
print(j)
else:
break
Icon
Every expression (including loops) is a generator. Printing squares from 0 to 20 can be achieved by writing
local squares, j
squares := create (seq(0) ^ 2)
every j := |@squares do
if j <= 20 then
write(j)
else
break
C#
An example C# 2.0 generator (the yield
is available since C# version 2.0):
Both of these examples utilise Generics, but this is not required.
// Method that takes an iterable input (possibly an array)
// and returns all even numbers.
public static IEnumerable<int> GetEven(IEnumerable<int> numbers) {
foreach (int i in numbers) {
if ((i % 2) == 0) {
yield return i;
}
}
}
It is possible to use multiple yield return
statements and they are applied in sequence on each iteration:
public class CityCollection : IEnumerable<string> {
public IEnumerator<string> GetEnumerator() {
yield return "New York";
yield return "Paris";
yield return "London";
}
}
XL
In XL, iterators are the basis of 'for' loops:
import IO = XL.UI.CONSOLE
iterator IntegerIterator (var out Counter : integer; Low, High : integer) written Counter in Low..High is
Counter := Low
while Counter <= High loop
yield
Counter += 1
// Note that I needs not be declared, because declared 'var out' in the iterator
// An implicit declaration of I as an integer is therefore made here
for I in 1..5 loop
IO.WriteLn "I=", I
Racket
Racket provides several related facilities for generators. First, its for-loop forms work with sequences, which are a kind of a producer:
(for ([i (in-range 10 20)])
(printf "i = ~s\n" i))
and these sequences are also first-class values:
(define 10-to-20 (in-range 10 20))
(for ([i 10-to-20])
(printf "i = ~s\n" i))
Some sequences are implemented imperatively (with private state variables) and some are implemented as (possibly infinite) lazy lists. Also, new struct definitions can have a property that specifies how they can be used as sequences.
But more directly, Racket comes with a generator library for a more traditional generator specification. For example,
#lang racket
(require racket/generator)
(define (ints-from from)
(generator ()
(for ([i (in-naturals from)]) ; infinite sequence of integers from 0
(yield i))))
(define g (ints-from 10))
(list (g) (g) (g)) ; -> '(10 11 12)
Note that the Racket core implements powerful continuation features, providing general (re-entrant) continuations that are composable, and also delimited continuations. Using this, the generator library is implemented in Racket.
Tcl
In Tcl 8.6, the generator mechanism is founded on named coroutines.
proc generator {body} {
coroutine gen[incr ::disambiguator] apply {{script} {
# Produce the result of [generator], the name of the generator
yield [info coroutine]
# Do the generation
eval $script
# Finish the loop of the caller using a 'break' exception
return -code break
}} $body
}
# Use a simple 'for' loop to do the actual generation
set count [generator {
for {set i 10} {$i <= 20} {incr i} {
yield $i
}
}]
# Pull values from the generator until it is exhausted
while 1 {
puts [$count]
}
Ruby
Ruby supports generators (starting from version 1.9) in form of built-in Enumerator class.
# Generator from an Enumerable object
chars = Enumerator.new(['A', 'B', 'C', 'Z'])
4.times { puts chars.next }
# Generator from a block
count = Enumerator.new do |yielder|
i = 0
loop { yielder.yield i += 1 }
end
100.times { puts count.next }
Haskell
In Haskell, with its lazy evaluation model, everything is a generator - every datum created with non-strict data constructor is generated on demand. For example,
fibgen (a,b) = a : fibgen (b,a+b)
-- Main> take 10 (fibgen (0,1))
-- [0,1,1,2,3,5,8,13,21,34]
where (:)
is a non-strict list constructor, cons.
Other Implementations
Java
Java does not have continuation functionality as part of the language per se; however, one may easily obtain that by providing a custom implementation of some known collection or iterator interface. This will typically be the java.lang.Iterable interface, or a Java collections framework interface, but it might even be a custom collection interface.
The continuation is implemented through a finite-state machine, namely an object, that will compute the single elements on demand, instead of retrieving them from a real collection or storage. This will be hidden behind the collection interface, and therefore will be transparent to the application, which only "sees" the iterator or collection methods of that object, and does not know about its implementation internals, thanks to encapsulation.
This means that, although continuations are not available as part of the language, you can easily get them through an OOP solution, specifically through polymorphism.
It is strongly advised that this technique be used whenever it is applicable. For example, it is often applied when implementing a very simple implementation of a model for a list or table in a Swing application: the model is implemented so that is generates its contents on the fly, as soon as there is the need for any of them to be displayed on the graphical component, and this is completely transparent to the Swing library, because it just sees a collection of elements.
It is also important to notice that, as long as the generator object implements interface java.lang.Iterable, it may be used to control a for each statement.
Let's take an example. The following code defines a subclass of the the J2SE class AbstractList, to implement an unmodifiable list[7]:
// Implements a fixed-size list, that contains all numbers
// from 0 to some given number.
public class Generator extends AbstractList<Integer> {
private int _size;
// The collection will contain all numbers
// from 0 to size-1, inclusive.
public Generator(int size) {
if (size < 0) {
throw new IllegalArgumentException("Negative size not allowed");
}
_size = size;
}
@Override
public Integer get(int index) {
if (index < 0 || index >= size()) {
throw new IndexOutOfBoundsException();
}
return Integer.valueOf(index);
}
@Override
public int size() {
return _size;
}
}
All of the collections in the Java collections framework implement java.lang.Iterable, therefore such a list can be used in a for each statement, for example:
for (int value: new Generator(20)) {
System.out.println("generator: " + value);
}
There also exist some third-party libraries, which allow for implementing generators by subclassing a given abstract class.
Perl
Perl does not natively provide generators, but support is provided by the Coro::Generator module which uses the Coro co-routine framework. Example usage:
use strict;
use warnings;
# enable generator { BLOCK } and yield
use Coro::Generator;
# array reference to iterate over
my $chars = ['A'...'Z'];
# new generator which can be called like a
# coderef.
my $letters = generator {
my $i = 0;
for my $letter (@$chars) {
# get next letter from $chars
yield $letter;
}
};
# call the generator 15 times
print $letters->(), "\n" for (0..15);
Lisp
Common Lisp also does not natively provide generators, yet various library implementations exist, such as pygen.
See also
- List comprehension for another construct that generates a sequence of values
- Iterator for the concept of producing a list one element at a time
- Lazy evaluation for producing values when needed
- Corecursion for potentially infinite data by recursion instead of yield
- Coroutine for even more generalization from subroutine
- Continuation for generalization of control flow
Notes
- ^ Kiselyov, Oleg (2004). "General ways to traverse collections in Scheme".
{{cite web}}
: Unknown parameter|month=
ignored (help) - ^ Liskov, Barbara (1992). "A History of CLU" (pdf).
{{cite web}}
: Unknown parameter|month=
ignored (help) - ^ Python Enhancement Proposals: PEP 255: Simple Generators, PEP 289: Generator Expressions, PEP 342: Coroutines via Enhanced Generators
- ^ yield (C# Reference)
- ^ "New In JavaScript 1.7". Retrieved 2006-10-10.
- ^ The Icon Programming Language utilizes generators to implement its goal directed evaluation. In Icon, generators can be invoked in contexts outside of the normal looping control structures.
- ^ java.util.AbstractList
References
- Stephan Murer, Stephen Omohundro, David Stoutamire and Clemens Szyperski: Iteration abstraction in Sather. ACM Transactions on Programming Languages and Systems, 18(1):1-15 (1996) [1]