4.10 Functions
Most
statements in a typical Python program are organized into functions.
A function is a group of statements that
executes upon request. Python provides many built-in functions and
allows programmers to define their own functions. A request to
execute a function is known as a function
call. When a function is called, it may be
passed arguments that specify data upon which the function performs
its computation. In Python, a function always returns a result value,
either None or a value that represents the results
of its computation. Functions defined within class
statements are also called methods. Issues
specific to methods are covered in Chapter 5; the
general coverage of functions in this section, however, also applies
to methods.
In Python, functions are objects (values) and are handled like other
objects. Thus, you can pass a function as an argument in a call to
another function. Similarly, a function can return another function
as the result of a call. A function, just like any other object, can
be bound to a variable, an item in a container, or an attribute of an
object. Functions can also be keys into a dictionary. For example, if
you need to quickly find a function's inverse given
the function, you could define a dictionary whose keys and values are
functions and then make the dictionary bidirectional (using some
functions from module math, covered in Chapter 15):
inverse = {sin:asin, cos:acos, tan:atan, log:exp}
for f in inverse.keys( ): inverse[inverse[f]] = f
The fact that functions are objects in Python is often expressed by
saying that functions are first-class objects.
4.10.1 The def Statement
The
def statement is the most common way to define a
function. def is a single-clause compound
statement with the following syntax:
def function-name(parameters):
statement(s)
function-name is an identifier. It is a
variable that gets bound (or rebound) to the function object when
def executes.
parameters is an optional list of identifiers,
called formal parameters or
just parameters, that are used to represent values that are supplied
as arguments when the function is called. In the simplest case, a
function doesn't have any formal parameters, which
means the function doesn't take any arguments when
it is called. In this case, the function definition has empty
parentheses following function-name.
When a function does take arguments,
parameters contains one or more
identifiers, separated by commas (,). In this
case, each call to the function supplies values, known as
arguments, that correspond to the parameters
specified in the function definition. The parameters are local
variables of the function, as we'll discuss later in
this section, and each call to the function binds these local
variables to the corresponding values that the caller supplies as
arguments.
The non-empty sequence of statements, known as the
function body, does not
execute when the def statement executes. Rather,
the function body executes later, each time the function is called.
The function body can contain zero or more occurrences of the
return statement, as we'll
discuss shortly.
Here's an example of a simple function that returns
a value that is double the value passed to it:
def double(x):
return x*2
4.10.2 Parameters
Formal parameters
that are simple identifiers indicate mandatory
parameters. Each call to the function must
supply a corresponding value (argument) for each mandatory
parameter.
In the comma-separated list of parameters, zero or more mandatory
parameters may be followed by zero or more
optional parameters, where
each optional parameter has the syntax:
identifier=expression
The def statement evaluates the
expression and saves a reference to the
value returned by the expression, called the
default value for the
parameter, among the attributes of the function object. When a
function call does not supply an argument corresponding to an
optional parameter, the call binds the parameter's
identifier to its default value for that execution of the function.
Note that the same object, the default value, gets bound to the
optional parameter whenever the caller does not supply a
corresponding argument. This can be tricky when the default value is
a mutable object and the function body alters the parameter. For
example:
def f(x, y=[ ]):
y.append(x)
return y
print f(23) # prints: [23]
prinf f(42) # prints: [23,42]
The second print statement prints
[23,42] because the first call to
f altered the default value of
y, originally an empty list [
], by appending 23 to it. If you want
y to be bound to a new empty list object each time
f is called with a single argument, use the
following:
def f(x, y=None):
if y is None: y = [ ]
y.append(x)
return y
print f(23) # prints: [23]
prinf f(42) # prints: [42]
At the end of the formal parameters, you may optionally use either or
both of the special forms
*identifier1 and
**identifier2. If both
are present, the one with two asterisks must be last.
*identifier1 indicates
that any call to the function may supply extra positional arguments,
while **identifier2
specifies that any call to the function may supply extra named
arguments (positional and named arguments are covered later in this
chapter). Every call to the function binds
identifier1 to a tuple whose items are the
extra positional arguments (or the empty tuple, if there are none).
identifier2 is bound to a dictionary whose
items are the names and values of the extra named arguments (or the
empty dictionary, if there are none). Here's how to
write a function that accepts any number of arguments and returns
their sum:
def sum(*numbers):
result = 0
for number in numbers: result += number
return result
print sum(23,42) # prints: 65
The ** form also lets you construct a dictionary
with string keys in a more readable fashion than with the standard
dictionary creation syntax:
def adict(**kwds): return kwds
print adict(a=23, b=42) # prints: {'a':23, 'b':42}
Note that the body of function adict is just one
simple statement, and therefore we can exercise the option to put it
on the same line as the def statement. Of course,
it would be just as correct (and arguably more readable) to code
function adict using two lines instead of one:
def adict(**kwds):
return kwds
4.10.3 Attributes of Function Objects
The
def statement defines some attributes of a
function object. The attribute func_name, also
accessible as _ _name_ _, is a read-only attribute
(trying to rebind or unbind it raises a runtime exception) that
refers to the identifier used as the function name in the
def statement. The attribute
func_defaults, which you may rebind or unbind,
refers to the tuple of default values for the optional parameters (or
the empty tuple, if the function has no optional
parameters).
Another function
attribute is the documentation
string, also known as a
docstring. You may use or rebind a
function's docstring attribute as either
func_doc or _ _doc_ _. If the
first statement in the function body is a string literal, the
compiler binds that string as the function's
docstring attribute. A similar rule applies to classes (see Chapter 5) and modules (see Chapter 7). Docstrings most often span multiple physical
lines, and are therefore normally specified in triple-quoted string
literal form. For example:
def sum(*numbers):
'''Accept arbitrary numerical arguments and return their sum.
The arguments are zero or more numbers. The result is their sum.'''
result = 0
for number in numbers: result += number
return result
Documentation strings should be part of any Python code you write.
They play a role similar to that of comments in any programming
language, but their applicability is wider since they are available
at runtime. Development environments and other tools may use
docstrings from function, class, and module objects to remind the
programmer how to use those objects. The doctest
module (covered in Chapter 17) makes it easy to
check that the sample code in docstrings is accurate and correct.
To make your docstrings as useful as possible, you should respect a
few simple conventions. The first line of a docstring should be a
concise summary of the function's purpose, starting
with an uppercase letter and ending with a period. It should not
mention the function's name, unless the name happens
to be a natural-language word that comes naturally as part of a good,
concise summary of the function's operation. If the
docstring is multiline, the second line should be empty, and the
following lines should form one or more paragraphs, separated by
empty lines, describing the function's expected
arguments, preconditions, return value, and side effects (if any).
Further explanations, bibliographical references, and usage examples
(to be checked with doctest) can optionally follow
toward the end of the docstring.
In addition to its predefined attributes, a function object may be
given arbitrary attributes. To create an attribute of a function
object, bind a value to the appropriate attribute references in an
assignment statement after the def statement has
executed. For example, a function could count how many times it is
called:
def counter( ):
counter.count += 1
return counter.count
counter.count = 0
Note that this is not common usage. More often, when you want to
group together some state (data) and some behavior (code), you should
use the object-oriented mechanisms covered in Chapter 5. However, the ability to associate arbitrary
attributes with a function can sometimes come in handy.
4.10.4 The return Statement
The return
statement in Python is allowed only inside a function body, and it
can optionally be followed by an expression. When
return executes, the function terminates and the
value of the expression is returned. A function returns
None if it terminates by reaching the end of its
body or by executing a return statement that has
no expression.
As a matter of style, you should not write a
return statement without an expression at the end
of a function body. If some return statements in a
function have an expression, all return statements
should have an expression. return
None should only be written explicitly to meet
this style requirement. Python does not enforce these stylistic
conventions, but your code will be clearer and more readable if you
follow them.
4.10.5 Calling Functions
A function call is an expression
with the following syntax:
function-object(arguments)
function-object may be any reference to a
function object; it is most often the function's
name. The parentheses denote the function-call operation itself.
arguments, in the simplest case, is a
series of zero or more expressions separated by commas
(,), giving values for the
function's corresponding formal parameters. When a
function is called, the parameters are bound to these values, the
function body executes, and the value of the function-call expression
is whatever the function returns.
4.10.5.1 The semantics of argument passing
In
traditional terms, all argument passing in Python is by
value. For example, if a variable is passed as an
argument, Python passes to the function the object (value) to which
the variable currently refers, not the variable itself. Thus, a
function cannot rebind the caller's variables.
However, if a mutable object is passed as an argument, the function
may request changes to that object since Python passes the object
itself, not a copy. Rebinding a variable and mutating an object are
totally different concepts in Python. For example:
def f(x, y):
x = 23
y.append(42)
a = 77
b = [99]
f(a, b)
print a, b # prints: 77 [99, 42]
The print statement shows that
a is still bound to 77.
Function f's rebinding of its
parameter x to 23 has no effect
on f's caller, and in particular
on the binding of the caller's variable, which
happened to be used to pass 77 as the
parameter's value. However, the
print statement also shows that
b is now bound to [99,42].
b is still bound to the same list object as before
the call, but that object has mutated, as f has
appended 42 to that list object. In either case,
f has not altered the caller's
bindings, nor can f alter the number
77, as numbers are immutable. However,
f can alter a list object, as list objects are
mutable. In this example, f does mutate the list
object that the caller passes to f as the second
argument by calling the object's
append method.
4.10.5.2 Kinds of arguments
Arguments that are just expressions are called
positional arguments. Each
positional argument supplies the value for the formal parameter that
corresponds to it by position (order) in the function
definition.
In a function call, zero or more positional arguments may be followed
by zero or more named
arguments with the following syntax:
identifier=expression
The identifier must be one of the formal
parameter names used in the def statement for the
function. The expression supplies the
value for the formal parameter of that name.
A function call must supply, via either a positional or a named
argument, exactly one value for each mandatory parameter, and zero or
one value for each optional parameter. For example:
def divide(divisor, dividend): return dividend // divisor
print divide(12,94) # prints: 7
print divide(dividend=94, divisor=12) # prints: 7
As you can see, the two calls to divide are
equivalent. You can pass named arguments for readability purposes
when you think that identifying the role of each argument and
controlling the order of arguments enhances your
code's clarity.
A more common use of named arguments is to bind some optional
parameters to specific values, while letting other optional
parameters take their default values:
def f(middle, begin='init', end='finis'): return begin+middle+end
print f('tini', end='') # prints: inittini
Thanks to named argument end='', the caller can
specify a value, the empty string '', for
f's third parameter,
end, and still let
f's second parameter,
begin, use its default value, the string
'init'.
At the end of the arguments in a function call, you may optionally
use either or both of the special forms
*seq and
**dict. If both are
present, the one with two asterisks must be last.
*seq passes the items
of seq to the function as positional
arguments (after the normal positional arguments, if any, that the
call gives with the usual simple syntax).
seq may be any sequence or iterable.
**dict passes the items
of dict to the function as named
arguments, where dict must be a dictionary
whose keys are all strings. Each item's key is a
parameter name, and the item's value is the
argument's value.
Sometimes you want to pass an argument of the form
*seq or
**dict when the formal
parameters use similar forms, as described earlier under Section 4.10.2. For example, using the
function sum defined in that section (and shown
again here), you may want to print the sum of all the values in
dictionary d. This is easy with
*seq:
def sum(*numbers):
result = 0
for number in numbers: result += number
return result
print sum(*d.values( ))
However, you may also pass arguments of the form
*seq or
**dict when calling a
function that does not use similar forms in its formal
parameters.
4.10.6 Namespaces
A
function's formal parameters, plus any variables
that are bound (by assignment or by other binding statements) in the
function body, comprise the function's
local namespace, also known
as local scope. Each of
these variables is called a local
variable of the
function.
Variables
that are not local are known as global
variables (in the absence of nested definitions,
which we'll discuss shortly). Global variables are
attributes of the module object, as covered in Chapter 7. If a local variable in a function has the
same name as a global variable, whenever that name is mentioned in
the function body, the local variable, not the global variable, is
used. This idea is expressed by saying that the local variable hides
the global variable of the same name throughout the function body.
4.10.6.1 The global statement
By default, any variable that is bound
within a function body is a local variable of the function. If a
function needs to rebind some global variables, the first statement
of the function must be:
global identifiers
where identifiers is one or more
identifiers separated by commas (,). The
identifiers listed in a global statement refer to
the global variables (i.e., attributes of the module object) that the
function needs to rebind. For example, the function
counter that we saw in
Section 4.10.3
could be implemented using global and a global
variable rather than an attribute of the function object as follows:
_count = 0
def counter( ):
global _count
_count += 1
return _count
Without the global statement, the
counter function would raise an
UnboundLocalError exception because
_count would be an uninitialized (unbound) local
variable. Note also that while the global
statement does enable this kind of programming, it is neither elegant
nor advisable. As I mentioned earlier, when you want to group
together some state and some behavior, the object-oriented mechanisms
covered in Chapter 5 are typically the best
approach.
You don't need global if the
function body simply uses a global variable, including changing the
object bound to that variable if the object is mutable. You need to
use a global statement only if the function body
rebinds a global variable. As a matter of style, you should not use
global unless it's strictly
necessary, as its presence will cause readers of your program to
assume the statement is there for some useful purpose.
4.10.6.2 Nested functions and nested scopes
A def statement within
a function body defines a nested
function, and the function whose body includes
the def is known as an outer
function to the nested one. Code in a nested
function's body may access (but not rebind) local
variables of an outer function, also known as
free variables of the
nested function. This nested-scope access is automatic in Python 2.2
and later. To request nested-scope access in Python 2.1, the first
statement of the module must be:
from _ _future_ _ import nested_scopes
The simplest way to let a nested function access a value is often not
to rely on nested scopes, but rather to explicitly pass that value as
one of the function's arguments. The
argument's value can be bound when the nested
function is defined by using the value as the default for an optional
argument. For example:
def percent1(a, b, c): # works with any version
def pc(x, total=a+b+c): return (x*100.0) / total
print "Percentages are ", pc(a), pc(b), pc(c)
Here's the same functionality using nested scopes:
def percent2(a, b, c): # needs 2.2 or "from future import"
def pc(x): return (x*100.0) / (a+b+c)
print "Percentages are", pc(a), pc(b), pc(c)
In this specific case, percent1 has a slight
advantage: the computation of
a+b+c
happens only once, while
percent2's inner function
pc repeats the computation three times. However,
if the outer function were rebinding its local variables between
calls to the nested function, repeating this computation might be an
advantage. It's therefore advisable to be aware of
both approaches, and choose the most appropriate one case by case.
A nested function that accesses values from outer local variables is
known as a closure. The following example shows
how to build a closure without nested scopes (using a default value):
def make_adder_1(augend): # works with any version
def add(addend, _augend=augend): return addend+_augend
return add
Here's the same closure functionality using nested
scopes:
def make_adder_2(augend): # needs 2.2 or "from future import"
def add(addend): return addend+augend
return add
Closures are an exception to the general rule that the
object-oriented mechanisms covered in Chapter 5
are the best way to bundle together data and code. When you need to
construct callable objects, with some parameters fixed at object
construction time, closures can be simpler and more effective than
classes. For example, the result of
make_adder_1(7) is a function that accepts a
single argument and adds 7 to that argument (the
result of make_adder_2(7) behaves in just the same
way). You can also express the same idea as lambda
x:
x+7, using the
lambda form covered in the next section. A closure
is a "factory" for any member of a
family of functions distinguished by some parameters, such as the
value of argument augend in the previous
examples, and this may often help you avoid code
duplication.
4.10.7 lambda Expressions
If a function body contains a single
return expression
statement, you may choose to replace the function with the special
lambda expression form:
lambda parameters: expression
A lambda expression is the anonymous equivalent of
a normal function whose body is a single return
statement. Note that the lambda syntax does not
use the return keyword. You can use a
lambda expression wherever you would use a
reference to a function. lambda can sometimes be
handy when you want to use a simple function as an argument or return
value. Here's an example that uses a
lambda expression as an argument to the built-in
filter function:
aList = [1,2,3,4,5,6,7,8,9]
low = 3
high = 7
filter(lambda x,l=low,h=high: h>x>l, aList) # returns: [4, 5, 6]
As an alternative, you can always use a local def
statement that gives the function object a name. You can then use
this name as the argument or return value. Here's
the same filter example using a local
def statement:
aList = [1,2,3,4,5,6,7,8,9]
low = 3
high = 7
def test(value, l=low, h=high):
return h>value>l
filter(test, aList) # returns: [4, 5, 6]
4.10.8 Generators
When the body
of a function contains one or more occurrences of the keyword
yield, the function is called a
generator. When a generator is called, the
function body does not execute. Instead, calling the generator
returns a special iterator object that wraps the function body, the
set of its local variables (including its parameters), and the
current point of execution, which is initially the start of the
function.
When the next method of this iterator object is
called, the function body executes up to the next
yield statement, which takes the form:
yield expression
When a yield statement executes, the function is
frozen with its execution state and local variables intact, and the
expression following yield is returned as the
result of the next method. On the next call to
next, execution of the function body resumes where
it left off, again up to the next yield statement.
If the function body ends or executes a return
statement, the iterator raises a StopException to
indicate that the iterator is finished. Note that
return statements in a generator cannot contain
expressions, as that is a syntax error.
yield is always a keyword in Python 2.3 and later.
In Python 2.2, to make yield a keyword in a source
file, use the following line as the first statement in the file:
from _ _future_ _ import generators
In Python 2.1 and earlier, you cannot define generators.
Generators are often handy ways to build iterators. Since the most
common way to use an iterator is to loop on it with a
for statement, you typically call a generator like
this:
for avariable in somegenerator(arguments):
For example, say that you want a sequence of numbers counting up from
1 to N and then down to
1 again. A generator helps:
def updown(N):
for x in xrange(1,N): yield x
for x in xrange(N,0,-1): yield x
for i in updown(3): print i # prints: 1 2 3 2 1
Here is a generator that works somewhat like the built-in
xrange function, but returns a sequence of
floating-point values instead of a sequence of integers:
def frange(start, stop, step=1.0):
while start < stop:
yield start
start += step
frange is only somewhat like
xrange, because, for simplicity, it makes
arguments start and stop
mandatory, and silently assumes step is positive
(by default, like xrange,
frange makes step equal to
1).
Generators are more flexible than functions that return lists. A
generator may build an iterator that returns an infinite stream of
results that is usable only in loops that terminate by other means
(e.g., via a break statement). Further, the
generator-built iterator performs lazy
evaluation: the iterator computes each successive item
only when and if needed, just in time, while the equivalent function
does all computations in advance and may require large amounts of
memory to hold the results list. Therefore, in Python 2.2 and later,
if all you need is the ability to iterate on a computed sequence, it
is often best to compute the sequence in a generator, rather than in
a function that returns a list. If the caller needs a list that
contains all the items produced by a generator
G(arguments),
the caller can use the following code:
resulting_list = list(G(arguments))
4.10.9 Recursion
Python supports recursion (i.e., a
Python function can call itself), but there is a limit to how deep
the recursion can be. By default, Python interrupts recursion and
raises a RecursionLimitExceeded exception (covered
in Chapter 6) when it detects that the stack of
recursive calls has gone over a depth of 1,000. You can change the
recursion limit with function setrecursionlimit of
module sys, covered in Chapter 8.
However, changing this limit will still not give you unlimited
recursion; the absolute maximum limit depends on the platform,
particularly on the underlying operating system and C runtime
library, but it's typically a few thousand. When
recursive calls get too deep, your program will crash. Runaway
recursion after a call to setrecursionlimit that
exceeds the platform's capabilities is one of the
very few ways a Python program can crash—really crash, hard,
without the usual safety net of Python's exception
mechanisms. Therefore, be wary of trying to fix a program that is
getting RecursionLimitExceeded exceptions by
raising the recursion limit too high with
setrecursionlimit. Most often,
you'd be better advised to look for ways to remove
the recursion or, at least, to limit the depth of recursion that your
program needs.
|