5.1 Classic Classes and Instances
A
classic class is a Python object with several
characteristics:
You can call a class object as if it were a function. The call
creates another object, known as an instance of
the class, that knows what class it belongs to.
A class has arbitrarily named attributes that you can bind and
reference.
The values of class attributes can be data objects or function
objects.
Class attributes bound to functions are known as
methods of the
class.
A
method can have a special Python-defined name with two leading and
two trailing underscores. Python invokes such special
methods, if they are present, when various kinds of
operations take place on class instances.
A class can inherit from other classes, meaning
it can delegate to other class objects the lookup of attributes that
are not found in the class
itself.
An instance of a class is a Python object with arbitrarily named
attributes that you can bind and reference. An instance object
implicitly delegates to its class the lookup of attributes not found
in the instance itself. The class, in turn, may delegate the lookup
to the classes from which it inherits, if any.
In Python, classes are objects (values), and are handled like other
objects. Thus, you can pass a class as an argument in a call to a
function. Similarly, a function can return a class as the result of a
call. A class, just like any other object, can be bound to a variable
(local or global), an item in a container, or an attribute of an
object. Classes can also be keys into a dictionary. The fact that
classes are objects in Python is often expressed by saying that
classes are first-class
objects.
5.1.1 The class Statement
The class statement
is the most common way to create a class object.
class is a single-clause compound statement with
the following syntax:
class classname[(base-classes)]:
statement(s)
classname is an identifier. It is a
variable that gets bound (or rebound) to the class object after the
class statement finishes executing.
base-classes is an optional
comma-delimited series of expressions whose values must be class
objects. These classes are known by different names in different
languages; you can think of them as the base
classes, superclasses, or
parents of the class being created. The class
being created is said to inherit from,
derive from, extend, or
subclass its base classes, depending on what
language you are familiar with. This class is also known as a
direct subclass or
descendant of its base
classes.
The subclass relationship between
classes is transitive. If C1 subclasses
C2, and C2
subclasses C3,
C1 subclasses
C3. Built-in function
issubclass(C1,
C2) accepts two
arguments that are class objects: it returns True
if C1 subclasses
C2, otherwise it returns
False. Any class is considered a subclass of
itself; therefore
issubclass(C,
C) returns
True for any class C.
The way in which the base classes of a class affect the functionality
of the class is covered later in this chapter.
The syntax of the class statement has a small,
tricky difference from that of the def statement
covered in Chapter 4. In a def
statement, parentheses are mandatory between the
function's name and the colon. To define a function
without formal parameters, use a statement such
as:
def name( ):
statement(s)
In a
class statement, the parentheses are mandatory if
the class has one or more base classes, but they are forbidden if the
class has no base classes. Thus, to define a class without base
classes, use a statement such as:
class name:
statement(s)
The non-empty sequence of statements that follows the
class statement is known as the class
body. A class body executes immediately, as part of the
class statement's execution.
Until the body finishes executing, the new class object does not yet
exist and the classname identifier is not
yet bound (or rebound). Section 5.4 later in this
chapter provides more details about what happens when a
class statement executes.
Finally, note that the class statement does not
create any instances of a class, but rather defines the set of
attributes that are shared by all instances when they are created.
5.1.2 The Class Body
The body of a class is where you normally
specify the attributes of the class; these attributes can be data
objects or function objects.
5.1.2.1 Attributes of class objects
You typically
specify an attribute of a class object by binding a value to an
identifier within the class body. For example:
class C1:
x = 23
print C1.x # prints: 23
Class object C1 now has an attribute named
x, bound to the value 23, and
C1.x refers to that attribute.
You can also bind or unbind class
attributes outside the class body. For example:
class C2: pass
C2.x = 23
print C2.x # prints: 23
However, your program is more readable if you bind, and thus create,
class attributes with statements inside the class body. Any class
attributes are implicitly shared by all instances of the class when
those instances are created, as we'll discuss
shortly.
The class
statement implicitly defines some class attributes. Attribute
_ _name_ _ is the
classname identifier string used in the
class statement. Attribute _ _bases_
_ is the tuple of class objects given as the base classes
in the class statement (or the empty tuple, if no
base classes are given). For example, using the class
C1 we just created:
print C1._ _name_ _, C1._ _bases_ _ # prints: C1, ( )
A
class also has an attribute _ _dict_ _, which is
the dictionary object that the class uses to hold all of its other
attributes. For any class object C, any
object x, and any identifier
S (except _ _name_ _,
_ _bases_ _, and _ _dict_ _),
C.S=x
is equivalent to C._ _dict_
_['S']=x.
For example, again referring to the class C1 we
just created:
C1.y = 45
C1._ _dict_ _['z'] = 67
print C1.x, C1.y, C1.z # prints: 23, 45, 67
There is no difference between class attributes created in the class
body, outside of the body by assigning an attribute, or outside of
the body by explicitly binding an entry in
C._ _dict_ _.
In statements that are directly in a class's body,
references to attributes of the class must use a simple name, not a
fully qualified name. For example:
class C3:
x = 23
y = x + 22 # must use just x, not C3.x
However, in statements that are in methods defined in a class body,
references to attributes of the class must use a fully qualified
name, not a simple name. For example:
class C4:
x = 23
def amethod(self):
print C4.x # must use C4.x, not just x
Note that attribute references (i.e., an expression like
C.S)
have richer semantics than attribute binding. These references are
covered in detail later in this chapter.
5.1.2.2 Function definitions in a class body
Most class bodies include
def statements, as functions (called methods in
this context) are important attributes for class objects. A
def statement in a class body obeys the rules
presented in Section 4.10. In addition, a method
defined in a class body always has a mandatory first parameter,
conventionally named self, that refers to the
instance on which you call the method. The self
parameter plays a special role in method calls, as covered later in
this chapter.
Here's an example of a class that includes a method
definition:
class C5:
def hello(self):
print "Hello"
A class can define a variety of special methods (methods with names
that have two leading and two trailing underscores) relating to
specific operations. We'll discuss special methods
in great detail later in this chapter.
5.1.2.3 Class-private variables
When
a statement in a class body (or in a method in the body) uses an
identifier starting with two underscores (but not ending with
underscores), such as _ _ident, the Python
compiler implicitly changes the identifier into
_classname_ _ident, where
classname is the name of the class. This
lets a class use private names for attributes, methods, global
variables, and other purposes, without the risk of accidentally
duplicating names used elsewhere.
By convention, all identifiers starting with a single underscore are
also intended as private to the scope that binds them, whether that
scope is or isn't a class. The Python compiler does
not enforce privacy conventions, however: it's up to
Python programmers to respect them.
5.1.2.4 Class documentation strings
If the first
statement in the class body is a string literal, the compiler binds
that string as the documentation string attribute for the class. This
attribute is named _ _doc_ _ and is known as the
docstring of the class. See Section 4.10.3 for more information on docstrings.
5.1.3 Instances
When you want to create an instance of
a class, call the class object as if it were a function. Each call
returns a new instance object of that class:
anInstance = C5( )
You can call built-in function
isinstance(I,C)
with a class object as argument C. In this
case, isinstance returns True
if object I is an instance of class
C or any subclass of
C. Otherwise,
isinstance returns
False.
5.1.3.1 _ _init_ _
When a class has or
inherits a method named _ _init_ _, calling the
class object implicitly executes _ _init_ _ on the
new instance to perform any instance-specific initialization that is
needed. Arguments passed in the call must correspond to the formal
parameters of _ _init_ _. For example, consider
the following class:
class C6:
def _ _init_ _(self,n):
self.x = n
Here's how to create an instance of the
C6 class:
anotherInstance = C6(42)
As shown in the C6 class, the _ _init_
_ method typically contains statements that bind instance
attributes. An _ _init_ _ method must either not
return a value or return the value None; any other
return value raises a TypeError exception.
The main purpose of _ _init_ _ is to bind, and
thus create, the attributes of a newly created instance. You may also
bind or unbind instance attributes outside _ _init_
_, as you'll see shortly. However, your
code will be more readable if you initially bind all attributes of a
class instance with statements in the _ _init_ _
method.
When _ _init_ _ is absent, you must call the class
without arguments, and the newly generated instance has no
instance-specific attributes. See Section 5.3 later in this chapter for
more details about _ _init_ _.
5.1.3.2 Attributes of instance objects
Once you have created an instance,
you can access its attributes (data and methods) using the dot
(.) operator. For example:
anInstance.hello( ) # prints: Hello
print anotherInstance.x # prints: 42
Attribute references such as these have fairly rich semantics in
Python and are covered in detail later in this section.
You can give an instance object an arbitrary attribute by binding a
value to an attribute reference. For example:
class C7: pass
z = C7( )
z.x = 23
print z.x # prints: 23
Instance object z now has an attribute named
x, bound to the value 23, and
z.x refers to that attribute. Note that the
_ _setattr_ _ special method, if present,
intercepts every attempt to bind an attribute. _ _setattr_
_ is covered in Section 5.3 later in this chapter.
Creating an instance implicitly defines two instance attributes. For
any instance z,
z._ _class_ _ is the
class object to which z belongs, and
z._ _dict_ _ is the
dictionary that z uses to hold all of its
other attributes. For example, for the instance z
we just created:
print z._ _class_ _._ _name_ _, z._ _dict_ _ # prints: C7, {'x':23}
You may rebind (but not unbind) either or both of these attributes,
but this is rarely necessary.
For any instance object z, any object
x, and any identifier
S (except _ _class_ _
and _ _dict_ _),
z.S=x
is equivalent to z._ _dict_
_['S']=x
(unless a _ _setattr_ _ special method intercepts
the binding attempt). For example, again referring to the instance
z we just created:
z.y = 45
z._ _dict_ _['z'] = 67
print z.x, z.y, z.z # prints: 23, 45, 67
There is no difference between instance attributes created in
_ _init_ _, by assigning to attributes, or by
explicitly binding an entry in
z._ _dict_ _.
5.1.3.3 The factory-function idiom
It is common to want to create
instances of different classes depending upon some condition or to
want to avoid creating a new instance if an existing one is available
for reuse. You might consider implementing these needs by having
_ _init_ _ return a particular object, but that
isn't possible because Python raises an exception
when _ _init_ _ returns any value other than
None. The best way to implement flexible object
creation is by using an ordinary function, rather than by calling the
class object directly. A function used in this role is known as a
factory function.
Calling a factory function is a more flexible solution, as such a
function may return an existing reusable instance or create a new
instance by calling whatever class is appropriate. Say you have two
almost-interchangeable classes (SpecialCase and
NormalCase) and you want to flexibly generate
either one of them, depending on an argument. The following
appropriateCase factory function allows you to do
just that (the role of the self parameters is
covered in Section 5.1.5 later in
this chapter):
class SpecialCase:
def amethod(self): print "special"
class NormalCase:
def amethod(self): print "normal"
def appropriateCase(isnormal=1):
if isnormal: return NormalCase( )
else: return SpecialCase( )
aninstance = appropriateCase(isnormal=0)
aninstance.amethod( ) # prints "special", as desired
5.1.4 Attribute Reference Basics
An
attribute reference is an expression of the form
x.name,
where x is any expression and
name is an identifier called the
attribute name. Many kinds of Python objects
have attributes, but an attribute reference has special rich
semantics when x refers to a class or
instance. Remember that methods are attributes too, so everything I
say about attributes in general also applies to attributes that are
callable (i.e.,
methods).
Say that x is an instance of class
C, which inherits from base class
B. Both classes and the instance have several
attributes (data and methods) as follows:
class B:
a = 23
b = 45
def f(self): print "method f in class B"
def g(self): print "method g in class B"
class C(B):
b = 67
c = 89
d = 123
def g(self): print "method g in class C"
def h(self): print "method h in class C"
x = C( )
x.d = 77
x.e = 88
Some attribute names are special. For example, C._ _name_
_ is the string 'C', the class name.
C._ _bases_ _ is the tuple
(B,), the tuple of
C's base classes. x._
_class_ _ is the class C, the class to
which x belongs. When you refer to an attribute
with one of these special names, the attribute reference looks
directly into a special dedicated slot in the class or instance
object and fetches the value it finds there. Thus, you can never
unbind these attributes. Rebinding them is allowed, so you can change
the name or base classes of a class or the class of an instance on
the fly, but this is an advanced technique and rarely necessary.
Both class C and instance x
each have one other special attribute, a dictionary named _
_dict_ _. All other attributes of a class or instance,
except for the few special ones, are held as items in the _
_dict_ _ attribute of the class or instance.
Apart from special names, when you use the syntax
x.name
to refer to an attribute of instance x,
the lookup proceeds in two steps:
When
'name'
is a key in x._ _dict_
_,
x.name
fetches and returns the value at
x._ _dict_
_['name']
Otherwise,
x.name
delegates the lookup to
x's class (i.e., it works
just the same as x._ _class_
_.name)
Similarly, lookup for an attribute reference
C.name
on a class object C also proceeds in two
steps:
When
'name'
is a key in C._ _dict_
_,
C.name
fetches and returns the value at
C._ _dict_
_['name']
Otherwise,
C.name
delegates the lookup to
C's base classes, meaning
it loops on C._ _bases_
_ and tries the name lookup on
each
When these two lookup procedures do
not find an attribute, Python raises an
AttributeError exception. However, if
x's class defines or
inherits special method _ _getattr_ _, Python
calls x._ _getattr_
_('name')
rather than raising the exception.
Consider the following attribute references:
print x.e, x.d, x.c, x.b. x.a # prints: 88, 77, 89, 67, 23
x.e and x.d succeed in step 1
of the first lookup process, since 'e' and
'd' are both keys in x._ _dict_
_. Therefore, the lookups go no further, but rather return
88 and 77. The other three
references must proceed to step 2 of the first process and look in
x._ _class_ _ (i.e., C).
x.c and x.b succeed in step 1
of the second lookup process, since 'c' and
'b' are both keys in C._ _dict_
_. Therefore, the lookups go no further, but rather return
89 and 67.
x.a gets all the way to step 2 of the second
process, looking in C._ _bases_ _[0] (i.e.,
B). 'a' is a key in
B._ _dict_ _, therefore x.a
finally succeeds and returns 23.
Note that the attribute lookup steps happen only when you refer to an
attribute, not when you bind an attribute. When you bind or unbind an
attribute whose name is not special, only the _ _dict_
_ entry for the attribute is affected. In other words, in
the case of attribute binding, there is no lookup procedure involved.
5.1.5 Bound and Unbound Methods
Step 1 of the class attribute
reference lookup process described in the previous section actually
performs an additional task when the value found is a function. In
this case, the attribute reference does not return the function
object directly, but rather wraps the function into an
unbound method object or a bound
method object. The key difference between unbound and
bound methods is that an unbound method is not associated with a
particular instance, while a bound method is.
In the code in the previous section, attributes f,
g, and h are functions;
therefore an attribute reference to any one of them returns a method
object wrapping the respective function. Consider the
following:
print x.h, x.g, x.f, C.h, C.g, C.f
This statement outputs three bound methods, represented as strings
like:
<bound method C.h of <_ _main_ _.C instance at 0x8156d5c>>
and then three unbound ones, represented as strings like:
<unbound method C.h>
We get bound methods when the attribute reference is on instance
x, and unbound methods when the attribute
reference is on class C.
Because a bound method is already associated with a specific
instance, you call the method as follows:
x.h( ) # prints: method h in class C
The key thing
to notice here is that you don't pass the
method's first argument, self, by
the usual argument-passing syntax. Rather, a bound method of instance
x implicitly binds the self
parameter to object x. Thus, the body of the
method can access the instance's attributes as
attributes of self, even though we
don't pass an explicit argument to the method.
An unbound method, however, is not associated with a specific
instance, so you must specify an appropriate instance as the first
argument when you invoke an unbound method. For example:
C.h(x) # prints: method h in class C
You call unbound methods far less frequently than you call bound
methods. The main use for unbound methods is for accessing overridden
methods, as discussed in Section 5.1.6 later in this chapter.
5.1.5.1 Unbound method details
As we've just discussed, when an attribute reference
on a class refers to a function, a reference to that attribute
returns an unbound method that wraps the function. An unbound method
has three attributes in addition to those of the function object it
wraps: im_class is the class object supplying the
method, im_func is the wrapped function, and
im_self is always None. These
attributes are all read-only, meaning that trying to rebind or unbind
any of them raises an exception.
You can call an unbound method just as you would call its
im_func function, but the first argument in any
call must be an instance of im_class or a
descendant. In other words, a call to an unbound method must have at
least one argument, which corresponds to the first formal parameter
(conventionally named self).
5.1.5.2 Bound method details
As covered earlier in Section 5.1.4,
an attribute reference on an instance x,
such as
x.f,
delegates the lookup to
x's class when
'f'
is not a key in x._ _dict_
_. In this case, when the lookup finds a function object,
the attribute reference operation creates and returns a bound method
that wraps the function. Note that when the attribute reference finds
a function object in x._ _dict_
_ or any other kind of callable object by whatever route,
the attribute reference operation does not create a bound method. The
bound method is created only when a function object is found as an
attribute in the instance's class.
A bound method is similar an unbound method, in that it has three
read-only attributes in addition to those of the function object it
wraps. Like with an unbound method, im_class is
the class object supplying the method, and im_func
is the wrapped function. However, in a bound method object, attribute
im_self refers to x,
the instance from which the method was obtained.
A bound method is used like its im_func function,
but calls to a bound method do not explicitly supply an argument
corresponding to the first formal parameter (conventionally named
self). When you call a bound method, the bound
method passes im_self as the first argument to
im_func, before other arguments (if any) are
passed at the point of call.
Let's follow the conceptual steps in a typical
method call with the normal syntax
x.name(arg).
x is an instance object,
name is an identifier naming one of
x's methods (a
function-valued attribute of
x's class), and
arg is any expression. Python checks if
'name'
is a key in x._ _dict_
_, but it isn't. So Python finds
name in
x._ _class_ _
(possibly, by inheritance, in one of its _ _bases_
_). Python notices that the value is a function object, and
that the lookup is being done on instance
x. Therefore, Python creates a bound
method object whose im_self attribute refers to
x. Then, Python calls the bound method
object with arg as the only actual
argument. The bound method inserts im_self (i.e.,
x) as the first actual argument and
arg becomes the second one. The overall
effect is just like calling:
x._ _class_ _._ _dict_ _['name'](x, arg)
When a bound method's function body executes, it has
no special namespace relationship to either its
self object or any class. Variables referenced are
local or global, just as for any other function, as covered in Section 4.10.6. Variables do not implicitly indicate
attributes in self, nor do they indicate
attributes in any class object. When the method needs to refer to,
bind, or unbind an attribute of its self object,
it does so by standard attribute-reference syntax (e.g.,
self.name). The lack of
implicit scoping may take some getting used to (since Python differs
in this respect from many other object-oriented languages), but it
results in clarity, simplicity, and the removal of potential
ambiguities.
Bound method objects are first-class objects, and you can use them
wherever you can use a callable object. Since a bound method holds
references to the function it wraps and to the
self object on which it executes,
it's a powerful and flexible alternative to a
closure (covered in Section 4.10.6.2). An instance object
with special method _ _call_ _ (covered in Section 5.3 later in this chapter)
offers another viable alternative. Each of these constructs lets you
bundle some behavior (code) and some state (data) into a single
callable object. Closures are simplest, but limited in their
applicability. Here's the closure from Chapter 4:
def make_adder_as_closure(augend):
def add(addend, _augend=augend): return addend+_augend
return add
Bound methods and callable instances are richer and more flexible.
Here's how to implement the same functionality with
a bound method:
def make_adder_as_bound_method(augend):
class Adder:
def _ _init_ _(self, augend): self.augend = augend
def add(self, addend): return addend+self.augend
return Adder(augend).add
Here's how to implement it with a callable instance
(an instance with _ _call_ _):
def make_adder_as_callable_instance(augend):
class Adder:
def _ _init_ _(self, augend): self.augend = augend
def _ _call_ _(self, addend): return addend+self.augend
return Adder(augend)
From the viewpoint of the code that calls the functions, all of these
functions are interchangeable, since all return callable objects that
are polymorphic (i.e., usable in the same ways). In terms of
implementation, the closure is simplest; the bound method and
callable instance use more flexible and powerful mechanisms, but
there is really no need for that extra power in this case.
5.1.6 Inheritance
When you use an attribute reference
C.name
on a class object C, and
'name'
is not a key in C._ _dict_
_, the lookup implicitly proceeds on each class object that
is in C._ _bases_ _, in
order. C's base classes
may in turn have their own base classes. In this case, the lookup
recursively proceeds up the inheritance tree, stopping when
'name'
is found. The search is depth-first, meaning that it examines the
ancestors of each base class of C before
considering the next base class of C.
Consider the following example:
class Base1:
def amethod(self): print "Base1"
class Base2(Base1): pass
class Base3:
def amethod(self): print "Base3"
class Derived(Base2, Base3): pass
aninstance = Derived( )
aninstance.amethod( ) # prints: "Base1"
In this case, the lookup for amethod starts in
Derived. When it isn't found
there, lookup proceeds to Base2. Since the
attribute isn't found in Base2,
lookup then proceeds to Base2's
ancestor, Base1, where the attribute is found.
Therefore, the lookup stops at this point and never considers
Base3, where it would also find an attribute with
the same name.
5.1.6.1 Overriding attributes
As
we've just seen, the search for an attribute
proceeds up the inheritance tree and stops as soon as the attribute
is found. Descendent classes are examined before their ancestors,
meaning that when a subclass defines an attribute with the same name
as one in a superclass, the search finds the definition when it looks
at the subclass and stops there. This is known as the subclass
overriding the definition in the superclass.
Consider the following:
class B:
a = 23
b = 45
def f(self): print "method f in class B"
def g(self): print "method g in class B"
class C(B):
b = 67
c = 89
d = 123
def g(self): print "method g in class C"
def h(self): print "method h in class C"
In this code, class C overrides attributes
b and g of its superclass
B.
5.1.6.2 Delegating to superclass methods
When a subclass
C overrides a method
f of its superclass
B, the body of
C.f
often wants to delegate some part of its operation to the
superclass's implementation of the method. This can
be done using an unbound method, as follows:
class Base:
def greet(self, name): print "Welcome ", name
class Sub(Base):
def greet(self, name):
print "Well Met and",
Base.greet(self, name)
x = Sub( )
x.greet('Alex')
The delegation to the superclass, in the body of
Sub.greet, uses an unbound method obtained by
attribute reference Base.greet on the superclass,
and therefore passes all attributes normally, including
self. Delegating to a superclass implementation is
the main use of unbound methods.
One very common use of such delegation occurs with special method
_ _init_ _. When an instance is created in Python,
the _ _init_ _ methods of base classes are not
automatically invoked, as they are in some other object-oriented
languages. Thus, it is up to a subclass to perform the proper
initialization by using delegation if necessary. For example:
class Base:
def _ _init_ _(self):
self.anattribute = 23
class Derived(Base):
def _ _init_ _(self):
Base._ _init_ _(self)
self.anotherattribute = 45
If the _ _init_ _ method of class
Derived didn't explicitly call
that of class Base, instances of
Derived would miss that portion of their
initialization, and thus such instances would lack attribute
anattribute.
5.1.6.3 "Deleting" class attributes
Inheritance and overriding
provide a simple and effective way to add or modify class attributes
(methods) non-invasively (i.e., without modifying the class in which
the attributes are defined), by adding or overriding the attributes
in subclasses. However, inheritance does not directly support similar
ways to delete (hide) base classes' attributes
non-invasively. If the subclass simply fails to define (override) an
attribute, Python finds the base class's definition.
If you need to perform such deletion, possibilities include:
Overriding the method and raising an exception in the
method's body
Eschewing inheritance, holding the attributes elsewhere than in the
subclass's _ _dict_ _, and
defining _ _getattr_ _ for selective delegation
Using the new-style object model and overriding _
_getattribute_ _ to similar effect
The last two techniques here are demonstrated in "_
_getattribute_ _" later in this chapter.
|