26.1 Python's distutils
The distutils are
a rich and flexible set of tools to package Python programs and
extensions for distribution to third parties. I cover typical, simple
use of the distutils for the most common packaging
needs. For in-depth, highly detailed discussion of
distutils, I recommend two manuals that are part
of Python's online documentation:
Distributing Python Modules (available at
http://www.python.org/doc/current/dist/), and
Installing Python Modules (available at
http://www.python.org/doc/current/inst/),
both by Greg Ward, the principal author of the
distutils.
26.1.1 The Distribution and Its Root
A distribution
is the set of files to package into a single file for distribution
purposes. A di
stribution may include zero, one, or more Python
packages and other Python modules (as covered in Chapter 7), as well as, optionally, Python scripts,
C-coded (and other) extensions, supporting data files, and auxiliary
files containing metadata about the distribution itself. A
distribution is said to be pure if all code it
includes is Python, and non-pure if it also
includes non-Python code (most often, C-coded
extensions).
You should
normally place all the files of a distribution in a directory, known
as the distribution root directory, and in
subdirectories of the distribution root. Mostly, you can arrange the
subtree of files and directories rooted at the distribution root to
suit your own organizational needs. However, remember from Chapter 7 that a Python package must reside in its own
directory, and a package's directory must contain a
file named _ _init_ _.py (or subdirectories with
_ _init_ _.py files, for subpackages) as well as
other modules belonging to that
package.
26.1.2 The setup.py Script
The distribution root directory must
contain a Python script that by convention is named
setup.py. The setup.py
script can, in theory, contain arbitrary Python code. However, in
practice, setup.py always boils down to some
variation of:
from distutils.core import setup, Extension
setup( many keyword arguments go here )
All the action is in the parameters you supply in the call to
setup. You should not import
Extension if your setup.py
deals with a pure distribution. Extension is
needed only for non-pure distributions, and you should import it only
when you need it. It is fine to have a few statements before the call
to setup, in order to arrange
setup's arguments in clearer and
more readable ways than could be managed by having everything inline
as part of the setup call.
The distutils.core.setup function accepts only
keyword arguments, and there are a large number of such arguments
that you could potentially supply. A few deal with the internal
operations of the distutils themselves, and you
never supply such arguments unless you are extending or debugging the
distutils, an advanced subject that I do not cover
in this book. Other keyword arguments to setup
fall into two groups: metadata about the distribution, and
information about what files are in the distribution.
26.1.3 Metadata About the Distribution
You should provide metadata about the
distribution by supplying some of the following keyword arguments
when you call the distutils.core.setup function.
The value you associate with each argument name you supply is a
string that is intended mostly to be human-readable; therefore, any
specifications about the string's format are just
advisory. The explanations and recommendations about the metadata
fields in the following are also non-normative, and correspond only
to common, not universal, conventions. Whenever the following
explanations refer to "this
distribution," it can be taken to refer to the
material included in the distribution, rather than to the packaging
of the distribution.
- author
-
The name(s) of the author(s) of material included in the
distribution. You should always provide this information, as the
authors deserve credit for their work.
- author_email
-
Email address(es) of the author(s) named in argument
author. You should provide this information only
if the author is willing to receive email about this work.
- contact
-
The name of the principal contact person or mailing list for this
distribution. You should provide this information if there is
somebody who should be contacted in preference to people named in
arguments author and
maintainer.
- contact_email
-
Email address of the contact named in argument
contact. You should provide this information if
and only if you supply the contact argument.
- description
-
A concise description of this distribution, preferably fitting within
one line of 80 characters or less. You should always provide this
information.
- fullname
-
The full name of this distribution. You should provide this
information if the name supplied as argument name
is in abbreviated or incomplete form (e.g., an acronym).
- keywords
-
A list of keywords that would likely be searched for by somebody
looking for the functionality provided by this distribution. You
should provide this information if it might be useful to index this
distribution in some kind of search engine.
- license
-
The licensing terms of this distribution, in a concise form that may
refer for details to a file in the distribution or to a URL. You
should always provide this information.
- maintainer
-
The name(s) of the current maintainer(s) of this distribution. You
should normally provide this information if the maintainer is
different from the author.
- maintainer_email
-
Email address(es) of the maintainer(s) named in argument
maintainer. You should provide this information
only if you supply the maintainer argument and if
the maintainer is willing to receive email about this work.
- name
-
The name of this distribution as a valid Python identifier (this
often requires abbreviations, e.g., by an acronym). You should always
provide this information.
- platforms
-
A list of platforms on which this distribution is known to work. You
should provide this information if you have reasons to believe this
distribution may not work everywhere. This information should be
reasonably concise, so this field may refer for details to a file in
the distribution or to a URL.
- url
-
A URL at which more information can be found about this distribution.
You should always provide this information if any such URL exists.
- version
-
The version of this distribution and/or its contents, normally
structured as major.minor or even more
finely. You should always provide this information.
26.1.4 Distribution Contents
A distribution can contain a mix of
Python source files, C-coded extensions, and other files.
setup accepts optional keyword arguments detailing
files to put in the distribution. Whenever you specify file paths,
the paths must be relative to the distribution root directory and use
/ as the path separator.
distutils adapts location and separator
appropriately when it installs the distribution. Note, however, that
the keyword arguments packages and
py_modules do not list file paths, but rather
Python packages and modules respectively. Therefore, in the values of
these keyword arguments, use no path separators or file extensions.
When you list subpackage names in argument
packages, use Python syntax (e.g.,
top_package.sub_package).
26.1.4.1 Python source files
By default,
setup looks for Python modules (which you list in
the value of the keyword argument py_modules) in
the distribution root directory, and for Python packages (which you
list in the value of the keyword argument
packages) as sub-directories of the distribution
root directory. You may specify keyword argument
package_dir to change these defaults. However,
things are simpler when you locate files according to
setup's defaults, so I do not
cover package_dir further in this book.
The setup keyword arguments you will most
frequently use to detail what Python source files to put in the
distribution are the following.
packages=[ list of package name strings ]
|
|
For each package name string p in the
list, setup expects to find a subdirectory
p in the distribution root directory, and
includes in the distribution the file p/_ _init_
_.py, which must be present, as well as any other file
p/*.py (i.e., all the modules of package
p). setup does not
search for subpackages of p: you must
explicitly list all subpackages, as well as top-level packages, in
the value of keyword argument packages.
py_modules=[ list of module name strings ]
|
|
For each module name string m in the list,
setup expects to find a file
m.py in the distribution root directory, and
includes m.py in the distribution.
scripts=[ list of script file path strings ]
|
|
Scripts are Python source files meant to be run as main programs
(generally from the command line). The value of the
scripts keyword lists the path strings of these
files, complete with .py extension, relative to
the distribution root directory.
Each script file should have as its first line a shebang line, that
is, a line starting with #! and containing the
substring python. When
distutils install the scripts included in the
distribution, distutils adjust each
script's first line to point to the Python
interpreter. This is quite useful on many platforms, since the
shebang line is used by the platform's shells or by
other programs that may run your scripts, such as web servers.
26.1.4.2 Other files
To put data files of any kind in the distribution, supply the
following keyword argument.
data_files=[ list of pairs (target_directory,[list of files]) ]
|
|
The value of keyword argument data_files is a list
of pairs. Each pair's first item is a string and
names a target directory (i.e., a directory
where distutils places data files when installing
the distribution); the second item is the list of file path strings
for files to put in the target directory. At installation time,
distutils places each target directory as a
subdirectory of Python's
sys.prefix for a pure distribution, or of
Python's sys.exec_prefix for a
non-pure distribution. distutils places the given
files directly in the respective target directory, never in
subdirectories of the target. For example, given the following
data_files usage:
data_files = [ ('miscdata', ['conf/config.txt',
'misc/sample.txt']) ] distutils includes in the distribution the file
config.txt from sub-directory
conf of the distribution root, and the file
sample.txt from subdirectory
misc of the distribution root. At installation
time, distutils creates a subdirectory named
miscdata in Python's
sys.prefix directory (or in the
sys.exec_prefix directory, if the distribution is
non-pure), and copies the two files into
miscdata/config.txt and
miscdata/sample.txt.
26.1.4.3 C-coded extensions
To put C-coded extensions in the distribution, supply the following
keyword argument.
ext_modules=[ list of instances of class Extension ]
|
|
All the details about each extension are supplied as arguments when
instantiating the distutils.core.Extension class.
Extension's constructor accepts
two mandatory arguments and many optional keyword arguments, as
follows.
class Extension(name, sources, **kwds)
|
|
name is the module name string for the
C-coded extension. name may include dots
to indicate that the extension module resides within a package.
sources is the list of source files that
the distutils must compile and link in order to
build the extension. Each item of sources
is a string giving a source file's path relative to
the distribution root directory, complete with file extension
.c. kwds lets you
pass other, optional arguments to Extension, as
covered later in this section.
The Extension class
also supports other file extensions besides .c,
indicating other languages you may use to code Python extensions. On
platforms having a C++ compiler, file extension .cpp
indicates C++ source files. Other file extensions that may
be supported, depending on the platform and on add-ons to the
distutils that are still in experimental stages at
the time of this writing, include .f for
Fortran, .i for SWIG, and
.pyx for Pyrex files. See Chapter 24 for information about using different
languages to extend Python.
In some cases, your extension needs no further information besides
mandatory arguments name and
sources. The distutils
implicitly perform all that is necessary to make the Python headers
directory and the Python library available for your
extension's compilation and linking, and also
provide whatever compiler or linker flags or options are needed to
build extensions on a given platform.
When it takes additional information to compile and link your
extension correctly, you can supply such information via the keyword
arguments of class Extension. Such arguments may
potentially interfere with the cross-platform portability of your
distribution. In particular, whenever you specify file or directory
paths as the values of such arguments, the paths should be relative
to the distribution root directory—using absolute paths
seriously impairs your distribution's cross-platform
portability.
Portability is not a problem when you just use the
distutils as a handy way to build your extension,
as suggested in Chapter 24. However, when you plan
to distribute your extensions to other platforms, you should examine
whether you really need to provide build information via keyword
arguments to Extension. It is sometimes possible
to bypass such needs by careful coding at the C level, and the
already mentioned Distributing Python Modules
manual provides important
examples.
The keyword arguments that you may pass when calling
Extension are the following:
- define_macros = [ ( macro_name,macro_value) ... ]
-
Each of the items macro_name and
macro_value, in the pairs listed as the
value of define_macros, is a string, respectively
the name and value for a C preprocessor macro definition, equivalent
in effect to the C preprocessor directive:
#define macro_name macro_value
macro_value can also be
None, to get the same effect as the C preprocessor
directive:
#define macro_name
- extra_compile_args = [ list of compile_arg strings ]
-
Each of the strings compile_arg listed as
the value of extra_compile_args is placed among
the command-line arguments for each invocation of the C compiler.
- extra_link_args = [ list of link_arg strings ]
-
Each of the strings link_arg listed as the
value of extra_link_args is placed among the
command-line arguments for the invocation of the linker.
- extra_objects = [ list of object_name strings ]
-
Each of the strings object_name listed as
the value of extra_objects names an object file to
add to the invocation of the linker. Do not specify the file
extension as part of the object name: distutils
adds the platform-appropriate file extension (such as
.o on Unix-like platforms and
.obj on Windows) to help you keep cross-platform
portability.
- include_dirs = [ list of directory_path strings ]
-
Each of the strings directory_path listed
as the value of include_dirs identifies a
directory to supply to the compiler as one where header files are
found.
- libraries = [ list of library_name strings ]
-
Each of the strings library_name listed as
the value of libraries names a library to add to
the invocation of the linker. Do not specify the file extension or
any prefix as part of the library name: distutils,
in cooperation with the linker, adds the platform-appropriate file
extension and prefix (such as .a (and a prefix
lib) on Unix-like platforms, and
.lib on Windows) to help you keep cross-platform
portability.
- library_dirs = [ list of directory_path strings ]
-
Each of the strings directory_path listed
as the value of library_dirs identifies a
directory to supply to the linker as one where library files are
found.
- runtime_library_dirs = [ list of directory_path strings ]
-
Each of the strings directory_path listed
as the value of runtime_library_dirs identifies a
directory where dynamically loaded libraries are found at runtime.
- undef_macros = [ list of macro_name strings ]
-
Each of the strings macro_name listed as
the value of undef_macros is the name for a C
preprocessor macro definition, equivalent in effect to the C
preprocessor directive:
#undef macro_name
26.1.5 The setup.cfg File
The
distutils let the user who is installing your
distribution specify many options at installation time. Most often
the user will simply enter the following command at a command line:
C:\> python setup.py install
but the already mentioned manual Installing Python
Modules explains many alternatives in detail. If you wish
to provide suggested values for some installation options, you can
put a setup.cfg file in your distribution root
directory. setup.cfg can also provide
appropriate defaults for options you can supply to build-time
commands. For copious details on the format and contents of file
setup.cfg, see the already mentioned manual
Distributing Python Modules.
26.1.6 The MANIFEST.in and MANIFEST Files
When you run:
python setup.py sdist
to produce a packaged-up source distribution (typically a
.zip file on Windows, or a
.tgz file, also known as a tarball, on Unix),
the distutils by default insert the following in
the distribution:
All Python and C source files, as well as data files, explicitly
mentioned or directly implied by your setup.py
file's options, as covered earlier in
this chapter
Test files, located at test/test*.py under the
distribution root directory
Files README.txt (if any), setup.cfg
(if any), and setup.py
You can add yet more files in the source distribution
.zip file or tarball by placing in the
distribution root directory a manifest template
file named MANIFEST.in, whose lines are rules,
applied sequentially, about files to add (include)
or subtract (prune) from the overall list of files
to place in the distribution. The sdist command of
the distutils also produces an exact list of the
files placed in the source distribution as a text file named
MANIFEST in the distribution root directory.
26.1.7 Creating Prebuilt Distributions with distutils
The packaged
source distributions you create with python
setup.py sdist are the most
widely useful files you can produce with
distutils. However, you can make life even easier
for users with specific platforms by also creating prebuilt forms of
your distribution with the command python
setup.py bdist.
For a pure distribution, supplying prebuilt forms is merely a matter
of convenience for the users. You can create prebuilt pure
distributions for any platform, including ones different from those
on which you work, as long as you have available on your path the
needed commands (such as zip,
gzip, bzip2, and
tar). Such commands are freely available on the
Net for all sorts of platforms, so you can easily stock up on them in
order to provide maximum convenience to users who want to install
your distribution.
For a non-pure distribution, making prebuilt forms available may be
more than just an issue of convenience. A non-pure distribution, by
definition, includes code that is not pure Python, generally C code.
Unless you supply a prebuilt form, users need to have the appropriate
C compiler installed in order to build and install your distribution.
This is not a terrible problem on platforms where the appropriate C
compiler is the free and ubiquitous gcc.
However, on other platforms, the C compiler needed for normal
building of Python extensions is commercial and costly. For example,
on Windows, the normal C compiler used by Python and its C-coded
extensions is Microsoft Visual C++ (Release 6, at the time of this
writing). It is possible to substitute other compilers, including
free ones such as the mingw32 and
cygwin versions of gcc, and
Borland C++ 5.5, whose command-line version you can download from the
Net at no cost. However, the process of using such alternative
compilers, as documented in the Python online manuals, is rather
complex and intricate, particularly for end users who may not be
experienced programmers.
Therefore, if you want your non-pure distribution to be widely
adopted on such platforms as Windows, it's highly
advisable to make your distribution also available in prebuilt form.
However, unless you have developed or purchased advanced
cross-compilation environments, building a non-pure distribution and
packaging it up in prebuilt form is only feasible on the target
platform. You also need to have the necessary C compiler installed.
When those conditions are satisfied, however, the
distutils make the procedure quite simple. In
particular, the command:
python setup.py bdist_wininst
creates an .exe file that is a Windows installer
for your distribution. If your distribution is non-pure, the prebuilt
distribution is dependent on the specific Python version. The
distutils reflect this fact in the name of the
.exe installer they create for you. Say, for
example, that your distribution's
name metadata is mydist, your
distribution's version metadata
is 0.1, and the Python version you use is
2.2. In this case, the
distutils build a Windows installer named
mydist-0.1.win32-py2.2.exe.
|