9.3 String Formatting
In Python, a string-formatting
expression has the syntax:
format % values
where format is a plain or Unicode string
containing format specifiers and values is
any single object or a collection of objects in a tuple or
dictionary. Python's string-formatting operator has
roughly the same set of features as the C language's
printf and operates in a similar way. Each format
specifier is a substring of format that
starts with a percent sign (%) and ends with one
of the conversion characters shown in Table 9-1.
Table 9-1. String-formatting conversion characters
d, i
|
Signed decimal integer
|
Value must be number
|
u
|
Unsigned decimal integer
|
Value must be number
|
o
|
Unsigned octal integer
|
Value must be number
|
x
|
Unsigned hexadecimal integer (lowercase letters)
|
Value must be number
|
X
|
Unsigned hexadecimal integer (uppercase letters)
|
Value must be number
|
e
|
Floating-point value in exponential form (lowercase e for exponent)
|
Value must be number
|
E
|
Floating-point value in exponential form (uppercase E for exponent)
|
Value must be number
|
f, F
|
Floating-point value in decimal form
|
Value must be number
|
g, G
|
Like e or E when
exp is greater than 4 or less than the
precision; otherwise like f or
F
|
exp is the exponent of the number being
converted
|
c
|
Single character
|
Value can be integer or single-character string
|
r
|
String
|
Converts any value with repr
|
s
|
String
|
Converts any value with str
|
%
|
Literal % character
|
Consumes no value
|
Between the % and the conversion character, you
can specify a number of optional modifiers, as we'll
discuss shortly.
The result of a formatting expression is a string that is a copy of
format where each format specifier is
replaced by the corresponding item of
values converted to a string according to
the specifier. Here are some simple examples:
x = 42
y = 3.14
z = "george"
print 'result = %d' % x # prints: result = 42
print 'answers are: %d %f' % (x,y) # prints: answers are: 42 3.14
print 'hello %s' % z # prints: hello george
9.3.1 Format Specifier Syntax
A format specifier can include numerous
modifiers that control how the corresponding item in
values is converted to a string. The
components of a format specifier, in order, are:
The mandatory leading % character that marks the
start of the specifier
An optional item name in parentheses (e.g.
(name))
Zero or more optional conversion flags:
#, which indicates that the conversion uses an
alternate form (if any exists for its type)
0, which indicates that the conversion is
zero-padded
-, which indicates that the conversion is
left-justified
a space, which indicates that a space is placed before a positive
number
+, which indicates that the numeric sign
(+ or -) is included before any
numeric conversion
An optional minimum width of the conversion, specified using one or
more digits or an asterisk (*), which means that
the width is taken from the next item in
values
An
optional precision for the conversion, specified with a dot
(.) followed by zero or more digits or a
*, which means that the width is taken from the
next item in values
A mandatory conversion type from Table 9-1
Item names must be given either in all format specifiers in
format or in none of them. When item names
are present, values must be a mapping
(often the dictionary of a namespace, e.g.,
vars( )), and each item
name is a key in values. In other words,
each format specifier corresponds to the item in
values keyed by the
specifier's item name. When item names are present,
you cannot use * in any format specifier.
When item names are absent, values must be
a tuple; when there is just one item,
values may be the item itself instead of a
tuple. Each format specifier corresponds to an item in
values by position, and
values must have exactly as many items as
format has specifiers (plus one extra for
each width or precision given by *). When the
width or precision component of a specifier is given by
*, the * consumes one item in
values, which must be an integer and is
taken as the number of characters to use as minimum width or
precision of the conversion.
9.3.2 Common String-Formatting Idioms
It
is quite common for format to contain
several occurrences of %s and for
values to be a tuple with exactly as many
items as format has occurrences of
%s. The result is a copy of
format where each %s is
replaced with str applied to the corresponding
item of values. For
example:
'%s+%s is %s'%(23,45,68) # results in: '23+45 is 68'
You can think of %s as a fast and concise way to
put together a few values, converted to string form, into a larger
string. For example:
oneway = 'x' + str(j) + 'y' + str(j) + 'z'
another = 'x%sy%sz' % (j, j)
After this code is executed, variables oneway and
another will always be equal, but the computation
of another, done via string formatting, is
measurably faster. Which way is clearer and simpler is a matter of
habit: get used to the string-formatting idiom, and it will come to
look simpler and clearer.
Apart from %s,
other reasonably common format specifiers are those used to format
floating-point values: %f for decimal formatting,
%e for exponential formatting, and
%g for either decimal or exponential formatting,
depending on the number's magnitude. When formatting
floating-point values, you normally specify width and/or precision
modifiers. A width modifier is a number right after the
% that gives the minimum width for the resulting
conversion; you generally use a width modifier if
you're formatting a table for display in a
fixed-width font. A precision modifier is a number following a dot
(.) right before the conversion type letter; you
generally use a precision modifier in order to fix the number of
decimal digits displayed for a number, to avoid giving a misleading
impression of excessive precision and wasting display space. For
example:
'%.2f'%(1/3.0) # results in: '0.33'
'%s'%(1/3.0) # results in: '0.333333333333'
With %s, you cannot specify how many digits to
display after the decimal point. It is important to avoid giving a
mistaken impression of very high precision when you know that your
numeric results are only accurate to a few digits. Displaying high
precision values might mislead people examining those results into
believing the results are much more accurate than is in fact the
case.
|