13.2 Restricted Execution
Python
code executed dynamically normally suffers no special restrictions.
Python's general philosophy is to give the
programmer tools and mechanisms that make it easy to write good, safe
code, and trust the programmer to use them appropriately. Sometimes,
however, trust might not be warranted. When code to execute
dynamically comes from an untrusted source, the code itself is
untrusted. In such cases it's important to
selectively restrict the execution environment so that such code
cannot accidentally or maliciously inflict damage. If you never need
to execute untrusted code, you can skip this section. However, Python
makes it easy to impose appropriate restrictions on untrusted code if
you ever do need to execute
it.
When the
_ _builtins_ _ item in the global namespace
isn't the standard _ _builtin_ _
module (or the latter's dictionary), Python knows
the code being run is restricted. Restricted code executes in a
sandbox environment, previously prepared by the trusted code, that
requests the restricted code's execution. Standard
modules rexec and Bastion help
you prepare an appropriate sandbox. To ensure that restricted code
cannot escape the sandbox, a few crucial internals (e.g., the
_ _dict_ _ attributes of modules, classes, and
instances) are not directly available to restricted code.
There is no
special protection against restricted code raising exceptions. On the
contrary, Python diagnoses any attempt by restricted code to violate
the sandbox restrictions by raising an exception. Therefore, you
should generally run restricted code in the try
clause of a try/except
statement, as covered in Chapter 6. Make sure you
catch all exceptions and handle them appropriately if your program
needs to keep running in such cases.
There is no built-in protection against untrusted code attempting to
inflict damage by consuming large amounts of memory or time
(so-called denial-of-service attacks). If you need to ward against
such attacks, you can run untrusted code in a separate process. The
separate process uses the mechanisms described in this section to
restrict the untrusted code's execution, while the
main process monitors the separate one and terminates it if and when
resource consumption becomes excessive. Processes are covered in
Chapter 14. Resource monitoring is currently
supported by the standard Python library only on Unix-like platforms
(by platform-specific module resource), and this
book covers only cross-platform
Python.
As a final note, you need to know that there are known, exploitable
security weaknesses in the restricted-execution mechanisms, even in
the most recent versions of Python. Although restricted execution is
better than nothing, at the time of this writing there are no known
ways to execute untrusted code that are suitable for
security-critical situations.
13.2.1 The rexec Module
The rexec module supplies the
RExec class, which you can instantiate to prepare
a typical restricted-execution sandbox environment in which to run
untrusted code.
class RExec(hooks=None,verbose=False)
|
|
Returns an instance of the RExec class, which
corresponds to a new restricted-execution environment, also known as
a sandbox. hooks, if not
None, lets you exert fine-grained control on
import statements executed in the sandbox. This is
an advanced and rarely used functionality, and I do not cover it
further in this book. verbose, if true,
causes additional debugging output to be sent to standard output for
many kinds of operations in the sandbox.
13.2.1.1 Methods
An instance r of RExec
provides the following methods. Versions of
RExec's methods whose names start
with s_ rather than r_ are also
available. An r_ method and its
s_ variant are equivalent, but the latter also
ensures that untrusted code can call only safe methods on standard
file objects sys.stdin,
sys.stdout, and sys.stderr.
This is needed only in the unusual case in which you have replaced
the standard file objects with file-like objects that also expose
additional, unsafe methods or attributes.
Adds and returns a new empty module if
no module yet corresponds to name modname
in the sandbox. If the sandbox already contains a module object that
corresponds to name modname,
r_add_module returns that module object.
r.r_eval(expr)
r.s_eval(expr)
|
|
r_eval executes expr,
which must be an expression or a code object, in the restricted
environment and returns the expression's result.
r.r_exec(code)
r.s_exec(code)
|
|
r_exec executes code,
which must be a string of code or a code object, in the restricted
environment.
r.r_execfile(filename)
r.s_execfile(filename)
|
|
r_execfile executes the file identified by
filename, which must contain Python code,
in the restricted environment.
r.r_import(modname[,globals[,locals[,fromlist]]])
r.s_import(modname[,globals[,locals[,fromlist]]])
|
|
Imports the module modname into the
restricted environment. All parameters are just like for built-in
function _ _import_ _, covered in Chapter 7. r_import raises
ImportError if the module is considered unsafe. A
subclass of RExec may override
r_import, to change the set of modules available
to import statements in untrusted code and/or to
otherwise change import functionality for the
sandbox.
r.r_open(filename[,mode[,bufsize]])
|
|
Executes when restricted code calls the built-in
open. All parameters are just like for the
built-in open, covered in Chapter 10. The version of r_open in
class RExec opens any file for reading, but none
for writing or appending. A subclass may ease or tighten these
restrictions.
r.r_reload(module)
r.s_reload(module)
|
|
Reloads the module object module in the
restricted-execution environment, similarly to built-in function
reload, covered in Chapter 7.
r.r_unload(module)
r.s_unload(module)
|
|
Unloads the module object module from the
restricted-execution environment (i.e., removes it from the
dictionary sys.modules as seen by untrusted code
executing in the sandbox).
13.2.1.2 Attributes
When RExec's defaults
don't fully correspond to your
application's specific needs, you can easily
customize the restricted-execution sandbox. Class
RExec has several attributes that are tuples of
strings. The items of these tuples are names of functions, modules,
or directories to be specifically allowed or disallowed, as follows:
- nok_builtin_names
-
Built-in functions not to be supplied in the sandbox
- ok_builtin_modules
-
Built-in modules that the sandbox can import
- ok_path
-
Used as sys.path for the
sandbox's import statements
- ok_posix_names
-
Attributes of os that the sandbox may
import
- ok_sys_names
-
Attributes of sys that the sandbox may
import
When you instantiate RExec, the new instance uses
class attributes to prepare the sandbox. If you need to customize the
sandbox, subclass RExec and instantiate the
subclass. Your subclass can override
RExec's attributes, typically by
copying the value that each attribute has in RExec
and selectively adding or removing specific items.
13.2.1.3 Using rexec
In the simplest case, you can
instantiate RExec and call the
instance's r_exec and
r_eval methods instead of using statement
exec and built-in function
eval. For example, here's a
somewhat safer variant of built-in function input:
import rexec
rex = rexec.RExec( )
def rexinput(prompt):
expr = raw_input(prompt)
return rex.r_eval(expr)
Function rexinput in this example is roughly
equivalent to built-in function input, covered in
Chapter 8. However, rexinput
wards against some of the abuses that are possible if you
don't trust the user who's
supplying input. For example, with the normal, unrestricted
eval, an expression such as _ _import_
_('os').system('xx')
lets the interactive user run any external program
xx. Built-in function
input implicitly uses normal, unrestricted
eval on the user's input.
Function rexinput uses restricted execution
instead, so that the same expression fails and raises
AttributeError, claiming that module
os has no attribute named
system. This example does not use a
try/except around the
r_eval call, but of course your application code
that calls rexinput should use
try/except if you need your
program to keep executing when the user makes mistakes or
unsuccessful attempts to break security. Mistakes and attempts to
break security both get diagnosed through exceptions.
This example's usefulness comes from the fact that a
restricted-execution sandbox can hide some functionality from
untrusted code, so that untrusted code cannot take advantage of that
functionality to wreak havoc. Function os.system
is a prime example of functionality that should always be prohibited
to untrusted code, so class RExec forbids it by
default.
After creating a new restricted-execution environment
r with
r=rexec.RExec( ), you
can optionally complete
r's initialization by
inserting modules into
r's sandbox with
add_module, then inserting attributes in those
modules with built-in function setattr. Simple
assignment statements also work just fine if the attributes have
names that you know at the time you're writing your
sandbox-preparation code. Here's how to enrich the
previous example to let the user-entered expressions use all
functions from module math (covered in Chapter 15) as if they were built-ins, since you know
that none of the functions presents any security risk:
import rexec, math
rex = rexec.RExec( )
burex = rex.add_module('_ _builtins_ _')
for function in dir(math):
if function[0] != '_':
setattr(burex, function, getattr(math, function))
def rich_input(prompt):
expr = raw_input(prompt)
return rex.r_eval(expr)
Function rich_input in this example is now both
richer and safer than the built-in input.
It's richer because the user can now also input
expressions such as sin(1.0).
It's safer, just like rexinput in
the previous example, because it uses restricted execution to limit
untrusted code.
Normally, you use add_module, and then add
attributes, only for the modules named '_ _main_
_' and '_ _builtins_ _'. If the
untrusted code needs other modules that it is allowed to import
(based on the ok_builtin_modules and
ok_path attributes of the RExec
subclass you instantiated), the untrusted code can import those other
modules normally, usually with an import statement
or a call to built-in function _ _import_ _.
However, you can also choose to use add_module for
other module names in order to synthesize, restrict, or otherwise
modify modules that later get imported by the untrusted code.
Once you have populated the sandbox, untrusted code can call the
functions and other callables that you added to the sandbox. When
called, such functions and other callables execute in the normal
(non-sandbox) environment, without constraints. You should therefore
ensure that untrusted code cannot cause damage by misusing such
callables. Module Bastion, covered in the next
section, deals with the specific task of selectively exposing object
methods.
13.2.2 The Bastion Module
The Bastion
module supplies a class, each of whose instances wraps an object and
selectively exposes some of the wrapped object's
methods, but no other attributes.
class Bastion(obj,filter=lambda n: n[:1]!='_',name=None)
|
|
A Bastion instance b
wrapping object obj exposes only those
methods of obj for whose name
filter returns true. An access
b.attr
works like:
if filter('attr'): return obj.attr
else: raise AttributeError, 'attr' plus a check that
b.attr
is a method, not an attribute of any other type.
The default filter accepts all method
names that do not start with an underscore (_)
(i.e., all methods that are neither private nor special methods).
When name is not None,
repr(b)
is the string '<Bastion for
name>'. When
name is None,
repr(b)
is '<Bastion for
%s>' %
repr(obj).
Suppose, for example, that your application supplies a class
MyClass whose public methods are all safe, while
private and special methods, as well as attributes that are not
methods, should be hidden from untrusted code. In the sandbox, you
can provide a factory function that supplies safely wrapped instances
of MyClass to untrusted code as follows:
import rexec, Bastion
rex = rexec.RExec( )
burex = rex.add_module('_ _builtins_ _')
def SafeMyClassFactory(*args, **kwds):
return Bastion.Bastion(MyClass(*args, **kwds))
burex.MyClass = SafeMyClassFactory
Now, untrusted code that you run with rex.r_exec
can instantiate and use safely wrapped instances of
MyClass:
m = MyClass(1,2,3)
m.somemethod(4,5)
However, any attempt by the untrusted code to access private or
special methods, even indirectly (e.g.,
m[6]=7 indirectly tries
to use special method _ _setitem_ _), raises
AttributeError, whether the real
MyClass supplies such methods or not. Suppose you
want a slightly less tight wrapping, allowing untrusted code to use
special method _ _getitem_ _, as well as normal
public methods, but no other. You just need to provide a custom
filter function when you instantiate
Bastion:
import rexec, Bastion
rex = rexec.RExec( )
burex = rex.add_module('_ _builtins_ _')
def SafeMyClassFactory(*args, **kwds):
def is_safe(n): n= ='_ _getitem_ _' or n[0]!='_'
return Bastion.Bastion(MyClass(*args, **kwds), is_safe)
burex.MyClass = SafeMyClassFactory
Now, untrusted code that is run in sandbox rex can
get, but not set, items of the instances of
MyClass it builds with the factory function
(assuming, of course, that your class MyClass does
supply method _ _getitem_ _).
|