6.6 Error-Checking Strategies

Most programming languages that support exceptions are geared to raise exceptions only in very rare cases. Python's emphasis is different. In Python, exceptions are considered appropriate whenever they make a program simpler and more robust. A common idiom in other languages, sometimes known as "look before you leap" (LBYL), is to check in advance, before attempting an operation, for all circumstances that might make the operation invalid. This is not ideal, for several reasons:

The checks may diminish the readability and clarity of the common, mainstream cases where everything is okay.
The work needed for checking may duplicate a substantial part of the work done in the operation itself.
The programmer might easily err by omitting some needed check.
The situation might change between the moment the checks are performed and the moment the operation is attempted.

The preferred idiom in Python is generally to attempt the operation in a try clause and handle the exceptions that may result in except clauses. This idiom is known as "it's easier to ask forgiveness than permission" (EAFP), a motto widely credited to Admiral Grace Murray Hopper, co-inventor of COBOL, and shares none of the defects of "look before you leap." Here is a function written using the LBYL idiom:

def safe_divide_1(x, y):
    if y=  =0:
        print "Divide-by-0 attempt detected"
        return None
    else:
        return x/y

With LBYL, the checks come first, and the mainstream case is somewhat hidden at the end of the function.

Here is the equivalent function written using the EAFP idiom:

def safe_divide_2(x, y):
    try:
        return x/y
    except ZeroDivisionError:  
        print "Divide-by-0 attempt detected"
        return None

With EAFP, the mainstream case is up front in a try clause, and the anomalies are handled in an except clause.

EAFP is most often the preferable error-handling strategy, but it is not a panacea. In particular, you must be careful not to cast too wide a net, catching errors that you did not expect and therefore did not mean to catch. The following is a typical case of such a risk (built-in function getattr is covered in Chapter 8):

def trycalling(obj, attrib, default, *args, **kwds):
    try: return getattr(obj, attrib)(*args, **kwds)
    except AttributeError: return default

The intention of function trycalling is to try calling a method named attrib on object obj, but to return default if obj has no method thus named. However, the function as coded does not do just that. It also hides any error case where AttributeError is raised inside the implementation of the sought-after method, silently returning default in those cases. This may hide bugs in other code. To do exactly what is intended, the function must take a little bit more care:

def trycalling(obj, attrib, default, *args, **kwds):
    try: method = getattr(obj, attrib)
    except AttributeError: return default
    else: return method(*args, **kwds)

This implementation of trycalling separates the getattr call, placed in the try clause and therefore watched over by the handler in the except clause, from the call of the method, placed in the else clause and therefore free to propagate any exceptions it may need to. Using EAFP in the most effective way involves frequent use of the else clause on try/except statements.

6.6.1 Handling Errors in Large Programs

In large programs, it is especially easy to err by making your try/except statements too wide, particularly once you have convinced yourself of the power of EAFP as a general error-checking strategy. A try/except is too wide when it catches too many different errors or an error that can occur in too many different places. The latter is a problem if you need to distinguish exactly what happened and where, and the information in the traceback is not sufficient to pinpoint such details (or you discard some or all of the information in the traceback object). For effective error handling, you have to keep a clear distinction between errors and anomalies that you expect (and thus know exactly how to handle), and unexpected errors and anomalies, which indicate a bug somewhere in your program.

Some errors and anomalies are not really erroneous, and perhaps not even all that anomalous: they are just special cases, perhaps rare but nevertheless quite expected, which you choose to handle via EAFP rather than via LBYL to avoid LBYL's many intrinsic defects. In such cases, you should just handle the anomaly, in most cases without even logging or reporting it. Be very careful, under these circumstances, to keep the relevant try/except constructs as narrow as feasible. Use a small try clause that doesn't call too many other functions, and very specific exception-class lists in the except clauses.

Errors and anomalies that depend on user input or other external conditions not under your control are always expected to some extent, precisely because you have no control on their underlying causes. In such cases, you should concentrate your effort on handling the anomaly gracefully, normally reporting and logging its exact nature and details, and generally keep your program running with undamaged internal and persistent states. The width of try/except clauses under such circumstances should also be reasonably narrow, although this is not quite as crucial as when you use EAFP to structure your handling of not-really-erroneous special cases.

Lastly, entirely unexpected errors and anomalies indicate bugs in your program's design or coding. In most cases, the best strategy regarding such errors is to avoid try/except and just let the program terminate with error and traceback messages. (You might even want to log such information and/or display it more suitably with an application-specific hook in sys.excepthook, as we'll discuss shortly.) If your program must keep running at all costs, even under such circumstances, try/except statements that are quite wide may be appropriate, with the try clause guarding function calls that exercise vast swaths of program functionality and broad except clauses.

In the case of a long-running program, make sure all details of the anomaly or error are logged to some persistent place for later study (and that some indication gets displayed, too, so that you know such later study is necessary). The key is making sure that the program's persistent state can be reverted to some undamaged, internally consistent point. The techniques that enable long-running programs to survive some of their own bugs are known as checkpointing and transactional behavior, but they are not covered further in this book.

6.6.2 Logging Errors

When Python propagates an exception all the way to the top of the stack without finding an applicable handler, the interpreter normally prints an error traceback to the standard error stream of the process (sys.stderr) before terminating the program. You can rebind sys.stderr to any file-like object usable for output in order to divert this information to a destination more suitable for your purposes.

When you want to change the amount and kind of information output on such occasions, rebinding sys.stderr is not sufficient. In such cases, you can assign your own function to sys.excepthook, and Python will call it before terminating the program due to an unhandled exception. In your exception-reporting function, you can output whatever information you think will later help you diagnose and debug the problem to whatever destinations you please. For example, you might use module traceback (covered in Chapter 17) to help you format stack traces. When your exception-reporting function terminates, so does your program.

6.6.3 The assert Statement

The assert statement allows you to introduce debugging code into a program. assert is a simple statement with the following syntax:

assert condition[,expression]

When you run Python with the optimize flag (-O, as covered in Chapter 3), assert is a null operation: the compiler generates no code. Otherwise, assert evaluates condition. If condition is satisfied, assert does nothing. If condition is not satisfied, assert instantiates AssertionError with expression as the argument (or without arguments, if there is no expression) and raises the resulting instance.

assert statements are an effective way to document your program. When you want to state that a significant condition C is known to hold at a certain point in a program's execution, assert C is better than a comment that just states C. The advantage of assert is that when the condition does not in fact hold, assert alerts you to the problem by raising AssertionError.

6.6.4 The _ _debug_ _ Built-in Variable

When you run Python without option -O, the _ _debug_ _ built-in variable is True. When you run Python with option -O, _ _debug_ _ is False. Also, with option -O, the compiler generates no code for an if statement whose condition is _ _debug_ _.

To exploit this optimization, surround the definitions of functions that you call only in assert statements with if _ _debug_ _. This technique makes compiled code smaller and faster when Python is run with -O, and enhances program clarity by showing that the functions exist only to perform sanity checks.