The Thread Model (Programming Perl)

17.2. The Thread Model

The thread model of multiprocessing was first introduced to Perl as an experimental feature in version 5.005. (By "thread model", we mean threads that share data resources by default, not the new ithreads of version 5.6.) In some senses, this thread model is still an experimental feature even in 5.6, because Perl is a rich language and multithreading can make a muddle of even the simplest language. There are still various nooks and crannies of Perl semantics that don't interact very well with the notion of everything being shared. The new ithreads model is an attempt to bypass these problems, and at some future point, the current thread model may be subsumed under the ithread model (when we get an interface to ithreads that says "share everything you can by default"). But despite its warts, the current "experimental" thread model continues to be useful in many real-world situations where the only alternative to being a guinea pig is even less desirable. Reasonably robust applications can be written in threaded Perl, but you have to be very careful. You should at least consider using fork instead, if you can think of a way to solve your problem with pipes instead of shared data structures.

But some algorithms are easier to express if multiple tasks have easy and efficient access to the same pool of data.[2] This makes for code that can be smaller and simpler. And because the kernel does not have to copy page tables for data (even if doing copy-on-write) at thread creation time, it should be faster to start a task this way. Likewise, context switches can be faster if the kernel doesn't need to swap page tables. (In fact, for user-level threads, the kernel doesn't get involved at all--though of course user-level threads have issues that kernel threads don't.)

[2] The System V shared memory model discussed in the last chapter does not exactly qualify as "easy and efficient".

That's the good news. Now for some more disclaimers. We already mentioned that threading is somewhat experimental in Perl, but even if it weren't, programming with threads is treacherous. The ability of one execution stream to poke holes willy-nilly into the data space of another exposes more opportunity for disaster than you can possibly imagine. You might say to yourself, "That's easy to fix, I'll just put locks on any shared data." Okay, locking of shared data is indispensable, but getting the locking protocols correct is notoriously difficult, with errors producing deadlock or nondeterministic results. If you have timing problems in your program, using threads will not only exacerbate them, but it will make them harder to locate.

Not only are you responsible for keeping your own shared data straight, but you are required to keep the data straight of all the Perl modules and C libraries you call into. Your Perl code can be 100% threadsafe, and if you call into a nonthreadsafe module or C subroutine without providing your own semaphore protection, you're toast. You should assume any module is not threadsafe until proven otherwise. That even includes some of the standard modules. Maybe even most of them.

Have we discouraged you yet? No? Then we'll point out that you're pretty much at the mercy of your operating system's threading library when it comes to scheduling and preemption policies. Some thread libraries only do thread switching on blocking system calls. Some libraries block the whole process if a single thread makes a blocking system call. Some libraries only switch threads on quantum expiration (either thread or process). Some libraries only switch threads explicitly.

Oh, and by the way, if your process receives a signal, which thread the signal is delivered to is completely system dependent.

To do thread programming in Perl, you must build a special version of Perl following the directions given in the README.threads file in the Perl source directory. This special Perl is pretty much guaranteed to run a bit slower than your standard Perl executable.

Do not assume that just because you know how threads are programmed in other models (POSIX, DEC, Microsoft, etc.) you know how threads work with Perl. As with other things in Perl, Perl is Perl, not C++ or Java or whatnot. For example, there are no real-time thread priorities (and no way to work around their absence). There are also no mutexes. Just use regular locking or perhaps the Thread::Semaphore module or the cond_wait facilities.

Still not discouraged? Good, because threads are really cool. You're scheduled to have some fun.

17.2.1. The Thread Module

The current interface for Perl threads is defined by the Thread module. Additionally, one new Perl keyword was added, the lock operator. We'll talk about lock later in this chapter. Other standard thread modules build on this basic interface.

The Thread module provides these class methods:

Method	Use
`new`	Construct a new `Thread`.
`self`	Return my current `Thread` object.
`list`	Return list of `Thread` objects.

And, for Thread objects, it provides these object methods:

Method	Use
`join`	Harvest a thread (propagate errors).
`eval`	Harvest a thread (trap errors).
`equal`	Compare two threads for identity.
`tid`	Return the internal thread ID.

In addition, the Thread module provides these importable functions:

Function	Use
`yield`	Tell the scheduler to run a different thread.
`async`	Construct a `Thread` via closure.
`cond_signal`	Wake up exactly one thread that is `cond_wait()`ing on a variable.
`cond_broadcast`	Wake up all threads that may be `cond_wait()`ing on a variable.
`cond_wait`	Wait on a variable until awakened by a `cond_signal()` or `cond_broadcast()` on that variable.

17.2.1.1. Thread creation

You can spawn a thread in one of two ways, either by using the Thread->new class method or by using the async function. In either case, the returned value is a Thread object. Thread->new takes a code reference indicating a function to run and arguments to pass to that function:

use Thread;
...
$t = Thread->new( \&func, $arg1, $arg2);

Often you'll find yourself wanting to pass a closure as the first argument without supplying any additional arguments:

my $something;
$t = Thread->new( sub { say($something) } );

For this special case, the async function provides some notational relief (that is, syntactic sugar):

use Thread qw(async);
...
my $something;
$t = async {
    say($something);
};

You'll note that we explicitly import the async function. You may, of course, use the fully qualified name Thread::async instead, but then your syntactic sugar isn't so sweet. Since async takes only a closure, anything you want to pass to it must be a lexical variable in scope at the time.

17.2.1.2. Thread destruction

Once begun--and subject to the whims of your threading library--the thread will keep running on its own until its top-level function (the function you passed to the constructor) returns. If you want to terminate a thread early, just return from within that top-level function.[3]

[3] Don't call exit! That would try to take down your entire process, and possibly succeed. But the process won't actually exit until all threads exit, and some of them may refuse to exit on an exit. More on that later.

Now it's all very well for your top-level subroutine to return, but who does it return to? The thread that spawned this thread has presumably gone on to do other things and is no longer waiting at a method call for a response. The answer is simple enough: the thread waits until someone issues a method call that does wait for a return value. That method call is called join, because it conceptually joins two threads back into one:

$retval = $t->join();    # harvest thread $t

The operation of join is reminiscent of waitpid on a child process. If the thread has already shut down, the join method returns immediately with the return value of the thread's top-level subroutine. If the thread is not done, join acts as a blocking call that suspends the calling thread indefinitely. (There is no time-out facility.) When the thread eventually completes, the join returns.

Unlike waitpid, however, which can only harvest the process's own children, any thread can join any other thread within the process. That is, it is not a necessity for the joining thread be the main thread or the parent thread. The only restrictions are that a thread can't join itself (which would be like officiating at your own funeral), and a thread can't join a thread that has already been joined (which would be like two funeral directors fighting each other over the body). If you try to do either of those things, an exception will be raised.

The return value of join doesn't have to be a scalar value--it can also be a list:

use Thread 'async';

$t1 = async {
    my @stuff = getpwuid($>);
    return @stuff;
};

$t2 = async {
    my $motd = `cat /etc/motd`;
    return $motd;
};

@retlist = $t1->join();
$retval  = $t2->join();

print "1st kid returned @retlist\n";
print "2nd kid returned $retval\n";

In fact, the return expression of a thread is always evaluated in list context, even if join is called in a scalar context, in which case the last value of the list is returned.

17.2.1.3. Catching exceptions from join

If a thread terminates with an uncaught exception, this does not immediately kill the whole program. That would be naughty. Instead, when a join is run on that thread, the join itself raises the exception. Using join on a thread indicates a willingness to propagate any exceptions raised by that thread. If you'd rather trap the exception right then and there, use the eval method, which, like its built-in counterpart, causes the exception to be put into $@:

$retval = $t->eval();   # catch join errors
if ($@) {
    warn "thread failed: $@";
}
else {
    print "thread returned $retval\n";
}

Although there's no rule to this effect, you might want to adopt a practice of joining a thread only from within the thread that created the one you're joining. That is, you harvest a child thread only from the parent thread that spawned it. This makes it a little easier to keep track of which exceptions you might need to handle where.

17.2.1.4. The detach method

As another alternative method of shutting down threads, if you don't plan to join a thread later to get its return value, you can call the detach method on it so that Perl will clean it up for you. It can no longer be joined. It's a little bit like when a process is inherited by the init program under Unix, except that the only way to do that under Unix is for the parent process to die.

The detach method does not "background" the thread; if you try to exit the main program and a detached thread is still running, the exit will hang until the thread exits on its own. Rather, detach just spares you from clean up. It merely tells Perl not to keep the return value and exit status of the thread after it finishes. In a sense, detach tells Perl to do an implicit join when the thread finishes and then throw away the results. That can be important: if you neither join nor detach a thread that returns some very large list, that storage will be lost until the end, because Perl would have to hang onto it on the off chance (very off, in this case) that someone would want to join that thread sometime in the future.

An exception raised in a detached child thread also no longer propagates up through a join, since there will never be one. Use eval {} wisely in the top-level function, and find some other way to report errors.

17.2.1.5. Identifying threads

Every Perl thread has a distinguishing thread identification number, which the tid object method returns:

$his_tidno = $t1->tid();

A thread can access its own thread object through the Thread->self call. Don't confuse that with the thread ID: to figure out its own thread ID, a thread does this:

$mytid = Thread->self->tid();   # $$ for threads, as it were.

To compare one thread object with another, do any of these:

Thread::equal($t1, $t2)
$t1->equal($t2)
$t1->tid() == $td->tid()

17.2.1.6. Listing current threads

You can get a list of current thread objects in the current process using the Thread->list class method call. The list includes both running threads and threads that are done but haven't been joined yet. You can do this from any thread.

for my $t (Thread->list()) {
    printf "$t has tid = %d\n", $t->tid();
}

17.2.1.7. Yielding the processor

The Thread module supports an importable function named yield. Its job is to cause the calling thread to surrender the processor. Unfortunately, details of what this really does are completely dependent on which flavor of thread implementation you have. Nevertheless, it's considered a nice gesture to relinquish control of the CPU occasionally:

use Thread 'yield';
yield();

You don't have to use parentheses. This is even safer, syntactically speaking, because it catches the seemingly inevitable "yeild" typo:

use strict;
use Thread 'yield';
yeild;          # Compiler wails, then bails.
yield;          # Ok.

17.2.2. Data Access

What we've gone over so far isn't really too hard, but we're about to fix that. Nothing we've done has actually exercised the parallel nature of threads. Accessing shared data changes all that.

Threaded code in Perl has the same constraints regarding data visibility as any other bit of Perl code. Globals are still accessed via global symbol tables, and lexicals are still accessed via some containing lexical scope (scratchpad).

However, the fact that multiple threads of control exist in the program throws a clinker into the works. Two threads can't be allowed to access the same global variable simultaneously, or they may tromp on each other. (The result of the tromping depends on the nature of the access.) Similarly, two threads can't be allowed to access the same lexical variable simultaneously, because lexical variables also behave like globals if they are declared outside the scope of closures being used by threads. Starting threads via subroutine references (using Thread->new) rather than via closures (using async) can help limit access to lexicals, if that's what you want. (Sometimes it isn't, though.)

Perl solves the problem for certain built-in special variables, like $! and $_ and @_ and the like, by making them thread-specific data. The bad news is that all your basic, everyday package variables are unprotected from tromping.

The good news is that you don't generally have to worry about your lexical variables at all, presuming they were declared inside the current thread, since each thread will instantiate its own lexical scope upon entry, separate from any other thread. You only have to worry about lexicals if they're shared between threads, by passing references around, for example, or by referring to lexicals from within closures running under multiple threads.

17.2.2.1. Synchronizing access with lock

When more than one agent can access the same item at the same time, collisions happen, just like at an intersection. Careful locking is your only defense.

The built-in lock function is Perl's red-light/green-light mechanism for access control. Although lock is a keyword of sorts, it's a shy one, in that the built-in function is not used if the compiler has already seen a sub lock {} definition in user code. This is for backward compatibility. CORE::lock is always the built-in, though. (In a perl not built for threading, calling lock is not an error; it's a harmless no-op, at least in recent versions.)

Just as the flock operator only blocks other instances of flock, not the actual I/O, so too the lock operator only blocks other instances of lock, not regular data access. They are, in effect, advisory locks. Just like traffic lights.[4]

[4] Some railroad crossing signals are mandatory (the ones with gates), and some folks think locks should be mandatory too. But just picture a world in which every intersection has arms that go up and down whenever the lights change.

You can lock individual scalar variables and entire arrays and hashes as well.

lock $var;
lock @values;
lock %table;

However, using lock on an aggregate does not implicitly lock all that aggregate's scalar components:

lock @values;       # in thread 1
...
lock $values[23];   # in thread 2 -- won't block!

If you lock a reference, this automatically locks access to the referent. That is, you get one dereference for free. This is handy because objects are always hidden behind a reference, and you often want to lock objects. (And you almost never want to lock references.)

The problem with traffic lights, of course, is that they're red half the time, and then you have to wait. Likewise, lock is a blocking call--your thread will hang there until the lock is granted. There is no time-out facility. There is no unlock facility, either, because locks are dynamically scoped. They persist until their block, file, or eval has finished. When they go out of scope, they are freed automatically.

Locks are also recursive. That means that if you lock a variable in one function, and that function recurses while holding the lock, the same thread can successfully lock the same variable again. The lock is finally dropped when all frames owning the locks have exited.

Here's a simple demo of what can happen if you don't have locking. We'll force a context switch using yield to show the kind of problem that can also happen accidentally under preemptive scheduling:

use Thread qw/async yield/;
my $var = 0;
sub abump {
    if ($var == 0) {
        yield;
        $var++;
    }
}

my $t1 = new Thread \&abump;
my $t2 = new Thread \&abump;

for my $t ($t1, $t2) { $t->join }
print "var is $var\n";

That code always prints 2 (for some definition of always) because we decided to do the bump after seeing its value was 0, but before we could do so, another thread decided the same thing.

We can fix that collision by the trivial addition of a lock before we examine $var. Now this code always prints 1:

sub abump {
    lock $var;
    if ($var == 0) {
        yield;
        $var++;
    }
}

Remember that there's no explicit unlock function. To control unlocking, just add another, nested scoping level so the lock is released when that scope terminates:

sub abump {
    {
        lock $var;
        if ($var == 0) {
            yield;
            $var++;
        }
    }  # lock released here!
    # other code with unlocked $var
}

17.2.2.2. Deadlock

Deadlock is the bane of thread programmers because it's easy to do by accident and hard to avoid even when you try to. Here's a simple example of deadlock:

my $t1 = async {
    lock $a; yield; lock $b;
    $a++; $b++
};
my $t2 = async {
    lock $b; yield; lock $a;
    $b++; $a++
};

The solution here is for all parties who need a particular set of locks to grab them in the same order.

It's also good to minimize the duration of time you hold locks. (At least, it's good to do so for performance reasons. But if you do it to reduce the risk of deadlock, all you're doing is making it harder to reproduce and diagnose the problem.)

17.2.2.3. Locking subroutines

You can also put a lock on a subroutine:

lock &func;

Unlike locks on data, which are advisory only, subroutine locks are mandatory. No one else but the thread with the lock may enter the subroutine.

Consider the following code, which contains race conditions involving the $done variable. (The yields are for demonstration purposes only).

use Thread qw/async yield/;
my $done = 0;
sub frob {
    my $arg = shift;
    my $tid = Thread->self->tid;
    print "thread $tid: frob $arg\n";
    yield;
    unless ($done) {
        yield;
        $done++;
        frob($arg + 10);
    }
}

If you run it this way:

my @t;
for my $i (1..3) {
    push @t, Thread->new(\&frob, $i);
}
for (@t) { $_->join }
print "done is $done\n";

here's the output (well, sometimes--it's not deterministic):

thread 1: frob 1
thread 2: frob 2
thread 3: frob 3
thread 1: frob 11
thread 2: frob 12
thread 3: frob 13
done is 3

However, if you run it this way:

for my $i (1..3) {
    push @t, async {
        lock &frob;
        frob($i);
    };
}
for (@t) { $_->join }
print "done is $done\n";

here's the output:

thread 1: frob 1
thread 1: frob 11
thread 2: frob 2
thread 3: frob 3
done is 1

17.2.2.4. The locked attribute

Although obeying a subroutine lock is mandatory, nothing forces anyone to lock them in the first place. You could say that the placement of the lock is advisory. But some subroutines would really like to be able to require that they be locked before being called.

The locked subroutine attribute addresses this. It's faster than calling lock &sub because it's known at compile time, not just at run time. But the behavior is the same as when we locked it explicitly earlier. The syntax is as follows:

sub frob : locked {
    # as before
}

If you have a function prototype, it comes between the name and any attributes:

sub frob ($) : locked {
    # as before
}

17.2.2.5. Locking methods

Automatic locking on a subroutine is really cool, but sometimes it's overkill. When you're invoking an object method, it doesn't generally matter if multiple methods are running simultaneously as long as they're all running on behalf of different objects. So you'd really like to lock the object that the method is being called on instead. Adding a method attribute to the subroutine definition does this:

sub frob : locked method {
    # as before
}

If called as a method, the invoking object is locked, providing serial access to that object, but allowing the method to be called on other objects. If the method isn't called on an object, the attribute still tries to do the right thing: if you call a locked method as a class method (Package->new rather than $obj->new) the package's symbol table is locked. If you call a locked method as a normal subroutine, Perl will raise an exception.

17.2.2.6. Condition variables

A condition variable allows a thread to give up the processor until some criterion is satisfied. Condition variables are meant as points of coordination between threads when you need more control than a mere lock provides. On the other hand, you don't really need more overhead than the lock provides, and condition variables are designed with this in mind. You just use ordinary locks plus ordinary conditionals. If the condition fails, then you'll have to take extraordinary measures via the cond_wait function; but we optimize for success, since in a well-designed application, we shouldn't be bottlenecking on the current condition anyway.

Besides locking and testing, the basic operations on condition variables consist of either sending or receiving a "signal" event (not a real signal in the %SIG sense). Either you suspend your own execution to wait for an event to be received, or you send an event to wake up other threads waiting for the particular condition. The Thread module provides three importable functions to do this: cond_wait, cond_signal, and cond_broadcast. These are the primitive mechanisms upon which more abstract modules like Thread::Queue and Thread::Semaphore are based. It's often more convenient to use those abstractions, when possible.

The cond_wait function takes a variable already locked by the current thread, unlocks that variable, and then blocks until another thread does a cond_signal or cond_broadcast for that same locked variable.

The variable blocked by cond_wait is relocked after cond_wait returns. If multiple threads are cond_waiting the same variable, all but one reblock because they can't regain the lock on the variable. Therefore, if you're only using cond_wait for synchronization, give up the lock as soon as possible.

The cond_signal function takes a variable already locked by the current thread and unblocks one thread that's currently in a cond_wait on that variable. If more than one thread is blocked in a cond_wait on that variable, only one is unblocked, and you can't predict which one. If no threads are blocked in a cond_wait on that variable, the event is discarded.

The cond_broadcast function works like cond_signal, but unblocks all threads blocked in a cond_wait on the locked variable, not just one. (Of course, it's still the case that only one thread can have the variable locked at a time.)

The cond_wait function is intended to be a last-resort kind of thing that a thread does only if the condition it wants isn't met. The cond_signal and cond_broadcast indicate that the condition is changing. The scheme is supposed to be this: lock, then check to see whether the condition you want is met; if it is, fine, and if it isn't, cond_wait until it is fine. The emphasis should be on avoiding blocking if at all possible. (Generally a good piece of advice when dealing with threads.)

Here's an example of passing control back and forth between two threads. Don't be fooled by the fact that the actual conditions are over on the right in statement modifiers; cond_wait is never called unless the condition we're waiting for is false.

use Thread qw(async cond_wait cond_signal);
my $wait_var = 0;
async {
    lock $wait_var;
    $wait_var = 1;
    cond_wait $wait_var  until $wait_var == 2;
    cond_signal($wait_var);
    $wait_var = 1;
    cond_wait $wait_var  until $wait_var == 2;
    $wait_var = 1;
    cond_signal($wait_var);
};

async {
    lock $wait_var;
    cond_wait $wait_var  until $wait_var == 1;
    $wait_var = 2;
    cond_signal($wait_var);
    cond_wait $wait_var  until $wait_var == 1;
    $wait_var = 2;
    cond_signal($wait_var);
    cond_wait $wait_var  until $wait_var == 1;
};

17.2.3. Other Thread Modules

Several modules are built on top of the cond_wait primitive.

17.2.3.1. Queues

The standard Thread::Queue module provides a way to pass objects between threads without worrying about locks or synchronization. This interface is much easier:

Method	Use
`new`	Construct a new `Thread::Queue`.
`enqueue`	Push one or more scalars on to the end of the queue.
`dequeue`	Shift the first scalar off the front of the queue. The `dequeue` method blocks if there are no items present.

Notice how similar a queue is to a regular pipe, except that instead of sending bytes, you get to pass around full scalars, including references and blessed objects!

Here's an example derived from the perlthrtut manpage:

use Thread qw/async/;
use Thread::Queue;

my $Q = Thread::Queue->new();
async {
    while (defined($datum = $Q->dequeue)) {
       print "Pulled $datum from queue\n";
    }
};

$Q->enqueue(12);
$Q->enqueue("A", "B", "C");
$Q->enqueue($thr);
sleep 3;
$Q->enqueue(\%ENV);
$Q->enqueue(undef);

Here's what you get for output:

Pulled 12 from queue
Pulled A from queue
Pulled B from queue
Pulled C from queue
Pulled Thread=SCALAR(0x8117200) from queue
Pulled HASH(0x80dfd8c) from queue

Notice how $Q was in scope when the asynchronous thread was launched via an async closure. Threads are under the same scoping rules as anything else in Perl. The example above would not have worked had $Q been declared after the call to async.

17.2.3.2. Semaphores

Thread::Semaphore provides you with threadsafe, counting semaphore objects to implement your favorite p() and v() operations. Because most of us don't associate these operations with the Dutch words passeer ("pass") and verlaat ("leave"), the module calls these operations "down" and "up" respectively. (In some of the literature, they're called "wait" and "signal".) The following methods are supported:

Method	Use
`new`	Construct a new `Thread::Semaphore`.
`down`	Allocate one or more items.
`up`	Deallocate one or more items.

The new method creates a new semaphore and initializes its count to the specified number. If no number is specified, the semaphore's count is set to 1. (The number represents some pool of items that can "run out" if they're all allocated.)

use Thread::Semaphore;
$mutex = Thread::Semaphore->new($MAX);

The down method decreases the semaphore's count by the specified number, or by 1 if no number is given. It can be interpreted as an attempt to allocate some or all of a resource. If the semaphore's count drops below zero, this method blocks until the semaphore's count is equal to or larger than the amount you're requesting. Call it like this:

$mutex->down();

The up method increases the semaphore's count by the specified number, or 1 if no number is given. It can be interpreted as freeing up some quantity of a previously allocated resource. This unblocks at least one thread that was blocked trying to down the semaphore, provided that the up raises the semaphore count above what the down is trying to decrement it by. Call it like this:

$mutex->up();

17.2.3.3. Other standard threading modules

Thread::Signal allows you to start up a thread that is designated to receive your process's %SIG signals. This addresses the still-vexing problem that signals are unreliable as currently implemented in Perl and their imprudent use can cause occasional core dumps.

These modules are still in development and may not produce the desired results on your system. Then again, they may. If they don't, it's because someone like you hasn't fixed them yet. Perhaps someone like you should pitch in and help.