Show Contents Previous Page Next Page
Chapter 11 - C API Reference Guide, Part II / Launching Subprocesses Introduction
The last topic we discuss is the API for launching subprocesses. While we don't like to encourage the creation of subprocesses because of the load they impose on a server, there are certain modules that need to do so. In fact, for certain modules, such as mod_cgi, launching subprocesses is their entire raison d'être.
Because Apache is a complex beast, calling fork() to spawn a new process within a server process is not something to be done lightly. There are a variety of issues to contend with, including, but not limited to, signal handlers, alarms, pending I/O, and listening sockets. For this reason, you should use Apache's published API to implement fork and exec, rather than trying to roll your own with the standard C functions.
In addition to discussing the subprocess API, this section covers a number of function calls that help in launching CGI scripts and setting up the environment for subprocesses.
void ap_add_cgi_vars (request_rec *r) void ap_add_common_vars (request_rec *r)
(Declared in the header file util_script.h .) By convention, modules that need to launch
subprocesses copy the contents of the current request record's subprocess_env table into
the child process's environment first. This table starts out empty, but modules are free
to add to it. For example, mod_env responds to the PassEnv, SetEnv, and UnsetEnv directives by setting or unsetting variables in an internal table. Then, during the request fixup
phase, it copies these values into subprocess_env so that the variables are exposed to the
environment by any content handler that launches a subprocess. These two routines are called by mod_cgi to fill up the subprocess_env table with the standard
CGI environment variables in preparation for launching a CGI script. You may
want to use one or both yourself in order to initialize the environment to a standard
state. add_cgi_vars() sets up the environment variables that are specifically called for by the
CGI/1.1 protocol. This includes GATEWAY_INTERFACE , QUERY_STRING ,
REQUEST_METHOD , PATH_INFO , and PATH_TRANSLATED , among others. ap_add_common_vars() adds other common CGI environment variables to subprocess_
env . This includes various HTTP_ variables that hold incoming HTTP headers from the
request such as HTTP_USER_AGENT and HTTP_REFERER , as well as such useful variables
as PATH , SERVER_NAME , SERVER_PORT , SERVER_ROOT , and SCRIPT_
FILENAME .
char **ap_create_environment (pool *p, table *t)
(Declared in the header file util_script.h .) Among the arguments you need when execing a
program with the ap_call_exec() command is an environment array. This function will
take the key/value pairs contained in an Apache table and turn it into a suitable array.
Usually you'll want to use the subprocess_env table for this purpose in order to be compatible
with mod_cgi and mod_env.
char **env = ap_create_environment(r->pool, r->subprocess_env);
int ap_can_exec (const struct stat*)
(Declared in the header file httpd.h .) This utility routinely checks whether a file is executable
by the current process user and/or group ID. You pass it the pointer to a stat structure,
often the info field of the current request record. It returns a true value if the file is
executable, false otherwise:
if(!ap_can_exec(&r->info)) {
. . . log nasty error message . . .
return HTTP_FORBIDDEN;
}
int ap_bspawn_child (pool *p, int (*)(void *, child_info *), void *data,
enum kill_conditions, BUFF **pipe_in, BUFF **pipe_out, BUFF **pipe_err)
(Declared in the header file buff.h .) The ap_bspawn_child() function is a mixture of the
Unix fork() and popen() calls. It can be used to open up a pipe to a child process or just
to fork off a child process to execute in the background. This function has many arguments. The first argument, p, is a pool pointer. The current
request's resource pool is the usual choice. The second argument is a function pointer
with the following prototype:
int child_routine (void *data, child_info *pinfo);
After forking, Apache will immediately call child_routine() with a generic data pointer (copied from the third argument to ap_bspawn_child(), which we discuss next) and a child_info pointer, a data type needed for the Win32 port. For all intents and purposes, the child_info argument is an opaque pointer that you pass to ap_call_exec(). It has no other use at present. The child routine should return a nonzero value on success or a zero value on failure. The third argument to ap_bspawn_child() is data, a generic void pointer. Whatever you use
for this argument will be passed to the child routine, and it is a simple way to pass information
from the parent process to the child process. Since the child process usually
requires access to the current request, it is common to pass a copy of the request_rec in
this field. The fourth argument is kill_conditions, an enumerated data type that affects what Apache
does with the spawned child when the server is terminating or restarting. The possibilities,
which are defined in alloc.h , are kill_never , to never send a signal to the child; kill_
always , to send the child a SIGKILL signal; kill_after_timeout , to send the child a SIGTERM,
wait 3 seconds, and then send a SIGKILL; justwait , to wait forever for the child
to complete; and kill_only_once , to send a SIGTERM and wait for the child to complete.
The usual value is kill_after_timeout , which is the same scheme that Apache uses
for the listening servers it spawns. The last three arguments are pipe_in, pipe_out, and pipe_err. If they are non-NULL , ap_bspawn_child() fills them in with BUFF pointers attached to the standard input, output,
and error of the spawned child process. By writing to pipe_in, the parent process will be
able to send data to the standard input of the spawned process. By reading from pipe_out
and pipe_err, you can retrieve data that the child has written to its standard output and
error. Pass NULL for any or all of these arguments if you are not interested in talking to
the child.
int ap_spawn_child (pool *p, int (*)(void *, child_info *), void *data,
enum kill_conditions, FILE **pipe_in, FILE **pipe_out, FILE **pipe_err)
(Declared in the header file alloc.h .) This function works exactly like ap_bspawn_child() but
uses more familiar FILE streams rather than BUFF streams for the I/O connection
between the parent and the child. This function is rarely a good choice, however,
because it is not compatible with the Win32 port, whereas ap_bspawn_child() is.
void ap_error_log2stderr (server_rec *s)
Once inside a spawned child, this function will rehook the standard error file descriptor
back to the server's error log. You may want to do this after calling ap_bspawn_child()
and before calling ap_call_exec() so that any error messages produced by the subprocess
show up in the server error log:
ap_error_log2stderr(r->server);
void ap_cleanup_for_exec (void)
(Declared in the header file alloc.h .) You should call this function just before invoking
ap_call_exec(). Its main duty is to run all the cleanup handlers for all the main resource
pools and all subpools.
int ap_call_exec (request_rec *r, child_info *pinfo, char *argv0, char **env, int shellcmd)
(Declared in the header file util_script.h .) After calling ap_bspawn_child() or ap_spawn_
child(), your program will most probably call ap_call_exec() to replace the current process
with a new one. The name of the command to run is specified in the request record's
filename field, and its command-line arguments, if any, are specified in args . If successful,
the new command is run and the call never returns. If preceded by an ap_spawn_
child(), the new process's standard input, output, and error will be attached to the
BUFF* s created by that call. This function takes five arguments. The first, r , is the current request record. It is used
to set up the argument list for the command. The second, pinfo, is the child_info pointer
passed to the function specified by ap_bspawn_child(). argv0 is the command name that will appear as the first item in the launched command's
argv[] array. Although this argument is usually the same as the path of the command
to run, this is not a necessary condition. It is sometimes useful to lie to a command
about its name, particularly when dealing with oddball programs that behave
differently depending on how they're invoked.
The fourth argument, env, is a pointer to an environment array. This is typically the
pointer returned by ap_create_environment(). The last argument, shellcmd, is a flag indicating
whether Apache should pass any arguments to the command. If shellcmd is true, then
Apache will not pass any arguments to the command (this is counterintuitive). If shellcmd
is false, then Apache will use the value of r->args to set up the arguments passed to the
command. The contents of r->args must be in the old-fashioned CGI argument form in
which individual arguments are separated by the + symbol and other funny characters
are escaped as %XX hex escape sequences. args may not contain the unescaped = or &
symbols. If it does, Apache will interpret it as a new-style CGI query string and refuse to
pass it to the command. We'll see a concrete example of setting up the arguments for an
external command shortly.
There are a few other precautionary steps ap_call_exec() will take. If SUEXEC is
enabled, the program will be run through the setuid wrapper. If any of the RLimitCPU,
RLimitMEM, or RLimitNPROC directives are enabled, setrlimit will be called underneath
to limit the given resource to the configured value.
Finally, for convenience, under OS/2 and Win32 systems ap_call_exec() will implement
the "shebang" Unix shell-ism. That is, if the first line of the requested file contains the
#! sequence, the remainder of the string is assumed to be the program interpreter which
will execute the script.
On Unix platforms, successful calls to ap_call_exec() will not return
because the current process has been terminated and replaced by the command.
On failure, ap_call_exec() will return -1 and errno will
be set.4 On Win32 platforms, successful calls
to ap_call_ exec() will return the process ID of the launched process
and not terminate the current code. The upcoming example shows how to deal with
this.
void ap_child_terminate (request_rec *r)
If for some reason you need to terminate the current child (perhaps because an attempt
to exec a new program has failed), this function causes the child server process to terminate
cleanly after the current request. It does this by setting the child's MaxRequests configuration
variable to 1 and clearing the keepalive flag so that the current connection is
broken after the request is serviced.
ap_child_terminate(r);
int ap_scan_script_header_err_buff (request_rec *r, BUFF *fb, char *buffer)
This function is useful when launching CGI scripts. It will scan the BUFF* stream fb for
HTTP headers. Typically the BUFF* is the pipe_out pointer returned from a previous
call to ap_bspawn_child(). Provided that the launched script outputs a valid header format,
the headers will be added to the request record's headers_out table. The same special actions are taken on certain headers as were discussed in
Chapter 9, Perl API Reference Guide,
when we covered the Perl cgi_header_out() method (see "Server Response
Methods" in "The Apache Request Object").
If the headers were properly formatted and parsed, the return value will be
OK . Otherwise, HTTP_INTERNAL_ SERVER_ERROR or some
other error code will be returned. In addition, the function will log errors
to the error log.
The buffer argument should be an empty character array allocated to MAX_STRING_
LENGTH or longer. If an error occurs during processing, this buffer will be set to contain
the portion of the incoming data that generated the error. This may be useful for
logging.
char buffer[MAX_STRING_LEN];
if(ap_scan_script_header_err(r, fb, buffer) != OK) {
... log nasty error message ...
int ap_scan_script_header_err (request_rec *r, FILE *f, char *buffer)
This function does exactly the same as ap_scan_script_header_err_buff(), except that it
reads from a FILE* stream rather than a BUFF* stream. You would use this with the
pipe_out FILE* returned by ap_spawn_child().
int ap_scan_script_header_err_core (request_rec *r, char *buffer,
int (*getsfunc) (char *, int, void *), void *getsfunc_data)
The tongue-twisting ap_scan_script_header_err_core() function is the underlying routine
which implements ap_scan_script_header_err() and ap_scan_script_header_err_buff(). The key
component here is the function pointer, getsfunc(), which is called upon to return a line
of data in the same way that the standard fgets() function does. For example, here's how
ap_scan_script_header_err() works, using the standard fgets() function:
static int getsfunc_FILE(char *buf, int len, void *f)
{
return fgets(buf, len, (FILE *) f) != NULL;
}
API_EXPORT(int) ap_scan_script_header_err(request_rec *r, FILE *f,
char *buffer)
{
return scan_script_header_err_core(r, buffer, getsfunc_FILE, f);
}
Your module could replace getsfunc_FILE() with an implementation to read from a string or other resource.
Show Contents Previous Page Next Page Copyright © 1999 by O'Reilly & Associates, Inc. |