The Module Structure (Apache: The Definitive Guide)

15.3. The Module Structure

Now we will look in detail at each entry in the module structure. We examine the entries in the order in which they are used, which is not the order in which they appear in the structure, and also show how they are used in the standard Apache modules.

15.3.1. Create Per-Server Config Structure

void *module_create_svr_config(pool *pPool, server_rec *pServer)

This structure creates the per-server configuration structure for the module. It is called once for the main server and once per virtual host. It allocates and initializes the memory for the per-server configuration and returns a pointer to it. pServer points to the server_rec for the current server.

15.3.1.1. Example

From mod_env.c:

typedef struct {
    table *vars;
    char *unsetenv;
    int vars_present;
} env_server_config_rec;

void *create_env_server_config (pool *p, server_rec *dummy)
{
    env_server_config_rec *new =
      (env_server_config_rec *) palloc (p, sizeof(env_server_config_rec));
    new->vars = make_table (p, 50);
    new->unsetenv = "";
    new->vars_present = 0;
    return (void *) new;
}

All this code does is allocate and initialize a copy of env_server_config_rec, which gets filled in during configuration.

15.3.2. Create Per-Directory Config Structure

void *module_create_dir_config(pool *pPool,char *szDir)

This structure is called once per module, with szDir set to NULL, when the main host's configuration is initialized, and again for each <Directory>, <Location>, or <File> section in the Config files containing a directive from this module, with szPath set to the directory. Any per-directory directives found outside <Directory>, <Location>, or <File> sections end up in the NULL configuration. It is also called when .htaccess files are parsed, with the name of the directory in which they reside. Because this function is used for .htaccess files, it may also be called after the initializer is called. Also, the core caches per-directory configurations arising from .htaccess files for the duration of a request, so this function is called only once per directory with an .htaccess file.

If a module does not support per-directory configuration, any directives that appear in a <Directory> section override the per-server configuration unless precautions are taken. The usual way to avoid this is to set the req _overrides member appropriately.

The purpose of this function is to allocate and initialize the memory required for any per-directory configuration. It returns a pointer to the allocated memory.

15.3.2.1. Example

From mod_rewrite.c:

static void *config_perdir_create(pool *p, char *path)
{
    rewrite_perdir_conf *a;
    a = (rewrite_perdir_conf *)pcalloc(p, sizeof(rewrite_perdir_conf));

    a->state           = ENGINE_DISABLED;
    a->rewriteconds    = make_array(p, 2, sizeof(rewritecond_entry));
    a->rewriterules    = make_array(p, 2, sizeof(rewriterule_entry));
    a->directory       = pstrdup(p, path);
    a->baseurl         = NULL;
    return (void *)a;
}

This function allocates memory for a rewrite_ perdir_conf structure (defined elsewhere in mod_rewrite.c) and initializes it. Since this function is called for every <Directory> section, regardless of whether it contains any rewriting directives, the initialization makes sure the engine is disabled unless specifically enabled later.

15.3.3. Per-Server Merger

void *module_merge_server(pool *pPool, void *base_conf, void *new_conf)

Once the Config files have been read, this function is called once for each virtual host, with base_conf pointing to the main server's configuration (for this module), and new_conf pointing to the virtual host's configuration. This gives you the opportunity to inherit any unset options in the virtual host from the main server or to merge the main server's entries into the virtual server, if appropriate. It returns a pointer to the new configuration structure for the virtual host (or it just returns new_conf, if appropriate).

It is possible that future changes to Apache will allow merging of hosts other than the main one, so don't rely on base_conf pointing to the main server.

15.3.3.1. Example

From mod_env.c:

void *merge_env_server_configs (pool *p, void *basev, void *addv)
{
    env_server_config_rec *base = (env_server_config_rec *)basev;
    env_server_config_rec *add = (env_server_config_rec *)addv;
    env_server_config_rec *new =
      (env_server_config_rec *)palloc (p, sizeof(env_server_config_rec));
    table *new_table;
    table_entry *elts;
    int i;
    char *uenv, *unset;

    new_table = copy_table( p, base->vars );
    elts = (table_entry *) add->vars->elts;
    for ( i = 0; i < add->vars->nelts; ++i ) {
        table_set( new_table, elts[i].key, elts[i].val ); 
    }
    unset = add->unsetenv;
    uenv = getword_conf( p, &unset );
    while ( uenv[0] != '\0' ) {
        table_unset( new_table, uenv );
        uenv = getword_conf( p, &unset );
    }
    new->vars = new_table;
    new->vars_present = base->vars_present || add->vars_present;
    return new;
}

This function creates a new configuration into which it then copies the base vars table (a table of environment variable names and values). It then runs through the individual entries of the addv vars table, setting them in the new table. It does this rather than use overlay_tables() because overlay_tables() does not deal with duplicated keys. Then the addv configuration's unsetenv (which is a space-separated list of environment variables to unset) unsets any variables specified to be unset for addv 's server.

15.3.4. Per-Directory Merger

void *module_dir_merge(pool *pPool, void *base_conf, void *new_conf)

Like the per-server merger, this is called once for each virtual host (not for each directory). It is handed the per-server document root per-directory Config (that is, the one that was created with a NULL directory name).

Whenever a request is processed, this function merges all relevant <Directory> sections and then merges .htacess files (interleaved, starting at the root and working downward), then <File> and <Location> sections, in that order.

Unlike the per-server merger, per-directory merger is called as the server runs, possibly with different combinations of directory, location, and file configurations for each request, so it is important that it copies the configuration (in new_conf) if it is going to change it.

15.3.4.1. Example

Now the reason we chose mod_rewrite.c for the per-directory creator becomes apparent, as it is a little more interesting than most:

static void *config_perdir_merge(pool *p, void *basev, void *overridesv)
{
    rewrite_perdir_conf *a, *base, *overrides;
    a     = (rewrite_perdir_conf *)pcalloc(p, sizeof(rewrite_perdir_conf));
    base  = (rewrite_perdir_conf *)basev;
    overrides = (rewrite_perdir_conf *)overridesv;

    a->state           = overrides->state;
    a->options         = overrides->options;
    a->directory       = overrides->directory;
    a->baseurl         = overrides->baseurl;
    if (a->options & OPTION_INHERIT) {
        a->rewriteconds = append_arrays(p, overrides->rewriteconds,     
             base->rewriteconds);
        a->rewriterules = append_arrays(p, overrides->rewriterules,
             base->rewriterules);
    }
    else {
        a->rewriteconds = overrides->rewriteconds;
        a->rewriterules = overrides->rewriterules;
    }
    return (void *)a;
}

As you can see, this merges the configuration from the base conditionally, depending on whether the new configuration specified an INHERIT option or not.

15.3.5. Command Table

command_rec aCommands[]

This structure points to an array of directives that configure the module. Each entry names a directive, specifies a function that will handle the command, and specifies which AllowOverride directives must be in force for the command to be permitted. Each entry then specifies how the directive's arguments are to be parsed and supplies an error message in case of syntax errors (such as the wrong number of arguments, or a directive used where it shouldn't be).

The definition of command_rec can be found in http_config.h:

typedef struct command_struct {
  char *name;               /* Name of this command */
  char *(*func)();          /* Function invoked */
  void *cmd_data;           /* Extra data, for functions that
                             * implement multiple commands...
                             */
  int req_override;         /* What overrides need to be allowed to
                             * enable this command
                             */
  enum cmd_how args_how;    /* What the command expects as arguments */
  
  char *errmsg;             /* 'usage' message, in case of syntax errors */
} command_rec;

cmd_how is defined as follows:

enum cmd_how {
  RAW_ARGS,                     /* cmd_func parses command line itself */
  TAKE1,                        /* one argument only */
  TAKE2,                        /* two arguments only */
  ITERATE,                      /* one argument, occurring multiple times
                                 * (e.g., IndexIgnore)
                                 */
  ITERATE2,                     /* two arguments, 2nd occurs multiple times
                                 * (e.g., AddIcon)
                                 */
  FLAG,                         /* One of 'On' or 'Off' */
  NO_ARGS,                      /* No args at all, e.g. </Directory> */
  TAKE12,                       /* one or two arguments */
  TAKE3,                        /* three arguments only */
  TAKE23,                       /* two or three arguments */
  TAKE123,                      /* one, two, or three arguments */
  TAKE13                        /* one or three arguments */
};

These options determine how the function func is called when the matching directive is found in a Config file, but first we must look at one more structure, cmd_parms :

typedef struct {
    void *info;                /* Argument to command from cmd_table */
    int override;              /* Which allow-override bits are set */
    int limited;               /* Which methods are <Limit>ed */

    char *config_file;         /* Filename cmd read from */
    int config_line;           /* Line cmd read from */
    FILE *infile;              /* fd for more lines (not currently used) */

    pool *pool;                /* Pool to allocate new storage in */
    pool *temp_pool;           /* Pool for scratch memory; persists during
                                * configuration, but wiped before the first
                                * request is served...
                                */
    server_rec *server;        /* server_rec being configured for */
    char *path;                /* If configuring for a directory,
                                * pathname of that directory
                                */
    command_rec *cmd;          /* Configuration command */
} cmd_parms;

This structure is filled in and passed to the function associated with each directive. Note that cmd_parms.info is filled in with the value of command_rec.cmd_data, allowing arbitrary extra information to be passed to the function. The function is also passed its per-directory configuration structure, if there is one, shown in the following definitions as mconfig. The per-server configuration is accessed by a call similar to:

get_module_config(parms->server->module_config, &module_struct)

replacing module_struct with your own module's module structure. Extra information may also be passed, depending on the value of args_how:

RAW_ARGS

func(cmd_parms *parms, void *mconfig, char *args)

args is simply the rest of the line (that is, excluding the directive).

NO_ARGS

func(cmd_parms *parms, void *mconfig)

TAKE1

func(cmd_parms *parms, void *mconfig, char *w)

w is the single argument to the directive.

TAKE2, TAKE12

func(cmd_parms *parms, void *mconfig, char *w1, char *w2)

w1 and w2 are the two arguments to the directive. TAKE12 means the second argument is optional. If absent, w2 is NULL.

TAKE3, TAKE13, TAKE23, TAKE123

func(cmd_parms *parms, void *mconfig, char *w1, char *w2, char *w3)

w1, w2, and w3 are the three arguments to the directive. TAKE13, TAKE23, and TAKE123 mean that the directive takes one or three, two or three, and one, two, or three arguments, respectively. Missing arguments are NULL.

ITERATE

func(cmd_parms *parms, void *mconfig, char *w)

func is called repeatedly, once for each argument following the directive.

ITERATE2

func(cmd_parms *parms, void *mconfig, char *w1, char *w2)

There must be at least two arguments. func is called once for each argument, starting with the second. The first is passed to func every time.

FLAG

func(cmd_parms *parms, void *mconfig, int f)

The argument must be either On or Off. If On, then f is nonzero; if Off, f is zero.

req_override can be any combination of the following (ORed together):

#define OR_NONE 0
#define OR_LIMIT 1
#define OR_OPTIONS 2
#define OR_FILEINFO 4
#define OR_AUTHCFG 8
#define OR_INDEXES 16
#define OR_UNSET 32
#define ACCESS_CONF 64
#define RSRC_CONF 128
#define OR_ALL (OR_LIMIT|OR_OPTIONS|OR_FILEINFO|OR_AUTHCFG|OR_INDEXES)

This structure defines the circumstances under which a directive is permitted. The logical AND of this field and the current override state must be nonzero for the directive to be allowed. In configuration files, the current override state is:

RSRC_CONF|OR_OPTIONS|OR_FILEINFO|OR_INDEXES

when outside a <Directory> section, and is:

ACCESS_CONF|OR_LIMIT|OR_OPTIONS|OR_FILEINFO|OR_AUTHCFG|OR_INDEXES

when inside a <Directory> section.

In .htaccess files, the state is determined by the AllowOverride directive.

15.3.5.1. Example

From mod_mime.c:

command_rec mime_cmds[] = {
{ "AddType", add_type, NULL, OR_FILEINFO, ITERATE2,
    "a mime type followed by one or more file extensions" },
{ "AddEncoding", add_encoding, NULL, OR_FILEINFO, ITERATE2,
    "an encoding (e.g., gzip), followed by one or more file extensions" },
{ "AddLanguage", add_language, NULL, OR_FILEINFO, ITERATE2,
    "a language (e.g., fr), followed by one or more file extensions" },
{ "AddHandler", add_handler, NULL, OR_FILEINFO, ITERATE2,
    "a handler name followed by one or more file extensions" },
{ "ForceType", set_string_slot, (void*)XtOffsetOf(mime_dir_config, type),
    OR_FILEINFO, TAKE1, "a media type" },
{ "SetHandler", set_string_slot, (void*)XtOffsetOf(mime_dir_config,
    handler), OR_FILEINFO, TAKE1, "a handler name" },
{ "TypesConfig", set_types_config, NULL, RSRC_CONF, TAKE1,
    "the MIME types config file" },
{ NULL }
};

Note the use of set_string_slot(). This standard function uses the offset defined in cmd_data, using XtOffsetOf to set a char* in the per-directory configuration of the module.

15.3.6. Initializer

void module_init(server_rec *pServer, pool *pPool)

This function is called after the server configuration files have been read but before any requests are handled. Like the configuration functions, it is called each time the server is reconfigured, so care must be taken to make sure it behaves correctly on the second and subsequent calls. This is the last function to be called before Apache forks the request-handling children. pServer is a pointer to the server_rec for the main host. pPool is a pool that persists until the server is reconfigured. Note that, at least in the current version of Apache:

pServer->server_hostname

may not yet be initialized. If the module is going to add to the version string with ap_add_version_component(), then this is a good place to do it.

It is possible to iterate through all the server configurations by following the next member of pServer, as in the following:

for( ; pServer ; pServer=pServer->next)
    ;

15.3.6.1. Example

From mod_mime.c:

#define MIME_HASHSIZE 27
#define hash(i) (isalpha(i) ? (tolower(i)) - 'a' : 26)

static table *hash_buckets[MIME_HASHSIZE];

void init_mime (server_rec *s, pool *p)
{
    FILE *f;
    char l[MAX_STRING_LEN];
    int x;
    char *types_confname = get_module_config (s->module_config, 
               &mime_module);

    if (!types_confname) types_confname = TYPES_CONFIG_FILE;

    types_confname = server_root_relative (p, types_confname);

    if(!(f = fopen(types_confname,"r"))) {
        fprintf(stderr,"httpd: could not open mime types file %s\n",
               types_confname);
        perror("fopen");
        exit(1);
    }

    for(x=0;x<27;x++) 
        hash_buckets[x] = make_table (p, 10);

    while(!(cfg_getline(l,MAX_STRING_LEN,f))) {
        char *ll = l, *ct;

        if(l[0] == '#'. continue;
        ct = getword_conf (p, &ll);

        while(ll[0]) {
            char *ext = getword_conf (p, &ll);
            str_tolower (ext);  /* ??? */
            table_set (hash_buckets[hash(ext[0])], ext, ct);
        }
    }
    fclose(f);
}

15.3.7. Child Initialization

static void 
module_child_init(server_rec *pServer,pool *pPool)

An Apache server may consist of many processes (on Unix, for example) or a single process with many threads (on Win32) or, in the future, a combination of the two. module_child_init() is called once for each instance of a heavyweight process, that is, whatever level of execution corresponds to a separate address space, file handles, etc. In the case of Unix, this is once per child process, but on Win32 it is called only once in total, not once per thread. This is because threads share address space and other resources. There is not currently a corresponding per-thread call, but there may be in the future. There is a corresponding call for child exit, described later in this chapter.

15.3.7.1. Example

From mod_unique_id.c:

static void unique_id_child_init(server_rec *s, pool *p)
{
    pid_t pid;
#ifndef NO_GETTIMEOFDAY
    struct timeval tv;
#endif

    pid = getpid();
    cur_unique_id.pid = pid;

    if (cur_unique_id.pid != pid) {
        ap_log_error(APLOG_MARK, APLOG_NOERRNO|APLOG_CRIT, s,
                    "oh no! pids are greater than 32-bits!  I'm broken!");
    }

    cur_unique_id.in_addr = global_in_addr;

#ifndef NO_GETTIMEOFDAY
    if (gettimeofday(&tv, NULL) == -1) {
        cur_unique_id.counter = 0;
    }
    else {
        cur_unique_id.counter = tv.tv_usec / 10;
    }
#else
    cur_unique_id.counter = 0;
#endif

    cur_unique_id.pid = htonl(cur_unique_id.pid);
    cur_unique_id.counter = htons(cur_unique_id.counter);
}

mod_unique_id.c 's purpose in life is to provide an ID for each request that is unique across all web servers everywhere (or, at least at a particular site). In order to do this it uses various bits of uniqueness, including the process ID of the child and the time at which it was forked, which is why it uses this hook.

15.3.8. Post Read Request

static int 

module_post_read_request(request_rec *pReq)

This function is called immediately after the request headers have been read, or, in the case of an internal redirect, synthesized. It is not called for subrequests. It can return OK, DECLINED, or a status code. If something other than DECLINED is returned, no further modules are called. This can be used to make decisions based purely on the header content. Currently the only standard Apache module to use this hook is the proxy module.

15.3.8.1. Example

From mod_proxy.c:

/* Detect if an absolute URI should be proxied or not. Note that we
 * have to do this during this phase because later phases are
 * "short-circuiting"... i.e., translate_names will end when the first
 * module returns OK. So for example, if the request is something like:
 *
 * GET http://othervhost/cgi-bin/printenv HTTP/1.0
 *
 * mod_alias will notice the /cgi-bin part and ScriptAlias it and
 * short-circuit the proxy... just because of the ordering in the
 * configuration file.
 */
static int proxy_detect(request_rec *r)
{
    void *sconf = r->server->module_config;
    proxy_server_conf *conf;

    conf = (proxy_server_conf *) ap_get_module_config(sconf, &proxy_module);

    if (conf->req && r->parsed_uri.scheme) {
    /* but it might be something vhosted */
       if (!(r->parsed_uri.hostname
           && !strcasecmp(r->parsed_uri.scheme, ap_http_method(r))
           && ap_matches_request_vhost(r, r->parsed_uri.hostname,
               r->parsed_uri.port_str ? r->parsed_uri.port : ap_default_port(r)))) {
        r->proxyreq = 1;
        r->uri = r->unparsed_uri;
        r->filename = ap_pstrcat(r->pool, "proxy:", r->uri, NULL);
        r->handler = "proxy-server";
        }
    }
    /* We need special treatment for CONNECT proxying: it has no scheme part */
    else if (conf->req && r->method_number == M_CONNECT
        && r->parsed_uri.hostname
        && r->parsed_uri.port_str) {
        r->proxyreq = 1;
        r->uri = r->unparsed_uri;
        r->filename = ap_pstrcat(r->pool, "proxy:", r->uri, NULL);
        r->handler = "proxy-server";
    }
    return DECLINED;
}

This code checks for a request that includes a hostname that does not match the current virtual host (which, since it will have been chosen on the basis of the hostname in the request, means it doesn't match any virtual host), or a CONNECT method (which only proxies use). If either of these conditions are true, the handler is set to proxy-server, and the filename is set to proxy:uri so that the later phases will be handled by the proxy module.

15.3.9. Translate Name

int module_translate(request_rec *pReq)

This function's task is to translate the URL in a request into a filename. The end result of its deliberations should be placed in pReq->filename. It should return OK, DECLINED, or a status code. The first module that doesn't return DECLINED is assumed to have done the job, and no further modules are called. Since the order in which modules are called is not defined, it is a good thing if the URLs handled by the modules are mutually exclusive. If all modules return DECLINED, a configuration error has occurred. Obviously, the function is likely to use the per-directory and per-server configurations (but note that at this stage, the per-directory configuration refers to the root configuration of the current server) in order to determine whether it should handle the request, as well as the URL itself (in pReq->uri). If a status is returned, the appropriate headers for the response should also be set in pReq->headers_out.

15.3.9.1. Example

Naturally enough, this comes from mod_alias.c:

char *try_alias_list (request_rec *r, array_header *aliases, int doesc)
{
    alias_entry *entries = (alias_entry *)aliases->elts;
    int i;
    
    for (i = 0; i < aliases->nelts; ++i) {
        alias_entry *p = &entries[i];
        int l = alias_matches (r->uri, p->fake);
        if (l > 0) {
            if (p->handler) { /* Set handler and leave a note for mod_cgi */
                r->handler = pstrdup(r->pool, p->handler);
                table_set (r->notes, "alias-forced-type", p->handler);
            }
            if (doesc) {
                char *escurl;
                escurl = os_escape_path(r->pool, r->uri + l, 1);
                return pstrcat(r->pool, p->real, escurl, NULL);
            } else
                return pstrcat(r->pool, p->real, r->uri + l, NULL);
        }
    }
    return NULL;
}

int translate_alias_redir(request_rec *r)
{
    void *sconf = r->server->module_config;
    alias_server_conf *serverconf =
        (alias_server_conf *)get_module_config(sconf, &alias_module);
    char *ret;
#ifdef __EMX__
    /* Add support for OS/2 drive names */
    if ((r->uri[0] != '/' && r->uri[0] != '\0'. && r->uri[1] != ':'.
#else    
    if (r->uri[0] != '/' && r->uri[0] != '\0'. 
#endif    
        return DECLINED;
    if ((ret = try_alias_list (r, serverconf->redirects, 1)) != NULL) {
        table_set (r->headers_out, "Location", ret);
        return REDIRECT;
    }

    if ((ret = try_alias_list (r, serverconf->aliases, 0)) != NULL) {
        r->filename = ret;
        return OK;
    }

    return DECLINED;
}

First of all, this example tries to match a Redirect directive. If it does, the Location header is set in headers_out, and REDIRECT is returned. If not, it translates into a filename. Note that it may also set a handler (in fact, the only handler it can possibly set is cgi-script, which it does if the alias was created by a ScriptAlias directive). An interesting feature is that it sets a note for mod_cgi.c, namely alias-forced-type. This is used by mod_cgi.c to determine whether the CGI script is invoked via a ScriptAlias, in which case Options ExecCGI is not needed.[83] For completeness, here is the code from mod_cgi.c that makes the test:

[83]This is a backward-compatibility feature.

int is_scriptaliased (request_rec *r)
{
    char *t = table_get (r->notes, "alias-forced-type");
    return t && (!strcmp (t, "cgi-script"));
}

15.3.10. An Interjection

At this point, the filename is known as well as the URL, and Apache reconfigures itself to hand subsequent module functions the relevant per-directory configuration (actually composed of all matching directory, location, and file configurations, merged with each other via the per-directory merger, in that order).[84]

[84]In fact, some of this is done before the Translate Name phase, and some after, since the location information can be used before name translation is done, but filename information obviously cannot be. If you really want to know exactly what is going on, probe the behavior with mod_reveal.c.

15.3.11. Header Parser

static int 

module_header_parser(request_rec *pReq)

This routine is similar in intent to the Post Read Request phase. It can return OK, DECLINED, or a status code. If something other than DECLINED is returned, no further modules are called. The intention was to make decisions based on the headers sent by the client. However, its use has been superseded by Post Read Request (which was introduced later in the development process) and it is not currently used by any standard module. For that reason, it is not possible to illustrate it with an example.

15.3.12. Check Access

int module_check_access(request_rec *pReq)

This routine checks access, in the allow/deny sense. It can return OK , DECLINED, or a status code. All modules are called until one of them returns something other than DECLINED or OK. If all modules return DECLINED, it is considered a configuration error. At this point, the URL and the filename (if relevant) are known, as are the client's address, user agent, and so forth. All of these are available through pReq. As long as everything says DECLINED or OK, the request can proceed.

15.3.12.1. Example

The only example available in the standard modules is, unsurprisingly, from mod_access.c:

int find_allowdeny (request_rec *r, array_header *a, int method)
{
    allowdeny *ap = (allowdeny *)a->elts;
    int mmask = (1 << method);
    int i, gothost=0;
    const char *remotehost=NULL;

    for (i = 0; i < a->nelts; ++i) {
        if (!(mmask & ap[i].limited))
            continue;
        if (ap[i].from && !strcmp(ap[i].from, "user-agents")) {
            char * this_agent = table_get(r->headers_in, "User-Agent");
            int j;

            if (!this_agent) return 0;

            for (j = i+1; j < a->nelts; ++j) {
                if (strstr(this_agent, ap[j].from)) return 1;
            }
            return 0;
        }

        if (!strcmp (ap[i].from, "all"))
            return 1;
        if (!gothost)
        {
            remotehost = get_remote_host(r->connection, r->per_dir_config,
                                         REMOTE_HOST);
            gothost = 1;
        }
        if (remotehost != NULL && isalpha(remotehost[0]))
            if (in_domain(ap[i].from, remotehost))
                return 1;
        if (in_ip (ap[i].from, r->connection->remote_ip))
            return 1;
    }
    return 0;
}

int check_dir_access (request_rec *r)
{
    int method = r->method_number;
    access_dir_conf *a =
        (access_dir_conf *)
           get_module_config (r->per_dir_config, &access_module);
    int ret = OK;

    if (a->order[method] == ALLOW_THEN_DENY) {
        ret = FORBIDDEN;
        if (find_allowdeny (r, a->allows, method))
            ret = OK;
        if (find_allowdeny (r, a->denys, method))
            ret = FORBIDDEN;
    } else if (a->order[method] == DENY_THEN_ALLOW) {
        if (find_allowdeny (r, a->denys, method))
            ret = FORBIDDEN;
        if (find_allowdeny (r, a->allows, method))
            ret = OK;
    }
    else {
        if (find_allowdeny(r, a->allows, method) 
            && !find_allowdeny(r, a->denys, method))
            ret = OK;
        else
            ret = FORBIDDEN;
    }

    if (ret == FORBIDDEN)
        log_reason ("Client denied by server configuration", r->filename, r);

    return ret;
}

Pretty straightforward stuff. in_ip() and in_domain() check whether an IP address or domain name, respectively, match the IP or domain of the client.

15.3.13. Check User ID

int module_check_user_id(request_rec *pReq)

This function is responsible for acquiring and checking a user ID. The user ID should be stored in pReq->connection->user. The function should return OK, DECLINED, or a status code. Of particular interest is HTTP_UNAUTHORIZED (formerly known as AUTH_REQUIRED), which should be returned if the authorization fails (either because the user agent presented no credentials, or because those presented were not correct). All modules are polled until one returns something other than DECLINED. If all decline, a configuration error is logged, and an error returned to the user agent. When HTTP_UNAUTHORIZED is returned, an appropriate header should be set to inform the user agent of the type of credentials to present when it retries. Currently the appropriate header is WWW-Authenticate (see the HTTP/1.1 specification for details). Unfortunately, Apache's modularity is not quite as good as it might be in this area, so this hook usually provides alternate ways of accessing the user/password database, rather than changing the way authorization is actually done, as evidenced by the fact that the protocol side of authorization is currently dealt with in http_protocol.c, rather than in the module. Note that this function checks the validity of the username and password, and not whether the particular user has permission to access the URL.

15.3.13.1. Example

An obvious user of this hook is mod_auth.c:

int authenticate_basic_user (request_rec *r)
{
    auth_config_rec *sec =
      (auth_config_rec *)get_module_config (r->per_dir_config, &auth_module);
    conn_rec *c = r->connection;
    char *sent_pw, *real_pw;
    char errstr[MAX_STRING_LEN];
    int res;

    if ((res = get_basic_auth_pw (r, &sent_pw))) return res;

    if(!sec->auth_pwfile) 
        return DECLINED;

    if (!(real_pw = get_pw(r, c->user, sec->auth_pwfile))) {
        sprintf(errstr,"user %s not found",c->user);
        log_reason (errstr, r->uri, r);
        note_basic_auth_failure (r);
        return AUTH_REQUIRED;
    }

    if(strcmp(real_pw,(char *)crypt(sent_pw,real_pw))) {
        sprintf(errstr,"user %s: password mismatch",c->user);
        log_reason (errstr, r->uri, r);
        note_basic_auth_failure (r);
        return AUTH_REQUIRED;
    }

    return OK;
}

15.3.14. Check Auth

int 
module_check_auth(request_rec *pReq)

This hook is called to check whether the authenticated user (found in pReq->connection->user) is permitted to access the current URL. It normally uses the per-directory configuration (remembering that this is actually the combined directory, location, and file configuration) to determine this. It must return OK, DECLINED, or a status code. Again, the usual status to return is HTTP_UNAUTHORIZED if access is denied, thus giving the user a chance to present new credentials. Modules are polled until one returns something other than DECLINED.

15.3.14.1. Example

Again, the natural example to use is from mod_auth.c:

int check_user_access (request_rec *r) {
    auth_config_rec *sec =
      (auth_config_rec *)get_module_config (r->per_dir_config, &auth_module);
    char *user = r->connection->user;
    int m = r->method_number;
    int method_restricted = 0;
    register int x;
    char *t, *w;
    table *grpstatus;
    array_header *reqs_arr = requires (r);
    require_line *reqs;

    if (!reqs_arr)
        return (OK);
    reqs = (require_line *)reqs_arr->elts;

    if(sec->auth_grpfile)
        grpstatus = groups_for_user (r->pool, user, sec->auth_grpfile);
    else
        grpstatus = NULL;

    for(x=0; x < reqs_arr->nelts; x++) {

        if (! (reqs[x].method_mask & (1 << m))) continue;

        method_restricted = 1;

        t = reqs[x].requirement;
        w = getword(r->pool, &t, ' ');
        if(!strcmp(w,"valid-user"))
            return OK;
        if(!strcmp(w,"user")) {
            while(t[0]) {
                w = getword_conf (r->pool, &t);
                if(!strcmp(user,w))
                    return OK;
            }
        }
        else if(!strcmp(w,"group")) {
            if(!grpstatus) 
                return DECLINED;        /* DBM group?  Something else? */
            
            while(t[0]) {
                w = getword_conf(r->pool, &t);
                if(table_get (grpstatus, w))
                    return OK;
            }
        }
    }

    if (!method_restricted)
        return OK;

    note_basic_auth_failure (r);

    return AUTH_REQUIRED;}

15.3.15. Type Checker

int module_type_checker(request_rec *pReq)

At this stage, we have almost finished processing the request. All that is left to decide is who actually handles it. This is done in two stages: first, by converting the URL or filename into a MIME type or handler string, a language, and an encoding; and second, by calling the appropriate function for the type. This hook deals with the first part. If it generates a MIME type, it should be stored in pReq->content_type. Alternatively, if it generates a handler string, it should be stored in pReq->handler. The languages go in pReq->content_languages, and the encoding in pReq->content_encoding. Note that there is no defined way of generating a unique handler string. Furthermore, handler strings and MIME types are matched to the request handler through the same table, so the handler string should probably not be a MIME type.[85]

[85]Old hands may recall that earlier versions of Apache used "magic" MIME types to cause certain request handlers to be invoked, such as the CGI handler. Handler strings were invented to remove this kludge.

15.3.15.1. Example

One obvious place that this must go on is in mod_mime.c:

int find_ct(request_rec *r)
{
    char *fn = strrchr(r->filename, '/'.;
    mime_dir_config *conf =
      (mime_dir_config *)get_module_config(r->per_dir_config, &mime_module);
    char *ext, *type, *orighandler = r->handler;

    if (S_ISDIR(r->finfo.st_mode)) {
        r->content_type = DIR_MAGIC_TYPE;
        return OK;
    }

    if(fn == NULL) fn = r->filename;

    /* Parse filename extensions, which can be in any order */
    while ((ext = getword(r->pool, &fn, '.')) && *ext) {
      int found = 0;

      /* Check for Content-Type */
      if ((type = table_get (conf->forced_types, ext))
          || (type = table_get (hash_buckets[hash(*ext)], ext))) {
          r->content_type = type;
          found = 1;
      }

      /* Check for Content-Language */
      if ((type = table_get (conf->language_types, ext))) {
          r->content_language = type;
          found = 1;
      }

      /* Check for Content-Encoding */
      if ((type = table_get (conf->encoding_types, ext))) {
          if (!r->content_encoding)
              r->content_encoding = type;
          else
              r->content_encoding = pstrcat(r->pool, r->content_encoding,
                                            ", ", type, NULL);
          found = 1;
      }

      /* Check for a special handler, but not for proxy request */
      if ((type = table_get (conf->handlers, ext)) && !r->proxyreq) {
          r->handler = type;
          found = 1;
      }

      /* This is to deal with cases such as foo.gif.bak, which we want
       * to not have a type. So if we find an unknown extension, we
       * zap the type/language/encoding and reset the handler.
       */

      if (!found) {
        r->content_type = NULL;
        r->content_language = NULL;
        r->content_encoding = NULL;
        r->handler = orighandler;
      }
    }

    /* Check for overrides with ForceType/SetHandler */

    if (conf->type && strcmp(conf->type, "none"))
        r->content_type = pstrdup(r->pool, conf->type);
    if (conf->handler && strcmp(conf->handler, "none"))
        r->handler = pstrdup(r->pool, conf->handler);

    if (!r->content_type) return DECLINED;

    return OK;
}

Another example can be found in mod_negotiation.c, but it is rather more complicated than is needed to illustrate the point.

15.3.16. Prerun Fixups

int 


module_fixups(request_rec *pReq)

Nearly there! This is your last chance to do anything that might be needed before the request is finally handled. At this point, all processing that is going to be done before the request is handled has been completed, the request is going to be satisfied, and all that is left to do is anything the request handler won't do. Examples of what you might do here include setting environment variables for CGI scripts, adding headers to pReq->header_out, or even setting something to modify the behavior of another module's handler in pReq->notes. Things you probably shouldn't do at this stage are many, but, most importantly, you should leave anything security-related alone, including, but certainly not limited to, the URL, the filename, and the username. Most modules won't use this hook because they do their real work elsewhere.

15.3.16.1. Example

As an example, we will set the environment variables for a shell script. Here's where it's done in mod_env.c:

int fixup_env_module(request_rec *r)
{
    table *e = r->subprocess_env;
    server_rec *s = r->server;
    env_server_config_rec *sconf = get_module_config (s->module_config,
                                                      &env_module);
    table *vars = sconf->vars;
    if ( !sconf->vars_present ) return DECLINED;
    r->subprocess_env = overlay_tables( r->pool, e, vars );
    return OK;  
}

Notice that this doesn't directly set the environment variables; that would be pointless because a subprocess's environment variables are created anew from pReq->subprocess_env. Also notice that, as is often the case in computing, considerably more effort is spent in processing the configuration for mod_env.c than is spent at the business end.

Another example can be found in mods_pics_simple.c:

static int pics_simple_fixup (request_rec *r) {
    char **stuff = (char **)get_module_config (r->per_dir_config,
                                               &pics_simple_module);
    if (!*stuff) return DECLINED;
    table_set (r->headers_out, "PICS-label", *stuff);
    return DECLINED;
}

This has such a simple configuration (just a string) that it doesn't even bother with a configuration structure.[86] All it does is set the PICS-label header with the string derived from the directory, location, and file relevant to the current request.

[86]Not a technique we particularly like, but there we are.

15.3.17. Handlers

handler_rec aModuleHandlers[];

The definition of a handler_rec can be found in http_config.h :

typedef struct {
    char *content_type;
    int (*handler)(request_rec *);
} handler_rec;

Finally, we are ready to handle the request. The core now searches through the modules' handler entries, looking for an exact match for either the handler type or the MIME type, in that order (that is, if a handler type is set, that is used; otherwise, the MIME type is used). When a match is found, the corresponding handler function is called. This will do the actual business of serving the user's request. Often you won't want to do this, because you'll have done the work of your module earlier, but this is the place to run your Java, translate to Swedish, or whatever you might want to do to serve actual content to the user. Most handlers either send some kind of content directly (in which case, they must remember to call send_http_header() before sending the content) or use one of the internal redirect methods (e.g., internal_redirect()).

15.3.17.1. Example

mod_status.c only implements a handler; here's the handler's table:

handler_rec status_handlers[] =
{
{ STATUS_MAGIC_TYPE, status_handler },
{ "server-status", status_handler },
{ NULL }
};

We don't show the actual handler here, because it is big and boring. All it does is trawl through the scoreboard (which records details of the various child processes) and generate a great deal of HTML. The user invokes this handler with either a SetHandler or an AddHandler; however, since the handler makes no use of a file, SetHandler is the more natural way to do it. Notice the reference to STATUS_MAGIC_TYPE. This is a "magic" MIME type, the use of which is now deprecated, but we must retain it for backward compatibility in this particular module.

15.3.18. Logger

int module_logger(request_rec *pRec)

Now that the request has been processed and the dust has settled, you may want to log the request in some way. Here's your chance to do that. Although the core stops running the logger function as soon as a module returns something other than OK or DECLINED, that is rarely done, as there is no way to know whether another module needs to be able to log something.

15.3.18.1. Example

Although mod_log_agent.c is more or less out of date since mod_log_config.c was introduced, it makes a nice, compact example:

int agent_log_transaction(request_rec *orig)
{
    agent_log_state *cls = get_module_config (orig->server->module_config,
                                              &agent_log_module);

    char str[HUGE_STRING_LEN];
    char *agent;
    request_rec *r;
    if(cls->agent_fd <0)
      return OK;

    for (r = orig; r->next; r = r->next)
        continue;
    if (*cls->fname == '\0'.    /* Don't log agent */
        return DECLINED;

    agent = table_get(orig->headers_in, "User-Agent");
    if(agent != NULL) 
      {
        sprintf(str, "%s\n", agent);
        write(cls->agent_fd, str, strlen(str));
      }

    return OK;
}

This is not a good example of programming practice. With its fixed-size buffer str, it leaves a gaping security hole. It wouldn't be enough to simply split the write into two parts to avoid this problem. Because the log file is shared among all server processes, the write must be atomic or the log file could get mangled by overlapping writes. mod_log_config.c carefully avoids this problem.

15.3.19. Child Exit

void 
child_exit(server_rec *pServer,pool *pPool)

This function is called immediately before a particular child exits. See "Child Initialization," earlier in this chapter, for an explanation of what "child" means in this context. Typically, this function will be used to release resources that are persistent between connections, such as database or file handles.

15.3.19.1. Example

From mod_log_config.c:

static void flush_all_logs(server_rec *s, pool *p)
{
    multi_log_state *mls;
    array_header *log_list;
    config_log_state *clsarray;
    int i;

    for (; s; s = s->next) {
        mls = ap_get_module_config(s->module_config, &config_log_module);
        log_list = NULL;
        if (mls->config_logs->nelts) {
            log_list = mls->config_logs;
        }
        else if (mls->server_config_logs) {
            log_list = mls->server_config_logs;
        }
        if (log_list) {
            clsarray = (config_log_state *) log_list->elts;
            for (i = 0; i < log_list->nelts; ++i) {
                flush_log(&clsarray[i]);
            }
        }
    }
}

This routine is only used when BUFFERED_LOGS is defined. Predictably enough, it flushes all the buffered logs, which would otherwise be lost when the child exited.