Writing Apache Modules with Perl and C

Writing Apache Modules with Perl and C

By:	Lincoln Stein and Doug MacEachern
Published:	O'Reilly & Associates, Inc. - March 1999

Show Contents Previous Page Next Page

Chapter 8 - Customizing the Apache Configuration Process / The Apache Configuration Directive API
Specifying Configuration Directive Syntax

Most configuration-processing callbacks will declare function prototypes that describe how they are intended to be called. Although in the current implementation Perl does not check callbacks' prototypes at runtime, they serve a very useful function nevertheless. The command_table() function can use callback prototypes to choose the correct syntax for the directive on its own. If no args_how key is present in the definition of the directive, command_table() will pull in the .pm file containing the callback definitions and attempt to autogenerate the args_how field on its own, using the Perl prototype() built-in function. By specifying the correct prototype, you can forget about args_how entirely and let command_table() take care of choosing the correct directive parsing method for you.

If both an args_how and a function prototype are provided, command_table() will use the value of args_how in case of a disagreement. If neither an args_how nor a function prototype is present, command_table() will choose a value of TAKE123, which is a relatively permissive parsing rule.

Apache supports a total of 11 different directive parsing methods. This section lists their symbolic constants and the Perl prototypes to use if you wish to take advantage of configuration definition shortcuts.

NO_ARGS ($$) or no prototype at all The directive takes no arguments. The callback will be invoked once each time the directive is encountered. sub TrafficCopOn ($$) { shift->{On}++; }
TAKE1 ($$$) The directive takes a single argument. The callback will be invoked once each time the directive is encountered, and the argument of the directive will be passed to the callback as the third argument. sub TrafficCopActiveSergeant ($$$) { my($cfg, $parms, $arg) = @_; $cfg->{Sergeant} = $arg; }
TAKE2 ($$$$) The directive takes two arguments. They are passed to the callback as the third and fourth arguments. sub TrafficCopLimits ($$$$) { my($cfg, $parms, $minspeed, $maxspeed) = @_; $cfg->{Min} = $minspeed; $cfg->{Max} = $maxspeed; }
TAKE3 ($$$$$) This is like TAKE1 and TAKE2, but the directive takes three mandatory arguments.
TAKE12 ($$$;$) In this interesting variant, the directive takes one mandatory argument and a second optional one. This can be used when the second argument has a default value that the user may want to override. sub TrafficCopWarningLevel ($$$;$) { my($cfg, $parms, $severity_level, $msg) = @_; $cfg->{severity} = $severity_level; $cfg->{msg} = $msg || "You have exceeded the speed limit. Your license please?" }
TAKE23 ($$$$;$) TAKE23 is just like TAKE12, except now there are two mandatory arguments and an optional third one.
TAKE123 ($$$;$$) In the TAKE123 variant, the first argument is mandatory and the other two are optional. This is useful for providing defaults for two arguments.
ITERATE ($$@) ITERATE is used when a directive can take an unlimited number of arguments. For example, the mod_autoindex IndexIgnore directive specifies a list of one or more file extensions to ignore in directory listings: IndexIgnore .bak .sav .hide .conf Although the function prototype suggests that the callback's third argument will be a list, this is not the case. In fact, the callback is invoked repeatedly with a single argument, once for each argument in the list. It's done this way for interoperability with the C API. The callback should be prepared to be called once for each argument in the directive argument list, and to be called again each time the directive is repeated. For example: sub TrafficCopRightOfWay ($$@) { my($cfg, $parms, $domain) = @_; $cfg->{RightOfWay}{$domain}++; }
ITERATE2 ($$@;@) ITERATE2 is an interesting twist on the ITERATE theme. It is used for directives that take a mandatory first argument followed by a list of arguments to be applied to the first. A familiar example is the AddType directive, in which a series of file extensions are applied to a single MIME type: AddType image/jpeg JPG JPEG JFIF jfif As with ITERATE, the callback function prototype for ITERATE2 is there primarily to provide a unique signature that can be recognized by command_table(). Apache will invoke your callback once for each item in the list. Each time Apache runs your callback, it passes the routine the constant first argument (image/jpeg in the example) and the current item in the list (JPG the first time around, JPEG the second time, and so on). In the example above, the configuration processing routine will be run a total of four times. Let's say Apache::TrafficCop needs to ticket cars parked on only the days when it is illegal, such as street sweeping day: TrafficCopTicket street_sweeping monday wednesday friday The ITERATE2 callback to handle this directive would look like:
RAW_ARGS ($$$;*) An args_how of RAW_ARGS instructs Apache to turn off parsing altogether. Instead, it simply passes your callback function the line of text following the directive. Leading and trailing whitespace is stripped from the text, but it is not otherwise processed. Your callback can then do whatever processing it wishes to perform. This callback receives four arguments, the third of which is a string-valued scalar containing the text following the directive. The last argument is a filehandle tied to the configuration file. This filehandle can be used to read data from the configuration file starting on the line following the configuration directive. It is most common to use a RAW_ ARGS prototype when processing a "container" directive. For example, let's say our TrafficCop needs to build a table of speed limits for a given district: <TrafficCopSpeedLimits charlestown> Elm St. 20 Payson Ave. 15 Main St. 25 </TrafficCopSpeedLimits> By using the RAW_ARGS prototype, the third argument passed in will be charlestown>; it's up to the handler to strip the trailing >. Now the handler can use the tied filehandle to read the following configuration lines, until it hits the container end token, </TrafficCopSpeedLimits>. For each configuration line that is read in, leading and trailing whitespace is stripped, as is the trailing newline. The handler can then apply any parsing rules it wishes to the line of data: my $EndToken = "</TrafficCopSpeedLimits>"; sub TrafficCopSpeedLimits ($$$;*) { my($cfg, $parms, $district, $cfg_fh) = @_; $district =~ s/>$//; while((my $line = <$cfg_fh>) !~ m:^$EndToken:o) { my($road, $limit) = ($line =~ /(.*)\s+(\S+)$/); $cfg->{SpeedLimits}{$district}{$road} = $limit; } } There is a trick to making configuration containers work. In order to be recognized as a valid directive, the name entry passed to command_table() must contain the leading <. This token will be stripped by Apache::ExtUtils when it maps the directive to the corresponding subroutine callback. my @directives = ( { name => '<TrafficCopSpeedLimits', errmsg => 'a district speed limit container', args_how => 'RAW_ARGS', req_override => 'OR_ALL' }, ); One other trick, which is not required but can provide some more user friendliness, is to provide a handler for the container end token. In our example, the Apache configuration gears will never see the </TrafficCopSpeedLimits> token, as our RAW_ARGS handler will read in that line and stop reading when it is seen. However, in order to catch cases in which the </Traffic-Cop-Speed-Limits> text appears without a preceding <Traffic-Cop-Speed-Limits> opening section, we need to turn the end token into a directive that simply reports an error and exits. command_table() includes special tests for directives whose names begin with </. When it encounters a directive like this, it strips the leading </ and trailing > characters from the name and tacks _END onto the end. This allows us to declare an end token callback like this one: my $EndToken = "</TrafficCopSpeedLimits>"; sub TrafficCopSpeedLimits_END () { die "$EndToken outside a <TrafficCopSpeedLimits> container\n"; } which corresponds to a directive definition like this one: my @directives = ( ... { name => '</TrafficCopSpeedLimits>', errmsg => 'end of speed limit container', args_how => 'NO_ARGS', req_override => 'OR_ALL', }, ); Now, should the server admin misplace the container end token, the server will not start, complaining with this error message: Syntax error on line 89 of httpd.conf: </TrafficCopSpeedLimits> outside a <TrafficCopSpeedLimits> container
FLAG ($$$) When the FLAG prototype is used, Apache will only allow the argument to be one of two values, On or Off. This string value will be converted into an integer: 1 if the flag is On, 0 if it is Off. If the configuration argument is anything other than On or Off, Apache will complain: Syntax error on line 90 of httpd.conf: TrafficCopRoadBlock must be On or Off Here's an example: # Makefile.PL my @directives = ( ... { name => 'TrafficCopRoadBlock', errmsg => 'On or Off', args_how => 'FLAG', req_override => 'OR_ALL', }, # TrafficCop.pm sub TrafficCopRoadBlock ($$$) { my($cfg, $parms, $arg) = @_; $cfg->{RoadBlock} = $arg; }

On successfully processing a directive, its handler should simply return. If an error occurs while processing the directive, the routine should die() with a string describing the source of the error. There is also a third possibility. The configuration directive handler can return DECLINE_CMD, a constant that must be explicitly imported from Apache::Constants. This is used in the rare circumstance in which a module redeclares another module's directive in order to override it. The directive handler can then return DECLINE_CMD when it wishes the directive to fall through to the original module's handler.

Show Contents Previous Page Next Page