Writing Apache Modules with Perl and C

Writing Apache Modules with Perl and C

By:	Lincoln Stein and Doug MacEachern
Published:	O'Reilly & Associates, Inc. - March 1999

Show Contents Previous Page Next Page

Chapter 10 - C API Reference Guide, Part I / Processing Requests
The BUFF API

All the I/O functions that were described in the previous two sections took the request record as an argument. Internally, these functions make calls to a lower-level I/O API that operates on the BUFF* stored in the connection record in the client field. There is a close parallelism between the request-oriented I/O functions and the connection-oriented ones. They have almost identical names, but the prefix ap_r is replaced by ap_b, and instead of taking a request record as their argument, they take a BUFF pointer. So, for instance, instead of calling:

ap_rputs("<H1>In the Beginning</H1>", r);

you could call:

ap_bputs("<H1>In the Beginning</H1>", r->connection->client);

You will probably never have to use the BUFF API in the ordinary course of events. The only exception is if your module needs to open a pipe to another process. In this case, the ap_bspawn_child() routine returns a BUFF stream connected to the external process.

In most cases, the function prototypes for the BUFF functions are similar to the prototypes of their corresponding request-oriented calls, except that the request_rec* is replaced by a BUFF*. But be wary: in several cases the arguments are swapped so that the BUFF* comes first in the argument list rather than last.

The buffer functions are defined in the header file include/buff.h:

int ap_bwrite (BUFF *fb, const void *buf, int nbyte)
int ap_bputs (const char *x, BUFF *fb)
int ap_bvputs (BUFF *fb,...)
int ap_bputc (int c, BUFF *fb)
int ap_bprintf (BUFF *fb, const char *fmt,...)
long ap_send_fb (BUFF *fb, request_rec *r)
long ap_send_fb_length (BUFF *fb, request_rec *r, long length)
int ap_bflush (BUFF *fb)

These output functions are identical to their ap_r counterparts but take a BUFF* as their argument. Usually, this argument will be retrieved from the connection record by calling r->connection->client, assuming that r is the current request record.
Note that ap_send_fb() and ap_send_fb_length() correspond to ap_send_fd() and ap_send_ fd_length() and are responsible for sending the contents of the file or process pointed to by the first argument.

int ap_bread (BUFF *fb, void *buf, int nbyte)

ap_bread() is a low-level input function that is used beneath the *_client_block() routines described in the previous section. It acts like the standard C library fread() function to read nbyte bytes from the BUFF pointed to by fb. If successful, the data is placed in buf and the byte count is returned as the function result. In case of an error, the function will return EOF (-1).
This function should never be used by a handler to read the incoming request body because it will not deal correctly with chunked data. However, it is useful when reading from a pipe created with the ap_bspawn_child() function.

int n = ap_bread(fb, buffer, len);

int ap_bgets (char *buf, int n, BUFF *fb)

The ap_bgets can be used like the C standard library function gets() to read a line of data into a string. It will read data into the char* buf until an EOF occurs, a newline is encountered, a carriage return/linefeed sequence occurs, or n-1 bytes have been read. The string is always NULL-terminated.
If successful, the function returns the number of bytes read, or 0 on an EOF condition. If an error occurs, the function returns -1.

char buffer[MAX_STRING_LEN];
while(ap_bgets(buffer, sizeof(buffer), fb) > 0) {
  ...
}

The Timeout API

Show Contents Go to Top Previous Page Next Page

The timeout API allows you to set an alarm that will be triggered after the time configured by the Timeout configuration directive. You should do this before starting any series of read or write operations in order to handle any of the myriad things that can go wrong during network I/O: the client hangs or crashes, the network goes down, or the user hits the stop button before the page is completely downloaded.

There are two types of timeout. A "hard" timeout causes the transaction to be aborted immediately. The currently executing handler is exited, and Apache immediately enters the logging phase. A "soft" timeout does not abort the transaction but does mark the connection record as being in an aborted state (by setting the aborted field to true). The current handler continues to run, but all calls to client input or output routines are ignored. This allows the handler to do any additional processing or cleanup that it requires. In either case, a message will be sent to the ErrorLog, labeled with the name of the handler along these lines:

[Tue Jul 28 17:02:36 1998] [info] mod_hello timed out for 127.0.0.1

or:

[Tue Jul 28 17:02:36 1998] [info] 127.0.0.1 client stopped connection before mod_hello completed

Many content handlers will do a series of I/O, do some processing, then do some more I/O. Every time a series of read or write operations is completed, the timeout should be reset by calling ap_reset_timeout(). This sets the internal timer back to zero. When your handler has finished all I/O operations successfully, it should call ap_kill_timeout() in order to cancel the timeout for good:

ap_soft_timeout("mod_hello", r);
while(...) {
   ... do I/O ...
   ap_reset_timeout(r);
}
ap_kill_timeout(r);

The various resource pools are deallocated correctly when a timeout occurs, so you should not have to worry about memory leaks so long as you have been careful to allocate all your data structures from resource pools. Should you have non-pool resources that you need to deallocate after a timeout, you can install a cleanup handler. See "The Cleanup API" section later in this chapter for details. You may also protect critical sections of your code with ap_block_alarms() and ap_unblock_alarms() to prevent a timeout from occurring at an inconvenient time.

void ap_hard_timeout (char *name, request_rec *r)

ap_hard_timeout() starts a timeout. The first argument contains an arbitrary string used to identify the current handler when the abort message is printed to the error log. If the alarm times out, the current handler will be exited, the transaction will be aborted, and Apache will immediately enter the logging phase of the request cycle.

ap_hard_timeout("mod_hello", r);

void ap_soft_timeout (char *name, request_rec *r)
ap_soft_timeout() works in the same way as ap_hard_timeout(), except that when the timeout occurs the transaction is placed into an aborted state in which all requested I/O operations are silently ignored. This allows the current handler to continue to its normal conclusion.

void ap_reset_timeout (request_rec *r)

This function resets the timeout to its initial state. You should call this function after any series of I/O operations.

void ap_kill_timeout (request_rec *r)

ap_kill_timeout() cancels the pending timeout. You should be sure to call this function before your handler exits to avoid the risk of the alarm going off during a subsequent part of the transaction.

void ap_block_alarms (void)

void ap_unblock_alarms (void)
These two functions are used to block off sections of code where you do not want an alarm to occur. After a call to ap_block_alarms(), the pending timeout is blocked until ap_ unblock_alarms() is called.

ap_block_alarms();
... critical section ...
ap_unblock_alarms();

Status Code Constants

Show Contents Go to Top Previous Page Next Page

The various status codes that handlers might want to return are defined in httpd.h. In addition to the Apache-specific status codes OK, DECLINED, and DONE, there are several dozen HTTP status codes to choose from.

In addition to the constants, Apache provides some handy macros for testing the range of a status code. Among other things, these macros can be used to check the status code returned by a subrequest (as described in the next section).

int ap_is_HTTP_INFO (int status_code)

Returns true if the status code is greater than or equal to 100 and less than 200. These codes are used to flag events in the HTTP protocol that are neither error codes nor success codes.

int ap_is_HTTP_SUCCESS (int status_code)

Returns true if the status code is greater than or equal to 200 and less than 300. This range is used for HTTP success codes, such as HTTP_OK.

int ap_is_HTTP_REDIRECT (int status_code)

Returns true if the status code is greater than or equal to 300 and less than 400. This range is used for redirects of various sorts, as well as the HTTP_NOT_MODIFIED result code.

int ap_is_HTTP_ERROR (int status_code)

Returns true for any of the HTTP error codes, which occupy the range greater than or equal to 400.

int ap_is_HTTP_CLIENT_ERROR (int status_code)

Returns true if the status code is greater than or equal to 400 and less than 500, which is the range reserved for client errors such as HTTP_NOT_FOUND.

int ap_is_HTTP_SERVER_ERROR (int status_code)

Returns true if the status code is greater than or equal to 500 and less than 600, which are used for server errors such as HTTP_INTERNAL_SERVER_ERROR.

Footnotes

3 Before you do roll your own, be sure to have a look at http://modperl.com/linapreq/ for a C library that provides routines for manipulating client request data viea the Apache API. This library was released after this book's final manuscript submission.

Show Contents Go to Top Previous Page Next Page