Show Contents Previous Page Next Page
Chapter 1 - Server-Side Programming with Apache
The Apache Project
This book is devoted to developing applications with the Apache web server API, so we turn our attention now to the short history of the Apache project.
The Apache project began in 1995 when a group of eight volunteers, seeing that web software was becoming increasingly commercialized, got together to create a supported open source web server. Apache began as an enhanced version of the public-domain NCSA server but steadily diverged from the original. Many new features have been added to Apache over the years: significant features include the ability for a single server to host multiple virtual web sites, a smorgasbord of authentication schemes, and the ability for the server to act as a caching proxy. In some cases, Apache is way ahead of the commercial vendors in the features wars. For example, at the time this book was written only the Apache web server had implemented the HTTP/1.1 Digest Authentication scheme.
Internally the server has been completely redesigned to use a modular and extensible architecture, turning it into what the authors describe as a "web server toolkit." In fact, there's very little of the original NCSA
httpd source code left within Apache. The main NCSA legacy is the configuration files, which remain backward-compatible with NCSA
Apache's success has been phenomenal. In less than three years, Apache has
risen from relative obscurity to the position of market leader. Netcraft, a
British market research company that monitors the growth and usage of the web,
estimates that Apache servers now run on over 50 percent of the Internet's web
sites, making it by far the most popular web server in the world. Microsoft,
its nearest rival, holds a mere 22 percent of the market.3
This is despite the fact that Apache has lacked some of the conveniences that
common wisdom holds to be essential, such as a graphical user interface for
configuration and administration.
Apache has been used as the code base for several commercial server products. The most successful of these, C2Net's Stronghold, adds support for secure communications with Secure Socket Layer (SSL) and a form-based configuration manager. There is also WebTen by Tenon Intersystems, a Macintosh PowerPC port, and the Red Hat Secure Server, an inexpensive SSL-supporting server from the makers of Red Hat Linux.
Another milestone was reached in November of 1997 when the Apache Group announced its port of Apache to the Windows NT and 95 operating systems (Win32). A fully multithreaded implementation, the Win32 port supports all the features of the Unix version and is designed with the same modular architecture as its brother. Freeware ports to OS/2 and the AmigaOS are also available.
In the summer of 1998, IBM announced its plans to join with the Apache volunteers to develop a version of Apache to use as the basis of its secure Internet commerce server system, supplanting the servers that it and Lotus Corporation had previously developed.
Why use Apache? Many web sites run Apache by accident. The server software is small, free, and well documented and can be downloaded without filling out pages of licensing agreements. The person responsible for getting his organization's web site up and running downloads and installs Apache just to get his feet wet, intending to replace Apache with a "real" server at a later date. But that date never comes. Apache does the job and does it well.
However, there are better reasons for using Apache. Like other successful open source products such as Perl, the GNU tools, and the Linux operating system, Apache has some big advantages over its commercial rivals.
- It's fast and efficient
The Apache web server core consists of 25,000 lines of highly tuned
C code. It uses many tricks to eke every last drop of performance out
of the HTTP protocol and, as a result, runs faster and consumes less system
resources than many commercial servers. Its modular architecture allows
you to build a server that contains just the functionality that you need
and no more.
- It's portable
Apache runs on all Unix variants, including the popular freeware Linux
operating system. It also runs on Microsoft Windows systems (95, 98, and
NT), OS/2, and even the bs2000 mainframe architecture.
- It's well supported
Apache is supported by a cast of thousands. Beyond the core Apache Group
developers, who respond to bug reports and answer technical questions
via email, Apache is supported by a community of webmasters with hundreds
of thousands of hours of aggregate experience behind them. Questions posted
to the Usenet newsgroup comp.infosystems.www.servers.unix
are usually answered within hours. If you need a higher level of support,
you can purchase Stronghold or another commercial version of Apache and
get all the benefits of the freeware product, plus trained professional
- It won't go away
In the software world, a vendor's size or stock market performance is
no guarantee of its staying power. Companies that look invincible one
year become losers the next. In 1988, who would have thought the Digital
Equipment whale would be gobbled up by the Compaq minnow just 10 years
later? Good community software projects don't go away. Because the source
code is available to all, someone is always there to pick up the torch
when a member of the core developer group leaves.
- It's stable and reliable
All software contains bugs. When a commercial server contains a bug
there's an irresistible institutional temptation for the vendor to cover
up the problem or offer misleading reassurances to the public. With Apache,
the entire development process is open to the public. The source code
is all there for you to review, and you can even eavesdrop on the development
process by subscribing to the developer's mailing list. As a result, bugs
don't remain hidden for long, and they are usually fixed rapidly once
uncovered. If you get really desperate, you can dig into the source code
and fix the problem yourself. (If you do so, please send the fix back
to the community!)
- It's got features to burn
Because of its modular architecture and many contributors, Apache has
more features than any other web server on the market. Some of its features
you may never use. Others, such as its powerful URL rewriting facility,
are peerless and powerful.
- It's extensible
Apache is open and extensible. If it doesn't already have a feature
you want, you can write your own server module to implement it. In the
unlikely event that the server API doesn't support what you want to do,
you can dig into the source code for the server core itself. The entire
system is open to your inspection; there are no black boxes or precompiled
libraries for you to work around.
- It's easy to administer
Apache is configured with plain-text configuration files and controlled
with a simple command-line tool. This sounds like a deficiency when compared
to the fancy graphical user interfaces supplied with commercial servers,
but it does have some advantages. You can save old copies of the configuration
files or even commit them to a source code control system, allowing you
to keep track of all the configuration changes you've made and to return
to an older version if something breaks. You can easily copy the configuration
files from one host machine to another, effectively cloning the server.
Lastly, the ability to control the server from the command line lets you
administer the server from anywhere that you can telnet from--you don't
even need web connectivity.
This being said, Apache does provide simple web-based interfaces for viewing
the current configuration and server status. A number of people are working
on administrative GUIs, and there is already a web interface for remotely
managing web user accounts (the
user_manage tool available
- It makes you part of a community
When you install an Apache server you become part of a large virtual
community of Apache webmasters, authors, and developers. You will never
feel that the software is something whose use has been grudgingly granted
to you by a corporate entity. Instead, the Apache server is owned by its
community. By using the Apache server, you automatically own a bit of
it too and are contributing, if even in only a small way, to its continued
health and development. Welcome to the club!
3 Impressive as they are, these numbers should
be taken with a grain or two of salt. Netcraft's survey techniques count only
web servers connected directly to the Internet. The number of web servers running
intranets is not represented in these counts, which might inflate or deflate
Apache's true market share.
The Apache C and Perl APIs Show Contents Go to Top Previous Page Next Page
The Apache module API gives you access to nearly all of the server's internal processing. You can inspect what it's doing at each step of the HTTP transaction cycle and intervene at any of the steps to customize the server's behavior. You can arrange for the server to take custom actions at startup and exit time, add your own directives to its configuration files, customize the process of translating URLs into file names, create custom authentication and authorization systems, and even tap into the server's logging system. This is all done via modules--self-contained pieces of code that can either be linked directly into the server executable, or loaded on demand as a dynamic shared object (DSO).
The Apache module API was intended for C programmers. To write a traditional compiled module, you prepare one or more C source files with a text editor, compile them into object files, and either link them into the server binary or move them into a special directory for DSOs. If the module is implemented as a DSO, you'll also need to edit the server configuration file so that the module gets loaded at the appropriate time. You'll then launch the server and begin the testing and debugging process.
This sounds like a drag, and it is. It's even more of a drag because you have to worry about details of memory management and configuration file processing that are tangential to the task at hand. A mistake in any one of these areas can crash the server.
For this reason, the Apache server C API has generally been used only for substantial modules which need high performance, tiny modules that execute very frequently, or anything that needs access to server internals. For small to medium applications, one-offs, and other quick hacks, developers have used CGI scripts, FastCGI, or some other development system.
Things changed in 1996 when Doug MacEachern introduced
mod_perl, a complete Perl interpreter wrapped within an Apache module. This module makes almost the entire Apache API available to Perl programmers as objects and method calls. The parts that it doesn't export are C-specific routines that Perl programmers don't need to worry about. Anything that you can do with the C API you can do with
mod_perl with less fuss and bother. You don't have to restart the server to add a new
mod_perl module, and a buggy module is less likely to crash the server.
We have found that for the vast majority of applications
mod_perl is all you need. For those cases when you need the raw processing power or the small memory footprint that a compiled module gives you, the C and Perl forms of the API are close enough so that you can prototype the application in
mod_perl first and port it to C later. You may well be surprised to find that the "prototype" is all you really need!
This book uses Show Contents Go to Top Previous Page Next Page
mod_perl to teach you the Apache API. This keeps the examples short and easy to understand, and shows you the essentials without bogging down in detail. Toward the end of the book we show you how to port Apache modules written in Perl into C to get the memory and execution efficiency of a compiled language.
Copyright © 1999 by O'Reilly & Associates, Inc.