2.3. Setting Up a Unix ServerWe can point httpd at our site with the -d flag (notice the full pathname to the site.toddle directory, which will probably be different on your machine): % httpd -d /usr/www/APACHE3/site.toddle Since you will be typing this a lot, it's sensible to copy it into a script called go. This can go in /usr/local/bin or in each local site. We have done the latter since it is convenient to change it slightly from time to time. Create it by typing: % cat > /usr/local/bin/go test -d logs || mkdir logs httpd -f 'pwd'/conf/httpd$1.conf -d 'pwd' ^d ^d is shorthand for Ctrl-D, which ends the input and gets your prompt back. This go will work on every site. It creates a logs directory if one does not exist, and it explicitly specifies paths for the ServerRoot directory (-d) and the Config file (-f). The command 'pwd' finds the current directory with the Unix command pwd. The back-ticks are essential: they substitute pwd's value into the script — in other words, we will run Apache with whatever configuration is in our current directory. To accomodate sites where we have more than one Config file, we have used ...httpd$1... where you might expect to see ...httpd... The symbol $1 copies the first argument (if any) given to the command go. Thus ./go 2 will run the Config file called httpd2.conf, and ./go by itself will run httpd.conf. Remember that you have to be in the site directory. If you try to run this script from somewhere else, pwd's return will be nonsense, and Apache will complain that it 'could not open document config file ...'. Make go runnable, and run it by typing the following (note that you have to be in the directory .../site.toddle when you run go): % chmod +x go % go If you get the error message: go: command not found you need to type: % ./go This launches Apache in the background. Check that it's running by typing something like this (arguments to ps vary from Unix to Unix): % ps -aux This Unix utility lists all the processes running, among which you should find several httpds.[14]
Sooner or later, you have finished testing and want to stop Apache. To do this, you have to get the process identity (PID) of the program httpd using ps -aux: USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 701 0.0 0.8 396 240 v0 R+ 2:49PM 0:00.00 ps -aux root 1 0.0 0.9 420 260 ?? Is 8:13AM 0:00.02 /sbin/init -- root 2 0.0 0.0 0 0 ?? DL 8:13AM 0:00.04 (pagedaemon) root 3 0.0 0.0 0 0 ?? DL 8:13AM 0:00.00 (vmdaemon) root 4 0.0 0.0 0 0 ?? DL 8:13AM 0:02.24 (syncer) root 35 0.0 0.3 204 84 ?? Is 8:13AM 0:00.00 adjkerntz -i root 98 0.0 1.8 820 524 ?? Is 7:13AM 0:00.43 syslogd daemon 107 0.0 1.3 820 384 ?? Is 7:13AM 0:00.00 /usr/sbin/portma root 139 0.0 2.1 888 604 ?? Is 7:13AM 0:00.07 inetd root 142 0.0 2.0 980 592 ?? Ss 7:13AM 0:00.27 cron root 146 0.0 3.2 1304 936 ?? Is 7:13AM 0:00.25 sendmail: accept root 209 0.0 1.0 500 296 con- I 7:13AM 0:00.02 /bin/sh /usr/loc root 238 0.0 5.8 10996 1676 con- I 7:13AM 0:00.09 /usr/local/libex root 239 0.0 1.1 460 316 v0 Is 7:13AM 0:00.09 -csh (csh) root 240 0.0 1.2 460 336 v1 Is 7:13AM 0:00.07 -csh (csh) root 241 0.0 1.2 460 336 v2 Is 7:13AM 0:00.07 -csh (csh) root 251 0.0 1.7 1052 484 v0 S 7:14AM 0:00.32 bash root 576 0.0 1.8 1048 508 v1 I 2:18PM 0:00.07 bash root 618 0.0 1.7 1040 500 v2 I 2:22PM 0:00.04 bash root 627 0.0 2.2 992 632 v2 I+ 2:22PM 0:00.02 mince demo_test root 630 0.0 2.2 992 636 v1 I+ 2:23PM 0:00.06 mince home root 694 0.0 6.7 2548 1968 ?? Ss 2:47PM 0:00.03 httpd -d /u webuser 695 0.0 7.0 2548 2044 ?? I 2:47PM 0:00.00 httpd -d /u webuser 696 0.0 7.0 2548 2044 ?? I 2:47PM 0:00.00 httpd -d /u webuser 697 0.0 7.0 2548 2044 ?? I 2:47PM 0:00.00 httpd -d /u webuser 698 0.0 7.0 2548 2044 ?? I 2:47PM 0:00.00 httpd -d /u webuser 699 0.0 7.0 2548 2044 ?? I 2:47PM 0:00.00 httpd -d /u To kill Apache, you need to find the PID of the main copy of httpd and then do kill <PID> — the child processes will die with it. In the previous example the process to kill is 694 — the copy of httpd that belongs to root. The command is this: % kill 694 If ps -aux produces more printout than will fit on a screen, you can tame it with ps -aux | more — hit Return to see another line or Space to see another screen. It is important to make sure that the Apache process is properly killed because you can quite easily kill a child process by mistake and then start a new copy of the server with its children — and a different Config file or Perl scripts — and so get yourself into a royal muddle. To get just the lines from ps that you want, you can use: ps awlx | grep httpd On Linux: killall httpd Alternatively and better, since it is less prone to finger trouble, Apache writes its PID in the file ... /logs/httpd.pid (by default — see the PidFile directive), and you can write yourself a little script, as follows: kill 'cat /usr/www/APACHE3/site.toddle/logs/httpd.pid' You may prefer to put more generalized versions of these scripts somewhere on your path. stop looks like this: pwd | read path kill 'cat $path/logs/httpd.pid' Or, if you don't plan to mess with many different configurations, use .../src/support/apachect1 to start and stop Apache in the default directory. You might want to copy it into /usr/local/bin to get it onto the path, or add $apacheinstalldir/bin to your path. It uses the following flags: usage: ./apachectl (start|stop|restart|fullstatus|status|graceful|configtest|help)
When we typed ./go, nothing appeared to happen, but when we looked in the logs subdirectory, we found a file called error_log with the entry: [<date>]:'mod_unique_id: unable to get hostbyname ("myname.my.domain") In our case, this problem was due to the odd way we were running Apache, and it will only affect you if you are running on a host with no DNS or on an operating system that has difficulty determining the local hostname. The solution was to edit the file /etc/hosts and add the line: 10.0.0.2 myname.my.domain myname where 10.0.0.2 is the IP number we were using for testing. However, our troubles were not yet over. When we reran httpd, we received the following error message: [<date>]--couldn't determine user name from uid This means more than might at first appear. We had logged in as root. Because of the security worries of letting outsiders log in with superuser powers, Apache, having been started with root permissions so that it can bind to port 80, has attempted to change its user ID to -1. On many Unix systems, this ID corresponds to the user nobody : a supposedly harmless user. However, it seems that FreeBSD does not understand this notion, hence the error message.[15] In any case, it really isn't a great idea to allow Apache to run as nobody (or any other shared user), because you run the risk that an attacker exploiting the fact that various different services are sharing the same user, that is, if you are running several different services (ftp, mail, etc) on the same machine.
2.3.1. webuser and webgroupThe remedy is to create a new user, called webuser, belonging to webgroup. The names are unimportant. The main thing is that this user should be in a group of its own and should not actually be used by anyone for anything else. On most Unix systems, create the group first by running adduser -group webgroup then the user by running adduser. You will be asked for passwords for both. If the system insists on a password, use some obscure non-English string like cQuycn75Vg. Ideally, you should make sure that the newly created user cannot actually log in; how this is achieved varies according to operating system: you may have to replace the encrypted password in /etc/passwd, or remove the home directory, or perhaps something else. Having told the operating system about this user, you now have to tell Apache. Edit the file httpd.conf to include the following lines: User webuser Group webgroup The following are the interesting directives. 2.3.1.1. UserThe User directive sets the user ID under which the server will run when answering requests. User unix-userid Default: User #-1 Server config, virtual host In order to use this directive, the standalone server must be run initially as root. unix-userid is one of the following:
The user should have no privileges that allow access to files not intended to be visible to the outside world; similarly, the user should not be able to execute code that is not meant for httpd requests. However, the user must have access to certain things — the files it serves, for example, or mod_proxy 's cache, when enabled (see the CacheRoot directive in Chapter 9). TIP: If you start the server as a non-root user, it will fail to change to the lesser-privileged user and will instead continue to run as that original user. If you start the server as root, then it is normal for the parent process to remain running as root. WARNING: Don't set User (or Group) to root unless you know exactly what you are doing and what the dangers are. 2.3.1.2. GroupThe Group directive sets the group under which the server will answer requests. Group unix-group Default: Group #-1 Server config, virtual host To use this directive, the standalone server must be run initially as root. unix-group is one of the following:
It is recommended that you set up a new group specifically for running the server. Some administrators use group nobody, but this is not always possible or desirable, as noted earlier. TIP: If you start the server as a non-root user, it will fail to change to the specified group and will instead continue to run as the group of the original user. Now, when you run httpd and look for the PID, you will find that one copy belongs to root, and several others belong to webuser. Kill the root copy and the others will vanish. 2.3.2. "Out of the Box" Default ProblemsWe found that when we built Apache "out of the box" using a GNU layout, some file defaults were not set up properly. If when you run ./go you get the rather odd error message on the screen: fopen: No such file or directory httpd: could not open error log file <path to site.toddle>site.toddle/var/httpd/log/error_log you need to add the line: ErrorLog logs/error_log to ...conf/httpd.conf. If, having done that, Apache fails to start and you get a message in .../logs/error_log: .... No such file or directory.: could not open mime types log file <path to site. toddle>/site.toddle/etc/httpd/mime.types you need to add the line: TypesConfig conf/mime.types to ...conf/httpd.conf. And if, having done that, Apache fails to start and you get a message in .../logs/error_log: fopen: no such file or directory httpd: could not log pid to file <path to site.toddle>/site.toddle/var/httpd/run/ httpd.pid you need to add the line: PIDFile logs/httpd.pid to ...conf/httpd.conf. 2.3.3. Running Apache Under UnixWhen you run Apache now, you may get the following error message: httpd: cannot determine local hostname Use ServerName to set it manually. What Apache means is that you should put this line in the httpd.conf file: ServerName <yourmachinename> Finally, before you can expect any action, you need to set up some documents to serve. Apache's default document directory is ... /httpd/htdocs — which you don't want to use because you are at /usr/www/APACHE3/site.toddle — so you have to set it explicitly. Create ... /site.toddle/htdocs, and then in it create a file called 1.txt containing the immortal words "hullo world." Then add this line to httpd.conf : DocumentRoot /usr/www/APACHE3/site.toddle/htdocs The complete Config file, .../site.toddle/conf/httpd.conf, now looks like this: User webuser Group webgroup ServerName my586 DocumentRoot /usr/www/APACHE3/APACHE3/site.toddle/htdocs/ #fix 'Out of the Box' default problems--remove leading #s if necessary #ServerRoot /usr/www/APACHE3/APACHE3/site.toddle #ErrorLog logs/error_log #PIDFile logs/httpd.pid #TypesConfig conf/mime.types When you fire up httpd, you should have a working web server. To prove it, start up a browser to access your new server, and point it at http://<yourmachinename>/.[16]
As we know, http means use the HTTP protocol to get documents, and / on the end means go to the DocumentRoot directory you set in httpd.conf. Lynx is the text browser that comes with FreeBSD and other flavors of Unix; if it is available, type: % lynx http://<yourmachinename>/ You see: INDEX OF / * Parent Directory * 1.txt If you move to 1.txt with the down arrow, you see: hullo world If you don't have Lynx (or Netscape, or some other web browser) on your server, you can use telnet :[17]
% telnet <yourmachinename> 80 You should see something like: Trying 192.168.123.2 Connected to my586.my.domain Escape character is '^]' Then type: GET / HTTP/1.0 <CR><CR> You should see: HTTP/1.0 200 OK Sat, 24 Aug 1996 23:49:02 GMT Server: Apache/1.3 Connection: close Content-Type: text/html <HEAD><TITLE>Index of /</TITLE></HEAD><BODY> <H1>Index of </H1> <UL><LI> <A HREF="/"> Parent Directory</A> <LI> <A HREF="1.txt"> 1.txt</A> </UL></BODY> Connection closed by foreign host. This is a rare opportunity to see a complete HTTP message. The first lines are headers that are normally hidden by your browser. The stuff between the < and > is HTML, written by Apache, which, if viewed through a browser, produces the formatted message shown by Lynx earlier, and by Netscape or Microsoft Internet Explorer in the next chapter. 2.3.4. Several Copies of ApacheTo get a display of all the processes running, run: % ps -aux Among a lot of Unix stuff, you will see one copy of httpd belonging to root and a number that belong to webuser. They are similar copies, waiting to deal with incoming queries. The root copy is still attached to port 80 — thus its children will be as well — but it is not listening. This is because it is root and has too many powers for this to be safe. It is necessary for this "master" copy to remain running as root because under the (slightly flawed) Unix security doctrine, only root can open ports below 1024. Its job is to monitor the scoreboard where the other copies post their status: busy or waiting. If there are too few waiting (default 5, set by the MinSpareServers directive in httpd.conf ), the root copy starts new ones; if there are too many waiting (default 10, set by the MaxSpareServers directive), it kills some off. If you note the PID (shown by ps -ax, or ps -aux for a fuller listing; also to be found in ... /logs/httpd.pid ) of the root copy and kill it with: % kill PID you will find that the other copies disappear as well. It is better, however, to use the stop script described in Section 2.3 earlier in this chapter, since it leaves less to chance and is easier to do. 2.3.5. Unix PermissionsIf Apache is to work properly, it's important to correctly set the file-access permissions. In Unix systems, there are three kinds of permissions: read, write , and execute. They attach to each object in three levels: user, group, and other or "rest of the world." If you have installed the demonstration sites, go to ... /site.cgi/htdocs, and type: % ls -l You see: -rw-rw-r-- 5 root bin 1575 Aug 15 07:45 form_summer.html The first - indicates that this is a regular file. It is followed by three permission fields, each of three characters. They mean, in this case:
When the permissions apply to a directory, the x execute permission means scan: the ability to see the contents and move down a level. The permission that interests us is other, because the copy of Apache that tries to access this file belongs to user webuser and group webgroup. These were set up to have no affinities with root and bin, so that copy can gain access only under the other permissions, and the only one set is "read." Consequently, a Bad Guy who crawls under the cloak of Apache cannot alter or delete our precious form_summer.html; he can only read it. We can now write a coherent doctrine on permissions. We have set things up so that everything in our web site, except the data vulnerable to attack, has owner root and group wheel. We did this partly because it is a valid approach, but also because it is the only portable one. The files on our CD-ROM with owner root and group wheel have owner and group numbers 0 that translate into similar superuser access on every machine. Of course, this only makes sense if the webmaster has root login permission, which we had. You may have to adapt the whole scheme if you do not have root login, and you should perhaps consult your site administrator. In general, on a web site everything should be owned by a user who is not webuser and a group that is not webgroup (assuming you use these terms for Apache configurations). There are four kinds of files to which we want to give webuser access: directories, data, programs, and shell scripts. webuser must have scan permissions on all the directories, starting at root down to wherever the accessible files are. If Apache is to access a directory, that directory and all in the path must have x permission set for other. You do this by entering: % chmod o+x <each-directory-in-the-path> To produce a directory listing (if this is required by, say, an index), the final directory must have read permission for other. You do this by typing: % chmod o+r <final-directory> It probably should not have write permission set for other: % chmod o-w <final-directory> To serve a file as data — and this includes files like .htaccess (see Chapter 3) — the file must have read permission for other: % chmod o+r file And, as before, deny write permission: % chmod o-w <file> To run a program, the file must have execute permission set for other: % chmod o+x <program> To execute a shell script, the file must have read and execute permission set for other: % chmod o+rx <script>: For complete safety: % chmod a=rx <script> If the user is to edit the script, but it is to be safe otherwise: % chmod u=rwx,og=rx <script> 2.3.6. A Local NetworkEmboldened by the success of site.toddle, we can now set about a more realistic setup, without as yet venturing out onto the unknown waters of the Web. We need to get two things running: Apache under some sort of Unix and a GUI browser. There are two main ways this can be achieved:
We cannot hope to give detailed explanations for all possible variants of these situations. We expect that many of our readers will already be webmasters familiar with these issues, who will want to skip the following sidebar. Those who are new to the Web may find it useful to know what we did. Copyright © 2003 O'Reilly & Associates. All rights reserved. |
|