Chapter 7. Indexing
As we saw back on site.first (see Chapter 3), if there is no index.html file in ... /htdocs or DirectoryIndex directive, Apache concocts an index called "Index of /", where "/" means the DocumentRoot directory. For many purposes this will, no doubt, be enough. But since this jury-rigged index is the first thing a client sees, you may want to do more.
7.1. Making Better Indexes in Apache
There is a wide range of possibilities; some are demonstrated at ... /site.fancyindex /httpd1.conf:
User webuser Group webgroup ServerName www.butterthlies.com DocumentRoot /usr/www/APACHE3/site.fancyindex/htdocs <Directory /usr/www/APACHE3/site.fancyindex/htdocs> IndexOptions FancyIndexing AddDescription "One of our wonderful catalogs" catalog_summer.html / catalog_autumn.html IndexIgnore *.jpg IndexIgnore .. IndexIgnore icons HEADER README AddIconByType (CAT,icons/bomb.gif) text/* DefaultIcon icons/burst.gif </Directory>
When you type ./go 1 on the server and access http://www.butterthlies.com/ on the browser, you should see a rather fancy display:
Index of / Name Last Modified Size Description -------------------------------------------------------------------- <bomb>catalog_autumn.html 23-Jul-1998 09:11 1k One of our wonderful catalogs <bomb>catalog_summer.html 25-Jul-1998 10:31 1k One of our wonderful catalogs <burst>index.html.ok 23-Jul-1998 09:11 1k --------------------------------------------------------------------
In the previous listing, <bomb> and <burst> stand in for standard graphic icons Apache has at its disposal. How does all this work? As you can see from the httpd.conf file, this smart formatting is displayed directory by directory. The key directive is IndexOptions.
IndexOptions option [option] ... (Apache 1.3.2 and earlier) IndexOptions [+|-]option [[+|-]option] ... (Apache 1.3.3 and later) Server config, virtual host, directory, .htaccess
This directive is somewhat complicated, and its syntax varies drastically depending on your version of Apache.
+/- syntax and merging of multiple IndexOptions directives is only available with Apache 1.3.3 and later; the FoldersFirst and DescriptionWidth options are only available with Apache 1.3.10 and later; the TrackModified option is only available with Apache 1.3.15 and later.
The IndexOptions directive specifies the behavior of the directory indexing. option can be one of the following:
DescriptionWidth=[n | *] (Apache 1.3.10 and later)
The DescriptionWidth keyword allows you to specify the width of the description column in characters. If the keyword value is *, then the column is automatically sized to the length of the longest filename in the display. See AddDescription for dangers inherent in truncating descriptions.
This turns on fancy indexing, which gives users more control over how the information is sorted.
Note that in versions of Apache prior to 1.3.2, the FancyIndexing and IndexOptions directives will override each other. You should use IndexOptions FancyIndexing in preference to the standalone FancyIndexing directive. As of Apache 1.3.2, a standalone FancyIndexing directive is combined with any IndexOptions directive already specified for the current scope.
FoldersFirst (Apache 1.3.10 and later)
If this option is enabled, subdirectories in a FancyIndexed listing will always appear first, followed by normal files in the directory. The listing is basically broken into two components, the files and the subdirectories, and each is sorted separately and then displayed with the subdirectories first. For instance, if the sort order is descending by name, and FoldersFirst is enabled, subdirectory Zed will be listed before subdirectory Beta, which will be listed before normal files Gamma and Alpha. This option only has an effect if FancyIndexing is also enabled.
IconHeight[=pixels] (Apache 1.3 and later)IconWidth[=pixels] (Apache 1.3 and later)
If these two options are used together, the server will include HEIGHT and WIDTH attributes in the IMG HTML tag for the file icon. This allows the browser to precalculate the page layout without waiting for all the images to load. If no value is given for the option, it defaults to the standard height of the icons supplied with the Apache software.
This makes the icons part of the anchor for the filename for fancy indexing.
NameWidth=[n | *] (Apache 1.3.2 and later)
The NameWidth keyword allows you to specify the width of the filename column in bytes. If the keyword value is *, then the column is automatically sized to the length of the longest filename in the display.
This enables the extraction of the title from HTML documents for fancy indexing. If the file does not have a description given by AddDescription, then httpd will read the document for the value of the TITLE tag. This is CPU and disk intensive.
If specified, Apache will not make the column headings in a FancyIndex ed directory listing into links for sorting. The default behavior is for them to be links; selecting the column heading will sort the directory listing by the values in that column. Only available in Apache 1.3 and later.
This will suppress the file description in fancy-indexing listings.
SuppressHTMLPreamble (Apache 1.3 and later)
If the directory actually contains a file specified by the HeaderName directive, the module usually includes the contents of the file after a standard HTML preamble (<HTML>, <HEAD>, etc.). The SuppressHTMLPreamble option disables this behavior, causing the module to start the display with the header-file contents. The header file must contain appropriate HTML instructions in this case. If there is no header file, the preamble is generated as usual.
This will suppress the display of the last modification date in fancy-indexing listings.
This will suppress the file size in fancy-indexing listings.
TrackModified (Apache 1.3.15 and later)
This returns the Last-Modified and ETag values for the directory listed in the HTTP header. It is only valid if the operating system and filesystem return legitimate stat( ) results. Most Unix systems do so, as do OS/2's JFS and Win32's NTFS volumes. OS/2 and Win32 FAT volumes, for example, do not. Once this feature is enabled, the client or proxy can track changes to the list of files when they perform a HEAD request. Note some operating systems correctly track new and removed files, but do not track changes for sizes or dates of the files within the directory.
There are some noticeable differences in the behavior of this directive in recent (post-1.3.0) versions of Apache.
For Apache 1.3.2 and Earlier
The default is that no options are enabled. If multiple IndexOptions could apply to a directory, then the most specific one is taken complete; the options are not merged. For example:
<Directory /web/docs> IndexOptions FancyIndexing </Directory> <Directory /web/docs/spec> IndexOptions ScanHTMLTitles </Directory>
In this case, only ScanHTMLTitles will be set for the /web/docs/spec directory.
For Apache 1.3.3 and Later
Apache 1.3.3 introduced some significant changes in the handling of IndexOptions directives. In particular:
IndexOrderDefault Ascending|Descending Name|Date|Size|Description Server config, virtual host, directory, .htaccess IndexOrderDefault is only available in Apache 1.3.4 and later.
The IndexOrderDefault directive is used in combination with the FancyIndexing index option. By default, FancyIndex ed directory listings are displayed in ascending order by filename; IndexOrderDefault allows you to change this initial display order.
IndexOrderDefault takes two arguments. The first must be either Ascending or Descending, indicating the direction of the sort. The second argument must be one of the keywords Name, Date, Size, or Description and identifies the primary key. The secondary key is always the ascending filename.
You can force a directory listing to be displayed only in a particular order by combining this directive with the SuppressColumnSorting index option; this will prevent the client from requesting the directory listing in a different order.
ReadmeName filename Server config, virtual host, directory, .htaccess Some features only available after 1.3.6; see text
The ReadmeName directive sets the name of the file that will be appended to the end of the index listing. filename is the name of the file to include and is taken to be relative to the location being indexed.
The filename argument is treated as a stub filename in Apache 1.3.6 and earlier, and as a relative URI in later versions. Details of how it is handled may be found under the description of the HeaderName directive, which uses the same mechanism and changed at the same time as ReadmeName.
See also HeaderName.
FancyIndexing on_or_off Server config, virtual host, directory, .htaccess
FancyIndexing turns fancy indexing on. The user can click on a column title to sort the entries by value. Clicking again will reverse the sort. Sorting can be turned off with the SuppressColumnSorting keyword for IndexOptions (see earlier in this chapter). See also the FancyIndexing option for IndexOptions.
IndexIgnore file1 file2 ... Server config, virtual host, directory, .htaccess
We can specify a description for individual files or for a list of them. We can exclude files from the listing with IndexIgnore.
IndexIgnore is followed by a list of files or wildcards to describe files. As we see in the following example, multiple IndexIgnore s add to the list rather than replacing each other. By default, the list includes ".".
You might well want to ignore .ht* files so that the Bad Guys can't look at the actual .htaccess files. Here we want to ignore the *.jpg files (which are not much use without the .html files that display them and explain what they show) and the parent directory, known to Unix and to Win32 as "..":
... <Directory /usr/www/APAC HE3/fancyindex.txt/htdocs> FancyIndexing on AddDescription "One of our wonderful catalogs" catalog_autumn.html catalog_summer.html IndexIgnore *.jpg .. </Directory>
You might want to use IndexIgnore for security reasons as well: what the eye doesn't see, the mouse finger can't steal. You can put in extra IndexIgnore lines, and the effects are cumulative, so we could just as well write:
<Directory /usr/www/APACHE3/fancyindex.txt/htdocs> FancyIndexing on AddDescription "One of our wonderful catalogs" catalog_autumn.html catalog_summer.html IndexIgnore *.jpg IndexIgnore .. </Directory>
AddIcon icon_name name Server config, virtual host, directory, .htaccess
We can add visual sparkle to our page by giving icons to the files with the AddIcon directive. Apache has more icons than you can shake a stick at in its ... /icons directory. Without spending some time exploring, one doesn't know precisely what each one looks like, but bomb.gif will do for an example. The icons directory needs to be specified relative to the DocumentRoot directory, so we have made a subdirectory ... /htdocs/icons and copied bomb.gif into it. We can attach the bomb icon to all displayed .html files with this:
... AddIcon icons/bomb.gif .html
AddIcon expects the URL of an icon, followed by a file extension, wildcard expression, partial filename, or complete filename to describe the files to which the icon will be added. We can iconify subdirectories off the DocumentRoot with ^^DIRECTORY^^, or make blank lines format properly with ^^BLANKICON^^. Since we have the convenient icons directory to practice with, we can iconify it with this:
AddIcon /icons/burst.gif ^^DIRECTORY^^
Or we can make it disappear with this:
... IndexIgnore icons ...
AddIcon ("DIR",/icons/burst.gif) ^^DIRECTORY^^
This line will print the word DIR where the burst icon would have appeared to mark a directory (that is, the text is used as the ALT description in the link to the icon). You could, if you wanted, print the word "Directory" or "This is a directory." The choice is yours.
Here are several examples of uses of AddIcon:
AddIcon (IMG,/icons/image.xbm) .gif .jpg .xbm AddIcon /icons/dir.xbm ^^DIRECTORY^^ AddIcon /icons/backup.xbm *~
AddIconByType should be used in preference to AddIcon, when possible.
AddAlt string file file ... Server config, virtual host, directory, .htaccess
AddDescription string file1 file2 ... Server config, virtual host, directory, .htaccess
<Directory /usr/www/APACHE3/fancyindex.txt/htdocs> FancyIndexing on AddDescription "One of our wonderful catalogs" catalog_autumn.html catalog_summer.html IndexIgnore *.jpg IndexIgnore .. AddIcon (CAT,icons/bomb.gif) .html AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^ AddIcon icons/blank.gif ^^BLANKICON^^ DefaultIcon icons/blank.gif </Directory>
Having achieved these wonders, we might now want to be a bit more sensible and choose our icons by MIME type using the AddIconByType directive.
DefaultIcon url Server config, virtual host, directory, .htaccess
AddIconByType icon mime_type1 mime_type2 ... Server config, virtual host, directory, .htaccess
... text/html html htm text/plain text text/richtext rtx text/tab-separated-values tsv text/x-setext text ...
So, we could have one icon for all text files by including the line:
AddIconByType (TXT,icons/bomb.gif) text/*
Or we could be more specific, using four icons, a.gif, b.gif, c.gif, and d.gif :
AddIconByType (TXT,/icons/a.gif) text/html AddIconByType (TXT,/icons/b.gif) text/plain AddIconByType (TXT,/icons/c.gif) text/tab-separated-values AddIconByType (TXT,/icons/d.gif) text/x-setext
Let's try out the simpler case:
<Directory /usr/www/APACHE3/fancyindex.txt/htdocs> FancyIndexing on AddDescription "One of our wonderful catalogs" catalog_autumn.html catalog_summer.html IndexIgnore *.jpg IndexIgnore .. AddIconByType (CAT,icons/bomb.gif) text/* AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^ </Directory>
For a further refinement, we can use AddIconByEncoding to give a special icon to encoded files.
AddAltByType string mime_type1 mime_type2 ... Server config, virtual host, directory, .htaccess
AddIconByEncoding icon mime_encoding1 >mime_encoding2 ... Server config, virtual host, directory, .htaccess
... AddIconByEncoding (COMP,/icons/d.gif) application/x-compress ...
AddAltByEncoding string mime_encoding1 mime_encoding2 ... Server config, virtual host, directory, .htaccess
HeaderName filename Server config, virtual host, directory, .htaccess
This directive inserts a header, read from filename, at the top of the index. The name of the file is taken to be relative to the directory being indexed. Apache will look first for filename.html and, if that is not found, then filename.
Apache Versions After 1.3.6
filename is treated as a URI path relative to the one used to access the directory being indexed and must resolve to a document with a major content type of "text" (e.g., text/html, text/plain, etc.). This means that filename may refer to a CGI script if the script's actual file type (as opposed to its output) is marked as text/html, such as with the following directive:
AddType text/html .cgi
Content negotiation will be performed if the MultiViews option is enabled. If filename resolves to a static text/html document (not a CGI script) and the Includes option is enabled, the file will be processed for server-side includes (see the mod_include documentation).
If the file specified by HeaderName contains the beginnings of an HTML document (<HTML>, <HEAD>, etc.), then you will probably want to set IndexOptions +SuppressHTMLPreamble, so that these tags are not repeated. (See also ReadmeName.)
<Directory /usr/www/APACHE3/fancyindex.txt/htdocs> FancyIndexing on AddDescription "One of our wonderful catalogs" catalog_autumn.html catalog_summer.html IndexIgnore *.jpg IndexIgnore .. icons HEADER README AddIconByType (CAT,icons/bomb.gif) text/* AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^ HeaderName HEADER ReadMeName README </Directory>
Since HEADER and README can be HTML documents, you can wrap the directory listing up in a whole lot of fancy interactive stuff if you want.
On the whole, however, FancyIndexing is just a cheap and cheerful way of getting something up on the Web. For a more elegant solution, study the next section.
Copyright © 2003 O'Reilly & Associates. All rights reserved.