8.8. RewriteThe preceding section described the alias module and its allies. Everything these directives can do, and more, can be done instead by mod_rewrite.c, an extremely compendious module that is almost a complete software product in its own right.[53] The documentation is thorough, and the reader is referred to http://www. engelschall.com/pw/apache/rewriteguide/ for any serious work. This section is intended for orientation only.
Rewrite takes a rewriting pattern and applies it to the URL. If it matches, a rewriting substitution is applied to the URL. The patterns are regular expressions familiar to us all in their simplest form; for example, mod.*\.c, which matches any module filename. The complete science of regular expressions is somewhat extensive, and the reader is referred to ... /src/regex/regex.7, a manpage that can be read with nroff -man regex.7 (on FreeBSD, at least). Regular expressions are also described in the POSIX specification and in Jeffrey Friedl's Mastering Regular Expressions (O'Reilly & Associates). The essence of regular expressions is that a number of special characters can be used to match parts of incoming URLs. The substitutions can include mapping functions that take bits of the incoming URL and look them up in databases or even apply programs to them. The rules can be applied repetitively and recursively to the evolving URL. It is possible (as the documentation says) to create "rewriting loops, rewriting breaks, chained rules, pseudo if-then-else constructs, forced redirects, forced MIME-types, forced proxy module throughout." The functionality is so extensive that it is probably impossible to master it in the abstract. When and if you have a problem of this sort, it looks as if mod_rewrite can solve it, given enough intellectual horsepower on your part! The module can be used in four situations:
The directives look simple enough. 8.8.1. RewriteEngineRewriteEngine on_or_off Server config, virtual host, directory Enables or disables the rewriting engine. If off, no rewriting is done at all. Use this directive to switch off functionality rather than commenting out Rewrite-Rule lines. 8.8.2. RewriteLogRewriteLog filename Server config, virtual host Sends logging to the specified filename. If the name does not begin with a slash, it is taken to be relative to the server root. This directive should appear only once in a Config file. 8.8.3. RewriteLogLevelRewriteLogLevel number Default number: 0 Server config, virtual host Controls the verbosity of the logging: 0 means no logging, and 9 means that almost every action is logged. Note that a number above 2 slows Apache down. 8.8.4. RewriteMapRewriteMap mapname {txt,dbm,prg,rnd,int}: filename Server config, virtual host Defines an external mapname file that inserts substitution strings through key lookup. The module passes mapname a query in the form: $(mapname : Lookupkey | DefaultValue) If the Lookupkey value is not found, DefaultValue is returned. The type of mapname must be specified by the next argument:
8.8.5. RewriteBaseRewriteBase BaseURL Directory, .htaccess The effects of this command can be fairly easily achieved by using the rewrite rules, but it may sometimes be simpler to encapsulate the process. It explicitly sets the base URL for per-directory rewrites. If RewriteRule is used in an .htaccess file, it is passed a URL that has had the local directory stripped off so that the rules act only on the remainder. When the substitution is finished, RewriteBase supplies the necessary prefix. To quote the manual's example: RewriteBase /xyz RewriteRule ^oldstuff\.html$ newstuff.html In this example, a request to /xyz/oldstuff.html gets rewritten to the physical file /abc/def/newstuff.html. Internally, the following happens:
8.8.6. RewriteCondRewriteCond TestString CondPattern Server config, virtual host, directory One or more RewriteCond directives can precede a RewriteRule directive to define conditions under which it is to be applied. CondPattern is a regular expression matched against the value retrieved for TestString, which contains server variables of the form %{NAME_OF_VARIABLE}, where NAME_OF_VARIABLE can be one of the following list:
These variables all correspond to the similarly named HTTP MIME headers, C variables of the Apache server, or the current time. If the regular expression does not match, the RewriteRule following it does not apply. 8.8.7. RewriteRuleRewriteRule Pattern Substitution [flags] Server config, virtual host, directory This directive can be used as many times as necessary. Each occurrence applies the rule to the output of the preceding one, so the order matters. Pattern is matched to the incoming URL; if it succeeds, the Substitution is made. An optional argument, flags, can be given. The flags, which follow, can be abbreviated to one or two letters:
For example, say we want to rewrite URLs of the form: /Language/~Realname/.../File into: /u/Username/.../File.Language We take the rewrite map file given previously and save it under /anywhere/map.real-to-user. Then we only have to add the following lines to the Apache server Config file: RewriteLog /anywhere/rewrite.log RewriteMap real-to-user txt:/anywhere/map.real-to-host RewriteRule ^/([^/]+)/~([^/]+)/(.*)$ /u/${real-to-user:$2|nobody}/$3.$1 8.8.8. A Rewrite ExampleThe Butterthlies salespeople seem to be taking their jobs more seriously. Our range has increased so much that the old catalog based around a single HTML script is no longer workable because there are too many cards. We have built a database of cards and a utility called cardinfo that accesses it using the arguments: cardinfo cardid query where cardid is the number of the card, and query is one of the following words: "price," "artist," or "size." The problem is that the salespeople are too busy to remember the syntax, so we want to let them log onto the card database as if it were a web site. For instance, going to http://sales.butterthlies.com/info/2949/price would return the price of card number 2949. The Config file is in ... /site.rewrite : User webuser Group webgroup # Apache requires this server name, although in this case it will # never be used. # This is used as the default for any server that does not match a # VirtualHost section. ServerName www.butterthlies.com NameVirtualHost 192.168.123.2 <VirtualHost www.butterthlies.com> ServerAdmin sales@butterthlies.com DocumentRoot /usr/www/site.rewrite/htdocs/customers ServerName www.butterthlies.com ErrorLog /usr/www/site.rewrite/logs/customers/error_log TransferLog /usr/www/site.rewrite/logs/customers/access_log </VirtualHost> <VirtualHost sales.butterthlies.com> ServerAdmin sales_mgr@butterthlies.com DocumentRoot /usr/www/site.rewrite/htdocs/salesmen Options ExecCGI indexes ServerName sales.butterthlies.com ErrorLog /usr/www/site.rewrite/logs/salesmen/error_log TransferLog /usr/www/site.rewrite/logs/salesmen/access_log RewriteEngine on RewriteLog logs/rewrite RewriteLogLevel 9 RewriteRule ^/info/([^/]+)/([^/]+)$ /cgi-bin/cardinfo?$2+$1 [PT] ScriptAlias /cgi-bin /usr/www/cgi-bin </VirtualHost> In real life cardinfo would be an elaborate program. However, here we just have to show that it could work, so it is extremely simple: #!/bin/sh # echo "content-type: text/html" echo sales.butterthlies.com echo "You made the query $1 on the card $2" To make sure everything is in order before we do it for real, we turn RewriteEngine off and access http://sales.butterthlies.com/cgi-bin/cardinfo. We get back the following message: The requested URL /info/2949/price was not found on this server. This is not surprising. We now turn RewriteEngine on and look at the crucial line in the Config file, which is: RewriteRule ^/info/([^/]+)/([^/]+)$ /cgi-bin/cardinfo?$2+$1 [PT] Translated into English this means the following: at the start of the string, match /info/, followed by one or more characters that aren't "/", and put those characters into the variable $1 (the parentheses do this; $1 because they are the first set). Then match a "/", then one or more characters aren't "/", and put those characters into $2. Then match the end of the string and pass the result through [PT] to the next rule, which is ScriptAlias. We end up as if we had accessed http://sales.butterthlies.com/cgi-bin/cardinfo?<card ID>+<query>. If the CGI script is on a different web server for some reason, we could write: RewriteRule ^/info/([^/]+)/([^/]+)$ http://somewhere.else.com/cgi-bin/ cardinfo/$2+$1[PT] Note that this pattern won't match /info/123/price/fred, because it has too many slashes in it. If we run all this with ./go, and access http://sales.butterthlies.com/info/2949/price from the client, we see the following message: You made the query price on card 2949 Copyright © 2001 O'Reilly & Associates. All rights reserved. |
|||||||||||||||||||||||||||||||||||||||
|