[Chapter 8] 8.7 Other Information Services

8.7 Other Information Services

In addition to the World Wide Web services described in the previous section, there are a number of other popular information services as well, including Gopher, WAIS , and Archie.

8.7.1 Gopher

Gopher is a menu-driven text-based tool for browsing through files and directories across the Internet. When a user selects a Gopher menu item, Gopher retrieves the specified file and displays it appropriately. This means that if a file is compressed, Gopher automatically uncompresses it; if it's a GIF image, Gopher automatically runs a GIF viewer.

The security concerns for Gopher are essentially the same - from both a server and a client point of view - as those described in the preceding section for HTTP servers and clients.

For servers, you have to worry about what a malicious client can trick you into running. Like HTTP servers, some Gopher servers use auxiliary programs to generate Gopher pages on the fly. Gopher servers are therefore susceptible to the same kinds of problems as HTTP servers:

Can an attacker trick the auxiliary program?
Can the attacker upload his own auxiliary program and cause it to be run?

For clients, you have to worry about what a malicious server can trick you into doing. Gopher uses the same kinds of extensible data type and auxiliary program mechanisms as HTTP , so the concerns are similar to those described for HTTP .

Like HTTP servers, Gopher servers also sometimes live on nonstandard ports, so those concerns are similar to HTTP as well. Some Gopher clients support transparent proxying (via SOCKS or other mechanisms), but many don't.

Most of the common WWW browsers such as Mosaic and Netscape Navigator are Gopher clients as well as being HTTP clients. These browsers generally support the proxying of Gopher via SOCKS or via transparent HTTP proxy servers such as the CERN HTTP server described in the preceding section. Even if you can't find a dedicated Gopher client that does proxying, you can probably use one of these WWW browsers (and an appropriate proxy server) instead. Your users may even prefer this approach to using a separate Gopher client, because it means they only have one application - a client for HTTP , Gopher, and several other protocols - to learn and configure, rather than a separate application for each protocol.

8.7.1.1 Packet filtering characteristics of Gopher

Gopher is a TCP -based service. Gopher clients use ports above 1023. Most Gopher servers use port 70, but some don't; see the discussion of nonstandard server ports above, in the section called "Packet Filtering Characteristics of HTTP ."

Direc-	Source	Dest.	Pro-	Source	Dest.	ACK
tion	Addr.	Addr.	tocol	Port	Port	Set	Notes
In	Ext	Int	TCP	>1023	70[21]	[22]	Incoming session, client to server
Out	Int	Ext	TCP	70[21]	>1023	Yes	Incoming session, server to client
Out	Int	Ext	TCP	>1023	70[21]	[22]	Outgoing session, client to server
In	Ext	Int	TCP	70[21]	>1023	Yes	Outgoing session, server to client

[21] 70 is the standard port number for Gopher servers, but some servers run on different port numbers.

[22] ACK is not set on the first packet of this type (establishing connection) but will be set on the rest.

8.7.1.2 Proxying characteristics of Gopher

The TIS FWTK http-gw proxy server can serve Gopher as well as HTTP . SOCKS does not include a modified Gopher client, but Gopher clients are, in general, not difficult to modify to use SOCKS ; many of the Gopher clients freely available on the Internet support SOCKS as either a compile-time or run-time option. Using a Web browser that supports proxying, like Netscape Navigator or Mosaic, will give you proxy support for Gopher automatically.

8.7.1.3 Summary of Gopher recommendations

These recommendations are basically the same as for HTTP :

If you're going to run a Gopher server, use a dedicated bastion host if possible.
If you're going to run a Gopher server, carefully configure the Gopher server to control what it has access to; in particular, watch out for ways that someone could upload a program to Gopher system somehow (via mail or FTP , for example), and then execute it via the Gopher server.
Carefully control the external programs your Gopher server can access.
You can't allow internal hosts to access all Gopher servers without allowing them to access all TCP ports, because some Gopher servers use nonstandard port numbers. If you don't mind allowing your users access to all TCP ports, you can use packet filtering to examine the ACK bit to allow outgoing connections to those ports (but not incoming connections from those ports). If you do mind, then either restrict your users to servers on the standard port (70), or use proxying.
Configure your Gopher clients carefully, and warn your users not to reconfigure them based on external advice.
If possible, use a Web browser such as Mosaic or Netscape Navigator for your Gopher client, rather than a dedicated client. Your users are probably going to demand WWW access before Gopher access anyway, so you might as well only have to figure out and secure one application.

8.7.2 Wide Area Information Servers ( WAIS )

WAIS indexes large text databases so that they can be searched efficiently by simple keyword or more complicated Boolean expressions. For example, you can ask for all the documents that mention "firewalls" or all the documents that mention "firewalls" but don't mention "fire marshals". (You might do this to make sure you don't get documents about literal firewalls.) WAIS was originally developed at Thinking Machines as a prototype information service, and is now widely used on the Internet for things like mailing-list archives and catalogs of various text-based information (library card catalogs, for example).

WAIS servers present the same basic security concerns as the servers for all of the other common Internet services, such as FTP and HTTP : Can an attacker use this server to access something he shouldn't?

You address this problem just as you do with other servers: Secure and restrict the environment the server runs in. In this way, you ensure that even if the attacker gets out of the server, there's nothing else of interest to be found on the machine. Further, nothing else should trust the machine enough to significantly facilitate further break-ins.

Generally, WAIS servers do not run auxiliary programs the way HTTP and Gopher servers do, so with WAIS , you don't need to worry that attackers will trick your servers or upload auxiliary programs as they can with the HTTP and Gopher servers. Because WAIS clients are not generally extensible with new data types and auxiliary programs, as HTTP and Gopher clients are, you also don't have that can of worms to worry about.

WAIS servers do share one characteristic of HTTP and Gopher servers, however: while there is a standard port number for WAIS servers ( TCP port 210), not all of them use it. See the discussion of this in the section on HTTP above for an understanding of the problems this can cause.

WAIS clients are generally standalone programs. They help you find WAIS servers, submit queries to those servers, display the results, submit follow-on queries based on previous results, and so on. WAIS information is generally text-based, so WAIS clients generally don't have the problems that HTTP and Gopher clients do regarding the safety of the data they retrieve.

Some WWW browsers include limited WAIS client support.

8.7.2.1 Packet filtering characteristics of WAIS

WAIS is a TCP -based service. WAIS clients use random ports above 1023. WAIS servers usually use port 210, but sometimes don't; see the discussion of nonstandard server ports above, in the section about HTTP .

Direc-	Source	Dest.	Pro-	Source	Dest.	ACK
tion	Addr.	Addr.	tocol	Port	Port	Set	Notes
In	Ext	Int	TCP	>1023	210[23]	[24]	Incoming session, client to server
Out	Int	Ext	TCP	210[23]	>1023	Yes	Incoming session, server to client
Out	Int	Ext	TCP	>1023	210[23]	[24]	Outgoing session, client to server
In	Ext	Int	TCP	210[23]	>1023	Yes	Outgoing session, server to client

[23] 210 is the standard port number for WAIS servers, but some servers run on different port numbers.

[24] ACK is not set on the first packet of this type (establishing connection) but will be set on the rest.

8.7.2.2 Proxying characteristics of WAIS

If you use a proxying Web browser like Netscape Navigator or Mosaic to access WAIS , you automatically get client support. As a straightforward single-connection protocol with plenty of user-specified information, WAIS lends itself to both modified-client and modified-procedure proxying. SOCKS support is commonly available in standalone WAIS clients.

8.7.2.3 Gateways to WAIS

A number of sites on the Internet provide WAIS gateways for HTTP clients, allowing people using WWW browsers to access WAIS servers indirectly. Gateways include:

http://www.ai.mit.edu/the-net/wais.html
http://www.wais.com/

8.7.2.4 Summary of WAIS recommendations

If you want to run a WAIS server, run it on a bastion host and on the standard port for WAIS ( TCP port 210). Carefully configure the server to control what information it has access to.
You can't allow internal hosts to access all WAIS servers without allowing them to access all TCP ports, because some WAIS servers use nonstandard port numbers. If you don't mind allowing your users access to all TCP ports, you can use packet filtering to examine the ACK bit to allow outgoing connections to those ports (but not incoming connections from those ports). If you do mind, then either restrict your users to servers on the standard port ( TCP port 210) or use proxying.
If possible, use a Web browser such as Mosaic or Netscape Navigator for your WAIS client, rather than a dedicated client. Your users are probably going to demand WWW access before WAIS access anyway, so you might as well figure out and secure just one application.

8.7.3 Archie

Archie is an Internet service that lets users search through indexes of anonymous FTP servers for strings or regular expressions that match file and directory names in the FTP archives. Archie continually polls public anonymous FTP servers to keep up to date with what's available at the sites.

8.7.3.1 Packet filtering characteristics of Archie

Archie is a UDP -based service. Dedicated Archie clients use ports above 1023; Archie servers use port 1525. As we discuss in Chapter 6 , UDP -based services in general are a problem for packet filtering firewalls. Alternative ways to access Archie (other than directly via the Archie protocol) are discussed in the sections below.

Direc-	Source	Dest.	Pro-	Source	Dest.	ACK
tion	Addr.	Addr.	tocol	Port	Port	Set	Notes
Out	Int	Ext[25]	UDP	>1023	1525	[26]	Outgoing query, client to server
In	Ext[25]	Int	UDP	1525	>1023	[26]	Incoming response, server to client

[25] The external address should be one of the well-known Archie servers.

[26] UDP packets do not have ACK bits.

Most sites with packet filtering systems typically filter out all UDP traffic, and then open specific, restricted peepholes between their internal hosts and their bastion host ( not the whole outside world) for key UDP -based services.

8.7.3.2 Proxying characteristics of Archie

Because Archie is a UDP -based service, it is not supported by SOCKS ; however, it can be proxied by the UDP Packet Relayer. TIS FWTK does not provide an Archie proxy server and does not provide a generic proxy server for UDP ( plug-gw supports only TCP ). Using one of the HTTP -Archie gateways (discussed below) gives you the proxying of your Web browser, and provides a better user interface than many dedicated Archie clients.

8.7.3.3 Providing Archie service to your users

There are several ways to let your users access Archie servers, including Telnet, email, and WWW gateways, and the dedicated Archie protocol. Sites running Archie servers generally prefer people to access them via the dedicated Archie protocol (or via WWW gateways, which in turn access the Archie servers via the dedicated Archie protocol) because this is more efficient for them and allows them to serve a greater number of users.

Archie access via Telnet

If you allow outgoing Telnet, Archie servers can be accessed by opening a Telnet session to the machine and logging in as "archie" (no password required). This will give you access to a command-line-oriented Archie query program. Now you can either capture a log of the Telnet session to save your results or tell the server to email you the results after you're done. Issue the command "help" to get started with a description of the server's command language.

Archie access via email

Generally, Archie servers can also be accessed by email. If you send an email message to "archie" at one of the Archie sites (e.g., archie@archie.ans.net ), it will treat the mail as a query to be processed and will email back the results. Send the query "help" to find out how to use the service.

Archie access via WWW gateways

Like many services, Archie servers can also be accessed via a variety of WWW gateways; simply point your Web browser at one of the following:

http://www.lerc.nasa.gov/Doc/archieplex-httpd.html
http://hoohoo.ncsa.uiuc.edu/archie.html
http://www.nexor.co.uk/archie.html

Archie access via packet filtering and the dedicated Archie protocol

Archie servers can be accessed directly with dedicated Archie clients. The problem is, as we've discussed above, Archie is UDP -based, and most sites block all but a very restricted set of UDP packets through their firewall in order to avoid security problems with RPC -based services like NFS and NIS/YP .

What makes it possible to provide direct Archie access via packet filtering is that there are so few Archie servers. Many sites consider it safe enough to open a peephole in the packet filtering for Archie to the well-known Archie servers their users are likely to use. There are several reasons for this:

The list of servers is small and fairly stable.
In order to be attacked through this peephole, the attack would have to come (or at least convincingly appear to come) from one of these sites. Given that these are all well-known and very heavily used sites, any break-in to them is probably going to be noticed very quickly, particularly if it takes down the Archie server in order to use its port to attack some other site. Even forging packets is likely to result in widespread service outages and quick detection.
Even if someone did break into one of the Archie servers, what are the chances that the attacker would then go after your site in particular?

You may also decide that you are willing to take a chance and open your packet filtering sufficiently to allow your users to access the handful of major Archie servers in the world, or at least the default servers you've configured into the client you're distributing.

8.7.3.4 Running an Archie server

Providing an Archie server is a major undertaking, requiring the dedication of a fairly substantial amount of computing power, disk space, and network bandwidth. Because of this, there are only about 20 Archie servers in the world; telling you how to become the 21st is beyond the scope of this book. In any case, it's unlikely to be beneficial to the world for you to do it. If there are too many servers, they are more likely to be incomplete or out of date and to consume bandwidth people need to access the FTP servers.

8.7.3.5 Summary of Archie recommendations

Don't run an Archie server.
Teach your users to access Archie via WWW gateways.