DNS is a distributed database system that translates hostnames to IP addresses and IP addresses to hostnames (e.g., it translates hostname miles.somewhere.net to IP address 192.168.244.34). DNS is also the standard Internet mechanism for storing and accessing several other kinds of information about hosts; it provides information about a particular host to the world at large. For example, if a host cannot receive mail directly, but another machine will receive mail for it and pass it on, that information is communicated with an MX record in DNS .
DNS clients include any program that needs to do any of the following:
Fundamentally, any program that uses hostnames can be a DNS client. This includes essentially every program that has anything to do with networking, including both client and server programs for Telnet, SMTP , FTP , and almost any other network service. DNS is thus a fundamental networking service, upon which other network services rely.
Other protocols may be used to provide this kind of information. For example, NIS/YP is used to provide host information within a network. However, DNS is the service used for this purpose across the Internet, and clients that need to access Internet hosts will have to use DNS , directly or indirectly. On networks that use NIS/YP or other methods internally, the server for the other protocol usually acts as a DNS proxy for the client. Many clients can also be configured to use multiple services, so that if a host lookup fails, it will retry using another method. Thus, it might start by looking in NIS/YP , which will show only local hosts, but try DNS if that fails, or it might start by looking in DNS , and then try a file on its own disk if that fails (so that you can put in personal favorite names, for example).
In UNIX , DNS is implemented by the Berkeley Internet Name Domain ( BIND ). On the client side is the resolver, a library of routines called by network processes. On the server side is a daemon called named (also known as in.named on some systems).
DNS is designed to forward queries and responses between clients and servers, so that servers may act on behalf of clients or other servers. This capability is very important to your ability to build a firewall that handles DNS services securely.
How does DNS work? Essentially, when a client needs a particular piece of information (e.g., the IP address of host ftp.somewhere.net ), it asks its local DNS server for that information. The local DNS server first examines its own cache to see if it already knows the answer to the client's query. If not, the local DNS server asks other DNS servers, in turn, to discover the answer to the client's query. When the local DNS server gets the answer (or decides that it can't for some reason), it caches any information it got and answers the client. For example, to find the IP address for ftp.somewhere.net , the local DNS server first asks one of the public root nameservers which machines are nameservers for the net domain. It then asks one of those net nameservers which machines are nameservers for the somewhere.net domain, and then it asks one of those nameservers for the IP address of ftp.somewhere.net .
This asking and answering is all transparent to the client. As far as the client is concerned, it has communicated only with the local server. It doesn't know or care that the local server may have contacted several other servers in the process of answering the original question.
There are two types of DNS network activities: lookups and zone transfers. Lookups occur when a DNS client (or a DNS server acting on behalf of a client) queries a DNS server for information, e.g., the IP address for a given hostname, the hostname for a given IP address, the name server for a given domain, or the mail exchanger for a given host. Zone transfers occur when a DNS server (the secondary server) requests from another DNS server (the primary server) everything the primary server knows about a given piece of the DNS naming tree (the zone). Zone transfers happen only among servers that are supposed to be providing the same information; a server won't try to do a zone transfer from a random other server under normal circumstances. People occasionally do zone transfers in order to gather information (this is OK when they're calculating what the most popular hostname on the Internet is, but bad when they're trying to find out what hosts to attack at your site).
For performance reasons, DNS lookups are usually executed using UDP . If some of the data is lost in transit by UDP (remember that UDP doesn't guarantee delivery), the lookup will be redone using TCP . There may be other exceptions. Figure 8.13 shows a DNS name lookup.
A DNS server uses well-known port 53 for all its UDP activities and as its server port for TCP . It uses a random port above 1023 for TCP requests. A DNS client uses a random port above 1023 for both UDP and TCP . You can thus differentiate between the following:
DNS zone transfers are performed using TCP . The connection is initiated from a random port above 1023 on the secondary server (which requests the data) to port 53 on the primary server (which sends the data requested by the secondary). A secondary server must also do a regular DNS query of a primary server to decide when to do a zone transfer. Figure 8.14 shows a DNS zone transfer.
DNS is structured so that servers always act as proxies for clients. It's also possible to use a DNS feature called forwarding so that a DNS server is effectively a proxy for another server. The remainder of this DNS discussion describes the use of these built-in proxying features of DNS .
In most implementations, it would be possible to modify the DNS libraries to use a modified-client proxy. On machines that do not support dynamic linking, using a modified-client proxy for DNS would require recompiling every network-aware program. Because users don't directly specify server information for DNS , modified-procedure proxies seem nearly impossible.
In fact, there are two separate DNS data trees: one for obtaining information by hostname (such as the IP address, CNAME record, HINFO record, or TXT record that corresponds to a given hostname), and one for obtaining information by IP address (the hostname for a given address).
For example, here is a sample of the DNS data for a fake domain somebody.net :
somebody.net. IN SOA tiger.somebody.net. root.tiger.somebody.net. ( 1001 ; serial number 36000 ; refresh (10 hr) 3600 ; retry (1 hr) 3600000 ; expire (1000 hr) 36000 ; default ttl (10 hr) ) IN NS tiger.somebody.net. IN NS lion.somebody.net. tiger IN A 192.168.2.34 IN MX 5 tiger.somebody.net. IN MX 10 lion.somebody.net. IN HINFO INTEL-486 BSDI ftp IN CNAME tiger.somebody.net. lion IN A 192.168.2.35 IN MX 5 lion.somebody.net. IN MX 10 tiger.somebody.net. IN HINFO SUN-3 SUNOS www IN CNAME lion.somebody.net. wais IN CNAME lion.somebody.net. alaska IN NS bear.alaska.somebody.net. bear.alaska IN A 192.168.2.81
This domain would also need a corresponding set of PTR records to map IP addresses back to hostnames. To translate an IP address to a hostname, you reverse the components of the IP address, append .IN-ADDR.ARPA , and look up the DNS PTR record for that name. For example, to translate IP address 126.96.36.199, you would look up the PTR record for 188.8.131.52.IN-ADDR.ARPA .
2.168.192.IN-ADDR.ARPA. IN SOA tiger.somebody.net.root.tiger.somebody.net. ( 1001 ; serial number 36000 ; refresh (10 hr) 3600 ; retry (1 hr) 3600000 ; expire (1000 hr) 36000 ; default ttl (10 hr) ) IN NS tiger.somebody.net. IN NS lion.somebody.net. 34 IN PTR tiger.somebody.net. 35 IN PTR lion.somebody.net. 81 IN PTR bear.alaska.somebody.net.
The first security problem with DNS is that many DNS servers and clients can be tricked by an attacker into believing bogus information. Many clients and servers don't check to see whether all the answers they get relate to questions they actually asked, or whether the answers they get are coming from the server they asked. Servers, in particular, may cache these "extra" answers without really thinking about it, and answer later queries with this bogus cached data. This lack of checking can allow an attacker to give false data to your clients and servers. For example, an attacker could use this capability to load your server's cache with information that says that his IP address maps to the hostname of a host you trust for password-less access via rlogin . (This is only one of several reasons you shouldn't allow the BSD "r" commands across your firewall; see the full discussion of these commands earlier in this chapter.)
The attack described in the previous section points out the problem of mismatched data between the hostname and IP address trees in the DNS . In a case like the one we've described, if you look up the hostname corresponding to the attacker's IP address (this is called a reverse lookup ), you get back the name of a host you trust. If you then look up the IP address of this hostname (which is called a double-reverse lookup ), you should see that the IP address doesn't match the one the attacker is using. This should alert you that something suspicious is going on. Reverse and double-reverse lookups are described in more detail in the section called "Set up a `fake' DNS server on the bastion host for the outside world to use" later in this DNS discussion.
Any program that makes authentication or authorization decisions based on the hostname information it gets from DNS should be very careful to validate the data with this reverse lookup/double-reverse lookup method. In some operating systems (for example, SunOS 4.x and later), this check is automatically done for you by the gethostbyaddr() library function. In most other operating systems, you have to do the check yourself. Make sure that you know which approach your own operating system takes and make sure that the daemons that are making such decisions in your system do the appropriate validation. (And be sure you're preserving this functionality if you modify or replace the vendor's libc .) Better yet, don't do any authentication or authorization based solely on hostname or even on IP address; there is no way to be sure that a packet comes from the IP address it claims to come from, unless there is some kind of cryptographic authentication within the packet that only the true source could have generated.
Some implementations of double-reverse lookup fail on hosts with multiple addresses, e.g., dual-homed hosts used for proxying. If both addresses are registered at the same name, a DNS lookup by name will return both of them, but many programs will read only the first. If the connection happened to come from the second address, the double-reverse will incorrectly fail even though the host is correctly registered. Although you should avoid using double-reverse implementations that have this flaw, you may also want to ensure that on your externally visible multi-homed hosts, lookup by address returns a different name for each address, and that those names have only one address returned when it is looked up. For example, for a host named "foo" with two interfaces named "e0" and "e1", have lookups of "foo" return both addresses, lookups of "foo-e0" and "foo-e1" return only the address of that interface, and lookups by IP address return "foo-e0" or "foo-e1" (but not simply "foo") as appropriate.
Another problem you may encounter when supporting DNS with a firewall is that it may reveal information that you don't want revealed. Some organizations view internal hostnames (as well as other information about internal hosts) as confidential information. They want to protect these host names much as they do their internal telephone directories. They're nervous because internal hostnames may reveal project names or other product intelligence, or because these names may reveal the type of the hosts (which may make an attack easier). For example, it's easy to guess what kind of system something is if its name is "lab-sun" or "cisco-gw".
Even the simplest hostname information can be helpful to an attacker who wants to bluff his way into your site, physically or electronically. Using information in this way is an example of what is commonly called a social engineering attack. The attacker first examines the DNS data to determine the name of a key host or hosts at your site. Such hosts will often be listed as DNS servers for the domain, or as MX gateways for lots of other hosts. Next, the attacker calls or visits your site, posing as a service technician, and says he needs to work on these hosts. He'll ask for the passwords for the hosts (if he calls on the telephone), or ask to be shown to the machine room (if he visits the site). Because the attacker seems legitimate, and seems to have inside information about the site - after all, he knows the names of your key hosts - he'll often gain access. Social engineering attacks like this takes a lot of brazenness on the part of the attacker, particularly if he actually visits your site, but you'd be amazed at how often such attacks succeed.
Besides internal hostnames, other information is often placed within the DNS ; information which is useful locally, but which you'd really rather an attacker not have access to. DNS HINFO and TXT resource records are particularly revealing:
Attackers will often obtain DNS information about your site wholesale by contacting your DNS server and asking for a zone transfer, as if they were a secondary server for your site. You can either prevent this with packet filtering (by blocking TCP -based DNS queries, which will unfortunately block more than just zone transfers) or through the xfernets directive of current implementations of BIND (see the BIND documentation for more information).
The question to keep in mind when considering what DNS data to reveal is, "Why give attackers any more information than necessary?" The following sections provide some suggestions to help you reveal only the data you want people to have.
We've mentioned that DNS has a query-forwarding capability. By taking advantage of this capability, you can give internal hosts an unrestricted view of both internal and external DNS data, while restricting external hosts to a very limited ("sanitized") view of internal DNS data. You might want to do this for such reasons as:
Figure 8.15 shows how to set up DNS to hide information; the following sections describe all the details.
The first step in hiding DNS information from the external world is to set up a fake DNS server on a bastion host. This server claims to be authoritative for your domain. Make it the server for your domain that is named by the Name Server records maintained by your parent domain. If you have multiple such servers for the outside world to talk to (which you should - some or all of the rest may belong to your service provider), make your fake server the primary server of the set of authoritative servers; make the others secondaries of this primary server.
As far as this fake server on the bastion host is aware, it knows everything about your domain. In fact, though, all it knows about is whatever information you want revealed to the outside world. This information typically includes only basic hostname and IP address information about the following hosts:
In addition, you'll need to publish MX records for any host or domain names that are used as part of email addresses in email messages and Usenet news postings, so that people can reply to these messages. Keep in mind that people may reply to messages days, weeks, months, or even years after they were sent. If a given host or domain name has been widely used as part of an email address, you may need to preserve an MX record for that host or domain forever, or at least until well after it's dead and gone, so that people can still reply to old messages. If it has appeared in print, "forever" may be all too accurate; sites still receive electronic mail for machines decommissioned five and 10 years ago.
You will also need to publish fake information for any machines that can contact the outside world directly. Many servers on the Internet (for example, most major anonymous FTP servers) insist on knowing the hostname (not just the IP address) of any machines that contact them, even if they do nothing with the hostname but log it. In the DNS resource records, A (name-to-address mapping) records and PTR (address-to-name mapping) records handle lookups for names and addresses.
As we've mentioned earlier, machines that have IP addresses and need hostnames do reverse lookups. With a reverse lookup, the server starts with the remote IP address of the incoming connection, and looks up the hostname that the connection is coming from. It takes the IP address (for example, 172.16.19.67), permutes it in a particular way (reverses the parts and adds .IN-ADDR.ARPA to get 184.108.40.206.IN-ADDR.ARPA , and looks up a PTR record for that name. The PTR record should return the hostname for the host with that address (e.g., mycroft.somewhere.net ), which the server then uses for its logs or whatever.
How can you deal with these reverse lookups? If all these servers wanted was a name to log, you could simply create a wildcard PTR record. That record would indicate that a whole range of addresses belongs to an unknown host in a particular domain. For example, you might have a lookup for *.19.16.172.IN-ADDR.ARPA return unknown.somewhere.net ). Returning this information would be fairly helpful; it would at least tell the server administrator whose machine it was ( somewhere.net 's). Anyone who had a problem with the machine could pursue it through the published contacts for the somewhere.net domain.
There is a problem with doing only this, however. In an effort to validate the data returned by the DNS , more and more servers (particularly anonymous FTP servers) are now doing a double-reverse lookup, and won't talk to you unless the double-reverse lookup succeeds. This is the same kind of lookup we mentioned above; it's certainly necessary for people who provide a service where they need any degree of authentication of the requesting host. Whether or not anonymous FTP is such a service is another question. Some people believe that once you put a file up for anonymous FTP , you no longer have reason to try to authenticate hosts; after all, you're trying to give information away. People running anonymous FTP servers that do double-reverse lookup argue that people who want services have a responsibility to be members of the network community and that requires being identifiable. Whichever side of the argument you're on, it is certainly true that the maintainers of several of the largest and best-known anonymous FTP servers are on the side that favors double-reverse lookup, and will not provide service to you unless double-reverse lookup succeeds.
In a double-reverse lookup, a DNS client:
Your fake server needs to provide consistent fake data for all hosts in your domain whose IP addresses are going to be seen by the outside world. For every IP address you own, the fake server must publish a PTR record with a fake hostname, as well as a corresponding A record that maps the fake hostname back to the IP address. For example, for address 172.16.1.2, you might publish a PTR record with the name host-172-16-1-2.somewhere.net and a corresponding A record which maps host-172-16-1-2.somewhere.net back to the corresponding IP address (172.16.1.2). When you connect to some remote system that attempts to do a reverse lookup of your IP address (e.g., 172.16.1.2) to determine your hostname, that system will get back the fake hostname (e.g., host-172-16-1-2 ). If the system then attempts to do a double-reverse lookup to translate that hostname to an IP address, it will get back 172.16.1.2, which matches the original IP address and satisfies the consistency check.
If you are strictly using proxying to connect internal hosts to the external world, you don't need to set up the fake information for your internal hosts; you simply need to put up information for the host or hosts running the proxy server. The external world will see only the proxy server's address. For a large network, this by itself may make using proxy service for FTP worthwhile.
Your internal machines need to use the real DNS information about your hosts, not the fake information presented to the outside world. You do this through a standard DNS server setup on some internal system. Your internal machines may also want to find out about external machines, though, e.g., to translate the hostname of a remote anonymous FTP site to an IP address.
One way to accomplish this is to provide access to external DNS information by configuring your internal DNS server to query remote DNS servers directly, as appropriate, to resolve queries from internal clients about external hosts. Such a configuration, however, would require opening your packet filtering to allow your internal DNS server to talk to these remote DNS servers (which might be on any host on the Internet). This is a problem because DNS is UDP -based, and as we discuss in Chapter 6 , you need to block UDP altogether in order to block outside access to vulnerable RPC -based services like NFS and NIS/YP .
Fortunately, the most common DNS server (the UNIX named program) provides a solution to this dilemma: the forwarders directive in the /etc/named.boot server configuration file. The forwarders directive tells the server that, if it doesn't know the information itself already (either from its own zone information or from its cache), it should forward the query to a specific server and let this other server figure out the answer, rather than try to contact servers all over the Internet in an attempt to determine the answer itself. In the /etc/named.boot configuration file, you set up the forwarders line to point to the fake server on the bastion host; the file also needs to contain a "slave" line, to tell it to only use the servers on the forwarders line, even if the forwarders are slow in answering.
The use of the forwarders mechanism doesn't really have anything to do with hiding the information in the internal DNS server; it has everything to do with making the packet filtering as strict as possible (i.e., applying the principle of least privilege), by making it so that the internal DNS server need only be able to talk to the bastion host DNS server, not to DNS servers throughout the whole Internet.
If internal hosts can't contact external hosts, you may not want to bother setting things up so that they can resolve external host names. SOCKS proxy clients can be set up to use the external name server directly. This simplifies your name service configuration somewhat, but it complicates your proxying configuration, and some users may want to resolve hostnames even though they can't reach them (for example, they may be interested in knowing whether the hostname in an email address is valid).
The next step is to configure your internal DNS clients to ask all their queries of the internal server. On UNIX systems, you do this through the /etc/resolv.conf file. There are two cases:
In either case, as far as the client is concerned, it asked a question of the internal server and got an answer from the internal server. The client doesn't know whether the internal server already knew the answer or had to obtain the answer from other servers (indirectly, via the bastion server). Therefore, the /etc/resolv.conf file will look perfectly standard on internal clients.
The key to this whole information-hiding configuration is that DNS clients on the bastion host must query the internal server for information, not the server on the bastion host. This way, DNS clients on the bastion host (such as Sendmail, for example) can use the real hostnames and so on for internal hosts, but clients in the outside world can't access the internal data.
DNS server and client configurations are completely separate. Many people assume that they must have configuration files in common, that the clients will automatically know about the local server, and that pointing them elsewhere will also point the server elsewhere. In fact, there is no overlap. Clients never read /etc/named.boot , which tells the server what to do, and the server never reads /etc/resolv.conf , which tells the clients what to do.
Again, there are two cases:
DNS clients on the bastion host could obtain information about external hosts more directly by asking the DNS server on the bastion host instead of the one on the internal host. However, if they did that, they'd be unable to get the "real" internal information, which only the server on the internal host has. They're going to need that information, because they're talking to the internal hosts as well as the external hosts.
The approach we've described above is not the only option. Suppose that you don't feel it's necessary to hide your internal DNS data from the world. In this case, your DNS configuration is similar to the one we've described above, but it's somewhat simpler. Figure 8.18 shows how DNS works without information hiding.
With this alternate approach, you should still have a bastion host DNS server and an internal DNS server; however, one of these can be a secondary server of the other. Generally, it's easier to make the bastion DNS server a secondary of the internal DNS server, and to maintain your DNS data on the internal server. You should still configure the internal DNS server to forward queries to the bastion host DNS server, but the bastion host DNS clients can be configured to query the bastion host server instead of the internal server.
You need to configure any packet filtering system between the bastion host and the internal server to allow the following (see the table below for details):
If the bastion host is also a DNS secondary server and the internal host is the corresponding DNS primary server, you also have to allow the following: