Chapter 8. Growing Your Domain

8.1. How Many Name Servers?

We set up two name servers in Chapter 4, "Setting Up BIND". Two servers are as few as you'll ever want to run. Depending on the size of your network, you may need to run many more than just two servers. It is not uncommon to run from five to seven servers, with one of them off-site. How many name servers are enough? You'll have to decide that based on your network. Here are some guidelines to help out:

Run at least one name server on each network or subnet you have. This removes routers as a point of failure. Make the most of any multihomed hosts you have, since they are (by definition) attached to more than one network.
If you have a file server and some diskless nodes, run a name server on the file server to serve this group of machines.
Run name servers near, but not necessarily on, large multiuser computers. The users and their processes probably generate a lot of queries, and as an administrator, you will work harder to keep a multiuser host up. But balance their needs against the risk of running a name server -- a security-critical server -- on a system that lots of people have access to.
Run at least one name server off-site. This makes your data available when your network isn't. You might argue that it's useless to look up an address when you can't reach the host. Then again, the off-site name server may be available if your network is reachable, but your other name servers are down. If you have a close relationship with an organization on the Internet -- say another university or a business partner -- they may be willing to run a slave for you.

Figure 8-1 shows a sample topology and a brief analysis to show you how this might work.

Figure 8-1. Sample network topology

Notice that if you follow our guidelines, there are still a number of places you could choose to run a name server. Host d, the file server for hosts a, b, c, and e, could run a name server. Host g, a big, multiuser host, is another good candidate. But probably the best choice is host f, the smaller host with interfaces on both networks. You'll need to run only one name server instead of two, and it'll run on a closely watched host. If you want more than one name server on either network, you can also run one on d or g.

8.1.1. Where Do I Put My Name Servers?

In addition to giving you a rough idea of how many name servers you'll need, these criteria should also help you decide where to run name servers (e.g., on file servers and multihomed hosts). But there are other important considerations when choosing the right host.

Other factors to keep in mind are the host's connectivity, the software it runs (BIND and otherwise), maintaining the homogeneity of your name servers, and security:

Connectivity

It's important that name servers be well-connected. Having a name server running on the fastest, most reliable host on your network won't do you any good if the host is mired in some backwater subnet of your network behind a slow, flaky serial line. Try to find a host close to your link to the Internet (if you have one), or find a well-connected Internet host to act as a slave for your zone. And on your own network, try to run name servers near the hubs of your network.

It's doubly important that your primary master name server be well connected. The primary needs good connectivity to all the slaves that update from it, for reliable zone transfers. And, like any name server, it'll benefit from fast, reliable networking.

Software

Another factor to consider in choosing a host for a name server is the software the host runs. Software-wise, the best candidate for a name server is a host running a vendor-supported version of BIND 8.2.3 or 9.1.0 and a robust implementation of TCP/IP (preferably based on 4.3 or 4.4 BSD Unix's networking -- we're Berkeley snobs). You can compile your own 8.2.3 or 9.1.0 BIND from the sources -- it's not hard, and the latest versions are very reliable -- but you'll probably have a tough time getting your vendor to support it. If you don't absolutely need a feature of BIND 8, you may be able to get away with running your vendor's port of older BIND code, like 4.9.7, which will give you the benefit of your vendor's support, for what that's worth.

Homogeneity

One last thing to take into account is the homogeneity of your name servers. As much as you might believe in "open systems," hopping between different versions of Unix can be frustrating and confusing. Avoid running name servers on lots of different platforms, if you can. You can waste a lot of time porting your scripts (or ours!) from one operating system to another or looking for the location of nslookup or named.conf on three different Unixes. Moreover, different vendors' versions of Unix tend to support different versions of BIND, which can cause all sorts of frustration. If you need the security features of BIND 8 or 9 on all your name servers, for example, choose a platform that supports BIND 8 or 9 for all your name servers.

Security

Since you would undoubtedly prefer that hackers not commandeer your name server to assist them in attacking your own hosts or other networks across the Internet, it's important to run your name server on a secure host. Don't run a name server on a big, multiuser system if you can't trust its users. If you have certain computers that are dedicated to hosting network services but don't permit general logins, those are good candidates for running name servers. If you have only one or a few really secure hosts, consider running the primary master name server on one of those, since its compromise would be more significant than the compromise of the slaves.

Though these are really secondary considerations -- it's more important to have a name server on a given subnet than to have it running on the perfect host -- do keep these criteria in mind when making a choice.

8.1.2. Capacity Planning

If you have heavily populated networks or users who do a lot of name server-intensive work, you may find that you need more name servers than we've recommended to handle the load. Or our recommendations may be fine for a little while, but as people add hosts to your nets or install new name server-intensive programs, you may find your name servers bogged down by queries.

Just which tasks are "name server-intensive"? Surfing the Web can be name server-intensive. Sending electronic mail, especially to large mailing lists, can be name server-intensive. Programs that make lots of remote procedure calls to different hosts can be name server-intensive. Even running certain graphical user environments can tax your name server. X Windows-based user environments, for example, query the name server to check access lists (among other things).

The astute (and precocious) among you may be asking, "But how do I know when my name servers are overloaded? What do I look for?" An excellent question!

Memory utilization is probably the most important aspect of a name server's operation to monitor. named can get very large on a name server that is authoritative for many zones. If named 's size, plus the size of the other processes you run, exceeds the size of your host's real memory, your host may swap furiously ("thrash") and not get anything done. Even if your host has more than enough memory to run all its processes, large name servers are slow to start and slow to spawn new named processes (e.g., to handle zone transfers). Another problem, peculiar to BIND 4: since a BIND 4 name server creates new named processes to handle zone transfers, it's quite possible to have more than one named process running at one time -- one answering queries and one or more servicing zone transfers. If your BIND 4 master name server already consumes 5 or 10 megabytes of memory, count on two or three times that amount being used occasionally.

Another criterion you can use to measure the load on your name server is the load the named process places on the host's CPU. Correctly configured name servers don't use much CPU time, so high CPU usage is often symptomatic of a configuration error. Programs such as top can help you characterize your name server's average CPU utilization.[54] Unfortunately, there are no absolute rules when it comes to acceptable CPU utilization. We offer a rough rule of thumb, though: 5% average CPU utilization is probably acceptable; 10% is a bit high, unless the host is dedicated to providing name service.

[54]top is a very handy program, written by Bill LeFebvre, that gives you a continuous report of which processes are sucking up the most CPU time on your host. The most recent version of top is available via anonymous FTP from ftp://eecs.nwu.edu as file /pub/top/top-3.4.tar.Z.

To get an idea of what normal figures are, here's what top might show for a relatively quiet name server:

last pid: 14299; load averages: 0.11, 0.12, 0.12       18:19:08
68 processes: 64 sleeping, 3 running, 1 stopped
Cpu states: 11.3% usr, 0.0% nice, 15.3% sys, 73.4% idle, 0.0% intr, 0.0% ker
Memory: Real: 8208K/13168K act/tot Virtual: 16432K/30736K act/tot Free: 4224K

  PID USERNAME PRI NICE   SIZE   RES STATE  TIME   WCPU    CPU COMMAND
   89 root       1    0  2968K 2652K sleep  5:01  0.00%  0.00% named

Okay, that's really quiet. Here's what top shows on a busy (though not overloaded) name server:

load averages: 0.30, 0.46, 0.44                  system: relay 16:12:20
39 processes: 38 sleeping, 1 waiting
Cpu states: 4.4% user, 0.0% nice, 5.4% system, 90.2% idle, 0.0% unk5, 0.0% unk6, 
0.0% unk7, 0.0% unk8
Memory: 31126K (28606K) real, 33090K (28812K) virtual, 54344K free Screen #1/ 3

   PID USERNAME PRI NICE  SIZE   RES   STATE   TIME  WCPU   CPU  COMMAND
 21910 root       1    0  2624K  2616K sleep 146:21  0.00% 1.42% /etc/named

Another statistic to look at is the number of queries the name server receives per minute (or second, if you have a busy name server). Again, there are no absolutes here: a fast Pentium III running NetBSD can probably handle thousands of queries per second without breaking a sweat, while an older Unix host might have problems with more than a few queries a second.

To check the volume of queries your name server is receiving, it's easiest to look at the name server's internal statistics, which you can configure the server to write to syslog at regular intervals.[55] For example, you could configure your name server to dump statistics every hour (actually, that's the default for BIND 8 servers), and compare the number of queries received between hours:

[55]Some older BIND name servers need coercion to dump their statistics: the ABRT signal (IOT on older systems). BIND 4.9 name servers automatically dump stats every hour, but 4.9.4 through 4.9.7 name servers, once again, need to be coerced with ABRT.

options {
        statistics-interval 60;
};

BIND 9 name servers don't support the statistics-interval substatement, but you can use rndc to tell a BIND 9 name server to dump statistics on the hour, for example in crontab :

0 * * * *  /usr/local/sbin/rndc stats

You should pay special attention to peak periods. Monday morning is often busy, because many people like to respond to mail they've received over the weekend first thing on Mondays.

You might also want to take a sample starting just after lunch, when people are returning to their desks and getting back to work -- all at about the same time. Of course, if your organization is spread across several time zones, you'll have to use your own good judgment to determine a busy time.

Here's a snippet from the syslog file on a BIND 8.2.3 name server:

Aug  1 11:00:49 terminator named[103]: NSTATS 965152849 959476930 A=8 NS=1 
SOA=356966 PTR=2 TXT=32 IXFR=9 AXFR=204
Aug  1 11:00:49 terminator named[103]: XSTATS 965152849 959476930 RR=3243 RNXD=0 
RFwdR=0 RDupR=0 RFail=20 RFErr=0 RErr=11 RAXFR=204 RLame=0 ROpts=0 SSysQ=3356 

SAns=391191 SFwdQ=0 SDupQ=1236 SErr=0 RQ=458031 RIQ=25 RFwdQ=0 RDupQ=0 RTCP=101316 
SFwdR=0 SFail=0 SFErr=0 SNaAns=34482 SNXD=0 RUQ=0 RURQ=0 RUXFR=10 RUUpd=34451
Aug  1 12:00:49 terminator named[103]: NSTATS 965156449 959476930 A=8 NS=1 
SOA=357195 PTR=2 TXT=32 IXFR=9 AXFR=204
Aug  1 12:00:49 terminator named[103]: XSTATS 965156449 959476930 RR=3253 RNXD=0 
RFwdR=0 RDupR=0 RFail=20 RFErr=0 RErr=11 RAXFR=204 RLame=0 ROpts=0 SSysQ=3360 

SAns=391444 SFwdQ=0 SDupQ=1244 SErr=0 RQ=458332RIQ=25 RFwdQ=0 RDupQ=0 RTCP=101388 
SFwdR=0 SFail=0 SFErr=0 SNaAns=34506 SNXD=0 RUQ=0 RURQ=0 RUXFR=10 RUUpd=34475

The number of queries received is dumped in the RQ field (in bold). To calculate the number of queries received in the hour, just subtract the first RQ value from the second one: 458332 - 458031 = 301.

Even if your host is fast enough to handle the number of queries it receives, you should make sure that the DNS traffic isn't placing an undue load on your network. On most LANs, DNS traffic is too small a proportion of the network's bandwidth to worry about. Over slow leased lines or dialup connections, though, DNS traffic could consume enough bandwidth to merit concern.

For a rough estimate of the volume of DNS traffic on your LAN, multiply the number of queries received (RQ) plus the number of answers sent (SAns) in an hour by 800 bits (100 bytes, a rough average size for a DNS message), and divide by 3600 (seconds per hour) to find the bandwidth utilized. This should give you a feeling for how much of your network's bandwidth is consumed by DNS traffic.[56]

[56]For a nice package that automates the analysis of BIND's statistics, look for Nigel Campbell's bindgraph in the DNS Resources Directory's tools page, http://www.dns.net/dnsrd/tools.html.

To give you an idea of what's normal, the last NSFNET traffic report (in April 1995) showed that DNS traffic constituted just over 5% of the total traffic volume (in bytes) on their backbone. The NSFNET's figures were based upon actual traffic sampling, not calculations like ours using the name server's statistics.[57] If you want to get a more accurate idea of the traffic your name server is receiving, you can always do your own traffic sampling with a LAN protocol analyzer.

[57]We're not sure how representative of the current state of the Internet these numbers are, but it's extremely difficult to wheedle equivalent numbers out of the commercial backbone providers that succeeded the NSFNET.

Once you've found that your name servers are overworked, what then? First, it's a good idea to make sure your name servers aren't being bombarded with queries by a misbehaving program. To do that, you'll need to find out where all the queries are coming from.

If you're running a BIND 4.9 or 8.1.2 name server, you can find out which resolvers and name servers are querying your name server just by dumping the statistics. These name servers keep statistics on a host-by-host basis, which is really useful in tracking down heavy users of your name server. BIND 8.2 or newer name servers don't keep these statistics by default; to induce them to keep host-by-host statistics, use the host-statistics substatement in your options statement, like this:[58]

[58]BIND 9 doesn't support the host-statistics substatement -- or keeping per-host statistics, for that matter -- as of 9.1.0.

options {
	host-statistics yes;
};

For example, take these statistics:

+++ Statistics Dump +++ (829373099) Fri Apr 12 23:24:59 1996
970779    time since boot (secs)
471621    time since reset (secs)
0    Unknown query types
185108    A queries
6    NS queries
69213    PTR queries
669    MX queries
2361    ANY queries
++ Name Server Statistics ++
(Legend)
    RQ      RR      RIQ      RNXD      RFwdQ
    RFwdR   RDupQ   RDupR    RFail     RFErr
    RErr    RTCP    RAXFR    RLame     ROpts
    SSysQ   SAns    SFwdQ    SFwdR     SDupQ
    SFail   SFErr   SErr     RNotNsQ   SNaAns
    SNXD
(Global)
    257357 20718 0 8509 19677  19939 1494 21 0 0  0 7 0 1 0
    824 236196 19677 19939 7643  33 0 0 256064 49269  155030
 [15.17.232.4]
    8736 0 0 0 717  24 0 0 0 0  0 0 0 0 0  0 8019 0 717 0
    0 0 0 8736 2141  5722
[15.17.232.5]
    115 0 0 0 8  0 21 0 0 0  0 0 0 0 0  0 86 0 1 0  0 0 0 115 0  7
[15.17.232.8]
    66215 0 0 0 6910  148 633 0 0 0  0 5 0 0 0  0 58671 0 6695 0
    15 0 0 66215 33697  6541
[15.17.232.16]
    31848 0 0 0 3593  209 74 0 0 0  0 0 0 0 0  0 28185 0 3563 0
    0 0 0 31848 8695  15359
[15.17.232.20]
    272 0 0 0 0  0 0 0 0 0  0 0 0 0 0  0 272 0 0 0  0 0 0 272 7  0
[15.17.232.21]
    316 0 0 0 52  14 3 0 0 0  0 0 0 0 0  0 261 0 51 0  0 0 0 316 30  30
[15.17.232.24]
    853 0 0 0 65  1 3 0 0 0  0 2 0 0 0  0 783 0 64 0  0 0 0 853 125  337
[15.17.232.33]
    624 0 0 0 47  1 0 0 0 0  0 0 0 0 0  0 577 0 47 0  0 0 0 624 2  217
[15.17.232.94]
    127640 0 0 0 1751  14 449 0 0 0  0 0 0 0 0  0 125440 0 1602 0
    0 0 0 127640 106  124661
[15.17.232.95]
    846 0 0 0 38  1 0 0 0 0  0 0 0 0 0  0 809 0 37 0  0 0 0 846 79  81
-- Name Server Statistics --
--- Statistics Dump --- (829373099) Fri Apr 12 23:24:59 1996

After the Global entry, each host is broken out by IP address in brackets. Looking at the legend, you can see that the first field in each record is RQ, or queries received. That gives us a good reason to look at hosts 15.17.232.8, 15.17.232.16, and 15.17.232.94, which appear to be responsible for about 88% of our queries.

If you're running an older name server, the only way to find out which resolvers and name servers are sending all those darned queries is to turn on name server debugging. (We'll cover this in depth in Chapter 13, "Reading BIND Debugging Output".) All you're really interested in is the source IP addresses of the queries your name server is receiving. When poring over the debugging output, look for hosts sending repeated queries, especially for the same or similar information. That may indicate a misconfigured or buggy program running on the host, or a foreign name server pelting your name server with queries.

If all the queries appear legitimate, add a new name server. Don't put the name server just anywhere, though; use the information from the debugging output to help you decide where best to run one. In cases where DNS traffic is gobbling up your Ethernet, it won't help to choose a host at random and create a name server there. You need to consider which hosts are sending all the queries, then figure out how to best provide them name service. Here are some hints to help you decide:

Look for queries from resolvers on hosts that share the same file server. You could run a name server on the file server.
Look for queries from resolvers on large, multiuser hosts. You could run a name server there.
Look for queries from resolvers on another subnet. Those resolvers should be configured to query a name server on their local subnet. If there isn't one on that subnet, create one.
Look for queries from resolvers on the same bridged segment (assuming you use bridging). If you run a name server on the bridged segment, the traffic won't need to be bridged to the rest of the network.
Look for queries from hosts connected to each other via another, lightly loaded network. You could run a name server on the other network.

Chapter 8. Growing Your Domain

Contents:

8.1. How Many Name Servers?

Figure 8-1. Sample network topology

8.1.1. Where Do I Put My Name Servers?

8.1.2. Capacity Planning