Chapter 5. TCP/IP Name ServicesContents:
Host Files
The majority of the conversations between computers these days take place using a protocol called Transmission Control Protocol running over a lower layer called Internet Protocol.[1] These two protocols are commonly lumped together into the acronym TCP/IP. Every machine that participates on a TCP/IP network must be assigned at least one unique numeric identifier, called an IP address. IP addresses are usually written using the form NNN.NNN.N.N, e.g., 192.168.1.9.
While machines are content to call each other by strings of dot-separated numbers, most people are less enamored by this idea. TCP/IP would have fallen flat on its face as a protocol if users had to remember a unique 12-digit sequence for every machine they wanted to contact. Mechanisms had to be invented to manage and distribute an IP address to human-friendly name mappings. This chapter describes the evolution of the network name services that allow us to access data at www.oog.org instead of at 192.168.1.9, and what takes place behind the scenes. Along the way we combine a dash of history with a healthy serving of practical advice on how Perl can help to manage this crucial part of any networking infrastructure. 5.1. Host FilesThe first approach used to solve the problem of mapping IP addresses to names was the most obvious and simple one: a standard file was created to hold a table of IP addresses and their corresponding computer names. This file can be found as /etc/hosts on Unix systems, Macintosh HD:System Folder:Preferences:hosts on Macs, and \$systemroot$\System32\Drivers\Etc\hosts on NT/2000 machines. On NT/2000 there is also an lmhosts file that serves a slightly different purpose, which we'll talk about later. Here's an example Unix-style host file: 127.0.0.1 localhost 192.168.1.1 everest.oog.org everest 192.168.1.2 rivendell.oog.org rivendell The limitations of this approach become clear very quickly. If oog.org 's network manager has two machines on a TCP/IP network that communicate with each other, and she wants to add a third which will be addressed by name, she's got to edit the correct file on all of her machines. If oog.org buys yet another machine, there are now four separate host files to be maintained (one on each machine). As untenable as this may seem, this is what actually happened during the early days of the Internet/ARPAnet. As new sites were connected, every site on the net that wished to talk with the new site needed to update their host files. The central host repository, known as the Network Information Center (NIC) (or more precisely the SRI-NIC, since it was housed at SRI at the time), updated and published a host file for the entire network called HOSTS.TXT. System administrators would anonymously FTP this file from SRI-NIC's NETINFO directory on a regular basis. Host files are still in use today, despite their limitations and the replacements we'll be talking about later in this chapter. There are some situations where host files are even mandatory. For example, under SunOS, a machine consults its /etc/hosts file to determine its own IP address. Host files also solve the "chicken and egg" problem encountered while a machine boots. If the network name servers that machine will be using are specified by name, there must be some way to determine their IP addresses. But if the network name service isn't operational yet, there's no way (unless it broadcasts for help) to receive this information. The usual solution is to place a stub file (with just a few hosts) in place for booting purposes. On a small network, having an up-to-date host file that includes all of the hosts on that network is useful. It doesn't even have to reside on each machine in that network to be helpful (since the other mechanisms we'll describe later do a much better job of distributing this information). Just having one around that can be consulted is handy for quick manual lookups and address allocation purposes. Since these files are still a part of everyday administration, let's look at better ways to manage them. Perl and host files are a natural match, given Perl's predilection for text file processing. Given their affinity for each other, we're going to use the simple host file as a springboard for a number of different explorations. Let's look at the parsing of host files. Parsing a host file can be a simple as this: open(HOSTS, "/etc/hosts") or die "Unable to open host file:$!\n"; while (defined ($_ = <HOSTS>)) { next if /^#/; # skip comments lines next if /^$/; # skip empty lines s/\s*#.*$//; # delete in-line comments and preceding whitespace ($ip, @names) = split; die "The IP address $ip already seen!\n" if (exists $addrs{$ip}); $addrs{$ip} = [@names]; for (@names){ die "The host name $_ already seen!\n" if (exists $names{$_}); $names{$_} = $ip; } } close(HOSTS); The previous code walks through an /etc/hosts file (skipping blank lines and comments), creating two data structures for later use. The first data structure is a hash of lists of hostnames keyed by the IP address. For the host file above, the data structure created would look like this: $addrs{'127.0.0.1'} = ['localhost']; $addrs{'192.168.1.2'} = ['rivendell.oog.org','rivendell']; $addrs{'192.168.1.1'} = ['everest.oog.org','everest']; The second is a hash table of host names, keyed by the name. For the same file, the %names hash would look like this: $names{'localhost'}='127.0.0.1' $names{'everest'}='192.168.1.1' $names{'everest.oog.org'}='192.168.1.1' $names{'rivendell'}='192.168.1.2' $names{'rivendell.oog.org'}='192.168.1.2' Note that in the simple process of parsing this file, we've also added some additional functionality. Our code checks for duplicate host names and IP addresses (both bad news on a TCP/IP network). When dealing with network-related data, use every opportunity possible to check for errors and bad information. It is always better to catch problems early in the game than to be bitten by them once the data has been propagated to your entire network. Because it is so important, I'll return to this topic later in the chapter 5.1.1. Generating Host FilesNow we turn to the more interesting topic of generating host files. Let's assume we have the following host database file for the hosts on our network: name: shimmer address: 192.168.1.11 aliases: shim shimmy shimmydoodles owner: David Davis department: software building: main room: 909 manufacturer: Sun model: Ultra60 -=- name: bendir address: 192.168.1.3 aliases: ben bendoodles owner: Cindy Coltrane department: IT building: west room: 143 manufacturer: Apple model: 7500/100 -=- name: sulawesi address: 192.168.1.12 aliases: sula su-lee owner: Ellen Monk department: design building: main room: 1116 manufacturer: Apple model: 7500/100 -=- name: sander address: 192.168.1.55 aliases: sandy micky mickydoo owner: Alex Rollins department: IT building: main room: 1101 manufacturer: Intergraph model: TD-325 -=- The format is simple: fieldname: value with -=- used as a separator between records. You might find you need other fields than those listed above, or have too many records to make it practical to keep in a single flat file. Though we are using a single flat file, the concepts we'll show in this chapter are not backend-specific. Here's some code that will parse a file like this to generate a host file: $datafile ="./database"; $recordsep = "-=-\n"; open(DATA,$datafile) or die "Unable to open datafile:$!\n"; $/=$recordsep; # prepare to read in database file one record at a time print "#\n\# host file - GENERATED BY $0\n# DO NOT EDIT BY HAND!\n#\n"; while (<DATA>) { chomp; # remove the record separator # split into key1,value1,...bingo, hash of record %record = split /:\s*|\n/m; print "$record{address}\t$record{name} $record{aliases}\n"; } close(DATA); Here's the output: # # host file - GENERATED BY createhosts # DO NOT EDIT BY HAND! # 192.168.1.11 shimmer shim shimmy shimmydoodles 192.168.1.3 bendir ben bendoodles 192.168.1.12 sulawesi sula su-lee 192.168.1.55 sander sandy micky mickydoo. Let's look at a few of the more interesting Perl techniques in this small code sample. The first unusual thing we do is set $/. From that point on, Perl treats chunks of text that end in -=-\n as a single record. This means the while statement will read in an entire record at a time and assign it to $_. The second interesting tidbit is the split assign technique. Our goal is to get each record into a hash with a key as the field name and its value as the field value. You'll see why we go to this trouble later as we develop this example further. The first step is to break $_ into component parts using split( ). The array we get back from split( ) is shown in Table 5-1. Table 5.1. The Array Returned by split()
Now take a good look at the contents of the list. Starting at $record[0], we have a key-value pair list (i.e., key=Name, value=shimmer\n, key=Address, value=192.168.1.11\n...) which we can just assign to populate a hash. Once this hash is created, we can print the parts we need. 5.1.2. Error Checking the Host File Generation ProcessPrinting the parts we need is just the beginning of what we can do. One very large benefit of using a separate database that gets converted into another form is the ability to insert error checking into the conversion process. As we mentioned before, this can prevent simple typos from becoming a problem before they get a chance to propagate or be put into production use. Here's the previous code with some simple additions to check for typos: $datafile ="./database"; $recordsep = "-=-\n"; open(DATA,$datafile) or die "Unable to open datafile:$!\n"; $/=$recordsep; # prepare to read in database file one record at a time print "#\n\# host file - GENERATED BY $0\n# DO NOT EDIT BY HAND!\n#\n"; while (<DATA>) { chomp; # remove the record separator # split into key1,value1,...bingo, hash of record %record = split /:\s*|\n/m; # check for bad hostnames if ($record{name} =~ /[^-.a-zA-Z0-9]/) { warn "!!!! $record{name} has illegal host name characters, skipping...\n"; next; } # check for bad aliases if ($record{aliases} =~ /[^-.a-zA-Z0-9\s]/) { warn "!!!! $record{name} has illegal alias name characters, skipping...\n"; next; } # check for missing address if (!$record{address}) { warn "!!!! $record{name} does not have an IP address, skipping...\n"; next; } # check for duplicate address if (defined $addrs{$record{address}}) { warn "!!!! Duplicate IP addr: $record{name} & $addrs{$record{address}}, skipping...\n"; next; } else { $addrs{$record{address}} = $record{name}; } print "$record{address}\t$record{name} $record{aliases}\n"; } close(DATA); 5.1.3. Improving the Host File OutputLet's borrow from Chapter 9, "Log Files", and add some analysis to the conversion process. We can automatically add useful headers, comments, and separators to the data. Here's an example output using the exact same database: # # host file - GENERATED BY createhosts3 # DO NOT EDIT BY HAND! # # Converted by David N. Blank-Edelman (dnb) on Sun Jun 7 00:43:24 1998 # # number of hosts in the design department: 1. # number of hosts in the software department: 1. # number of hosts in the IT department: 2. # total number of hosts: 4 # # Owned by Cindy Coltrane (IT): west/143 192.168.1.3 bendir ben bendoodles # Owned by Alex Rollins (IT): main/1101 192.168.1.55 sander sandy micky mickydoo # Owned by Ellen Monk (design): main/1116 192.168.1.12 sulawesi sula su-lee # Owned by David Davis (software): main/909 192.168.1.11 shimmer shim shimmy shimmydoodles Here's the code that produced that output, followed by some commentary: $datafile ="./database"; # get username on either WinNT/2000 or Unix $user = ($^O eq "MSWin32")? $ENV{USERNAME} : (getpwuid($<))[6]." (".(getpwuid($<))[0].")"; open(DATA,$datafile) or die "Unable to open datafile:$!\n"; $/=$recordsep; # read in database file one record at a time while (<DATA>) { chomp; # remove the record separator # split into key1,value1 @record = split /:\s*|\n/m; $record ={}; # create a reference to empty hash %{$record} = @record; # populate that hash with @record # check for bad hostname if ($record->{name} =~ /[^-.a-zA-Z0-9]/) { warn "!!!! ".$record->{name} . " has illegal host name characters, skipping...\n"; next; } # check for bad aliases if ($record->{aliases} =~ /[^-.a-zA-Z0-9\s]/) { warn "!!!! ".$record->{name} . " has illegal alias name characters, skipping...\n"; next; } # check for missing address if (!$record->{address}) { warn "!!!! ".$record->{name} . " does not have an IP address, skipping...\n"; next; } # check for duplicate address if (defined $addrs{$record->{address}}) { warn "!!!! Duplicate IP addr:".$record->{name}. " & ".$addrs{$record->{address}}.", skipping...\n"; next; } else { $addrs{$record->{address}} = $record->{name}; } $entries{$record->{name}} = $record; # add this to a hash of hashes } close(DATA); # print a nice header print "#\n\# host file - GENERATED BY $0\n# DO NOT EDIT BY HAND!\n#\n"; print "# Converted by $user on ".scalar(localtime)."\n#\n"; # count the number of entries in each department and then report on it foreach my $entry (keys %entries){ $depts{$entries{$entry}->{department}}++; } foreach my $dept (keys %depts) { print "# number of hosts in the $dept department: $depts{$dept}.\n"; } print "# total number of hosts: ".scalar(keys %entries)."\n#\n\n"; # iterate through the hosts, printing a nice comment and the entry itself foreach my $entry (keys %entries) { print "# Owned by ",$entries{$entry}->{owner}," (", $entries{$entry}->{department},"): ", $entries{$entry}->{building},"/", $entries{$entry}->{room},"\n"; print $entries{$entry}->{address},"\t", $entries{$entry}->{name}," ", $entries{$entry}->{aliases},"\n\n"; } The most significant difference between this code example and the previous one is the data representation. Because there was no need in the previous example to retain the information from a record after it had been printed, we could use the single hash %record. But for this code, we chose to read the file into a slightly more complex data structure (a hash of hashes) so we could do some simple analysis of the data before printing it. We could have kept a separate hash table for each field (similar to our needspace example in Chapter 2, "Filesystems"), but the beauty of this approach is its maintainability. If we decide later on to add a serial_number field to the database, we do not need to change our program's parsing code; it will just magically appear as $record->{serial_number}. The downside is that Perl's syntax probably makes our code look more complex than it is. Here's an easy way to look at it: we're parsing the file in precisely the same way we did in the last example. The difference is this time we are storing each record in a newly-created anonymous hash. Anonymous hashes are just like normal hash variables except they are accessed through a reference, instead of a name. To create our larger data structure (a hash of hashes), we link this new anonymous hash back into the main hash table %entries. We created a key with an associated value that is the reference to the anonymous hash we've just populated. Once we are done, %entries has a key for each machine's name and a value that is a reference to a hash table containing all of the fields associated with that machine name (IP address, room, etc.). Perhaps you'd prefer to see the output sorted by IP address? No problem, just include a custom sort routine by changing: foreach my $entry (keys %entries) { to: foreach my $entry (sort byaddress keys %entries) { and adding: sub byaddress { @a = split(/\./,$entries{$a}->{address}); @b = split(/\./,$entries{$b}->{address}); ($a[0]<=>$b[0]) || ($a[1]<=>$b[1]) || ($a[2]<=>$b[2]) || ($a[3]<=>$b[3]); } Here's the relevant portion of the output, now nicely sorted: # Owned by Cindy Coltrane (IT): west/143 192.168.1.3 bendir ben bendoodles # Owned by David Davis (software): main/909 192.168.1.11 shimmer shim shimmy shimmydoodles # Owned by Ellen Monk (design): main/1116 192.168.1.12 sulawesi sula su-lee # Owned by Alex Rollins (IT): main/1101 192.168.1.55 sander sandy micky mickydoo Make the output look good to you. Let Perl support your professional and aesthetic endeavors. 5.1.4. Incorporating a Source Code Control SystemIn a moment we're going to move on to the next approach to the IP Address-to-Name mapping problem. Before we do, we'll want to add another twist to our host file creation process, because a single file suddenly takes on network-wide importance. A mistake in this file will affect an entire network of machines. To give us a safety net, we'll want a way to back out of bad changes, essentially going back in time to a prior configuration state. The most elegant way to build a time machine like this is to add a source control system to the process. Source control systems are typically used by developers to:
This functionality is extremely useful to a system administrator. The error-checking code we added to the conversion process earlier, in Section 5.1.2, "Error Checking the Host File Generation Process", can help with certain kinds of typo and syntax errors, but it does not offer any protection against semantic errors (e.g., deleting an important hostname, assigning the wrong IP address to a host, misspelling a hostname). You could add semantic error checks into the conversion process, but you probably won't catch all of the possible errors. As we've quoted before, nothing is foolproof, since fools are so ingenious. You might think it would be better to apply source control system functionality to the initial database editing process, but there are two good reasons why it is also important to apply it to the resultant output:
My source control system of choice is the Revision Control System (RCS). RCS has some Perl- and system administration-friendly features:
If you've never dealt with RCS before, please take a moment to read Appendix A, "The Five-Minute RCS Tutorial". The rest of this section assumes a cursory knowledge of the RCS command set. Craig Freter has written an object-oriented module called Rcs which makes using RCS from Perl easy. The steps are:
Let's add this to our host file generation code so you can see how the module works. Besides the Rcs module code, we've also changed things so the output is sent to a specific file and not STDOUT as in our previous versions. Only the code that has changed is shown. Refer to the previous example for the omitted lines represented by "...": $outputfile="hosts.$$"; # temporary output file $target="hosts"; # where we want the converted data stored ... open(OUTPUT,"> $outputfile") or die "Unable to write to $outputfile:$!\n"; print OUTPUT "#\n\# host file - GENERATED BY $0\n # DO NOT EDIT BY HAND!\n#\n"; print OUTPUT "# Converted by $user on ".scalar(localtime)."\n#\n"; ... foreach my $dept (keys %depts) { print OUTPUT "# number of hosts in the $dept department: $depts{$dept}.\n"; } print OUTPUT "# total number of hosts: ".scalar(keys %entries)."\n#\n\n"; # iterate through the hosts, printing a nice comment and the entry foreach my $entry (sort byaddress keys %entries) { print OUTPUT "# Owned by ",$entries{$entry}->{owner}," (", $entries{$entry}->{department},"): ", $entries{$entry}->{building},"/", $entries{$entry}->{room},"\n"; print OUTPUT $entries{$entry}->{address},"\t", $entries{$entry}->{name}," ", $entries{$entry}->{aliases},"\n\n"; } close(OUTPUT); use Rcs; # where our RCS binaries are stored Rcs->bindir('/usr/local/bin'); # create a new RCS object my $rcsobj = Rcs->new; # configure it with the name of our target file $rcsobj->file($target); # check it out of RCS (must be checked in already) $rcsobj->co('-l'); # rename our newly created file into place rename($outputfile,$target) or die "Unable to rename $outputfile to $target:$!\n"; # check it in $rcsobj->ci("-u","-m"."Converted by $user on ".scalar(localtime)); This code assumes the target file has been checked in at least once already. To see the effect of this code addition, we can look at three entries excerpted from the output of rloghosts: revision 1.5 date: 1998/05/19 23:34:16; author: dnb; state: Exp; lines: +1 -1 Converted by David N. Blank-Edelman (dnb) on Tue May 19 19:34:16 1998 ---------------------------- revision 1.4 date: 1998/05/19 23:34:05; author: eviltwin; state: Exp; lines: +1 -1 Converted by Divad Knalb-Namlede (eviltwin) on Tue May 19 19:34:05 1998 ---------------------------- revision 1.3 date: 1998/05/19 23:33:35; author: dnb; state: Exp; lines: +20 -0 Converted by David N. Blank-Edelman (dnb) on Tue May 19 19:33:16 1998 The previous example doesn't show much of a difference between file versions (see the lines: part of the entries), but you can see that we are tracking the changes every time the file gets created. If we needed to, we could use the rcsdiff command to see exactly what changed. Under dire circumstances, we would be able to revert to previous versions if one of these changes had wreaked unexpected havoc on the network. Copyright © 2001 O'Reilly & Associates. All rights reserved. |
||||||||||||||||||||||||||||||||||||||
|