15.1 Understanding NFS

Using NFS, clients can mount partitions of a server as if they were physically connected to the client. In addition to allowing remote access to files over the network, NFS allows many (relatively) low-cost computer systems to share the same high-capacity disk drive at the same time. NFS clients and servers have been written for many different operating systems.

NFS is nearly transparent. In practice, a workstation user simply logs into the workstation and begins working, accessing it as if the files were locally stored. In many environments, workstations are set up to mount the disks on the server automatically at boot time or when files on the disk are first referenced. NFS also has a network-mounting program that can be configured to mount the NFS disk automatically when an attempt is made to access files stored on remote disks.

There are several basic security problems with NFS:

NFS is built on top of Sun's RPC (Remote Procedure Call), and in most cases uses RPC for user authentication. Unless a secure form of RPC is used, NFS can be easily spoofed.
Even when Secure RPC is used, information sent by NFS over the network is not encrypted, and is thus subject to monitoring and eavesdropping. The data can be intercepted and replaced (thereby corrupting or Trojaning files being imported via NFS).
NFS uses the standard Unix filesystem for access control, opening the networked filesystem to many of the same problems as a local filesystem.

One of the key design features behind NFS is the concept of server statelessness. Unlike other systems, there is no "state" kept on a server to indicate that a client is performing a remote file operation. Thus, if the client crashes and is rebooted, there is no state in the server that needs to be recovered. Alternatively, if the server crashes and is rebooted, the client can continue operating on the remote file as if nothing really happenedóthere is no server-side state to recreate.^[2] We'll discuss this concept further in later sections.

^[2] Actual implementations are not completely stateless, however, as we will see later in this chapter.

15.1.1 NFS History

NFS was developed inside Sun Microsystems in the early 1980s. Since that time, NFS has undergone three major revisions:

NFS Version 1

NFS Version 1 was Sun's prototype network filesystem. This version was never released to the outside world.

NFS Version 2

NFS Version 2 was first distributed with Sun's SunOS 2 operating system in 1985. Version 2 was widely licensed to numerous Unix workstation vendors. A freely distributable, compatible version was developed in the late 1980s at the University of California at Berkeley.

During its 10-year life, many subtle, undocumented changes were made to the NFS Version 2 specification. Some vendors allowed NFS version 2 to read or write more than 4 KB at a time; others increased the number of groups provided as part of the RPC authentication from 8 to 16. Although these minor changes created occasional incompatibilities between different NFS implementations, NFS Version 2 provided a remarkable degree of compatibility between systems made by different vendors.

NFS Version 3

NFS Version 3 specification was developed during a series of meetings in Boston in July, 1992.^[3] Working code for NFS Version 3 was introduced by some vendors in 1995, and became widely available. Version 3 incorporated many performance improvements over Version 2, but did not significantly change the way that NFS works or the security model used by the network filesystem.

^[3] Pawlowski, Juszczak, Staubach, Smith, Lebel, and Hitz, "NFS Version 3 Design and Implementation," USENIX Summer 1994 conference. The standard was later codified as RFC 1813. A copy of the NFS Version 3 paper can be obtained from http://www.netapp.com/tech_library/hitz94.html. The RFC can be downloaded from http://www.faqs.org/rfcs/rfc1813.html.

NFS Version 4

NFS Version 4 is described in RFC 3010, published in December of 2000 as a draft standard. Version 4 will be a departure from previous versions of NFS by being stateful, and by including the locking and mounting operations as part of the basic protocol. NFSv4 is also being designed with stronger security considerations. However, because development of version 4 is ongoing as this book goes to press, we provide only a brief discussion.

NFS is based on two similar but distinct protocols: MOUNT and NFS. Both make use of a data object known as a file handle. There is also a distributed protocol for file locking, which is not technically part of NFS, and which does not have any obvious security ramifications,^[4] so we won't describe the file-locking protocol here.

^[4] While there are no obvious security implications for the lock protocol itself (other than the obvious denial of service problems), the lockd daemon that implements the protocol was the subject of several buffer overflow problems discovered in the late 1990s.

15.1.2 File Handles

Each object on the NFS-mounted filesystem is referenced by a unique object called a file handle. A file handle is viewed by the client as being opaqueóthe client cannot interpret the contents. However, to the server, the contents have considerable meaning. The file handles uniquely identify every file and directory on the server computer.

The Unix NFS server stores three pieces of information inside each file handle.

Filesystem identifier: Refers to the partition containing the file (file identifiers such as inode numbers are usually unique only within a partition).
File identifier: Can be something as simple as an inode number, used to refer to a particular item on a partition.
Generation count: A number that is incremented each time a file is unlinked and recreated. The generation count ensures that when a client references a file on the server, that file is, in fact, the same file that the server thinks it is. Without a generation count, two clients accessing the same file on the same server could produce erroneous results if one client deleted the file and created a new file with the same inode number. The generation count prevents such situations from occurring: when the file is recreated, the generation number is incremented, and the second client gets an error message when it attempts to access the older, now nonexistent, file.

Which Is Better: Stale Handles or Stale Love?

To better understand the role of the generation count, imagine a situation in which you are writing a steamy love letter to a colleague with whom you are having a clandestine affair. You start by opening a new editor file on your workstation. Unbeknownst to you, your editor creates the file in the /tmp directory, which happens to be on the NFS server. The server allocates an inode from the free list on that partition, constructs a file handle for the new file, and sends the file handle to your workstation (the client). You begin editing the file. "My darling chickadee, I remember last Thursday in your office . . . " you start to write, only to be interrupted by a long phone call.

You aren't aware of it, but as you are talking on the phone, there is a power flicker in the main computer room, and the server crashes and reboots. As part of the reboot, the temporary file for your mail is deleted along with everything else in the /tmp directory, and its inode is added back to the free list on the server. While you are still talking on the phone, your manager starts to compose a letter to the president of the company, recommending a raise and promotion for you. He also opens a file in the /tmp directory, and his diskless workstation is allocated a file handle for the same inode that you were using (it is free now, after all)!

You finally finish your call and return to your letter. Of course, you notice nothing out of the ordinary because of the stateless nature of NFS. You put the finishing touches on your letter ("... and I can't wait until this weekend; my wife suspects nothing!") and save it. Your manager finishes his letter at the same moment: "... as a reward for his hard work and serious attitude, I recommend a 50% raise." Your manager and you hit the Send key simultaneously.

Without a generation count, the results might be less than amusing. The object of your affection could get a letter about you deserving a raise. Or, your manager's boss could get a letter concerning a midday dalliance on the desktop. Or, both recipients might get a mixture of the two versions, with each version containing one file record from one file and one from another. The problem is that the system can't distinguish between the two files because the file handles are the same.

This kind of thing occasionally happened before the the generation-count code was working properly and consistently in the Sun NFS server. With the generation-count software working as it should, you will now instead get an error message stating "Stale NFS File Handle" when you try to access the (now deleted) file. That's because the server increments the generation-count value in the inode when the inode is returned to the free list. Later, whenever the server receives a request from a client that has a valid file handle except for the generation count, the server rejects the operation and returns an error.

Some older NFS servers ignore the generation count in the file handle. These versions of NFS are considerably less secure, as they enable an attacker to easily create valid file handles for directories on the server. They can also lead to the corruption of user files.

Note that the file handle doesn't include a pathname; a pathname is not necessary and is, in fact, subject to change while a file is being accessed.

15.1.3 The MOUNT Protocol

The MOUNT protocol is used for the initial negotiation between the NFS client and the NFS server. Using MOUNT, a client can determine which filesystems are available for mounting and can obtain a token (the file handle) that is used to access the root directory of a particular filesystem. After that file handle is returned, it can thereafter be used to retrieve file handles for other directories and files on the server.

Another benefit of the MOUNT protocol is that you can export only a portion of a local partition to a remote client. By specifying that the root is a directory on the partition, the MOUNT service will return its file handle to the client. To the client, this file handle behaves exactly like one for the root of a partition: reads, writes, and directory lookups all behave the same way.

MOUNT is an RPC service. The service is provided by the mountd or rpc.mountd daemon, which is started automatically at boot time. (On Solaris systems, for example, mountd is located in /usr/lib/nfs/mountd, and is started by the script /etc/rc3.d/S15nfs.server.) MOUNT is often given the RPC program number 100,005. The standard mountd normally responds to six different requests:

Request	Effect
NULL	Does nothing
MNT	Returns a file handle for a filesystem; advises the mount daemon that a client has mounted the filesystem
DUMP	Returns the list of mounted filesystems
UMNT	Removes the mount entry for this client for a particular filesystem
UMNTALL	Removes all mount entries for this client
EXPORT	Returns the server's export list to the client

Although the MOUNT protocol provides useful information within an organization, the information that it provides could be used by those outside an organization to launch an attack. For this reason, you should prevent people outside your organization from accessing your computer's mount daemon. The best way to do this is by using a host-based or network-based firewall. See Chapter 11 for further information.

The MOUNT protocol is based on Sun Microsystems' RPC and External Data Representation (XDR) protocols. For a complete description of the MOUNT protocol, see RFC 1094.

15.1.4 The NFS Protocol

The NFS protocol takes over where the MOUNT protocol leaves off. With the NFS protocol, a client can list the contents of an exported filesystem's directories; obtain file handles for other directories and files; and even create, read, or modify files (as permitted by Unix permissions).

Here is a list of the RPC functions that perform operations on directories:

Function	Effect
CREATE	Creates (or truncates) a file in the directory
LINK	Creates a hard link
LOOKUP	Looks up a file in the directory
MKDIR	Makes a directory
READADDR	Reads the contents of a directory
REMOVE	Removes a file in the directory
RENAME	Renames a file in the directory
RMDIR	Removes a directory
SYMLINK	Creates a symbolic link

These RPC functions can be used with files:

Function	Effect
GETATTR	Gets a file's attributes (owner, length, etc.)
SETATTR	Sets some of a file's attributes
READLINK	Reads a symbolic link's path
READ	Reads from a file
WRITE	Writes to a file

NFS Version 3 added a number of additional RPC functions. With the exception of MKNOD, these new functions simply allow improved performance:

Function	Effect
ACCESS	Determines if a user has the permission to access a particular file or directory
FSINFO	Returns static information about a filesystem
FSSTAT	Returns dynamic information about a filesystem
MKNOD	Creates a device or special file on the remote filesystem
READDIRPLUS	Reads a directory and returns the file attributes for each entry in the directory
PATHCONF	Returns the attributes of a file specified by the pathname
COMMIT	Commits the NFS write cache to disk

All communication between the NFS client and the NFS server is based upon Sun's RPC system (described in Chapter 13), which lets programs running on one computer call subroutines that are executed on another. RPC uses Sun's XDR system to allow the exchange of information between different kinds of computers (see Figure 15-1). Sun built NFS upon the Internet User Datagram Protocol (UDP), believing that UDP was faster and more efficient than TCP. However, NFS required reliable transmission and, as time went on, many tuning parameters were added that made NFS resemble TCP in many respects. NFS Version 3 allows the use of TCP, which actually improves performance over low-bandwidth, high-latency links such as modem-based PPP connections because TCP's backoff and retransmission algorithms are significantly better than those in NFS.

Figure 15-1. NFS protocol stack

15.1.4.1 How NFS creates a reliable filesystem from a best-effort protocol

UDP is fast but only best-effort. "Best effort" means that the protocol does not guarantee that UDP packets transmitted will ever be delivered, or that they will be delivered in order. NFS works around this problem by requiring the NFS server to acknowledge every RPC command with a result code that indicates whether the command was successfully completed. If the NFS client does not get an acknowledgment within a certain amount of time, it retransmits the original command.

If the NFS client does not receive an acknowledgment, that indicates that UDP lost either the original RPC command or the RPC acknowledgment. If the original RPC command was lost, there is no problemóthe server sees it for the first time when it is retransmitted. But if the acknowledgment was lost, the server will actually get the same NFS command twice.

For most NFS commands, this duplication of requests presents no problem. With READ, for example, the same block of data can be read once or a dozen times, without consequence. Even with the WRITE command, the same block of data can be written twice to the same point in the file, without consequence, so long as there is not more than one process writing to the file at the same time.^[5]

^[5] This is precisely the reason that NFS does not have an atomic command for appending information to the end of a file.

Other commands, however, cannot be executed twice in a row. MKDIR, for example, will fail the second time that it is executed because the requested directory will already exist. For commands that cannot be repeated, some NFS servers maintain a cache of the last few commands that were executed. When the server receives a MKDIR request, it first checks the cache to see if it has already received the MKDIR request. If so, the server merely retransmits the acknowledgment (which must have been lost).

15.1.4.2 Hard, soft, and spongy mounts

If the NFS client still receives no acknowledgment, it will retransmit the request again and again, each time doubling the time that it waits between retries. If the network filesystem was mounted with the soft option, the request will eventually time out. If the network filesystem is mounted with the hard option, the client continues sending the request until the client is rebooted or gets an acknowledgment. Some BSD-derived versions of Unix also have a spongy option that is similar to hard, except that the stat, lookup, fsstat, readlink, and readdir operations behave as if they have a soft MOUNT.

NFS uses the mount command to specify whether a filesystem is mounted with the hard or soft option. To mount a filesystem soft, specify the soft option. For example:

/etc/mount -o soft zeus:/big /zbig

This command mounts the directory /big stored on the server called zeus locally in the directory /zbig. The option -o soft tells the mount program that you wish the filesystem mounted soft.

To mount a filesystem hard, do not specify the soft option:

/etc/mount zeus:/big /zbig

On some systems you need to be explicit that this is an NFS mount. You may also be able to use a URL format for the path and server. Here are examples of each:

mount  -F nfs zeus:/big /zbig
mount  nfs://zeus/bin /zbig

Deciding whether to mount a filesystem hard or soft can be difficult because there are advantages and disadvantages to each option. Diskless workstations often hard-mount the directories that they use to keep system programs; if a server crashes, the workstations wait until the server is rebooted, then continue file access with no problem. Filesystems containing home directories are usually hard-mounted so that all disk writes to those filesystems will be performed correctly.

On the other hand, if you mount many filesystems with the hard option, you will discover that your workstation may stop working every time any server crashes and won't work again until it reboots. If there are many libraries and archives that you keep mounted on your system, but that are not critical, you may wish to mount them soft. You may also wish to specify the intr option, which is like the hard option except that the user can interrupt it by typing the kill character (usually Ctrl-C).

As a general rule of thumb, read-only filesystems can be mounted soft without any chance of accidental loss of data. An alternative to using soft mounts is to mount everything hard (or spongy, when available) but avoid mounting your nonessential NFS partitions directly in the root directory. This practice will prevent the Unix getpwd( ) function from hanging when a server is down.^[6]

^[6] Hal Stern, in Managing NFS and NIS, says that any filesystem that is read/write or on which you are mounting executables should be mounted hard to avoid corruption. His analogy with a dodgy NFS server is that hard mount behaves like a slow drive, while soft mount behaves like a broken drive!

15.1.4.3 Connectionless and stateless

As we've mentioned, NFS servers are stateless by design. Stateless means that all of the information that the client needs to mount a remote filesystem is kept on the client, instead of having additional information with the mount stored on the server. After a file handle is issued for a file, that file handle will remain good even if the server is shut down and rebooted as long as the file continues to exist and no major changes are made to the configuration of the server that would change the values (e.g., a filesystem rebuild or restore from tape).

Early NFS servers were also connectionless. Connectionless means that the server program does not keep track of every client that has remotely mounted the filesystem.^[7] When offering NFS over a TCP connection, however, NFS is not connectionless: there is one TCP connection for each mounted filesystem.

^[7] An NFS server computer does keep track of clients that mount its filesystems remotely. The /usr/etc/rpc.mountd program maintains this database; however, a computer that is not in this database can still access the server's filesystem even if it is not registered in the rpc.mountd database.

The advantage of a stateless, connectionless system is that such systems are easier to write and debug. The programmer does not need to write any code for re-establishing connections after the network server crashes and restarts because there is no connection that must be re-established. If a client crashes (or if the network becomes disconnected), valuable resources are not tied up on the server maintaining a connection and state for that client.

A second advantage of this approach is that it scales. That is, a connectionless, stateless NFS server works equally well if 10 clients are using a filesystem or if 10,000 are using it. Although system performance suffers under extremely heavy use, every file request made by a client using NFS should eventually be satisfied, and there is no performance penalty if a client mounts a filesystem but never uses it.

15.1.4.4 NFS and root

Because the superuser can do so much damage on the typical Unix system, NFS takes special precautions in how it handles the superuser running on client computers.

Instead of giving the client superuser unlimited privileges on the NFS server, NFS gives the superuser on the clients virtually no privileges: the superuser is mapped to the UID of the nobody useróusually a UID of 32767 or 60001 (although, occasionally -1 or -2 on pre-POSIX systems).^[8] Some versions of NFS allow you to specify at mount time the UID to which to map root's accesses, with the UID of the nobody user as the default.

^[8] The Unix kernel maps accesses from client superusers to the kernel variable nobody, which is set to different values on different systems. Historically, the value of nobody was -1, although Solaris defines nobody to be 60001. You can change this value to 0 through the use of adb, treating all superuser requests automatically as superuser on the NFS server. In the immortal words of Ian D. Horswill, "The Sun kernel has a user-patchable cosmology. It contains a polytheism bit called `nobody.'...The default corresponds to a basically Greek pantheon in which there are many Gods and they're all trying to screw each other (both literally and figuratively in the Greek case). However, by using adb to set the kernel variable nobody to 0 in the divine boot image, you can move to a Baha'i cosmology in which all Gods are really manifestations of the One Root God, Zero, thus inventing monotheism." (The Unix-Haters Handbook, by Simson Garfinkel et al., IDG Books, 1994, p. 291.)

Thus, superusers on NFS client machines actually have fewer privileges (with respect to the NFS server) than ordinary users. However, this lack of privilege isn't usually much of a problem for would-be attackers who have root access because the superuser can simply su to a different UID such as bin or sys. On the other hand, treating the superuser in this way can protect other files on the NFS server.

Most implementations of NFS do no remapping of any other UID, nor any remapping of any GID values.^[9] Thus, if a server exports any file or directory with access permissions for some user or group, the superuser on a client machine can take on an identity to access that information. This rule implies that the exported file can be read or copied by someone remote or, worse, modified without authorization.

^[9] As usual, there are exceptions. As we'll see later, implementations of NFS of modern BSD and Linux systems can map other server UIDs that aren't present on the client to an anonymous UID.

15.1.5 NFS Version 3

During the 10 years of the life of NFS Version 2, a number of problems were discovered with it. These problems included:

NFS was originally based on AUTH_UNIX RPC security. As such, it provided almost no protection against spoofing. AUTH_UNIX simply used the stated UID and GID of the client user to determine access.
The packets transmitted by NFS were not encrypted, and were thus open to eavesdropping, alteration, or forging on a network.
NFS had no provisions for files larger than 4 GB. This was not a problem in 1985, but many Unix users now have bigger disks and bigger files.
NFS suffered serious performance problems on high-speed networks because of the maximum 8-KB data-size limitation on READ and WRITE procedures, and because of the need to separately request the file attributes on each file when a directory was read.

NFS Version 3 (NFS 3) was the first major revision to NFS since the protocol was commercially released. As such, NFS 3 was designed to correct many of the problems that had been experienced with NFS. But NFS 3 was not a total rewrite. According to Pawlowski et al., there were three guiding principles in designing NFS 3:

Keep it simple.
Get it done in a year.
Avoid anything controversial.

Thus, while NFS 3 allows for improved performance and access to files larger than 4 GB, it does not make any fundamental changes to the overall NFS architecture. (That has been relegated to NFS Version 4.)

As a result of the design criteria, there are relatively few changes between the NFS 2 and NFS 3 protocols:

File handle size was increased from a fixed-length 32-byte block of data to a variable-length array with a maximum length of 64 bytes.
The maximum size of data that can be transferred using READ and WRITE procedures is now determined dynamically by the values returned by the FSINFO function. The maximum lengths for filenames and pathnames are now similarly specified.
File lengths and offsets were extended from four bytes to eight bytes.^[10]

^[10] Future versions of NFSóor any other filesystemówill not likely need to use more than eight bytes to represent the size of a file: eight bytes can represent more than 1.7 x 10¹³ MB of storage.
RPC errors can now return data (such as file attributes) in addition to returning codes.
Additional file types are now supported for character- and block-device files, sockets, and FIFOs. In some cases, this actually increases the potential vulnerability of the NFS server.
An ACCESS procedure was added to allow an NFS client to explicitly check to see if a particular user can or cannot access a file.

Because RPC allows a server to respond to more than one version of a protocol at the same time, NFS 3 servers are potentially able to support the NFS 2 and 3 protocols simultaneously so that they can serve older NFS 2 clients while allowing easy upgradability to NFS 3. Likewise, most NFS 3 clients could continue to support the NFS 2 protocol as well so that they can speak with old and new servers.^[11]

^[11] However, as the years pass, there is likely to be less need for Version 2 support. Good software engineering and security practice would suggest that the code for Version 2 compatibility be dropped rather than leave open the possibility of compromise via a lurking bug.

This need for backward compatibility effectively prevented the NFS 3 designers from adding new security features to the protocols. If NFS 3 had more security features, an attacker could avoid them by resorting to NFS 2. On the other hand, by changing a site from unsecure RPC to secure RPC, a site can achieve secure NFS for all of its NFS clients and servers, whether they are running NFS 2 or NFS 3.

If your system supports NFS over TCP links, you should configure it to use TCP and not UDP unless there are significant performance reasons for not doing so. TCP-based service is more immune to denial of service problems, spoofed requests, and several other potential problems inherent in the current use of UDP packets.