Chapter 7. Network File System Design and Operation

It's possible to configure and use the Network File System without too much knowledge of how it is implemented or why various design decisions were made. But if you need to debug problems, or analyze patterns of NFS usage to suggest performance optimizations, you will need to know more about the inside workings of the NFS protocol and the daemons that implement it. With an understanding of how and why NFS does the things it does, you can more readily determine why it is broken or slow -- probably the two most common complaints in any large NFS network.

Like NIS, NFS is implemented as a set of RPC procedures that use eXternal Data Representation (XDR) encoding to pass arguments between client and server. A filesystem mounted using NFS provides two levels of transparency:

The filesystem appears to be resident on a disk attached to the local system, and all of the filesystem entries -- files and directories -- are viewed the same way, whether local or remote. NFS hides the location of the file on the network.
NFS-mounted filesystems contain no information about the file server from which they are mounted. The NFS file server may be of a different architecture or running an entirely different operating system with a radically different filesystem structure. For example, a Sun machine running Solaris can mount an NFS filesystem from a Windows NT system or an IBM MVS mainframe, using NFS server implementations for each of these systems. NFS hides differences in the underlying remote filesystem structure and makes the remote filesystem appear to be of the exact same structure as that of the client.

NFS achieves the first level of transparency by defining a generic set of filesystem operations that are performed on a Virtual File System (VFS). The second level comes from the definition of virtual nodes, which are related to the more familiar Unix filesystem inode structures but hide the actual structure of the physical filesystem beneath them. The set of all procedures that can be performed on files is the vnode interface definition. The vnode and VFS specifications together define the NFS protocol.

7.1. Virtual filesystems and virtual nodes

The Virtual File System allows a client system to access many different types of filesystems as if they were all attached locally. VFS hides the differences in implementations under a consistent interface. On a Unix NFS client, the VFS interface makes all NFS filesystems look like Unix filesystems, even if they are exported from IBM MVS or Windows NT servers. The VFS interface is really nothing more than a switchboard for filesystem- and file-oriented operations, as shown in Figure 7-1.

Figure 7-1. Virtual File System interfaces

Actions that operate on entire filesystems, such as getting the amount of free space left in the filesystem, are called VFS operations; calls that operate on files or directories are vnode operations. On the server side, implementing a VFS entails taking the generic VFS and vnode operations and converting them into the appropriate actions on the real, underlying filesystem. This conversion happens invisibly to the NFS client process. It made a straightforward system call, which the client-side VFS turned into a vnode operation, and the server then converted into an equivalent operation on its filesystem.

For example, the chown( ) system call has an analogous operator in the vnode interface that sets the attributes of a file, as does the stat( ) system call that retrieves these attributes. There is not a strict one-to-one relationship of Unix system calls to vnode operations. The write( ) system call uses several filesystem calls to get a file's attributes, and append or modify blocks in the file. Some vnode operations are not defined on certain types of filesystems. The FAT filesystem, for example, doesn't have an equivalent of symbolic links, so an NFS file server running on an Windows NT machine rejects any attempts to use the vnode operation to create a symbolic link.

So far we have defined an interface to some filesystem objects, but not the mechanism used to "name" objects in the system. In a local Unix system call, these object names are file descriptors, which uniquely identify a file within the scope of a process. The counterparts of file descriptors in NFS are filehandles, which are opaque "pointers" to files on the remote system. An opaque handle is of no value to the client because it can only be interpreted in the context of the remote filesystem. When you want to make a system call on a file, you first get a file descriptor for it. To make an NFS call (in the kernel) you must get a filehandle for the vnode. It is up to the virtual filesystem layer to translate user-level file descriptors into kernel-level filehandles. Filehandles and their creation will be covered in more depth in the next section.

Chapter 7. Network File System Design and Operation

Contents:

7.1. Virtual filesystems and virtual nodes

Figure 7-1. Virtual File System interfaces