6.5. ReplicationSolaris 2.6 introduced the concept of replication to NFS clients. This feature is known as client-side failover. Client-side failover is useful whenever you have read-only data that you need to be highly available. An example will illustrate this. Suppose your user community needs to access a collection of historical data on the last 200 national budgets of the United States. This is a lot of data, and so is a good candidate to store on a central NFS server. However, because your users' jobs depend on it, you do not want to have a single point of failure, and so you keep the data on several NFS servers. (Keeping the data on several NFS servers also gives one the opportunity to load balance). Suppose you have three NFS servers, named hamilton, wolcott, and dexter, each exporting a copy of data. Then each server might have an entry like this in its dfstab file:
Now, without client-side failover, each NFS client might have one of the following vfstab entries:share -o ro /export/budget_stats
Suppose an NFS client is mounting /stats/budgetfrom NFS server hamilton, and hamilton stops responding. The user on that client will want to mount a different server. In order to do this, he'll have to do all of the following:hamilton:/export/budget_stats - /stats/budget nfs - yes ro wolcott:/export/budget_stats - /stats/budget nfs - yes ro dexter:/export/budget_stats - /stats/budget nfs - yes ro
This vfstab entry defines a replicated NFS filesystem. When this vfstab entry is mounted, the NFS client will:hamilton,wolcott,dexter:/export/budget_stat - /budget_stats nfs - yes ro
The currserver value tells us that NFS traffic for the /budget_stats mount point is bound to server wolcott. Apparently hamilton stopped responding at one point, because we see non-zero values for the counters noresponse, failover and remap. The counter noresponse counts the number of times a remote procedure call to the currently bound NFS server timed out. The counter failovercounts the number of times the NFS client has "failed over" or switched to another NFS server due to a timed out remote procedure call. The counter remap counts the number of files that were "mapped" to another NFS server after a failover. For example, if an application on the NFS client had /budget_stats/1994/deficit open, and then the client failed over to another server, the next time the application went to read data from /budget_stats/1944/deficit, the open file reference would be re-mapped to the corresponding 1944/deficit file on the newly bound NFS server. Solaris will also notify you when a failover happens. Expect a message like:% nfsstat -m ... /budget_stats from hamilton,wolcott,dexter:/export/budget_stats Flags: vers=3,proto=tcp,sec=sys,hard,intr,llock,link,symlink,acl,rsize=32768,wsize=32768, retrans=5 Failover:noresponse=1, failover=1, remap=1, currserver=wolcott
on both the NFS client's system console and in its /var/adm/messages file. By the way, it is not required that each server have the same pathname mounted. The mount command will let you mount replica servers with different directories. For example:NOTICE: NFS: failing over from hamilton to wolcott
As long as the contents of serverX:/q and serverY:/m are the same, the top level directory name does not have to be. The next section discusses rules for content of replicas.# mount -o ro serverX:/q,serverY:/m /mnt
6.5.1. Properties of replicasReplicas on each server in the replicated filesystem have to be the same in content. For example, if on an NFS client we have done:
then /export on both servers needs to be an exact copy. One way to generate such a copy would be:# mount -o ro serverX,serverY:/export /mnt
The third command invoked here, rm -rf ../export is somewhat curious. What we want to do is remove the contents of /export in a manner that is as fast and secure as possible. We could do rm -rf /exportbut that has the side of effect of removing /export as well as its contents. Since /export is exported, any NFS client that is currently mounting serverY:/export will experience stale filehandles (see Section 18.8, "Stale filehandles"). Recreating /export immediately with the mkdir command does not suffice because of the way NFS servers generate filehandles for clients. The filehandle contains among other things the inode number (a file's or directory's unique identification number) and this is almost guaranteed to be different. So we want to remove just what is under /export. A commonly used method for doing that is:# rlogin serverY serverY # cd /export serverY # rm -rf ../export serverY # mount serverX:/export /mnt serverY # cd /mnt serverY # find . -print | cpio -dmp /export serverY # umount /mnt serverY # exit #
but the problem there is that if someone has placed a filename like foo /etc/passwd (i.e., a file with an embedded space character) in /export, then the xargs rm -rf command will remove a file called foo and a file called /etc/passwd, which on Solaris may prevent one from logging into the system. Doing rm -rf ../export will prevent /export from being removed because rm will not remove the current working directory. Note that this behavior may vary with other systems, so test it on something unimportant to be sure. At any rate, the aforementioned sequence of commands will create a replica that has the following properties:# cd /export ; find . -print | xargs rm -rf
6.5.2. Rules for mounting replicasIn order to use client-side failover, the filesystem must be mounted with the sub-options ro (read-only) and hard. The reason why it has to be mounted read-only is that if NFS clients could write to the replica filesystem, then the replicas would be no longer synchronized, producing the following undesirable effects:
Note that it is not a requirement that all the NFS servers in the replicated filesystem support the same transport protocol (TCP or UDP).# mount -o vers=2 serverA,serverB,serverC:/export /mnt
6.5.3. Managing replicasIn Solaris, the onus for creating, distributing, and maintaining replica filesystems is on the system administrator; there are no tools to manage replication. The techniques used in the example given in the Section 6.5.1, "Properties of replicas", can be used, although the example script given in that subsection for generating a replica may cause stale filehandle problems when using it to update a replica; we will address this in Section 18.8, "Stale filehandles". You will want to automate the replica distribution procedure. In the example, you would alter the aforementioned example to:
6.5.4. Replicas and the automounterReplication is best combined with use of the automounter. The integration of the two is described in Section 9.5.1, "Replicated servers".
Copyright © 2002 O'Reilly & Associates. All rights reserved.