15.4. Incorrect directory content caching
A user of a Solaris NFS client reported having
intermittent problems
accessing files mounted from a non-Unix NFS server. The Solaris NFS
client
tarsus was apparently able to list files
that had previously been removed by another NFS client, but was
unable to access the contents of the files. The files would
eventually disappear. The NFS client that initially removed the files
did not experience any problems and the user reported that the files
had indeed been removed from the server's directory. He
verified this by logging into the NFS server and listing the contents
of the exported directory.
We suspected the client
tarsus was not
invalidating its cached information, and proceeded to try to
reproduce the problem while capturing the NFS packets to analyze the
network traffic:
[1] tarsus$ ls -l /net/inchun/export/folder
total 8
-rw-rw-rw- 1 labiaga staff 2883 Apr 10 20:03 data1
-rw-rw-rw- 1 root other 12 Apr 10 20:01 data2
[1] protium$ rm /net/inchun/export/folder/data2
[2] tarsus$ ls /net/inchun/export/folder
data1 data2
[3] tarsus$ ls -l /net/inchun/export/folder
/net/inchun/export/folder/data2: Stale NFS file handle
total 6
-rw-rw-rw- 1 labiaga staff 2883 Apr 10 20:03 data1
The first directory listing on
tarsus correctly
displayed the contents of the NFS directory
/net/inchun/export/folder before anything was
removed. The problems began after the NFS client
protium removed the file
data2. The second directory listing on
tarsus continued showing the recently removed
data2 file as part of the directory, although
the extended directory listing reported a "Stale NFS
filehandle" for
data2.
This was a typical case of inconsistent caching of information by an
NFS client. Solaris NFS clients cache the directory content and
attribute information in memory at the time the directory contents
are first read from the NFS server. Subsequent client accesses to the
directory first validate the cached information, comparing the
directory's cached modification time to the modification time
reported by the server. A match in modification times indicates that
the directory has not been modified since the last time the client
read it, therefore it can safely use the cached data. On the other
hand, if the modification times are different, the NFS client purges
its cache, and issues a new NFS
Readdir
request to the server to obtain the updated directory contents and
attributes. Some non-Unix NFS servers are known for not updating the
modification time of directories when files are removed, leading to
directory caching problems. We used
snoop to
capture the NFS packets between our client and server while the
problem was being reproduced. The analysis of the snoop output should
help us determine if we're running into this caching problem.
To facilitate the discussion, we list the
snoop
packets preceded by the commands that generated them. This shows the
correlation between the NFS traffic and the Unix commands that
generate the traffic:
[1] tarsus $ ls -l /net/inchun/export/folder
total 8
-rw-rw-rw- 1 labiaga staff 2883 Apr 10 20:03 data1
-rw-rw-rw- 1 root other 12 Apr 10 20:01 data2
7 0.00039 tarsus -> inchun NFS C GETATTR2 FH=FA14
8 0.00198 inchun -> tarsus NFS R GETATTR2 OK
9 0.00031 tarsus -> inchun NFS C READDIR2 FH=FA14 Cookie=0
10 0.00220 inchun -> tarsus NFS R READDIR2 OK 4 entries (No more)
11 0.00033 tarsus -> inchun NFS C LOOKUP2 FH=FA14 data2
12 0.00000 inchun -> tarsus NFS R LOOKUP2 OK FH=F8CD
13 0.00000 tarsus -> inchun NFS C GETATTR2 FH=F8CD
14 0.00000 inchun -> tarsus NFS R GETATTR2 OK
15 0.00035 tarsus -> inchun NFS C LOOKUP2 FH=FA14 data1
16 0.00211 inchun -> tarsus NFS R LOOKUP2 OK FH=F66F
17 0.00032 tarsus -> inchun NFS C GETATTR2 FH=F66F
18 0.00191 inchun -> tarsus NFS R GETATTR2 OK
Packets 7 and 8 contain the request and reply for attributes for the
/net/inchun/export/folder directory. The
attributes can be displayed by using the
-v
directive:
Excerpt from: snoop -i /tmp/capture -p 7,8 -v
ETHER: ----- Ether Header -----
ETHER:
ETHER: Packet 8 arrived at 20:45:9.75
...
NFS: ----- Sun NFS -----
NFS:
NFS: Proc = 1 (Get file attributes)
NFS: Status = 0 (OK)
NFS: File type = 2 (Directory)
NFS: Mode = 040777
NFS: Type = Directory
NFS: Setuid = 0, Setgid = 0, Sticky = 0
NFS: Owner's permissions = rwx
NFS: Group's permissions = rwx
NFS: Other's permissions = rwx
NFS: Link count = 2, UID = 0, GID = -2, Rdev = 0x0
NFS: File size = 512, Block size = 512, No. of blocks = 1
NFS: File system id = 7111, File id = 161
NFS: Access time = 11-Apr-00 12:50:18.000000 GMT
NFS: Modification time = 11-Apr-00 12:50:18.000000 GMT
NFS: Inode change time = 31-Jul-96 09:40:56.000000 GMT
Packet 8 shows the
/net/inchun/export/folder
directory was last modified on April 11, 2000 at 12:50:18.000000 GMT.
tarsus caches this timestamp to later determine
when the cached directory contents need to be updated. Packet 9
contains the request made by
tarsus for the
directory listing from
inchun. Packet 10
contains
inchun's reply with four entries
in the directory. A detailed view of the packets shows the four
directory entries: ".", "..",
"
data1", and
"
data2". The EOF indicator notifies
the client that all existing directory entries have been listed, and
there is no need to make another NFS
Readdir
call:
Excerpt from: snoop -i /tmp/capture -p 9,10 -v
ETHER: ----- Ether Header -----
ETHER:
ETHER: Packet 10 arrived at 20:45:9.74
...
NFS: ----- Sun NFS -----
NFS:
NFS: Proc = 16 (Read from directory)
NFS: Status = 0 (OK)
NFS: File id Cookie Name
NFS: 137 50171 .
NFS: 95 50496 ..
NFS: 199 51032 data1
NFS: 201 51706 data2
NFS: 4 entries
NFS: EOF = 1
NFS:
The directory contents are cached by
tarsus, so
that they may be reused in a future directory listing. The NFS
Lookup and NFS
Getattr
requests, along with their corresponding replies in packets 11 thru
18, result from the long listing of the directory requested by
ls -l. An NFS
Lookup
obtains the filehandle of a directory component. The NFS
Getattr requests the file attributes of the file
identified by the previously obtained filehandle.
NFS Version 2 filehandles are 32 bytes long. Instead of displaying a
long and cryptic 32-byte number,
snoop generates
a shorthand version of the filehandle and displays it when invoked in
summary mode. This helps you associate filehandles with file objects
more easily. You can obtain the exact filehandle by displaying the
network packet in verbose mode by using the -v option. The packet 7
filehandle FH=FA14 is really:
Excerpt from: snoop -i /tmp/capture -p 7 -v
NFS: ----- Sun NFS -----
NFS:
NFS: Proc = 1 (Get file attributes)
NFS: File handle = [FA14]
NFS: 0204564F4C32000000000000000000000000A10000001C4DFF20A00000000000
Next,
protium, a different NFS client comes into
the picture, and removes one file from the directory previously
cached by
tarsus:
[1] protium $ rm /net/inchun/export/folder/data2
22 0.00000 protium -> inchun NFS C GETATTR2 FH=FA14
23 0.00000 inchun -> protium NFS R GETATTR2 OK
24 0.00000 protium -> inchun NFS C REMOVE2 FH=FA14 data2
25 0.00182 inchun -> protium NFS R REMOVE2 OK
Packets 22 and 23 update the cached attributes of the
/net/inchun/export/folder directory on
protium. Packet 24 contains the actual NFS
Remove request sent to
inchun, which in turn acknowledges the
successful removal of the file in packet 25.
tarsus then lists the directory in question, but
fails to detect that the contents of the directory have changed:
[2] tarsus $ ls /net/inchun/export/folder
data1 data2
39 0.00000 tarsus -> inchun NFS C GETATTR2 FH=FA14
40 0.00101 inchun -> tarsus NFS R GETATTR2 OK
This is where the problem begins. Notice that two NFS
Getattr network packets are generated as a
result of the directory listing but no
Readdir
request. In this case, the client issues the NFS
Getattr operation to request the
directory's modification time:
Excerpt from: snoop -i /tmp/capture -p 39,40 -v
ETHER: ----- Ether Header -----
ETHER:
ETHER: Packet 40 arrived at 20:45:10.88
...
NFS: ----- Sun NFS -----
NFS:
NFS: Proc = 1 (Get file attributes)
NFS: Status = 0 (OK)
NFS: File type = 2 (Directory)
NFS: Mode = 040777
NFS: Type = Directory
NFS: Setuid = 0, Setgid = 0, Sticky = 0
NFS: Owner's permissions = rwx
NFS: Group's permissions = rwx
NFS: Other's permissions = rwx
NFS: Link count = 2, UID = 0, GID = -2, Rdev = 0x0
NFS: File size = 512, Block size = 512, No. of blocks = 1
NFS: File system id = 7111, File id = 161
NFS: Access time = 11-Apr-00 12:50:18.000000 GMT
NFS: Modification time = 11-Apr-00 12:50:18.000000 GMT
NFS: Inode change time = 31-Jul-96 09:40:56.000000 GMT
The modification time of the directory is the same as the
modification time before the removal of the file!
tarsus compares the cached modification time of
the directory with the modification time just obtained from the
server, and determines that the cached directory contents are still
valid since the modification times are the same. The directory
listing is therefore satisfied from the cache instead of forcing the
NFS client to read the updated directory contents from the server.
This explains why the removed file continues to show up in the
directory listing:
[3] tarsus $ ls -l /net/inchun/export/folder
/net/inchun/export/folder/data2: Stale NFS file handle
total 6
-rw-rw-rw- 1 labiaga staff 2883 Apr 10 20:03 data1
44 0.00000 tarsus -> inchun NFS C GETATTR2 FH=FA14
45 0.00101 inchun -> tarsus NFS R GETATTR2 OK
46 0.00032 tarsus -> inchun NFS C GETATTR2 FH=F66F
47 0.00191 inchun -> tarsus NFS R GETATTR2 OK
48 0.00032 tarsus -> inchun NFS C GETATTR2 FH=F8CD
49 0.00214 inchun -> tarsus NFS R GETATTR2 Stale NFS file handle
The directory attributes reported in packet 45 are the same as those
seen in packet 40, therefore
tarsus assumes that
it can safely use the cached filehandles associated with the cached
entries of this directory. In packet 46,
tarsus
requests the attributes of filehandle F66F, corresponding to the
data1 file. The server replies with the
attributes in packet 47.
tarsus then proceeds to
request the attributes of filehandle F8CD, which corresponds to the
data2 file. The server replies with a
"Stale NFS filehandle" error because there is no file on
the server associated with the given filehandle. This problem would
never have occurred had the server updated the modification time
after removing the file causing
tarsus to detect
that the directory had been changed.
Directory caching works nicely when the NFS server obeys Unix
directory semantics. Many non-Unix NFS servers provide such semantics
even if they have to submit themselves to interesting contortions.
Having said this, there is nothing in the NFS protocol specification
that requires the modification time of a directory to be updated when
a file is removed. You may therefore need to disable Solaris NFS
directory caching if you're running into problems interacting
with non-Unix servers. To permanently disable NFS directory caching,
add this line to
/etc/system:
set nfs:nfs_disable_rddir_cache = 0x1
The Solaris kernel reads
/etc/system at startup
and sets the value of
nfs_disable_rddir_cache to
0x1 in the
nfs kernel module. The change takes
effect only after reboot. Use
adb to disable
caching during the current session, postponing the need to reboot.
You still need to set the tunable in
/etc/system
to make the change permanent through reboots:
aqua# adb -w -k /dev/ksyms /dev/mem
physmem 3ac8
nfs_disable_rddir_cache/W1
nfs_disable_rddir_cache: 0x0 = 0x1
adb is an interactive assembly level debugger
that enables you to consult and modify the kernel's memory
contents. The
-k directive instructs
adb to perform kernel memory mapping accessing
the kernel's memory via
/dev/mem, and
obtaining the kernel's symbol table from
/dev/ksyms. The
-w directive allows
you to modify the kernel memory contents. A word of caution:
adb is a power tool that will cause serious data
corruption and potential system panics
when misused.
| | |
15.3. Boot parameter confusion | | 15.5. Incorrect mount point permissions |