18.3. Adjusting for network reliability problemsEven a lightly loaded network can suffer from reliability problems if older bridges or routers joining the network segments routinely drop parts of long packet trains. Older bridges and routers are most likely to affect NFS performance if their network interfaces cannot keep up with the packet arrival rates generated by the NFS clients and servers on each side. Some NFS experts believe it is a bad idea to micro-manage NFS to compensate for network problems, arguing instead that these problems should be handled by the transport layer. We encourage you to use NFS over TCP, and allow the TCP implementation to dynamically adapt to network glitches and unreliable networks. TCP does a much better job of adjusting transfer sizes, handling congestion, and generating retransmissions to compensate for network problems. Having said this, there may still be times when you choose to use UDP instead of TCP to handle your NFS traffic. In such cases, you will need to determine the impact that an old bridge or router is having on your network. This requires another look at the client-side RPC statistics:
When timeouts is high and badxid is close to zero, it implies that the network or one of the network interfaces on the client, server, or any intermediate routing hardware is dropping packets. Some older host Ethernet interfaces are tuned to handle page-sized packets and do not reliably handle larger packets; similarly, many older Ethernet bridges cannot forward long bursts of packets. Older routers or hosts acting as IP routers may have limited forwarding capacity, so reducing the number of packets sent for any request reduces the probability that these routers will drop packets that build up behind their network interfaces. The NFS buffer size determines how many packets are required to send a single, large read or write request. The Solaris default buffer size is 8KB for NFS Version 2 and 32KB for NFS Version 3. Linux uses a default buffer size of 1KB. The buffer size can be negotiated down, at mount time, if the client determines that the server prefers a smaller transfer size.% nfsstat -rc Client rpc: Connection-oriented: calls badcalls badxids timeouts newcreds badverfs 1753569 1412 3 64 0 0 timers cantconn nomem interrupts 0 1317 0 18 Connectionless: calls badcalls retrans badxids timeouts newcreds 12252 41 334 5 166 0 badverfs timers nomem cantsend 0 4321 0 206
Decreasing the NFS buffer size has the undesirable effect of increasing the load on the server and sending more packets on the network to read or write a given buffer. The size of the actual packets on the network does not change, but the number of IP packets composing a single NFS buffer decreases as the rsize and wsize are decreased. For example, an 8KB NFS buffer is divided into five IP packets of about 1500 bytes, and a sixth packet with the remaining data bytes. If the write size is set to 2048 bytes, only two IP packets are needed. The problem lies in the number of packets required to transfer the same amount of data. Table 18-2 shows the number of IP packets required to copy a file for various NFS read buffer sizes.# mount -o rsize=2048,wsize=2048 wahoo:/export/home /mnt
Table 18-2. IP packets, RPC requests as function of NFS buffer size
As the file size increases, transfers with smaller NFS buffer sizes send more IP packets to the server. The number of packets will be the same for 4096- and 8192-byte buffers, but for file sizes over 4K, setting rsize=4096 always requires twice as many RPC calls to the server. The increased network traffic adds to the very problem for which the buffer size change was compensating, and the additional RPC calls further load the server. Due to the increased server load, it is sometimes necessary to increase the RPC timeout parameter when decreasing NFS buffer sizes. Again, we encourage you to use NFS over TCP when possible and avoid having to worry about the NFS buffer sizes.
Copyright © 2002 O'Reilly & Associates. All rights reserved.