
nfs tuning for fast server, slow client
We have a Sequent S81 which nfs mounts /usr/spool/mail from a much
speedier Sun 670. Under even moderate loads on the Sequent, we get
nfs server not responding timeouts, and we strongly suspect that the
Sequent is losing response packets from the Sun.
Indeed. The problem is the ethernet controller on the S81's SCED card.
It cannot handle packet trains - a handful of packets back to back on
the wire. Check the number of receiver overflows - packets for the S81
seen by the SCED but dropped - reported by sestat.
Your Sun will reply to an NFS read request from the S81 by sending an
8K (say) block as 5 or 6 ethernet packets. These packets go out on the
ethernet one after the other, separated by the minimum inter-packet
delay given in the Blue Book. The Sequent drops one of these packets
now and again: its controller can't keep up with this rate of traffic.
The S81 kernel tries to re-assemble the NFS reply, finds a packet
missing and waits for a while in case it turns up. After a while, it
discards the reply - what the hell, it was a datagram anyway - and
sends the NFS read request again. When the Sun sees this, it fires off
the same 8K block as 5 or 6 packets and the cycle repeats itself.
The very useful NFS and NIS book from O'Reilly gives helpful advice
about how to deal with the combination of slow server/fast client,
but doesn't have anything to say about our situation. Can anyone
offer suggestions of how we can make our Sun server more considerate
of it's slower client.
Experiment with some of the NFS mount options such as rsize and wsize
(the NFS read and write block sizes) and timeo (NFS timeout) and retrans
(retransmission count). The default values for these will probably
have been tuned for a Sun with a decent ethernet controller. Dropping
the read and write block size to 4K for this NFS mounted filesystem
might do the trick. There are no hard and fast rules: you'll need to
experiment to see what works best in your environment.
(Unfortunately, not mounting /usr/spool/mail
is not an option, because the users are already {*filter*}ed.)
This should be an option: your argument is not a good excuse. NFS
mounting the mailbox partition is a Very Bad Idea. First of all, you
may run into protection problems because user and delivery agents may
be running as root when accessing a mailbox. NFS requests made as root
normally gets converted to the null UID nobody. The server will fail
the request which makes the user or delivery agent break. Secondly,
mailbox locking is a Big Problem. It is all too easy for two processes
running on different hosts to simultaneously access an NFS mounted
mailbox and trash it. Consider someone writing back their mailbox as
the mail system is delivering mail to it....
The lock protocols used by UNIX mail systems can easily get upset by
NFS. Creating a lockfile or renaming the mailbox can be fooled because
of NFS directory cacheing. There is a window when NFS clients and
their servers can see a different picture of the same directory. Thus
a mail program may not see a lock file which really does exist.... The
other option is to use the NFS lock daemon to "lock" the mailbox, but
this is not exactly a reliable piece of software.
What you can do is arrange remote access to mailboxes using the Post
Office Protocol (POP). This allows users to read and send mail without
having the mailbox located on the host they are using (or NFS
mounted). Some user agents like MH can be configured to support POP
transparently: the users don't even need to know where their mailbox
lives or that POP is being used.
Hope this helps
Jim