
Very slow NFS performance between AIX client and FreeBSD server
Hello all --
I am hoping someone can help me debug a performance problem. We have
two AIX clients mounting from a FreeBSD server, and while one sees
fine performance, the other performs abominably (i.e. 2 minutes+ for a
200K copy.) Both clients are running the same version of AIX (4.3)
and appear to have all the nfs variables set to the same values (using
'nfso'). The server is running FreeBSD 4.4. I have checked the
network status of the poor client, and all other network connections
(i.e. ftp access to other sites or to the nfs server) behave normally.
I did a tcpdump (capturing full packets) while copying files to the
NFS mounts of both clients, and one of the main differences seems to
be that the poor client does many many more GETATTR calls and replies.
Can someone help me figure out why this would be? None of the calls
are failing. While one of the copies is taking place, an 'ls' on the
server shows that the destination file is either empty or size 8192 --
the delay seems to be in some kind of exchange of information, rather
than with the actual copying of data.
I have tried tuning various variables, including rsize, wsize, and
timeout, but have not seen any noticeable improvement. Two
strangenesses that I have noticed are that I have a high number of CRC
errors and Alignment errors (though both the card and switch are set
to Full-Duplex, and it is not interfering with non-NFS network
activity), and that the NFS slowness is sporadic (if I do continuous
copies of 100-900K files, eventually the copies speed up to normal
before slowing back down to a minutes-long crawl.)
I'm enclosing the tcpdump output from two of the captured packets (the
GETATTR call and reply). If you can help, I would appreciate a reply
to my e-mail address (a...@brni.com).
Thanks!
Andria
P.S. I'm new to AIX, so if you have suggestions, I'd really
appreciate it if you could send the appropriate commands as well.
------------------------------------------------
Frame 10 (182 on wire, 182 captured)
Arrival Time: Apr 24, 2002 14:06:19.921447000
Time delta from previous packet: 1.801991000 seconds
Time relative to first packet: 1.840613000 seconds
Frame Number: 10
Packet Length: 182 bytes
Capture Length: 182 bytes
Ethernet II
Destination: 08:00:36:b1:88:03 (FreeBSD-server)
Source: 00:04:ac:e4:9c:b7 (AIX-client)
Type: IP (0x0800)
Internet Protocol, Src Addr: AIX-client (XX.XX.XX.8), Dst Addr:
FreeBSD-server (XX.XX.XX.2)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN:
0x00)
0000 00.. = Differentiated Services Codepoint: Default (0x00)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 168
Identification: 0x98e6
Flags: 0x00
.0.. = Don't fragment: Not set
..0. = More fragments: Not set
Fragment offset: 0
Time to live: 60
Protocol: TCP (0x06)
Header checksum: 0x2547 (correct) Source: AIX-client
(XX.XX.XX.8)
Destination: FreeBSD-server (XX.XX.XX.2)
Transmission Control Protocol, Src Port: 49300 (49300), Dst Port: nfsd
(2049), Seq: 2915623760, Ack: 2475944560
Source port: 49300 (49300)
Destination port: nfsd (2049)
Sequence number: 2915623760
Next sequence number: 2915623888
Acknowledgement number: 2475944560
Header length: 20 bytes
Flags: 0x0018 (PSH, ACK)
0... .... = Congestion Window Reduced (CWR): Not set
.0.. .... = ECN-Echo: Not set
..0. .... = Urgent: Not set
...1 .... = Acknowledgment: Set
.... 1... = Push: Set
.... .0.. = Reset: Not set
.... ..0. = Syn: Not set
.... ...0 = Fin: Not set
Window size: 60000
Checksum: 0x3d10 (correct)
Remote Procedure Call
Last Fragment: Yes
Fragment Length: 124
XID: 0x378947fb (931743739)
Message Type: Call (0)
RPC Version: 2
Program: NFS (100003)
Program Version: 3 Procedure: GETATTR (1)
Credentials
Flavor: AUTH_UNIX (1)
Length: 52
Stamp: 0x3cc6c9f7
Machine Name: AIX-client
length: 15
contents: AIX-client
fill bytes: opaque data
UID: 521
GID: 20
Auxiliary GIDs
GID: 20
GID: 102
GID: 101
GID: 100
Verifier
Flavor: AUTH_NULL (0)
Length: 0
Network File System
Program Version: 3
Procedure: GETATTR (1)
object
length: 28
type: unknown
data: 66DFBE3C2996226C0C0000003E2E2200
699D32770000000000000000
Frame 11 (170 on wire, 170 captured)
Arrival Time: Apr 24, 2002 14:06:19.921926000
Time delta from previous packet: 0.000479000 seconds
Time relative to first packet: 1.841092000 seconds
Frame Number: 11
Packet Length: 170 bytes
Capture Length: 170 bytes
Ethernet II
Destination: 00:04:ac:e4:9c:b7 (AIX-client)
Source: 08:00:36:b1:88:03 (FreeBSD-server)
Type: IP (0x0800)
Internet Protocol, Src Addr: FreeBSD-server (XX.XX.XX.2), Dst Addr:
AIX-client (XX.XX.XX.8)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN:
0x00)
0000 00.. = Differentiated Services Codepoint: Default (0x00)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 156
Identification: 0xb646
Flags: 0x04
.1.. = Don't fragment: Set
..0. = More fragments: Not set
Fragment offset: 0
Time to live: 64
Protocol: TCP (0x06)
Header checksum: 0xc3f2 (correct)
Source: FreeBSD-server (XX.XX.XX.2)
Destination: AIX-client (XX.XX.XX.8)
Transmission Control Protocol, Src Port: nfsd (2049), Dst Port: 49300
(49300), Seq: 2475944560, Ack: 2915623888
Source port: nfsd (2049)
Destination port: 49300 (49300)
Sequence number: 2475944560
Next sequence number: 2475944676
Acknowledgement number: 2915623888
Header length: 20 bytes
Flags: 0x0018 (PSH, ACK)
0... .... = Congestion Window Reduced (CWR): Not set
.0.. .... = ECN-Echo: Not set
..0. .... = Urgent: Not set
...1 .... = Acknowledgment: Set
.... 1... = Push: Set
.... .0.. = Reset: Not set
.... ..0. = Syn: Not set
.... ...0 = Fin: Not set
Window size: 33176
Checksum: 0x7e7b (correct)
Remote Procedure Call
Last Fragment: Yes
Fragment Length: 112
XID: 0x378947fb (931743739)
Message Type: Reply (1)
This is a reply to a request in frame 10
Program: NFS (100003)
Program Version: 3
Procedure: GETATTR (1)
Reply State: accepted (0)
Verifier
Flavor: AUTH_NULL (0)
Length: 0
Accept State: RPC executed successfully (0)
Network File System
Program Version: 3
Procedure: GETATTR (1)
Status: OK (0)
obj_attributes
Type: Directory (2)
mode: 0755
0... .... .... = not SUID
.0.. .... .... = not SGID
..0. .... .... = not save swapped text
...1 .... .... = Read permission for owner
.... 1... .... = Write permission for owner
.... .1.. .... = Execute permission for owner
.... ..1. .... = Read permission for group
.... ...0 .... = no Write permission for group
.... .... 1... = Execute permission for group
.... .... .1.. = Read permission for others
.... .... ..0. = no Write permission for others
.... .... ...1 = Execute permission for others
nlink: 23
uid: 521
gid: 20 size: 1536
used: 2048
rdev: 108,9175075
specdata1: 108
specdata2: 9175075
fsid: 134428
fileid: 2240062
atime: Apr 24, 2002 13:56:35.000000000
seconds: 1019656595
nano seconds: 0
mtime: Apr 24, 2002 12:59:22.000000000
seconds: 1019653162
nano seconds: 0
ctime: Apr 24, 2002 12:59:22.000000000
seconds: 1019653162
nano seconds: 0