Thursday, September 11, 2008

NFS performance issue

NFS ops increases and high performance in Filer.

Step 1 :
See below output. NFS OPS ~90000 (column 2 )

Begin: Tue Sep 09 17:52:30 GMT 2008
 CPU   NFS  CIFS  HTTP   Total    Net kB/s   Disk kB/s     Tape kB/s Cache Cache  CP   CP Disk    FCP iSCSI   FCP  kB/s
                                  in   out   read  write  read write   age   hit time  ty util                 in   out
 61% 93233     0     0   93233 16978 17120      0      0     0     0   >60  100%   0%  -    0%      0     0     0     0
 61% 94057     0     0   94057 17171 17170     24     24     0     0   >60  100%   0%  -    3%      0     0     0     0
 61% 93384     0     0   93384 16998 16997      0      0     0     0   >60  100%   0%  -    0%      0     0     0     0
 60% 93020     0     0   93020 16932 16930      0      0     0     0   >60  100%   0%  -    0%      0     0     0     0
 59% 87212     0     0   87212 24549 16171     32     32     0     0   >60  100%   0%  -    3%      0     0     0     0
 62% 89947     0     0   89947 23317 16618      0      0     0     0   >60  100%   0%  -    0%      0     0     0     0
 61% 90205     1     0   90206 19157 16517      4      0     0     0   >60   99%   0%  -    1%      0     0     0     0
 59% 87641     0     0   87641 18877 16021   3728  22480     0     0   >60  100%  93%  Hf  33%      0     0     0     0
 61% 91401     0     0   91401 19355 16697    428   1012     0     0   >60  100%  13%  :    8%      0     0     0     0
 62% 92058     3     0   92061 19027 16943    392      8     0     0   >60  100%   0%  -

Step : 2

TIME: Tue Sep 09 18:00:02 GMT 2008TIME_DELTA: 07:33 (453s)nfsv3:nfs:nfsv3_read_latency:1426.57usnfsv3:nfs:nfsv3_read_ops:16/snfsv3:nfs:nfsv3_write_latency:267.78usnfsv3:nfs:nfsv3_write_ops:74/s

Step : 3

We can see 100 % of GETATTR request from the clients. SO most of the 90000 NFS OPS are for GETATTR . This can be due to your application behaviour orsome of your client is mis behaving in this environment.
Server nfs V3: (43308280 calls)
null getattr setattr lookup access readlink read
1 0% 43181988 100% 5302 0% 49543 0% 10529 0% 6 0% 3673 0%
write create mkdir symlink mknod remove rmdir
47953 0% 11 0% 1 0% 6 0% 0 0% 15 0% 1 0%
rename link readdir readdir+ fsstat fsinfo pathconf
1 0% 0 0% 5068 0% 3945 0% 129 0% 0 0% 108 0%
commit 0 0%


Step :4

192.168.4.101 NFSOPS = 413629159 ( 1%)
192.168.4.95 NFSOPS = 402901550 ( 1%)
192.168.4.78 NFSOPS = 402401562 ( 1%)
192.168.4.79 NFSOPS = 271381208 ( 1%)
192.168.4.83 NFSOPS = 267008241 ( 1%)
192.168.4.102 NFSOPS = 256459663 ( 1%)
192.168.4.66 NFSOPS = 247692988 ( 1%)
192.168.4.84 NFSOPS = 207956370 ( 0%)
192.168.4.100 NFSOPS = 205229829 ( 0%)
192.168.4.80 NFSOPS = 200581076 ( 0%)
192.168.4.97 NFSOPS = 196879823 ( 0%)

Summary : Based on the perfstat we do not see any NFS latency on the filer end. CPU/DISK is normal on the filer but we do see high NFS OPS which looks
abnormal. Most of the NFS OPS are GETATTR calls and this can be either your application is malfunctioning or some of your client is malfunctioning. So checking 
your application or your clients is recomended. Also please check the client mount options and make sure you are not using "noac". Option "noac" will not 
cache the attribute information on the client end

Please also refer:

https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb36582






2 comments:

Swaminathan S.A. said...

steps to reduce NFS ops

https://now.netapp.com/Knowledgebase/solutionarea.asp?id=ntapcs4694

Swaminathan S.A. said...

There are numerous conditions which could cause NFS clients to report retransmissions or server not responding. This solution is specifically for when
the filer's active request limit is exceeded. To confirm that this is the case, check the "num msg" count reported by the "nfsstat -d" command.

Each time the limit is exceeded, the "num msg" count reported by "nfsstat -d" increments. For example:

num msg=1, too many mbufs=14, rpcErr=12, svrErr=0

In the interval since the NFS statistics were last reset, the filer has dropped exactly one NFS request because of the limit. If the num msg count is zero, then this solution does not apply.

Data ONTAP processes NFS operations from the network layer, without any NFS daemons. Unlike operating systems with NFS daemons, you cannot tune the limit directly.