[Return to Library] [Contents] [Previous Chapter] [Next Section] [Next Chapter] [Index] [Help]


2    Monitoring Your System

Before you start to monitor your system to identify a performance problem, you should understand your user environment, the applications you are running and how they use the various subsystems, and what is acceptable performance.

The source of the performance problem may not be obvious. For example, if your disk I/O subsystem is swamped with activity, the problem may be in either the virtual memory subsystem or the disk I/O subsystem. In general, obtain as much information as possible about the system before you attempt to tune it.

In addition, how you decide to tune your system depends on how your users and applications utilize the system. For example, if you are running CPU-intensive applications, the virtual memory subsystem may be more important than the unified buffer cache (UBC).

This chapter contains the following information:


[Return to Library] [Contents] [Previous Chapter] [Next Section] [Next Chapter] [Index] [Help]


2.1    Monitoring Tools Overview

Numerous system monitoring tools are available. You may have to use various tools in combination with each other in order to get an accurate picture of your system. In addition to obtaining information about your system when it is running poorly, it is also important for you to obtain information about your system when it is running well. By comparing the two sets of data, you may be able to pinpoint the area that is causing the performance problem.

The primary monitoring tools are described in Table 2-1.

Table 2-1: Primary Monitoring Tools

Tool Description
iostat Reports I/O statistics for terminals, disks, and the system. See Section 2.2.5 for more information on using the iostat command to diagnose system performance problems.
netstat Displays network statistics. The netstat command symbolically displays the contents of network-related data structures. Depending on the options supplied to netstat, the output format will vary. The more common format is to supply the netstat command with a time interval to determine the number of incoming and outgoing packets, as well as packet collisions, on a given interface. See Section 2.2.10 for more information on using the netstat command to diagnose system performance problems.
nfsstat Displays Network File System (NFS) and Remote Procedure Call (RPC) statistics for clients and servers. The output includes the number of packets that had to be retransmitted (retrans) and the number of times a reply transaction ID did not match the request transaction ID (badxid). See Section 2.2.11 for more information on using the nfsstat command to diagnose system performance problems.
ps Displays the current status of the system processes. Although ps is a fairly accurate snapshot of the system, it cannot begin and finish a snapshot as fast as some processes change state. As a result, the output may contain some inaccuracies. The ps command includes information about how the processes use the CPU and virtual memory. See Section 2.2.1 for more information on using the ps command to diagnose system performance problems.
uptime Shows how long a system has been running and the system load average. The load average numbers give the number of jobs in the run queue for the last 5 seconds, the last 30 seconds, and the last 60 seconds. See Section 2.2.2 for more information on using the uptime command to diagnose system performance problems.
vmstat Shows information about process threads, virtual memory, interrupts, and CPU usage for a specified time interval. See Section 2.2.3 for more information on using the vmstat command to diagnose system performance problems.

Other tools can also provide you with important monitoring information. These secondary monitoring tools are described in Table 2-2.

Table 2-2: Secondary Monitoring Tools

Tool Description
atom Serves as a general-purpose framework for creating sophisticated program analysis tools. It includes numerous unsupported prepackaged tools and the following supported tools: third, hiprof, and pixie.

The third tool performs memory access checks and detects memory leaks in an application.

The hiprof tool produces either a flat or hierarchical profile of an application. The flat profile shows the execution time spent in a given procedure, and the hierarchical profile shows the execution time spent in a given procedure and all of its descendents.

The pixie tool partitions an application into basic blocks and counts the number of times each basic block is executed.

For details, see the Programmer's Guide or atom(1).

dbx Analyzes running kernels and dump files. The dbx command invokes a source-level debugger. You can use dbx with code produced by the cc and as compilers and with machine code. After invoking the dbx debugger, you issue dbx commands that allow you to examine source files, control program execution, display the state of the program, and debug at the machine-code level. To analyze kernels, use the -k option. See Section 2.2.9 for more information on using the dbx command to diagnose system performance problems.
dumpfs Displays UFS file system information. This command is useful for getting information about the file system block and fragment size and the minimum free space percentage. See Section 2.2.6 for more information on using the dumpfs command to diagnose system performance problems.
gprof Displays call graph profile data showing the effects of called routines. Similar to the prof utility.

For details, see the Programmer's Guide or gprof(1).

ipcs Reports interprocess communication (IPC) statistics. The ipcs command displays information about currently active message queues, shared-memory segments, semaphores, remote queues, and local queue headers. Information provided in the following fields by the ipcs -a command can be especially useful:

- QNUM, the number of messages currently outstanding in the
  associated message queue

- CBYTES, the number of bytes in messages currently
  outstanding in the associated message queue

- QBYTES, the maximum number of bytes allowed in messages
  outstanding in the associated message queue

- SEGSZ, the size of the associated shared memory segment

- NSEMS, the number of semaphores in the set associated with
  the semaphore entry

See ipcs(1) for details.

kdbx Analyzes running kernels and dump files. The kdbx debugger is an interactive program that lets you examine either the running kernel or dump files created by the savecore utility. In either case, you will be examining an object file and a core file. For running systems, these files are usually /vmunix and /dev/mem, respectively. Dump files created by savecore are saved in the directory specified by the /sbin/init.d/savecore script which is, by default, /var/adm/crash. All dbx commands are available in kdbx using the dbx option.

See the manual Kernel Debugging or kdbx(8) for details.

kprofile Profiles the kernel using the performance counters in the hardware. See the manual Kernel Debugging or kprofile(1) for details.
nfswatch Monitors all NFS network traffic and divides it into several categories. The number and percentage of packets received in each category appears on the screen in a continuously updated display. Your kernel must be configured with the packetfilter option. See nfswatch(8) and packetfilter(7) for details.
pixie Provides basic block counting data when used with prof.
prof Displays statistics on where time is being spent - at the routine level, basic block level, or instruction level - during the execution of a program. This information will help you to determine where to concentrate your efforts to optimize source code.
showfdmn Displays the attributes of an AdvFS file domain and detailed information about each volume in the file domain.
showfile Displays the full storage allocation map (extent map) for files in an Advanced File System (AdvFS). An extent is a contiguous area of disk space that the file system allocates to a file.
showfsets Displays the filesets (or clone filesets) and their characteristics in a specified domain.
swapon Specifies additional disk space for paging and swapping and displays swap space utilization, including the total amount of allocated swap space, the amount of swap space that is being used, and the amount of free swap space. See Section 2.2.4 for more information on using the swapon command to diagnose system performance.
tcpdump Displays network traffic. The tcpdump command prints out the headers of packets on a network interface that match the Boolean expression. Your kernel must be configured with the packetfilter option. See tcpdump(8) and packetfilter(7) for details.
uprofile Profiles user code using performance counters in the hardware. See uprofile(1) for details.
voldg Displays, with the list option, information about an LSM diskgroup's attributes. See voldg(8) for details.
voldisk Displays, with the list option, a disk's configuration and attribute information. See voldisk(8) for details.
volprint Displays information from records in the LSM configuration database. See volprint(8) for more information.
volstat Displays Logical Storage Manager statistics for LSM volumes, plexes, subdisks, or disks. See volstat(8) for details.
voltrace Prints records from an event log. Sets event trace masks to determine what type of events will be tracked. See voltrace(8) for more information.
volwatch Monitors LSM for failure events and sends mail to the specified user. See volwatch(8) for more information.
w Displays a summary of current system activity. The system summary shows the current time, the amount of time since the system was last started, the number of users logged in to the system, and the load averages. The load average numbers give the number of jobs in the run queue for the last 5 seconds, the last 30 seconds, and the last 60 seconds. See w(1) for details.
xload Displays the system load average for X. The xload command displays a periodically updating histogram of the system load average. See xload(1X) for details.

POLYCENTER Performance Solution, a layered product, is also available as a monitoring tool. It can monitor many Digital UNIX nodes simultaneously. A single-node version of the product is included with the operating system at no extra charge.

POLYCENTER Performance Solution has a graphical user interface (GUI) called Performance Manager. Performance Manager is a real-time performance monitor that allows you to detect and correct performance problems. Graphs and charts can show hundreds of different system values, including CPU performance, memory usage, disk transfers, file-system capacity, network efficiency, and AdvFS and cluster-specific metrics.

Thresholds can be set to alert you to or correct a problem when it occurs, and archives of data can be kept for high-speed playback or long-term trend analysis.

Performance Manager has performance analysis and system management scripts, as well as cluster-specific and AdvFS-specific scripts. These scripts can be run simultaneously on multiple nodes from the GUI.

Performance Manager automatically discovers cluster members when a single cluster member node is specified, and it can monitor both individual cluster members and an entire cluster concurrently.

For details on POLYCENTER Performance Solution, see the manual POLYCENTER Performance Solution for UNIX Systems: User's Guide.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2    Determining the Problem

The following sections describe how to use monitoring tools to identify the system component or subsystem that is causing a performance degradation. Once you determine which subsystem or component is causing the problem and you are sure that you understand your system environment and the needs of your users, refer to the appropriate section in Chapter 3 for information on tuning the particular subsystem or component.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.1    Monitoring Processes - ps Command

The ps command displays the current status of the system processes. You can use it to determine the current running processes, their state, and how they utilize system memory. The command lists processes in order of decreasing CPU usage, so you can easily determine which processes are using the most CPU time. Be aware that ps is only a snapshot of the system; by the time the command finishes executing, the system state has probably changed. For example, one of the first lines of the command may refer to the ps command itself.

An example of the ps command follows:


ps aux

USER  PID  %CPU %MEM   VSZ   RSS  TTY S    STARTED      TIME  COMMAND
chen  2225  5.0  0.3  1.35M  256K p9  U    13:24:58  0:00.36  cp /vmunix /tmp
root  2236  3.0  0.5  1.59M  456K p9  R  + 13:33:21  0:00.08  ps aux
sorn  2226  1.0  0.6  2.75M  552K p9  S  + 13:25:01  0:00.05  vi met.ps
root   347  1.0  4.0  9.58M  3.72 ??  S      Nov 07 01:26:44  /usr/bin/X11/X -a
root  1905  1.0  1.1  6.10M  1.01 ??  R    16:55:16  0:24.79  /usr/bin/X11/dxpa
sorn  2228  0.0  0.5  1.82M  504K p5  S  + 13:25:03  0:00.02  more
sorn  2202  0.0  0.5  2.03M  456K p5  S    13:14:14  0:00.23  -csh (csh)
root     0  0.0 12.7   356M  11.9 ??  R <  Nov 07 3-17:26:13  [kernel idle]
            [1]  [2]   [3]    [4]    [5]               [6]

The ps command includes the following information that you can use to diagnose CPU and virtual memory problems:

  1. Percent CPU time usage (%CPU). [Return to example]

  2. Percent real memory usage (%MEM). [Return to example]

  3. Process virtual address size (VSZ) - This is the total amount of virtual memory allocated to the process. [Return to example]

  4. Real memory (resident set) size of the process (RSS) - This is the total amount of physical memory mapped to virtual pages (that is, the total amount of memory that the application has physically used). Shared memory is included in the resident set size figures; as a result, the total of these figures may exceed the total amount of physical memory available on the system. [Return to example]

  5. Process status or state (S) - This specifies whether a process is runnable (R), uninterruptible sleeping (U), sleeping (S), idle (I), stopped (T), or halted (H). It also indicates whether the process is swapped out (W), whether the process is exceeding a soft limit on memory requirements (>), whether the process is a process group leader with a controlling terminal (+), and whether the process priority has been reduced (N) or raised (<) with the nice or renice command. [Return to example]

  6. Current CPU time used (TIME). [Return to example]

From the output of the ps command, you can determine which processes are consuming most of your system's CPU time and memory and whether processes are swapped out. Concentrate on processes that are runnable or paging. Here are some concerns to keep in mind:

For information about memory tuning, see Section 3.4. For information about improving the performance of your applications, see the Programmer's Guide.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.2    Measuring the System Load - uptime Command

The uptime command shows how long a system has been running and the load average. The load average counts jobs that are waiting for disk I/O and also applications whose priorities have been changed with either the nice or renice command. The load average numbers give the average number of jobs in the run queue for the last 5 seconds, the last 30 seconds, and the last 60 seconds.

An example of the uptime command follows:

uptime

1:48pm  up 7 days,  1:07,  35 users,  load average: 7.12, 10.33, 10.31

Note whether the load is increasing or decreasing. An acceptable load average depends on your type of system and how it is being used. In general, for a large system, a load of 10 is high, and a load of 3 is low. Workstations should have a load of 1 or 2. If the load is high, look at what processes are running with the ps command. You may want to run some applications during off-peak hours. You can also lower the priority of applications with the nice or renice command to conserve CPU cycles.

See Section 3.2 for additional information on how to reduce the load on your system.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.3    Monitoring Virtual Memory and CPU Usage - vmstat Command

The vmstat command shows the virtual memory, process, and total CPU statistics for a specified time interval. The first line of the output is for all time since a reboot, and each subsequent report is for the last interval. Because the CPU operates faster than the rest of the system, performance bottlenecks usually exist in the memory or I/O subsystems.


An example of the vmstat command follows:


vmstat 1

Virtual Memory Statistics: (pagesize = 8192)
procs        memory            pages                       intr        cpu
r  w  u  act  free wire  fault cow zero react pin pout   in  sy  cs  us sy  id
2 66 25  6417 3497 1570  155K  38K  50K    0  46K    0    4 290 165   0  2  98
4 65 24  6421 3493 1570   120    9   81    0    8    0  585 865 335  37 16  48
2 66 25  6421 3493 1570    69    0   69    0    0    0  570 968 368   8 22  69
4 65 24  6421 3493 1570    69    0   69    0    0    0  554 768 370   2 14  84
4 65 24  6421 3493 1570    69    0   69    0    0    0  865  1K 404   4 20  76
               [1]                                  [2]     [3]         [4]

The vmstat command includes information that you can use to diagnose CPU and virtual memory problems. The following fields are particularly important:

  1. Virtual memory information, including the number of pages that are active (act), the number of pages on the free list (free), and the number of pages wired down (wire). See Section 1.3.1 for more information. [Return to example]

  2. The number of pages that have been paged out (pout). [Return to example]

  3. Interrupt information, including the number of nonclock device interrupts per second (in), the number of system calls called per second (sy), and the number of task and thread context switches per second (cs). [Return to example]

  4. CPU usage information, including the percentage of used time for normal and priority processes (us), the percentage of system time (sy), and the percentage of idle time (id). [Return to example]

While diagnosing a bottleneck situation, keep the following issues in mind:

See Chapter 3 for information on improving CPU usage and I/O operations and for information on tuning virtual memory, disks, and file systems.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.4    Displaying the Swap Space Configuration - swapon Command

Use the swapon command with the -s option to display your swap device configuration. For each swap partition, the command displays the total amount of allocated swap space, the amount of swap space that is being used, and the amount of free swap space. This information should help you determine how your swap space is being utilized. For example:

swapon -s

Swap partition /dev/rz2b (default swap):
    Allocated space:        16384 pages (128MB)
    In-use space:               1 pages (  0%)
    Free space:             16383 pages ( 99%)

 
Swap partition /dev/rz12c: Allocated space: 128178 pages (1001MB) In-use space: 1 pages ( 0%) Free space: 128177 pages ( 99%)
 

 
Total swap allocation: Allocated space: 144562 pages (1129MB) Reserved space: 2946 pages ( 2%) In-use space: 2 pages ( 0%) Available space: 141616 pages ( 97%)

See Section 3.4.2.1 for information on how to tune your swap space configuration. Use the iostat command to determine which disks are being used the most.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.5    Monitoring Disk I/O - iostat Command

The iostat command reports I/O statistics for terminals, disks, and the CPU. The first line of the output is the average since boot time, and each subsequent report is for the last interval. An example of the iostat command is as follows:

iostat 1

      tty     rz1      rz2      rz3      cpu
 tin tout bps tps  bps tps  bps tps  us ni sy id
  0    3   3   1    0   0    8   1   11 10 38 40
  0   58   0   0    0   0    0   0   46  4 50  0
  0   58   0   0    0   0    0   0   68  0 32  0
  0   58   0   0    0   0    0   0   55  2 42  0

The iostat command reports I/O statistics that you can use to diagnose disk I/O performance problems. For example, the command displays information about the following:


Note the following when you use the iostat command:

See Section 3.6 for information on how to improve your disk I/O performance.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.6    Displaying UFS Information - dumpfs Command

The dumpfs command dumps UFS information. The command prints out the super block and cylinder group information. The command is useful for getting information about the file system block and fragment sizes and the minimum free space percentage.

The following example shows part of the output of the dumpfs command:

dumpfs /dev/rrz3g | more

magic   11954   format  dynamic time    Tue Sep 14 15:46:52 1993
nbfree  21490   ndir    9       nifree  99541   nffree  60
ncg     65      ncyl    1027    size    409600  blocks  396062
bsize   8192    shift   13      mask    0xffffe000
fsize   1024    shift   10      mask    0xfffffc00
frag    8       shift   3       fsbtodb 1
cpg     16      bpg     798     fpg     6384    ipg     1536
minfree 10%     optim   time    maxcontig 8     maxbpg  2048
rotdelay 0ms    headswitch 0us  trackseek 0us   rps     60

The information contained in the first lines are relevant for tuning. Of specific interest are the following fields:

Keep the following issues in mind:

For information about tuning UFS file system configuration parameters and sysconfigtab configuration attributes to improve your disk I/O performance, see Section 3.6.1.2.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.7    Monitoring AdvFS - advscan, showfdmn, showfile, showfsets

You can use the advscan, showfdmn, showfile, and showfsets commands to display information about AdvFS.

See Section 3.6.1.3 for information about tuning AdvFS.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.7.1    The advscan Command

The advscan command locates pieces of AdvFS domains on disk partitions and in LSM disk groups. Use the advscan command when you have moved disks to a new system, have moved disks around in a way that has changed device numbers, or have lost track of where the domains are. The command is also used for repair if you delete /etc/fdmns, delete a directory domain under /etc/fdmns, or delete some links from a domain directory under /etc/fdmns.

The advscan command accepts a list of volumes or disk groups and searches all partitions and volumes in each. It determines which partitions on a disk are part of an AdvFS file domain. You can run the advscan command to rebuild all or part of your /etc/fdmns directory or you can rebuild it by hand by supplying the names of the partitions in a domain.

The following example scans devices rz0 and rz5 for AdvFS partitions:

advscan rz0 rz5


 
Scanning disks rz0 rz5 Found domains:
 
usr_domain Domain Id 2e09be37.0002eb40 Created Thu Jun 23 09:54:15 1994
 
Domain volumes 2 /etc/fdmns links 2
 
Actual partitions found: rz0c rz5c
 

For the following example, the rz6 domains were removed from /etc/fdmns. The advscan command scans device rz6 and re-creates the missing domains.

advscan -r rz6


 
Scanning disks rz6 Found domains:
 
*unknown* Domain Id 2f2421ba.0008c1c0 Created Mon Jan 23 13:38:02 1995
 
Domain volumes 1 /etc/fdmns links 0
 
Actual partitions found: rz6a* *unknown* Domain Id 2f535f8c.000b6860 Created Tue Feb 28 09:38:20 1995
 
Domain volumes 1 /etc/fdmns links 0
 
Actual partitions found: rz6b*
 
Creating /etc/fdmns/domain_rz6a/ linking rz6a
 
Creating /etc/fdmns/domain_rz6b/ linking rz6b
 

See advscan(8) for details on the advscan command.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.7.2    The showfdmn Command

The showfdmn command displays the attributes of an AdvFS file domain and detailed information about each volume in the file domain. The following example of the showfdmn command displays domain information for the /usr file domain:

showfdmn usr


 
Id Date Created LogPgs Domain Name 2b5361ba.000791be Tue Jan 12 16:26:34 1993 256 usr
 
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name 1L 820164 351580 57% on 256 256 /dev/rz0d

See showfdmn(8) for information about the output of the command.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.7.3    The showfile Command

The showfile command displays the full storage allocation map (extent map) for files in an Advanced File System (AdvFS). An extent is a contiguous area of disk space that the file system allocates to a file. The following example of the showfile command displays the AdvFS-specific attributes for all of the files in the current working directory:

showfile *


 
Id Vol PgSz Pages XtntType Segs SegSz Log Perf File 22a.001 1 16 1 simple ** ** off 50% Mail 7.001 1 16 1 simple ** ** off 20% bin 1d8.001 1 16 1 simple ** ** off 33% c 1bff.001 1 16 1 simple ** ** off 82% dxMail 218.001 1 16 1 simple ** ** off 26% emacs 1ed.001 1 16 0 simple ** ** off 100% foo 1ee.001 1 16 1 simple ** ** off 77% lib 1c8.001 1 16 1 simple ** ** off 94% obj 23f.003 1 16 1 simple ** ** off 100% sb 170a.008 1 16 2 simple ** ** off 35% t 6.001 1 16 12 simple ** ** off 16% tmp

The following example of the showfile command shows the attributes and extent information for the mail file, which is a simple file:

showfile -x mail


 
Id Vol PgSz Pages XtntType Segs SegSz Log Perf File 4198.800d 2 16 27 simple ** ** off 66% tutorial
 
extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 5 2 781552 80 5 12 2 785776 192 17 10 2 786800 160 extentCnt: 3

See showfile(8) for information about the output of the command.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.7.4    The showfset Command

The showfsets command displays the filesets (or clone filesets) and their characteristics in a specified domain.

The following is an example of the showfsets command:

showfsets dmn


 
mnt Id : 2c73e2f9.000f143a.1.8001 Clone is : mnt_clone Files : 79, limit = 1000 Blocks (1k) : 331, limit = 25000 Quota Status : user=on group=on
 
mnt_clone Id : 2c73e2f9.000f143a.2.8001 Clone of : mnt Revision : 1

See showfsets(8) for information about the output of the command.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.8    Monitoring the Logical Storage Manager (LSM)

A number of commands are available to display LSM-related information and to monitor LSM-related activity:

In addition, you can use the Analyze menu in LSM's graphical interface (dxlsm) to monitor activity on volumes, LSM disks, and subdisks.

See the manual Logical Storage Manager for more information about monitoring LSM and for information about LSM performance management.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.8.1    The voldg Command

The voldg list command displays brief information about the attributes of LSM disk groups. If you specify a particular disk group, the command displays more detailed information on the status and configuration of the specified group.

The following example uses the voldg list command to display information about the rootdg disk group:

voldg list rootdg

Group:     rootdg
dgid:      795887625.1025.system32
import-id: 0.1
flags:
config:    seqno=0.1351 permlen=347 free=316 templen=9 loglen=52
config disk rz9 copy 1 len=347 state=clean online
config disk rz10 copy 1 len=347 state=clean online
config disk rz12 copy 1 len=347 state=clean online
config disk rz15 copy 1 len=347 state=clean online
config disk rz11 copy 1 len=347 state=clean online
config disk rz13 copy 1 len=347 state=clean online
log disk rz8 copy 1 len=200
log disk rz8 copy 2 len=200
log disk rz9 copy 1 len=52
log disk rz10 copy 1 len=52
log disk rz12 copy 1 len=52
log disk rz15 copy 1 len=52
log disk rz11 copy 1 len=52
log disk rz13 copy 1 len=52
log disk rz3 copy 1 len=200
log disk rz3 copy 2 len=200

For more information, see the voldg(8) reference page.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.8.2    The voldisk Command

The voldisk list command displays the device names for all recognized disks, the disk names, the disk group names associated with each disk, and the status of each disk.

The following example uses the voldisk list command to display information about the rz15 disk:

voldisk list rz15

Device:    rz15
devicetag: rz15
type:      sliced
hostid:    system32
disk:      name=rz15 id=795887633.1049.system32
group:     name=rootdg id=795887625.1025.system32
flags:     online ready private imported
pubpaths:  block=/dev/rz15g char=/dev/rrz15g
privpaths: block=/dev/rz15h char=/dev/rrz15h
version:   1.1
iosize:    512
public:    slice=6 offset=0 len=2697533
private:   slice=7 offset=0 len=512
update:    time=795888426 seqno=0.18
headers:   0 248
configs:   count=1 len=347
logs:      count=1 len=52
Defined regions:
 config   priv     17-   247[   231]: copy=01 offset=000000
 config   priv    249-   364[   116]: copy=01 offset=000231
 log      priv    365-   416[    52]: copy=01 offset=000000

For more information, see the voldisk(8) reference page.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.8.3    The volprint Command

The volprint command displays information from records in the LSM configuration database. You can select the records to be displayed by name or using special search expressions. In addition, you can display record association hierarchies, so that the structure of records is more apparent.

Use the volprint command to display disk group, disk media, volume, plex, and subdisk records. Use the voldisk list command to display disk access records, or physical disk information.

The following example uses the volprint command to show the status of the voldev1 volume:

volprint -ht voldev1

DG NAME        GROUP-ID
DM NAME        DEVICE       TYPE     PRIVLEN  PUBLEN   PUBPATH
V  NAME        USETYPE      KSTATE   STATE    LENGTH   READPOL  PREFPLEX
PL NAME        VOLUME       KSTATE   STATE    LENGTH   LAYOUT   ST-WIDTH MODE
SD NAME        PLEX         PLOFFS   DISKOFFS LENGTH   DISK-NAME    DEVICE

 
v voldev1 fsgen ENABLED ACTIVE 804512 SELECT - pl voldev1-01 voldev1 ENABLED TEMP 804512 CONCAT - WO sd rz8-01 voldev1-01 0 0 804512 rz8 rz8 pl voldev1-02 voldev1 ENABLED ACTIVE 804512 CONCAT - RW sd dev1-01 voldev1-02 0 2295277 402256 dev1 rz9 sd rz15-02 voldev1-02 402256 2295277 402256 rz15 rz15

For more information, see the volprint(8) reference page.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.8.4    The volstat Command

The volstat command provides information about activity on volumes, plexes, subdisks, and disks under LSM control. It reports statistics that reflect the activity levels of LSM objects since boot time.

The amount of information displayed depends on what options you specify to volstat. For example, you can display statistics for a specific LSM object, or you can display statistics for all objects at one time. You can also specify a disk group, in which case, only statistics for objects in that disk group are displayed; if you do not specify a particular disk group, volstat displays statistics for the default disk group (rootdg).

The volstat command can also be used to reset the statistics information to zero. This can be done for all objects or for only specified objects. Resetting just prior to a particular operation makes it possible to measure the subsequent impact of that particular operation.

The following example shows statistics on LSM volumes.

volstat

OPERATIONS       BLOCKS        AVG TIME(ms)
TYP NAME        READ   WRITE    READ    WRITE   READ   WRITE
vol archive      865     807    5722     3809   32.5    24.0
vol home        2980    5287    6504    10550   37.7   221.1
vol local      49477   49230  507892   204975   28.5    33.5
vol src        79174   23603  425472   139302   22.4    30.9
vol swapvol    22751   32364  182001   258905   25.3   323.2

For more information, see the volstat(8) reference page.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.8.5    The voltrace Command

The voltrace command reads an event log (/dev/volevent) and prints formatted event log records to standard output. Using voltrace, you can set event trace masks to determine which type of events will be tracked. For example, you can trace I/O events, configuration changes, or I/O errors.

The following sample voltrace command shows status on all new events.

voltrace -n -e all

18446744072623507277 IOTRACE 439: req 3987131 v:rootvol p:rootvol-01 \
  d:root_domain s:rz3-02 iot write lb 0 b 63120 len 8192 tm 12
18446744072623507277 IOTRACE 440: req 3987131 \
  v:rootvol iot write lb 0 b 63136 len 8192 tm 12

For more information, see the voltrace(8) reference page.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.8.6    The volwatch Command

The volwatch command monitors LSM for failure events and sends mail to the specified user.

For more information, see the volwatch(8) reference page.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.8.7    The Analyze Menu

LSM's graphical interface (dxlsm) includes an Analyze menu. The Analyze menu allows you to display statistics about volumes, LSM disks, and subdisks. The information is displayed graphically, using colors and patterns on the disk icons, and numerically, using the Analysis Statistics form. You can use the Analysis Parameters form to tailor the information that will be displayed.

See the manual Logical Storage Manager for information about dxlsm.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.9    Monitoring System Parameter Settings

System parameters are global variables. You can monitor the setting of these variables by using the Kernel Tuner or the sysconfig command. Also, as explained in Section 2.2.10, you can also do this monitoring with dbx.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.9.1    Using the Kernel Tuner to Monitor Settings

The Kernel Tuner (dxkerneltuner) is provided by the Common Desktop Environment's (CDE) graphical user interface. To access the Kernel Tuner, click on the Application Manager icon in the CDE menu bar and then select the Monitoring/Tuning category. When you then select the Kernel Tuner, a pop-up containing a list of subsystems appears. Selecting a subsystem generates a display of the subsystem's attributes and their values. See Appendix B for descriptions of the attributes displayed by the Kernel Tuner or the sysconfig command.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.9.2    Using the sysconfig Command to Monitor Settings

The sysconfig command is part of a system utility that allows you to modify most of the global variables that affect system performance without needing to rebuild the kernel to put the new values permanently in effect. (Section 3.3 explains how to modify global variables in this way.)

The sysconfig -q command monitors the values of attributes. Each attribute corresponds to a global variable; however, not all global variables have corresponding attributes (that is, only a subset of the global variables in a system have corresponding attributes). Attribute names usually differ slightly from the names that are used for their corresponding global variables in the system configuration file (/usr/sys/conf/system_name) and the param.c file (/usr/sys/system_name/param.c), but they are always very similar.

To examine the current setting of a particular global variable, issue a sysconfig command with the name of the subsystem that owns the variable and the name of the attribute that corresponds to the particular variable:

sysconfig -q subsystem_name [ attribute_name ]

If you omit attribute_name, the values for all of the attributes for the named subsystem are displayed.

Use the following command to list the subsystem names that you can specify in a sysconfig command:

sysconfig -s

For example:

sysconfig -s

Cm: loaded and configured
Generic: loaded and configured
Proc: loaded and configured

.
.
.
Xpr: loaded and configured Rt: loaded and configured Net: loaded and configured
#


Use the following command to list the values of all of the attributes associated with a particular subsystem:

sysconfig -q subsystem_name

For example:

sysconfig -q vm

ubc-minpercent = 10
ubc-maxpercent = 100

.
.
.
vm-syswiredpercent = 80 vm-inswappedmin = 1
sysconfig -q vfs
name-cache-size = 1029
name-cache-hash-size = 256

.
.
.
max-ufs-mounts = 1000 vnode-deallocation-enable = 1 #


Note that a global variable's value in the system configuration file or the param.c file can differ from the value assigned to the global variable's attribute established in the sysconfigtab file (/etc/sysconfigtab) by the sysconfigdb command or in a running kernel by the sysconfig -r command. In a running system, values established by the sysconfig -r command override values established in the sysconfigtab file, and values in the sysconfigtab file override values in the system configuration file or the param.c file.

If an attribute is not defined in the sysconfigtab file, the sysconfig -q command returns the value of the corresponding parameter in the system configuration file or param.c.

To display the minimum and maximum values that can be given to attributes, issue the following command:

sysconfig -Q subsystem_name [ attribute_list ]

See sysconfig(8) or the Kernel Debugging and Configuration Management Guide for details on the sysconfig command. See Section 3.3 for information on how to tune the values of configuration attributes using the sysconfigdb and sysconfig -r commands.

For descriptions of the configuration attributes that have an effect on system performance, see Appendix B. Note that not all subsystems displayed by a sysconfig -r command are covered in Appendix B. Only those subystems that have tunable attributes affecting performance are covered.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.10    Using dbx to Monitor Subsystems

You can use dbx to examine source files, control program execution, display the state of the program, and debug at the machine-code level. To examine the values of variables and data structures, use the dbx print command.

To examine a running system with dbx, issue the following command:

dbx -k /vmunix /dev/mem

The following sections describe how to use dbx to examine various subsystems of the Digital UNIX operating system.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.10.1    Checking Virtual Memory with dbx

You can check virtual memory by using dbx and examining the vm_perfsum structure. Note the vpf_pagefaults field (number of hardware page faults) and the vpf_swapspace field (number of pages of swap space not reserved):

(dbx)  p vm_perfsum

struct {
	vpf_pagefaults = 6732100

.
.
.
vpf_swapspace = 29230 } (dbx)

See Section 3.4 for information on how to tune the virtual memory subsystem.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.10.2    Checking UFS with dbx

To check UFS using dbx, examine the ufs_clusterstats structure to see how efficiently the system is performing cluster read and write transfers. You can examine the cluster reads and writes separately with the ufs_clusterstats_read and ufs_clusterstats_write structures.

The following example shows a system that is not clustering efficiently:

(dbx)  p ufs_clusterstats

struct {
    full_cluster_transfers = 3130
    part_cluster_transfers = 9786
    non_cluster_transfers = 16833
    sum_cluster_transfers = {
        [0] 0
        [1] 24644
        [2] 1128
        [3] 463
        [4] 202
        [5] 55
        [6] 117
        [7] 36
        [8] 123
        [9] 0
    }
}
(dbx)


The preceding example shows 24644 single-block transfers and no 9-block transfers. The trend of the data shown in the example is the reverse of what you want to see. It shows a large number of single-block transfers and a declining number of multiblock (1 - 9) transfers. However, if the files are all small, this may be the best blocking that you can achieve.

See Section 3.6.1.2 for information on how to tune the UFS file system.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.10.3    Checking the UFS Namei Cache with dbx

The UFS namei cache stores recently used file system pathname/inode number pairs. It also stores inode information for files that were referenced but not found. Having this information in the cache substantially reduces the amount of searching that is needed to perform pathname translations.

To check the namei cache, use dbx and look at the nchstats data structure. In particular, look at the ncs_goodhits, ncs_neghits, and ncs_misses fields to determine the hit rate. The hit rate should be above 80 percent ( ncs_goodhits plus ncs_neghits divided by the sum of the ncs_goodhits, ncs_neghits, and ncs_misses).

For example:

(dbx)  p nchstats

struct {
    ncs_goodhits = 9748603   -found a pair
    ncs_neghits = 888729     -found a pair that didn't exist
    ncs_badhits = 23470
    ncs_falsehits = 69371
    ncs_miss = 1055430       -did not find a pair
    ncs_long = 4067          -name was too long to fit in the cache
    ncs_pass2 = 127950
    ncs_2passes = 195763
    ncs_dirscan = 47
}
(dbx)

For information on how to improve the namei cache hit rate, see Section 3.6.1. For information on how to improve namei cache lookup speeds, see Section 3.4.1.3.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.10.4    Checking the UBC with dbx

To check the UBC, use dbx to examine the vm_perfsum structure. In particular, look at the vpf_piowrites field (number of I/O operations for page outs generated by the page stealing daemon) and the vpf_ubcalloc field (number of times the UBC had to allocate a page from the virtual memory free page list to satisfy memory demands). For example:

(dbx)  p vm_perfsum

struct {
    vpf_pagefaults = 6732100
    vpf_kpagefaults = 119865
    vpf_cowfaults = 926159
    vpf_cowsteals = 192703
    vpf_zfod = 2720195
    vpf_kzfod = 119865
    vpf_pgiowrites = 1882
    vpf_pgwrites = 4747
    vpf_pgioreads = 1874108
    vpf_pgreads = 1412
    vpf_swapreclaims = 4
    vpf_taskswapouts = 0
    vpf_taskswapins = 0
    vpf_vplmsteal = 1411
    vpf_vplmstealwins = 1365
    vpf_vpseqdrain = 0
    vpf_ubchit = 3851
    vpf_ubcalloc = 103378
    vpf_ubcpushes = 0
    vpf_ubcpagepushes = 0
    vpf_ubcdirtywra = 0
    vpf_ubcreclaim = 0
    vpf_reactivate = 1973
    vpf_allocatedpages = 16177
    vpf_wiredpages = 2805
    vpf_ubcpages = 5494
    vpf_freepages = 3384
    vpf_swapspace = 29230
}
(dbx)

The vpf_ubcpages field gives the number of pages of physical memory that the UBC is using to cache file data. If the UBC is using significantly more than half of physical memory and the paging rate (vpf_pgiowrites field) is high, you should probably reduce ubc-maxpercent to 50 percent. This should cause a decrease in the paging activity.

You can also monitor the UBC by examining the ufs_getapage_stats kernel data structure. You can calculate the hit rate by dividing the value for read_hits by the value for read_looks. A good hit rate is a rate above 95 percent.

(dbx)  p ufs_getapage_stats

struct {
    read_looks = 2059022
    read_hits = 2022488
    read_miss = 36506
}
(dbx)

In addition, you can check the UBC by examining the vm_tune structure and the vt_ubcseqpercent and vt_ubcseqstartpercent fields. These values are used to prevent a large file from completely filling the UBC, thus limiting the amount of memory available to the virtual memory subsystem.

For example:

(dbx)  p vm_tune

struct {
    vt_cowfaults = 4
    vt_mapentries = 200
    vt_maxvas = 1073741824
    vt_maxwire = 16777216
    vt_heappercent = 7
    vt_anonklshift = 17
    vt_anonklpages = 1
    vt_vpagemax = 16384
    vt_segmentation = 1
    vt_ubcpagesteal = 24
    vt_ubcdirtypercent = 10
    vt_ubcseqstartpercent = 50
    vt_ubcseqpercent = 10
    vt_csubmapsize = 1048576
    vt_ubcbuffers = 256
    vt_syncswapbuffers = 128
    vt_asyncswapbuffers = 4
    vt_clustermap = 1048576
    vt_clustersize = 65536
    vt_zone_size = 0
    vt_kentry_zone_size = 16777216
    vt_syswiredpercent = 80
    vt_inswappedmin = 1
}

When copying large files, the source and destination objects in the UBC will grow very large (up to all of available physical memory). Reducing the value of vt_ubcseqpercent decreases the number of pages that will be used to cache sequentially accessed files (that is, files being moved in memory). The value represents the percent of memory that a sequentially accessed file can grow to before it starts stealing memory from itself. The value imposes a resident set size limit on a file.

See Section 3.4.1 for information on how to tune the UBC.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.10.5    Checking the Metadata Buffer Cache with dbx

The metadata buffer cache contains file metadata - superblocks, inodes, indirect blocks, directory blocks, and cylinder group summaries. To check the metadata buffer cache, use dbx to examine the bio_stats structure:

(dbx)  p bio_stats

struct {
    getblk_hits = 4590388
    getblk_misses = 17569
    getblk_research = 0
    getblk_dupbuf = 0
    getnewbuf_calls = 17590
    getnewbuf_buflocked = 0
    vflushbuf_lockskips = 0
    mntflushbuf_misses = 0
    mntinvalbuf_misses = 0
    vinvalbuf_misses = 0
    allocbuf_buflocked = 0
    ufssync_misses = 0
}
(dbx)

If the miss rate is high, you may want to raise the value of the bufcache attribute. The number of block misses (getblk_misses) divided by the sum of block misses and block hits (getblk_hits) should not be more than 3 percent.

See Section 3.4.1.3 for information on how to tune the metadata buffer cache.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.10.6    Monitoring CAM Data Structures with dbx

The operating system uses the Common Access Method (CAM) as the operating system interface to the hardware. CAM maintains the xpt_qhead, ccmn_bp_head, and xpt_cb_queue data structures:

Use dbx to examine the three structures:

(dbx)  p xpt_qhead

struct {
    xws = struct {
        x_flink = 0xffffffff81f07400
        x_blink = 0xffffffff81f03000
        xpt_flags = 2147483656
        xpt_ccb = (nil)
        xpt_nfree = 300
        xpt_nbusy = 0
    }
    xpt_wait_cnt = 0
    xpt_times_wait = 2
    xpt_ccb_limit = 1048576
    xpt_ccbs_total = 300
    x_lk_qhead = struct {
        sl_data = 0
        sl_info = 0
        sl_cpuid = 0
        sl_lifms = 0
    }
}
(dbx)  p ccmn_bp_head
struct {
    num_bp = 50
    bp_list = 0xffffffff81f1be00
    bp_wait_cnt = 0
}
(dbx)  p xpt_cb_queue
struct {
    flink = 0xfffffc00004d6828
    blink = 0xfffffc00004d6828
    flags = 0
    initialized = 1
    count = 0
    cplt_lock = struct {
        sl_data = 0
        sl_info = 0
        sl_cpuid = 0
        sl_lifms = 0
    }
}
(dbx)

If the values for xpt_wait_cnt or bp_wait_cnt are nonzero, CAM has run out of buffer pool space. If this situation persists, you may be able to eliminate the problem by changing one or more of CAM's I/O attributes (see Section B.8).

The count parameter in xpt_cb_queue is the number of I/O operations that have been completed and are ready to be passed back to a peripheral device driver. Normally, the value of count should be zero or one. If greater than one, it could indicate either a problem or a temporary situation in which a large number of I/O operations are completing simultaneously. If repeated testing demonstrates that the value is consistently greater than one, one or more subsystem components may require tuning.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


2.2.11    Monitoring the Network - netstat Command

To check network statistics, use the netstat command (or nfsstat command, see Section 2.2.12). Some problems to look for are as follows:

Most of the information provided by netstat is used to diagnose network hardware or software failures, not to analyze tuning opportunities. See the manual Network Administration for additional information on how to diagnose failures.

The following example shows the output produced by the -i option of the netstat command:

netstat -i

Name  Mtu   Network     Address         Ipkts Ierrs    Opkts Oerrs  Coll
ln0   1500  DLI         none           133194     2    23632     4  4881
ln0   1500  <Link>                     133194     2    23632     4  4881
ln0   1500  red-net     node1          133194     2    23632     4  4881
sl0*  296   <Link>                          0     0        0     0     0
sl1*  296   <Link>                          0     0        0     0     0
lo0   1536  <Link>                        580     0      580     0     0
lo0   1536  loop        localhost         580     0      580     0     0

Use the following command to determine the causes of the input (Ierrs) and output (Oerrs) shown in the preceding example:

netstat -is


 
ln0 Ethernet counters at Fri Jan 14 16:57:36 1994
 
4112 seconds since last zeroed 30307093 bytes received 3722308 bytes sent 133245 data blocks received 23643 data blocks sent 14956647 multicast bytes received 102675 multicast blocks received 18066 multicast bytes sent 309 multicast blocks sent 3446 blocks sent, initially deferred 1130 blocks sent, single collision 1876 blocks sent, multiple collisions 4 send failures, reasons include: Excessive collisions 0 collision detect check failure 2 receive failures, reasons include: Block check error Framing Error 0 unrecognized frame destination 0 data overruns 0 system buffer unavailable 0 user buffer unavailable


The -s option for the netstat command displays statistics for each protocol:

netstat -s

ip:
        67673 total packets received
        0 bad header checksums
        0 with size smaller than minimum
        0 with data size < data length
        0 with header length < data size
        0 with data length < header length
        8616 fragments received
        0 fragments dropped (dup or out of space)
        5 fragments dropped after timeout
        0 packets forwarded
        8 packets not forwardable
        0 redirects sent
icmp:
        27 calls to icmp_error
        0 errors not generated 'cuz old message was icmp
        Output histogram:
                echo reply: 8
                destination unreachable: 27
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        Input histogram:
                echo reply: 1
                destination unreachable: 4
                echo: 8
        8 message responses generated
igmp:
        365 messages received
        0 messages received with too few bytes
        0 messages received with bad checksum
        365 membership queries received
        0 membership queries received with invalid field(s)
        0 membership reports received
        0 membership reports received with invalid field(s)
        0 membership reports received for groups to which we belong
        0 membership reports sent
tcp:
        11219 packets sent
                7265 data packets (139886 bytes)
                4 data packets (15 bytes) retransmitted
                3353 ack-only packets (2842 delayed)
                0 URG only packets
                14 window probe packets
                526 window update packets
                57 control packets
        12158 packets received
                7206 acks (for 139930 bytes)
                32 duplicate acks
                0 acks for unsent data
                8815 packets (1612505 bytes) received in-sequence
                432 completely duplicate packets (435 bytes)
                0 packets with some dup. data (0 bytes duped)
                14 out-of-order packets (0 bytes)
                1 packet (0 bytes) of data after window
                0 window probes
                1 window update packet
                5 packets received after close
                0 discarded for bad checksums
                0 discarded for bad header offset fields
                0 discarded because packet too short
        19 connection requests
        25 connection accepts
        44 connections established (including accepts)
        47 connections closed (including 0 drops)
        3 embryonic connections dropped
        7217 segments updated rtt (of 7222 attempts)
        4 retransmit timeouts
                0 connections dropped by rexmit timeout
        0 persist timeouts
        0 keepalive timeouts
                0 keepalive probes sent
                0 connections dropped by keepalive
udp:
        12003 packets sent
        48193 packets received
        0 incomplete headers
        0 bad data length fields
        0 bad checksums
        0 full sockets
        12943 for no port (12916 broadcasts, 0 multicasts)

See netstat(1) for information about the output produced by the various options supported by the netstat command.


[Return to Library] [Contents] [Previous Chapter] [Previous Section] [Next Chapter] [Index] [Help]


2.2.12    Displaying NFS Statistics - nfsstat Command

To check NFS statistics, use the nfsstat command. For example:

nfsstat


 
Server rpc: calls badcalls nullrecv badlen xdrcall 38903 0 0 0 0
 
Server nfs: calls badcalls 38903 0
 
Server nfs V2: null getattr setattr root lookup readlink read 5 0% 3345 8% 61 0% 0 0% 5902 15% 250 0% 1497 3% wrcache write create remove rename link symlink 0 0% 1400 3% 549 1% 1049 2% 352 0% 250 0% 250 0% mkdir rmdir readdir statfs 171 0% 172 0% 689 1% 1751 4%
 
Server nfs V3: null getattr setattr lookup access readlink read 0 0% 1333 3% 1019 2% 5196 13% 238 0% 400 1% 2816 7% write create mkdir symlink mknod remove rmdir 2560 6% 752 1% 140 0% 400 1% 0 0% 1352 3% 140 0% rename link readdir readdir+ fsstat fsinfo pathconf 200 0% 200 0% 936 2% 0 0% 3504 9% 3 0% 0 0% commit 21 0%
 
Client rpc: calls badcalls retrans badxid timeout wait newcred 27989 1 0 0 1 0 0 badverfs timers 0 4
 
Client nfs: calls badcalls nclget nclsleep 27988 0 27988 0
 
Client nfs V2: null getattr setattr root lookup readlink read 0 0% 3414 12% 61 0% 0 0% 5973 21% 257 0% 1503 5% wrcache write create remove rename link symlink 0 0% 1400 5% 549 1% 1049 3% 352 1% 250 0% 250 0% mkdir rmdir readdir statfs 171 0% 171 0% 713 2% 1756 6%
 
Client nfs V3: null getattr setattr lookup access readlink read 0 0% 666 2% 9 0% 2598 9% 137 0% 200 0% 1408 5% write create mkdir symlink mknod remove rmdir 1280 4% 376 1% 70 0% 200 0% 0 0% 676 2% 70 0% rename link readdir readdir+ fsstat fsinfo pathconf 100 0% 100 0% 468 1% 0 0% 1750 6% 1 0% 0 0% commit 10 0%

The ratio of timeouts to calls (which should not exceed 1 percent) is the most important thing to look for in the NFS statistics. A timeout-to-call ratio greater than 1 percent can have a significant negative impact on performance. See Section 3.6.3 for information on how to tune your system to avoid timeouts.

If you are attempting to monitor an experimental situation with nfsstat, it may be advisable to reset the NFS counters to zero before you begin the experiment. The nfsstat -z command can be used to clear the counters.