4    Improving System Performance

You may be able to improve Tru64 UNIX performance by tuning the operating system or performing other tasks. You may need to tune the system under the following circumstances:

To help you improve system performance, this chapter describes how to perform the following tasks:

In order to effectively tune a system, you must identify the area in which performance is deficient. See Chapter 3 for information about monitoring the system.

4.1    Tuning Special Configurations

Large configurations or configurations that run memory-intensive or network-intensive applications may require special tuning. The following sections provide information about tuning these special configurations:

In addition, your application product documentation may include specific configuration and tuning recommendations.

4.1.1    Tuning Internet Servers

Internet servers (including Web, proxy, firewall, and gateway servers) run network-intensive applications that usually require significant system resources. If you have an Internet server, it is recommended that you modify the default values of some kernel attributes.

Follow the recommendations in Table 4-1 to help you tune an Internet server.

Table 4-1:  Internet Server Tuning Recommendations

Action Reference
Increase the system resources available to processes. Section 5.1
Increase the available address space. Section 5.3.1
Increase the number of memory-mapped files in a user address. Section 5.3.2
Increase the number of pages with individual protections. Section 5.3.3
Ensure that the Unified Buffer Cache (UBC) has sufficient memory. Section 9.2.3
Increase the size of the hash table that the kernel uses to look up TCP control blocks. Section 10.2.1
Increase the number of TCP hash tables. Section 10.2.2
Increase the limits for partial TCP connections on the socket listen queue. Section 10.2.3
For only proxy servers, increase the maximum number of concurrent nonreserved, dynamically allocated ports. Section 10.2.4
Disable use of a path maximum transmission unit (PMTU). Section 10.2.16
Increase the number of IP input queues. Section 10.2.19
For only proxy servers, enable mbuf cluster compression. Section 10.2.21

4.1.2    Tuning Large-Memory Systems

Large memory systems often run memory-intensive applications, such as database programs, that usually require significant system resources. If you have a large memory system, it is recommended that you modify the default values of some kernel attributes.

Follow the recommendations in Table 4-2 to help you tune a large-memory system.

Table 4-2:  Large-Memory System Tuning Recommendations

Action Reference
Increase the system resources available to processes. Section 5.1
Increase the size of a System V message and queue. Section 5.4.1
Increase the maximum size of a single System V shared memory region. Section 5.4.4
Increase the minimum size of a System V shared memory segment. Section 5.4.5
Increase the available address space. Section 5.3.1
Increase the maximum number of memory-mapped files that are available to a process. Section 5.3.2
Increase the maximum number of virtual pages within a process' address space that can have individual protection attributes. Section 5.3.3
Reduce the size of the AdvFS buffer cache. Section 6.4.5
Increase the number of AdvFS buffer hash chains, if you are using AdvFS. Section 9.3.4.2
Increase the memory reserved for AdvFS access structures, if you are using AdvFS. Section 9.3.4.3
Increase the size of the metadata buffer cache to more than 3 percent of main memory, if you are using UFS. Section 9.4.3.1
Increase the size of the metadata hash chain table, if you are using UFS. Section 9.4.3.2

4.1.3    Tuning NFS Servers

NFS servers run only a few small user-level programs, which consume few system resources. File system tuning is important because processing NFS requests consumes the majority of CPU and wall clock time. See Chapter 9 for information on file system tuning.

In addition, if you are running NFS over TCP, tuning TCP may improve performance if there are many active clients. If you are running NFS over UDP, network tuning is not needed. See Section 10.2 for information on network subsystem tuning.

Follow the recommendations in Table 4-3 to help you tune a system that is only serving NFS.

Table 4-3:  NFS Server Tuning Recommendations

Action Reference
Set the value of the maxusers attribute to the number of server NFS operations that are expected to occur each second. Section 5.1
Increase the size of the namei cache. Section 9.2.1
Increase the number of AdvFS access structures, if you are using AdvFS. Section 9.3.4.3
Increase the size of the metadata buffer cache, if you are using UFS. Section 9.4.3.1

4.2    Checking the Configuration by Using the sys_check Utility

After you apply any configuration-specific tuning recommendations, as described in Section 4.1, run the sys_check utility to check your system configuration.

The sys_check utility creates an HTML file that describes the system configuration, and can be used to diagnose problems. The utility checks kernel attribute settings and memory and CPU resources, provides performance data and lock statistics for SMP systems and for kernel profiles, and outputs any warnings and tuning recommendations.

Consider applying the sys_check utility's configuration and tuning recommendations before applying any advanced tuning recommendations.

Note

You may experience impaired system performance while running the sys_check utility. Invoke the utility during offpeak hours to minimize the performance impact.

You can invoke the sys_check utility from the SysMan graphical user interface or from the command line. If you specify sys_check without any command-line options, it performs a basic system analysis and creates an HTML file with configuration and tuning recommendations. Options that you can specify at the command line include the following:

See sys_check(8) for more information.

4.3    Using the Advanced Tuning Recommendations

If system performance is still deficient after applying the initial tuning recommendations for your configuration (see Section 4.1) and considering the sys_check recommendations (see Section 4.2), you may be able to improve performance by using the advanced tuning recommendations.

Before using the advanced tuning recommendations, you must:

Use the advanced tuning guidelines shown in Table 4-4 to help you tune your system. In addition, Section 4.5 describes some solutions to common performance problems. Before implementing any tuning recommendation, you must ensure that it is appropriate for your configuration and workload and also consider its benefits and tradeoffs.

Table 4-4:  Advanced Tuning Guidelines

If your workload consists of: You can improve performance by:
Applications requiring extensive system resources Increasing resource limits (Chapter 5)
Memory-intensive applications

Increasing the memory available to processes (Section 6.4)

Modifying paging and swapping operations (Section 6.5)

Reserving shared memory (Section 6.6)

CPU-intensive applications Freeing CPU resources (Section 7.2)
Disk I/O-intensive applications Distributing the disk I/O load (Section 8.1)
File system-intensive applications Modifying AdvFS, UFS, or NFS operation (Chapter 9)
Network-intensive applications Modifying network operation (Section 10.2)
Non-optimized or poorly-written applications applications Optimizing or rewriting the applications (Chapter 11)

4.4    Modifying the Kernel

The operating system includes various subsystems that are used to define or extend the kernel. Kernel variables control subsystem behavior or track subsystem statistics since boot time.

Kernel variables are assigned default values at boot time. For certain configurations and workloads, especially memory or network-intensive systems, the default values of some attributes may not be appropriate, so you must modify these values to provide optimal performance.

There are several methods that you can use to modify kernel variable values:

Each system includes different subsystems, depending on the configuration and the installed kernel options. For example, all systems include the mandatory subsystems, such as the generic, vm, and vfs subsystems. Other subsystems are optional, such as presto.

Most subsystems include one or more attributes. These attributes control or monitor some part of the subsystem. For example, the vm subsystem includes the vm-page-free-swap attribute, which controls when swapping starts. The socket subsystem includes the sobacklog_hiwat attribute, which monitors the maximum number of pending socket requests.

Kernel subsystem attributes are documented in reference pages. For example, sys_attrs_advfs(5) includes definitions for all the advfs subsystem attributes. See sys_attrs(5) for more information.

Subsystem attributes are managed by the configuration manager server, cfgmgr. You access subsystem attributes by using the sysconfig and sysconfigdb commands and by using the Kernel Tuner, dxkerneltuner, which is provided by the Common Desktop Environment (CDE).

You permanently modify an attribute value by including the modification in the /etc/sysconfigtab database file, using a special format. In some cases, you can modify attributes while the system is running. However, these run-time modifications are lost when the system reboots.

The following sections describe how to perform these tasks:

4.4.1    Displaying Kernel Subsystems

Use one of the following methods to display the kernel subsystems currently configured in your operating system:

The following example shows how to use the sysconfig -s command to display all the subsystems configured in the operating system:

# sysconfig -s
cm: loaded and configured
hs: loaded and configured
ksm: loaded and configured
generic: loaded and configured
io: loaded and configured
ipc: loaded and configured
proc: loaded and configured
sec: loaded and configured
socket: loaded and configured
rt: loaded and configured
bsd_tty: loaded and configured
xpr: loaded and configured
kdebug: loaded and configured
dli: loaded and configured
ffm_fs: loaded and configured
atm: loaded and configured
atmip: loaded and configured
lane: loaded and configured
atmifmp: loaded and configured
atmuni: loaded and configured
atmilmi3x: loaded and configured
uni3x: loaded and configured
bparm: loaded and configured
advfs: loaded and configured
net: loaded and configured
 .
 .
 .

4.4.2    Displaying Current Attribute Values

Use one of the following methods to display the current (run-time) value of an attribute:

The following example shows how to use the sysconfig -q command to display the current values of the vfs subsystem:

# sysconfig -q vfs
vfs:
name-cache-size = 1029
name-cache-hash-size = 256
buffer-hash-size = 512
special-vnode-alias-tbl-size = 64
bufcache = 3
bufpages = 238
path-num-max = 64
sys-v-mode = 0
ucred-max = 256
nvnode = 468
max-vnodes = 6558
min-free-vnodes = 468
vnode-age = 120
namei-cache-valid-time = 1200
max-free-file-structures = 0
max-ufs-mounts = 1000
vnode-deallocation-enable = 1
pipe-maxbuf-size = 65536
pipe-single-write-max = -1
pipe-databuf-size = 8192
pipe-max-bytes-all-pipes = 81920000
noadd-exec-access = 0
fifo-do-adaptive = 1

Note

The current value of an attribute may not reflect a legal value, if you are not actually using a subsystem. For example, if you do not have an AdvFS fileset mounted, the current value of the advfs subsystem attribute AdvfsPreallocAccess will be 0 (zero), even though the minimum value is 128. After you mount an AdvFS fileset, the current value will be changed to 128.

4.4.3    Displaying Minimum and Maximum Attribute Values

Each subsystem attribute has a minimum and maximum value. If you modify an attribute, the value must be between these values. However, the minimum and maximum values should be used with caution. Instead, use the tuning recommendations described in this manual to determine an appropriate attribute value for your configuration.

Use one of the following methods to display the minimum and maximum allowable values for an attribute:

4.4.4    Modifying Run-Time Attribute Values

Modifying an attribute's current (run-time) value allows the change to occur immediately, without rebooting the system. Not all attributes support run-time modifications.

Modifications to run-time values are lost when you reboot the system and the attribute values return to their permanent values. To make a permanent change to an attribute value, see Section 4.4.5.

To determine if an attribute can be tuned at run time, use one of the following methods:

To modify an attribute's run-time value, use one of the following methods:

Note

Do not specify erroneous values for subsystem attributes, because system behavior may be unpredictable. If you want to modify an attribute, use only the recommended values described in this manual.

To return to the original attribute value, either modify the run-time value or reboot the system.

4.4.5    Permanently Modifying Attribute Values

To permanently change the value of an attribute, you must include the new value in the /etc/sysconfigtab file, using the required format. Do not edit the file manually.

Note

Before you permanently modify a subsystem attribute, it is recommended that you maintain a record of the original value, in case you need to return to this value.

Use one of the following methods to permanently modify the value of an attribute:

Note

Do not specify erroneous values for subsystem attributes, because system behavior may be unpredictable. If you want to modify an attribute, use only the recommended values described in this manual.

4.4.6    Displaying and Modifying Kernel Variables by Using the dbx Debugger

Use the dbx debugger to examine the values of kernel variables and data structures and to modify the current (run-time) values of kernel variables.

The following example of the dbx print command displays the current (run-time) value of the vm_page_free_count kernel variable:

# /usr/ucb/dbx -k /vmunix /dev/mem 
(dbx) print vm_page_free_count
248
(dbx)

The following example of the dbx print command displays the current (run-time) values of the kernel variables in the vm_perfsum data structure:

# /usr/ucb/dbx -k /vmunix /dev/mem 
(dbx) print vm_perfsum
struct {
    vpf_pagefaults = 1689166
    vpf_kpagefaults = 13690
    vpf_cowfaults = 478504
    vpf_cowsteals = 638970
    vpf_zfod = 255372
    vpf_kzfod = 13654
    vpf_pgiowrites = 3902
    .
    .
    .
 
    vpf_vmwiredpages = 440
    vpf_ubcwiredpages = 0
    vpf_mallocpages = 897
    vpf_totalptepages = 226
    vpf_contigpages = 3
    vpf_rmwiredpages = 0
    vpf_ubcpages = 2995
    vpf_freepages = 265
    vpf_vmcleanpages = 237
    vpf_swapspace = 7806
}
(dbx)

Use the dbx patch command to modify the current (run-time) values of kernel variables. The values you assign by using the dbx patch command are lost when you rebuild the kernel.

Notes

If possible, use the sysconfig command or the Kernel Tuner to modify subsystem attributes instead of using dbx to modify kernel variables. Do not specify erroneous values for kernel variables, because system behavior may be unpredictable. If you want to modify a variable, use only the recommended values described in this manual.

The following example of the dbx patch command changes the current value of the cluster_consec_init variable to 8:

# /usr/ucb/dbx -k /vmunix /dev/mem 
(dbx) patch cluster_consec_init = 8
32767
(dbx)

To ensure that the system is utilizing a new kernel variable value, reboot the system. See the Programmer's Guide for detailed information about the dbx debugger.

You can also use the dbx assign command to modify run-time kernel variable values. However, the modifications are lost when you reboot the system.

4.5    Solving Common Performance Problems

The following sections provide examples of some common performance problems and solutions:

Each section describes how to detect the problem, the possible causes of the problem, and how to eliminate or diminish the problem.

4.5.1    Application Completes Slowly

Use the following table to detect a slow application completion time and to diagnose the performance problem:

How to detect

Check application log files.

Use the ps command to display information about application processing times and whether an application is swapped out. See Section 6.3.1.

Use process accounting commands to obtain information about process completion times. See accton(8).

Cause Application is poorly written.
Solution Rewrite the application so that it runs more efficiently. See Chapter 7. Use profiling and debugging commands to analyze applications and identify inefficient areas of code. See Section 11.1.
Cause Application is not optimized.
Solution Optimize the application. See Chapter 7.
Cause Application is being swapped out.
Solution

Delay swapping processes. See Section 6.5.1.

Increase the memory available to processes. See Section 6.4.

Reduce an application's use of memory. See Section 11.2.6.

Cause Application requires more memory resources.
Solution

Increase the memory available to processes. See Section 6.4.

Reduce an application's use of memory. See Section 11.2.6.

Cause Insufficient swap space.
Solution Increase the swap space and distribute it across multiple disks. See Section 4.5.3.
Cause Application requires more CPU resources.
Solution Provide more CPU resources to processes. See Section 4.5.4.
Cause Disk I/O bottleneck.
Solution Distribute disk I/O efficiently. See Section 4.5.6.

4.5.2    Excessive Memory Paging

A high rate of paging or a low free page count may indicate that you have inadequate memory for the workload. Avoid paging if you have a large memory system. Use the following table to detect insufficient memory and to diagnose the performance problem:

How to detect

Use the vmstat command to display information about paging and memory consumption. See Section 6.3.2 for more information.

Use the dbx vm_perfsum data structure to display the number of page faults and other memory information. See Section 6.3.4.

Check the UBC paging information by using the dbx vm_perfsum data structure. See Section 6.3.5.

Cause Insufficient memory resources available to processes.
Solution

Reduce an application's use of memory. See Section 11.2.6.

Increase the memory resources that are available to processes. See Section 6.4.

4.5.3    Insufficient Swap Space

If you consume all the available swap space, the system will display messages on the console indicating the problem. Use the following table to detect if you have insufficient swap space and to diagnose the performance problem:

How to detect

Invoke the swapon -s while you are running a normal workload. See Section 6.3.3.
Cause Insufficient swap space for your configuration.
Solution Configure enough swap space for your configuration and workload. See Section 2.3.2.3.
Cause Swap space not distributed.
Solution Distribute the swap load across multiple swap devices to improve performance. See Section 6.2.
Cause Applications are utilizing excessive memory resources.
Solution

Increase the memory available to processes. See Section 6.4.

Reduce an application's use of memory. See Section 11.2.6.

4.5.4    Insufficient CPU Cycles

Although a low CPU idle time can indicate that the CPU is being fully utilized, performance can suffer if the system provides an insufficient number of CPU cycles to processes. Use the following table to detect insufficient CPU cycles and to diagnose the performance problem:

How to detect

Use the vmstat command to display information about CPU system, user, and idle times. See Section 6.3.2 for more information.

Use the kdbx cpustat extension to check CPU usage. See Section 7.1.4).

Cause Excessive CPU demand from applications.
Solution

Optimize applications. See Section 11.2.4.

Use hardware RAID to relieve the CPU of disk I/O overhead. See Section 8.4.

4.5.5    Processes Swapped Out

Swapped out (suspended) processes will decrease system response time and application completion time. Avoid swapping if you have a large memory system or large applications. Use the following table to detect if processes are being swapped out and to diagnose the performance problem:

How to detect

Use the ps command to determine if your system is swapping processes. See Section 6.3.1.

Cause Insufficient memory resources.
Solution

Increase the memory available to processes. See Section 6.4.

Reduce an application's use of memory. See Section 11.2.6.

Cause Swapping occurs too early during page reclamation.
Solution Decrease the rate of swapping. See Section 6.5.1.

4.5.6    Disk Bottleneck

Excessive I/O to only one or a few disks may cause a bottleneck at the over-utilized disks. Use the following table to detect an uneven distribution of disk I/O and to diagnose the performance problem:

How to detect

Use the iostat command to display which disks are being used the most. See Section 8.2.1.

Use the swapon -s command to display the utilization of swap disks. See Section 6.3.3.

Use the volstat command to display information about the LSM I/O workload. See Section 8.3.4.2 for more information.

Use the advfsstat to display AdvFS disk usage information. See Section 9.3.3.1.

Cause Disk I/O not evenly distributed.
Solution

Use disk striping. See Section 2.5.2

Distribute disk, swap, and file system I/O across different disks and, optimally, multiple buses. See Section 8.1.

4.5.7    Poor Disk I/O Performance

Because disk I/O operations are much slower than memory operations, the disk I/O subsystem is often the source of performance problems. Use the following table to detect poor disk I/O performance and to diagnose the performance problem:

How to detect

Use the iostat command to determine if a you have a bottleneck at a disk. See Section 8.2.1 for more information.

Check for disk fragmentation by using the AdvFS defragment utility with the -v and -n options.

Check the hit rate of the namei cache with the dbx nchstats data structure. See Section 9.1.2.

Monitor the memory allocated to the UBC by using the dbx vm_perfsum, ufs_getapage_stats, and vm_tune data structures. See Section 6.3.5.

Check UFS clustering with the dbx ufs_clusterstats data structure. See Section 6.3.5.

Check the hit rate of the metadata buffer cache by using the dbx bio_stats data structure. See Section 9.4.2.3.

Use the advfsstat command to monitor the performance of AdvFS domains and filesets. See Section 9.3.3.1.

Cause Disk I/O is not efficiently distributed.
Solution

Use disk striping. See Section 2.5.2

Distribute disk, swap, and file system I/O across different disks and, optimally, multiple buses. See Section 8.1.

Cause File systems are fragmented.
Solution Defragment file systems. See Section 9.3.4.4 and Section 9.4.3.3.
Cause Maximum open file limit is too small.
Solution Increase the maximum number of open files. See Section 5.5.1.
Cause The namei cache is too small.
Solution Increase the size of the namei cache. See Section 9.2.1.

4.5.8    Poor AdvFS Performance

Use the following table to detect poor AdvFS performance and to diagnose the performance problem:

How to detect

Use the advfsstat command to monitor the performance of AdvFS domains and filesets. See Section 9.3.3.1

Check for disk fragmentation by using the AdvFS defragment command with the -v and -n options. See defragment(8) for more information.

Cause Single-volume domains are being used.
Solution Use multiple-volume file domains. See Section 9.3.2.1.
Cause File system is fragmented.
Solution Defragment the file system. See Section 9.3.4.4.
Cause There are too few AdvFS buffer cache hits.
Solution

Allocate sufficient memory to the AdvFS buffer cache. See Section 9.3.4.1.

Increase the number of AdvFS buffer hash chains (Section 9.3.4.2.

Increase the dirty data caching threshold. See Section 9.3.4.5.

Modify the AdvFS device queue limit. See Section 9.3.4.6.

Cause The advfsd daemon is running unnecessarily.
Solution Stop the daemon. See Section 7.2.5.

4.5.9    Poor UFS Performance

Use the following table to detect poor UFS performance and to diagnose the performance problem:

How to detect

Use the dumpfs command to display UFS information. See Section 9.4.2.1.

Check the hit rate of the namei cache with the dbx nchstats data structure. See Section 9.1.2.

Monitor the memory allocated to the UBC by using the dbx vm_perfsum, ufs_getapage_stats, and vm_tune data structures.

Check how effectively the system is clustering and check fragmentation by using the dbx print command to examine the ufs_clusterstats, ufs_clusterstats_read, and ufs_clusterstats_write data structures. See Section 9.4.2.2.

Check the hit rate of the metadata buffer cache by using the dbx bio_stats data structure. See Section 9.4.2.3.

Cause The UBC is too small.
Solution Increase the amount of memory allocated to the UBC. See Section 9.2.3.
Cause The metadata buffer cache is too small.
Solution Increase the size of metadata buffer cache. See Section 9.4.3.1.
Cause The file system fragment size is incorrect.
Solution Make the file system fragment size equal to the block size. See Section 9.4.1.1.
Cause File system is fragmented.
Solution Defragment the file system. Section 9.4.3.3.

4.5.10    Poor NFS Performance

Use the following table to detect poor NFS performance and to diagnose the performance problem:

How to detect

Use the nfsstat command to display the number of NFS requests and other information. See Section 9.5.1.1.

Use the dbx print nchstats command to determine the namei cache hit rate. See Section 9.1.2.

Use the dbx print bio_stats command to determine the metadata buffer cache hit rate. See Section 9.4.2.3.

Use the dbx print vm_perfsum command to check the UBC hit rate. See Section 6.3.5.

Use the dbx print nfs_sv_active_hist command to display a histogram of the active NFS server threads. See Section 4.4.6.

Use the ps axlmp command to display the number of idle threads. See Section 9.5.2.2 and Section 9.5.2.3.

Cause NFS server threads busy.
Solution Reconfigure the server to run more threads. See Section 9.5.2.2.
Cause Memory resources are not focused on file system caching.
Solution

Increase the number of vnodes on the free list. See Section 9.2.11.

If you are using AdvFS, increase the memory allocated for AdvFS buffer caching. See Section 9.3.4.1.

If you are using AdvFS, increase the memory reserved for AdvFS access structures. See Section 9.3.4.3.

Cause UFS metadata buffer cache hit rate is low.
Solution

Increase the size of the metadata buffer cache. See Section 9.4.3.1.

Increase the size of the namei cache. See Section 9.2.1.

Cause CPU idle time is low.
Solution Use UFS, instead of AdvFS. See Section 9.4.

4.5.11    Poor Network Performance

Use the following table to detect poor network performance and to diagnose the performance problem:

How to detect

Use the netstat command to display information about network collisions and dropped network connections. See Section 10.1.1.

Check the socket listen queue statistics to check the number of pending requests and the number of times the system dropped a received SYN packet. See Section 10.1.2.

Cause The TCP hash table is too small.
Solution Increase the size of the hash table that the kernel uses to look up TCP control blocks. See Section 10.2.1.
Cause The limit for the socket listen queue is too low.
Solution Increase the limit for partial TCP connections on the socket listen queue. See Section 10.2.3.
Cause There are too few outgoing network ports.
Solution Increase the maximum number of concurrent nonreserved, dynamically allocated ports. See Section 10.2.4.
Cause Network connections are becoming inactive too quickly.
Solution Enable TCP keepalive functionality. See Section 10.2.6.