3    Monitoring Systems and Diagnosing Performance Problems

You must gather a wide variety of performance information in order to identify performance problems or areas where performance is deficient.

Some symptoms or indications of performance problems are obvious. For example, applications complete slowly or messages appear on the console indicating that the system is out of resources. Other problems or performance deficiencies are not obvious and can be detected only by monitoring system performance.

This chapter describes how to perform the following tasks:

After you identify a performance problem or an area in which performance is deficient, you can identify an appropriate solution. See Chapter 4 for information about improving system performance.

3.1    Obtaining Information About System Events

It is recommended that you set up a routine to continuously monitor system events and to alert you when serious problems occur. Periodically examining event and log files allows you to correct a problem before it affects performance or availability, and helps you diagnose performance problems.

The system event logging facility and the binary event logging facility log system events. The system event logging facility uses the syslog function to log events in ASCII format. The syslogd daemon collects the messages logged from the various kernel, command, utility, and application programs. This daemon then writes the messages to a local file or forwards the messages to a remote system, as specified in the /etc/syslog.conf event logging configuration file. You should periodically monitor these ASCII log files for performance information.

The binary event logging facility detects hardware and software events in the kernel and logs detailed information in binary format records. The binary event logging facility uses the binlogd daemon to collect various event log records. The daemon then writes these records to a local file or forwards the records to a remote system, as specified in the /etc/binlog.conf default configuration file.

You can examine the binary event log files by using the following methods:

In addition, it is recommended that you configure crash dump support into the system. Significant performance problems may cause the system to crash, and crash dump analysis tools can help you diagnose performance problems.

See the System Administration manual for more information about event logging and crash dumps.

The following sections describe Event Manager and the DECevent utility.

3.1.1    Using Event Manager

Event Manager (EVM) allows you to obtain event information and communicate this information to interested parties for immediate or later action. Event Manager provides the following features:

See the System Administration manual for more information about EVM.

3.1.2    Using DECevent

The DECevent utility continuously monitors system events through the binary event logging facility, decodes events, and tracks the number and the severity of events logged by system devices. DECevent attempts to isolate failing device components and provides a notification mechanism that can warn of potential problems.

DECevent determines if a threshold has been crossed, according to the number and severity of events reported. Depending on the type of threshold crossed, DECevent analyzes the events and notifies users of the events (for example, through mail).

You must register a license to use DECevent's analysis and notification features, or these features may also be available as part of your service agreement. A license is not needed to use DECevent to translate the binary log file to ASCII format.

See the DECevent Translation and Reporting Utility manual for more information.

3.2    Using System Accounting and Disk Quotas

It is recommended that you set up system accounting, which allows you to obtain information about the resources consumed by each user. Accounting can track the amount of CPU usage and connect time, the number of processes spawned, memory and disk usage, the number of I/O operations, and the number of print operations.

In addition, you should establish Advanced File System (AdvFS) and UNIX File System (UFS) disk quotas to track and control disk usage. Disk quotas allow you to limit the disk space available to users and to monitor disk space usage.

See the System Administration manual for information about system accounting and UFS disk quotas. See the AdvFS Administration manual for information about AdvFS quotas.

3.3    Continuously Monitoring Performance

You may want to set up a routine to continuously monitor system performance. Some monitoring tools will alert you when serious problems occur (for example, mail). It is important that you choose a monitoring tool that has low overhead in order to obtain accurate performance information.

The following tools allow you to continuously monitor performance:

Table 3-1:  Tools for Continuous Performance Monitoring

Name Description

Performance Manager

Simultaneously monitors multiple Tru64 UNIX systems, detects performance problems, and performs event notification. See Section 3.3.1 for more information.

Performance Visualizer

Graphically displays the performance of all significant components of a parallel system. Using Performance Visualizer, you can monitor the performance of all the member systems in a cluster. See Section 3.3.2 for more information.

monitor

Collects a variety of performance data on a running system and either displays the information or saves it to a binary file. The monitor utility is available on the Tru64 UNIX Freeware CD-ROM. See ftp://gatekeeper.dec.com/pub/DEC for information.

top

Provides continuous reports on the state of the system, including a list of the processes using the most CPU resources. The top command is available on the Tru64 UNIX Freeware CD-ROM. See ftp://eecs.nwu.edu/pub/top for information.

tcpdump

Continuously monitors the network traffic associated with a particular network service and allows you to identify the source of a packet. See tcpdump(8) for information.

nfswatch

Continuously monitors all incoming network traffic to a Network File System (NFS) server, and displays the number and percentage of packets received. See nfswatch(8) for information.

xload

Displays the system load average in a histogram that is periodically updated. See xload(1X) for information.

volstat

Provides information about activity on volumes, plexes, subdisks, and disks under LSM control. The volstat utility reports statistics that reflect the activity levels of LSM objects since boot time or since you reset the statistics. See Section 8.4.7.2 for information.

volwatch

Monitors LSM for failures in disks, volumes, and plexes, and sends mail if a failure occurs. See Section 8.4.7.4 for information.

The following sections describe the Performance Manager and Performance Visualizer products.

3.3.1    Using Performance Manager

Performance Manager (PM) for Tru64 UNIX allows you to simultaneously monitor multiple Tru64 UNIX systems, so you can detect and correct performance problems. PM can operate in the background, alerting you to performance problems. Monitoring only a local node does not require a PM license. However, a license is required to monitor multiple nodes and clusters.

Performance Manager (PM) is located on Volume 2 of the Associated Products CD-ROM in your distribution kit. To use PM, you must be running the pmgrd daemon. To start PM, invoke the /usr/bin/pmgr command.

PM gathers and displays Simple Network Protocol (SNMP and eSNMP) data for the systems you choose, and allows you to detect and correct performance problems from a central location. PM has a graphical user interface (GUI) that runs locally and displays data from the monitored systems. Use the GUI to choose the systems and data that you want to monitor.

You can customize and extend PM to create and save performance monitoring sessions. Graphs and charts can show hundreds of different system values, including CPU performance, memory usage, disk transfers, file-system capacity, network efficiency, database performance, and AdvFS and cluster-specific metrics. Data archives can be used for high-speed playback or long-term trend analysis.

PM provides comprehensive thresholding, rearming, and tolerance facilities for all displayed metrics. You can set a threshold on every key metric, and specify the PM reaction when a threshold is crossed. For example, you can configure PM to send mail, to execute a command, or to display a notification message.

PM also has performance analysis and system management scripts, as well as cluster-specific and AdvFS-specific scripts. Run these scripts separately to target specific problems, or run them simultaneously to check the overall system performance. The PM analyses include suggestions for eliminating problems. PM can monitor both individual cluster members and an entire cluster concurrently.

See http://www.zso.dec.com/unix/pm/pmweb/index.html for information about Performance Manager.

3.3.2    Using Performance Visualizer

Performance Visualizer is a valuable tool for developers of parallel applications. Because it monitors the performance of several systems simultaneously, it allows you to see the impact of a parallel application on all the systems, and to ensure that the application is balanced across all systems. When problems are identified, you can change the application code and use Performance Visualizer to evaluate the impact of these changes. Performance Visualizer is a Tru64 UNIX layered product and requires a license.

Performance Visualizer also helps you identify overloaded systems, underutilized resources, active users, and busy processes. You can monitor the following:

You can choose to look at all of the hosts in a parallel system or at individual hosts. See the Performance Visualizer documentation for more information.

3.4    Gathering Performance Information

There are various commands and utilities that you can use to gather system performance information. It is important that you gather statistics under a variety of conditions. Comparing sets of data will help you to diagnose performance problems.

For example, to determine how an application affects system performance, you can gather performance statistics without the application running, start the application, and then gather the same statistics. Comparing different sets of data will enable you to identify whether the application is consuming memory, CPU, or disk I/O resources.

In addition, you must gather information at different stages during the application processing to obtain accurate performance information. For example, an application may be I/O-intensive during one stage and CPU-intensive during another.

To obtain a basic understanding of system performance, invoke the following commands while under a normal workload:

There are many tools that you can use to query subsystems, profile the system kernel and applications, and collect CPU statistics. See the following tables for information:

Kernel profiling and debugging Table 3-2
Memory resource monitoring Table 6-2
CPU monitoring Table 7-1
Disk I/O distribution monitoring Table 8-1
Logical Storage Manager (LSM) monitoring Table 8-7
Advanced File System (AdvFS) monitoring Table 9-3
UNIX File System (UFS) monitoring Table 9-7
Network File System (NFS) monitoring Table 9-9
Network subsystem monitoring Table 10-1
Application profiling and debugging Table 11-1

3.5    Profiling and Debugging Kernels

Table 3-2 describes the tools that you can use to profile and debug the kernel. Detailed information about these profiling and debugging tools is located in the Kernel Debugging manual and in the tools' reference pages.

Table 3-2:  Kernel Profiling and Debugging Tools

Name Use Description

prof

Analyzes profiling data

Analyzes profiling data and produces statistics showing which portions of code consume the most time and where the time is spent (for example, at the routine level, the basic block level, or the instruction level).

The prof command uses as input one or more data files generated by the kprofile, uprofile, or pixie profiling tools. The prof command also accepts profiling data files generated by programs linked with the -p switch of compilers such as cc. See prof(1) for more information.

kprofile

Produces a program counter profile of a running kernel

Profiles a running kernel using the performance counters on the Alpha chip. You analyze the performance data collected by the tool with the prof command. See kprofile(1) for more information.

dbx

Debugs running kernels, programs, and crash dumps, and examines and temporarily modifies kernel variables

Provides source-level debugging for C, Fortran, Pascal, assembly language, and machine code. The dbx debugger allows you to analyze crash dumps, trace problems in a program object at the source-code level or at the machine code level, control program execution, trace program logic and flow of control, and monitor memory locations.

Use dbx to debug kernels, debug stripped images, examine memory contents, debug multiple threads, analyze user code and applications, display the value and format of kernel data structures, and temporarily modify the values of some kernel variables. See dbx(8) for more information.

kdbx

Debugs running kernels and crash dumps

Allows you to examine a running kernel or a crash dump. The kdbx debugger, a frontend to the dbx debugger, is used specifically to debug kernel code and display kernel data in a readable format. The debugger is extensible and customizable, allowing you to create commands that are tailored to your kernel debugging needs.

You can also use extensions to check resource usage (for example, CPU usage). See kdbx(8) for more information.

ladebug

Debugs kernels and applications

Debugs programs and the kernel and helps locate run-time programming errors. The ladebug symbolic debugger is an alternative to the dbx debugger and provides both command-line and graphical user interfaces and support for debugging multithreaded programs. See the Ladebug Debugger Manual and ladebug(1) for more information.

3.6    Accessing and Modifying Kernel Subsystems

The operating system includes various subsystems that are used to define or extend the kernel. Kernel variables control subsystem behavior or track subsystem statistics since boot time.

Kernel variables are assigned default values at boot time. For certain configurations and workloads, especially memory- or network-intensive systems, the default values of some attributes may not be appropriate, so you must modify these values to provide optimal performance.

Although you can use the dbx debugger to directly change variable values on a running kernel, Compaq recommends that you use kernel subsystem attributes to access the kernel variables.

Subsystem attributes are managed by the configuration manager server, cfgmgr. You can display and modify attributes by using the sysconfig and sysconfigdb commands and by using the Kernel Tuner, dxkerneltuner, which is provided by the Common Desktop Environment (CDE). In some cases, you can modify attributes while the system is running. However, these run-time modifications are lost when the system reboots.

The following sections describe how to perform these tasks:

3.6.1    Displaying the Subsystems Configured in the Kernel

Each system includes different subsystems, depending on the configuration and the installed kernel options. For example, all systems include the mandatory subsystems, such as the generic, vm, and vfs subsystems. Other subsystems are optional, such as the Prestoserve subsystem presto.

Use one of the following methods to display the kernel subsystems currently configured in your operating system:

The following example shows how to use the sysconfig -s command to display the subsystems configured in the kernel:

# sysconfig -s
cm: loaded and configured
hs: loaded and configured
ksm: loaded and configured
generic: loaded and configured
io: loaded and configured
ipc: loaded and configured
proc: loaded and configured
sec: loaded and configured
socket: loaded and configured
rt: loaded and configured
bsd_tty: loaded and configured
xpr: loaded and configured
kdebug: loaded and configured
dli: loaded and configured
ffm_fs: loaded and configured
atm: loaded and configured
atmip: loaded and configured
lane: loaded and configured
atmifmp: loaded and configured
atmuni: loaded and configured
atmilmi3x: loaded and configured
uni3x: loaded and configured
bparm: loaded and configured
advfs: loaded and configured
net: loaded and configured
 .
 .
 .

3.6.2    Displaying Current Subsystem Attribute Values

Most kernel subsystems include one or more attributes. These attributes control or monitor some part of the subsystem and are assigned a value at boot time. For example, the vm subsystem includes the vm_page_free_swap attribute, which controls when swapping starts. The socket subsystem includes the sobacklog_hiwat attribute, which monitors the maximum number of pending socket requests.

Kernel subsystem attributes are documented in the reference pages. For example, sys_attrs_advfs(5) includes definitions for all the advfs subsystem attributes. See sys_attrs(5) for more information.

Use one of the following methods to display the current value of an attribute:

The following example shows how to use the sysconfig -q command to display the current values of the vfs subsystem attributes:

# sysconfig -q vfs
vfs:
name_cache_size = 3141
name_cache_hash_size = 512
buffer_hash_size = 2048
special_vnode_alias_tbl_size = 64
bufcache = 3
bufpages = 1958
path_num_max = 64
sys_v_mode = 0
ucred_max = 256
nvnode = 1428
max_vnodes = 49147
min_free_vnodes = 1428
vnode_age = 120
namei_cache_valid_time = 1200
max_free_file_structures = 0
max_ufs_mounts = 1000
vnode_deallocation_enable = 1
pipe_maxbuf_size = 262144
pipe_databuf_size = 8192
pipe_max_bytes_all_pipes = 134217728
noadd_exec_access = 0
fifo_do_adaptive = 1
nlock_record = 10000
smoothsync_age = 30
revoke_tty_only = 1
strict_posix_osync = 0

Note

The current value of an attribute may not reflect a legal value, if you are not actually using a subsystem.

3.6.3    Displaying Minimum and Maximum Attribute Values

Each subsystem attribute has a minimum and maximum value. If you modify an attribute, the value must be between these values. However, the minimum and maximum values should be used with caution. Instead, use the tuning guidelines described in this manual to determine an appropriate attribute value for your configuration.

Use one of the following methods to display the minimum and maximum allowable values for an attribute:

3.6.4    Modifying Attribute Values at Run Time

Modifying an attribute's current value at run time allows the change to occur immediately, without rebooting the system. Not all attributes support run-time modifications.

Modifications to run-time values are lost when you reboot the system and the attribute values return to their permanent values. To make a permanent change to an attribute value, see Section 3.6.6.

To determine if an attribute can be tuned at run time, use one of the following methods:

To modify an attribute's value at run time, use one of the following methods:

Note

Do not specify erroneous values for subsystem attributes, because system behavior may be unpredictable. If you want to modify an attribute, use only the recommended values described in this manual.

To return to the original attribute value, either modify attribute's current value or reboot the system.

3.6.5    Modifying Attribute Values at Boot Time

You can set the value of a subsystem attribute at the boot prompt. This will modify the value of the attribute only for the next system boot.

To do this, enter the b -fl i command at the boot prompt. You will be prompted for the kernel name, attribute name, and attribute value. For example, enter the following to set the value of the somaxconn attribute to 65535:

Enter kernel_name [option_1 ... option_n]: vmunix somaxconn=65535

3.6.6    Permanently Modifying Attribute Values

To permanently change the value of an attribute, you must include the new value in the /etc/sysconfigtab file, using the required format. Do not edit the file manually.

Note

Before you permanently modify a subsystem attribute, it is recommended that you maintain a record of the original value, in case you need to return to this value.

Use one of the following methods to permanently modify the value of an attribute:

Note

Do not specify erroneous values for subsystem attributes, because system behavior may be unpredictable. If you want to modify an attribute, use only the recommended values described in this manual.

3.6.7    Displaying and Modifying Kernel Variables by Using the dbx Debugger

Use the dbx debugger to examine the values of kernel variables and data structures, and to modify the current (run-time) values of kernel variables.

Note

In some cases, you must specify the processor number with the dbx print command. For example, to examine the nchstats data structure on a single-processor system, use the dbx print processor_ptr[0].nchstats command.

The following example of the dbx print command displays the current (run-time) value of the maxusers kernel variable:

# /usr/ucb/dbx -k /vmunix /dev/mem 
(dbx) print maxusers
512
(dbx)

Use the dbx patch command to modify the current (run-time) values of kernel variables. The values you assign by using the dbx patch command are lost when you rebuild the kernel.

Notes

If possible, use the sysconfig command or the Kernel Tuner to modify subsystem attributes instead of using dbx to modify kernel variables. Do not specify erroneous values for kernel variables, because system behavior may be unpredictable. If you want to modify a variable, use only the recommended values described in this manual.

The following example of the dbx patch command changes the current value of the cluster_consec_init variable to 8:

# /usr/ucb/dbx -k /vmunix /dev/mem 
(dbx) patch cluster_consec_init = 8
8
(dbx)

To ensure that the system is utilizing a new kernel variable value, reboot the system. See the Programmer's Guide for detailed information about the dbx debugger.

You can also use the dbx assign command to modify run-time kernel variable values. However, the modifications are lost when you reboot the system.