7 Managing CPU Performance

You may be able to improve performance by optimizing CPU resources. This chapter describes how to perform the following tasks:

Obtain information about CPU performance (Section 7.1)

Improve CPU performance (Section 7.2)

7.1 Gathering CPU Performance Information

Table 7-1 describes the tools you can use to gather information about CPU usage.

Table 7-1: CPU Monitoring Tools

Name	Use	Description
`sys_check`	Analyzes system configuration and displays statistics (Section 4.2)	Creates an HTML file that describes the system configuration, and can be used to diagnose problems. The utility checks kernel variable settings and memory and CPU resources, and provides performance data and lock statistics for SMP systems and kernel profiles. The `sys_check` utility performs a basic analysis of your configuration and kernel variable settings and provides warnings and tuning recommendations if necessary. See `sys_check`(8) for more information.
`ps`	Displays CPU and virtual memory usage by processes (Section 6.3.1)	Displays current statistics for running processes, including CPU usage, the processor and processor set, and the scheduling priority. The `ps` command also displays virtual memory statistics for a process, including the number of page faults, page reclamations, and pageins; the percentage of real memory (resident set) usage; the resident set size; and the virtual address size.
`Process Tuner`	Displays CPU and virtual memory usage by processes	Displays current statistics for running processes. Invoke the Process Tuner from the CDE Application Manager to display a list of processes and their characteristics, display the processes running for yourself or all users, display and modify process priorities, or send a signal to a process. While monitoring processes, you can select parameters to view (percent of CPU usage, virtual memory size, state, and `nice` priority) and also sort the view.
`vmstat`	Displays virtual memory and CPU usage statistics (Section 6.3.2)	Displays information about process threads, virtual memory usage (page lists, page faults, pageins, and pageouts), interrupts, and CPU usage (percentages of user, system and idle times). First reported are the statistics since boot time; subsequent reports are the statistics since a specified interval of time.
`monitor`	Collects performance data	Collects a variety of performance data on a running system and either displays the information in a graphical format or saves it to a binary file. The `monitor` command is available on the Tru64 UNIX Freeware CD-ROM. See `ftp://gatekeeper.dec.com/pub/DEC` for information.
`top`	Provides a continuous report on the system	Provides continuous reports on the state of the system, including a list of the processes using the most CPU resources. The `top` command is available on the Tru64 UNIX Freeware CD-ROM. See `ftp://eecs.nwu.edu/pub/top` for information.
`ipcs`	Displays IPC statistics	Displays interprocess communication (IPC) statistics for currently active message queues, shared-memory segments, semaphores, remote queues, and local queue headers. The information provided in the following fields reported by the `ipcs` `-a` command can be especially useful: `QNUM`, `CBYTES`, `QBYTES`, `SEGSZ`, and `NSEMS`. See `ipcs`(1) for more information.
`uptime`	Displays the system load average (Section 7.1.3)	Displays the number of jobs in the run queue for the last 5 seconds, the last 30 seconds, and the last 60 seconds. The `uptime` command also shows the number of users logged into the system and how long a system has been running.
`w`	Reports system load averages and user information	Displays the current time, the amount of time since the system was last started, the users logged in to the system, and the number of jobs in the run queue for the last 5 seconds, 30 seconds, and 60 seconds. The `w` command also displays information about system users, including login and process information. See `w`(1) for more information.
`xload`	Monitors the system load average	Displays the system load average in a histogram that is periodically updated. See `xload`(1X) for more information.
`(kdbx) cpustat`	Reports CPU statistics (Section 7.1.4)	Displays CPU statistics, including the percentages of time the CPU spends in various states.
`(kdbx) lockstats`	Reports lock statistics (Section 7.1.5)	Displays lock statistics for each lock class on each CPU in the system.

The following sections describe some of these commands in detail.

7.1.1 Monitoring CPU Usage by Using the ps Command

The ps command displays a snapshot of the current status of the system processes. You can use it to determine the current running processes (including users), their state, and how they utilize system memory. The command lists processes in order of decreasing CPU usage so you can identify which processes are using the most CPU time.

See Section 6.3.1 for detailed information about using the ps command to diagnose CPU performance problems.

7.1.2 Monitoring CPU Statistics by Using the vmstat Command

The vmstat command shows the virtual memory, process, and CPU statistics for a specified time interval. The first line of output displays statistics since reboot time; each subsequent line displays statistics since the specified time interval.

See Section 6.3.2 for detailed information about the using the vmstat command to diagnose performance problems.

7.1.3 Monitoring the Load Average by Using the uptime Command

The uptime command shows how long a system has been running and the load average. The load average counts jobs that are waiting for disk I/O, and applications whose priorities have been changed with either the nice or the renice command. The load average numbers give the average number of jobs in the run queue for the last 5 seconds, the last 30 seconds, and the last 60 seconds.

An example of the uptime command is as follows:

# /usr/ucb/uptime
1:48pm  up 7 days,  1:07,  35 users,  load average: 7.12, 10.33, 10.31

The command output displays the current time, the amount of time since the system was last started, the number of users logged into the system, and the load averages for the last 5 seconds, the last 30 seconds, and the last 60 seconds.

From the command output, you can determine whether the load is increasing or decreasing. An acceptable load average depends on your type of system and how it is being used. In general, for a large system, a load of 10 is high, and a load of 3 is low. Workstations should have a load of 1 or 2.

If the load is high, look at what processes are running with the ps command. You may want to run some applications during offpeak hours.

You can also lower the priority of applications with the nice or renice command to conserve CPU cycles. See nice(1) and renice(8)for more information.

7.1.4 Checking CPU Usage by Using the kdbx Debugger

The kdbx debugger cpustat extension displays CPU statistics, including the percentages of time the CPU spends in the following states:

Running user-level code

Running system-level code

Running at a priority set with the nice function

Idle

Waiting (idle with input or output pending)

The cpustat extension to the kdbx debugger can help application developers determine how effectively they are achieving parallelism across the system.

By default, the kdbx cpustat extension displays statistics for all CPUs in the system. For example:

# /usr/bin/kdbx -k /vmunix /dev/mem 
(kdbx)cpustat
 Cpu   User (%)    Nice (%) System (%)  Idle (%)   Wait (%)
===== ========== ========== ========== ========== ==========
    0       0.23       0.00       0.08      99.64       0.05
    1       0.21       0.00       0.06      99.68       0.05

See the Kernel Debugging manual and kdbx(8) for more information.

7.1.5 Checking Lock Usage by Using the kdbx Debugger

The kdbx debugger lockstats extension displays lock statistics for each lock class on each CPU in the system, including the following information:

Address of the structure

Class of the lock for which lock statistics are being recorded

CPU for which the lock statistics are being recorded

Number of instances of the lock

Number of times that processes have tried to get the lock

Number of times that processes have tried to get the lock and missed

Percentage of time that processes miss the lock

Total time that processes have spent waiting for the lock

Maximum amount of time that a single process has waited for the lock

Minimum amount of time that a single process has waited for the lock

For example:

# /usr/bin/kdbx -k /vmunix /dev/mem 
(kdbx)lockstats

See the Kernel Debugging manual and kdbx(8) for more information.

7.2 Improving CPU Performance

A system must be able to efficiently allocate the available CPU cycles among competing processes to meet the performance needs of users and applications. You may be able to improve performance by optimizing the CPU usage.

Table 7-2 describes the recommendations for improving CPU performance.

Table 7-2: Primary CPU Performance Improvement Guidelines

Recommendations	Performance Benefit	Tradeoff
Add processors (Section 7.2.1)	Increases CPU resources	Applicable only for multiproccessing systems and may impact virtual memory performance
Use the Class Scheduler (Section 7.2.2)	Allocates CPU resources to critical applications	None
Prioritize jobs (Section 7.2.3)	Ensures that important applications have the highest priority	None
Schedule jobs at offpeak hours (Section 7.2.4)	Distributes the system load	None
Stop the `advfsd` daemon (Section 7.2.5)	Decreases demand for CPU power	Applicable only if you are not using the AdvFS graphical user interface
Use hardware RAID (Section 7.2.6)	Relieves the CPU of disk I/O overhead and provides disk I/O performance improvements	Increases costs

The following sections describe how to optimize your CPU resources. If optimizing CPU resources does not solve the performance problem, you may have to upgrade your CPU to a faster processor.

7.2.1 Adding Processors

Multiprocessing systems allow you to expand the computing power of a system by adding processors. Workloads that benefit most from multiprocessing have multiple processes or multiple threads of execution that can run concurrently, such as database management system (DBMS) servers, Internet servers, mail servers, and compute servers.

You may be able to improve the performance of a multiprocessing system that has only a small percentage of idle time by adding processors. Before you add processors, you must ensure that a performance problem is not caused by the virtual memory or I/O subsystems. For example, increasing the number of processors will not improve performance in a system that lacks sufficient memory resources.

In addition, increasing the number of processors may increase the demands on your I/O and memory subsystems and could cause bottlenecks.

If you add processors and your system is metadata-intensive (that is, it opens large numbers of small files and accesses them repeatedly), you can improve the performance of synchronous write operations by using Prestoserve (see Section 2.4.8), or by using a RAID controller with a write-back cache (see Section 8.4).

7.2.2 Using the Class Scheduler

Use the Class Scheduler to allocate a percentage of CPU time to specific tasks or applications. This allows you to reserve CPU time for important processes, while limiting CPU usage by less critical processes.

To use class scheduling, group together processes into classes and assign each class a percentage of CPU time. You can also manually assign a class to any process.

The Class Scheduler allows you to display statistics on the actual CPU usage for each class.

See class_scheduling(4), class_admin(8), runclass(1), and classcntl(2) for more information about the Class Scheduler.

7.2.3 Prioritizing Jobs

You can prioritize jobs so that important applications are run first. Use the nice command to specify the priority for a command. Use the renice command to change the priority of a running process.

See nice(1) and renice(8) for more information.

7.2.4 Scheduling Jobs at Offpeak Hours

You can schedule jobs so that they run at offpeak hours (use the at and cron commands) or when the load level permits (use the batch command). This can relieve the load on the CPU and the memory and disk I/O subsystems.

See at(1) and cron(8) for more information.

7.2.5 Stopping the advfsd Daemon

The advfsd daemon allows Simple Network Management Protocol (SNMP) clients such as Netview or Performance Manager (PM) to request AdvFS file system information. If you are not using the AdvFS graphical user interface (GUI), you can free CPU resources and prevent the advfsd daemon from periodically scanning disks by stopping the advfsd daemon.

To prevent the advfsd daemon from starting at boot time, rename /sbin/rc3.d/S53advfsd to /sbin/rc3.d/T53advfsd. To immediately stop the daemon, use the kill -9 pid command, where pid specifies the daemon's process identification number (PID).

7.2.6 Using Hardware RAID to Relieve the CPU of I/O Overhead

RAID controllers can relieve the CPU of the disk I/O overhead, in addition to providing many disk I/O performance-enhancing features. See Section 8.4 for more information about hardware RAID.