7    Managing CPU Performance

You may be able to improve performance by optimizing CPU resources. This chapter describes how to perform the following tasks:

7.1    Gathering CPU Performance Information

Table 7-1 describes the tools you can use to gather information about CPU usage.

Table 7-1:  CPU Monitoring Tools

Name Use Description

sys_check

Analyzes system configuration and displays statistics (Section 4.3)

Creates an HTML file that describes the system configuration, and can be used to diagnose problems. This utility checks kernel variable settings and memory and CPU resources, and provides performance data and lock statistics for SMP systems and kernel profiles.

The sys_check utility performs a basic analysis of your configuration and kernel variable settings, and provides warnings and tuning guidelines if necessary. See sys_check(8) for more information.

ps

Displays CPU and virtual memory usage by processes (Section 6.3.2 and Section 7.1.1)

Displays current statistics for running processes, including CPU usage, the processor and processor set, and the scheduling priority.

The ps command also displays virtual memory statistics for a process, including the number of page faults, page reclamations, and page ins; the percentage of real memory (resident set) usage; the resident set size; and the virtual address size.

Process Tuner

Displays CPU and virtual memory usage by processes

Displays current statistics for running processes. Invoke the Process Tuner graphical user interface (GUI) from the CDE Application Manager to display a list of processes and their characteristics, display the processes running for yourself or all users, display and modify process priorities, or send a signal to a process.

While monitoring processes, you can select parameters to view (percent of CPU usage, virtual memory size, state, and nice priority) and also sort the view.

vmstat

Displays virtual memory and CPU usage statistics (Section 7.1.2)

Displays information about process threads, virtual memory usage (page lists, page faults, page ins, and page outs), interrupts, and CPU usage (percentages of user, system and idle times). First reported are the statistics since boot time; subsequent reports are the statistics since a specified interval of time.

monitor

Collects performance data

Collects a variety of performance data on a running system and either displays the information in a graphical format or saves it to a binary file. The monitor command is available on the Tru64 UNIX Freeware CD-ROM. See ftp://gatekeeper.dec.com/pub/DEC for information.

top

Provides continuous reports on the system

Provides continuous reports on the state of the system, including a list of the processes using the most CPU resources. The top command is available on the Tru64 UNIX Freeware CD-ROM. See ftp://eecs.nwu.edu/pub/top for information.

ipcs

Displays IPC statistics

Displays interprocess communication (IPC) statistics for currently active message queues, shared-memory segments, semaphores, remote queues, and local queue headers. The information provided in the following fields reported by the ipcs -a command can be especially useful: QNUM, CBYTES, QBYTES, SEGSZ, and NSEMS. See ipcs(1) for more information.

uptime

Displays the system load average (Section 7.1.3)

Displays the number of jobs in the run queue for the last 5 seconds, the last 30 seconds, and the last 60 seconds. The uptime command also shows the number of users logged into the system and how long a system has been running.

w

Reports system load averages and user information

Displays the current time, the amount of time since the system was last started, the users logged in to the system, and the number of jobs in the run queue for the last 5 seconds, 30 seconds, and 60 seconds.

The w command also displays information about system users, including login and process information. See w(1) for more information.

xload

Monitors the system load average

Displays the system load average in a histogram that is periodically updated. See xload(1X) for more information.

(kdbx) cpustat

Reports CPU statistics (Section 7.1.4)

Displays CPU statistics, including the percentages of time the CPU spends in various states.

(kdbx) lockstats

Reports lock statistics (Section 7.1.5)

Displays lock statistics for each lock class on each CPU in the system.

The following sections describe some of these commands in detail.

7.1.1    Monitoring CPU Usage by Using the ps Command

The ps command displays a snapshot of the current status of the system processes. You can use it to determine the current running processes (including users), their state, and how they utilize system memory. The command lists processes in order of decreasing CPU usage so you can identify which processes are using the most CPU time.

See Section 6.3.2 for detailed information about using the ps command to diagnose CPU performance problems.

7.1.2    Monitoring CPU Statistics by Using the vmstat Command

The vmstat command shows the virtual memory, process, and CPU statistics for a specified time interval. The first line of output displays statistics since reboot time; each subsequent line displays statistics since the specified time interval.

An example of the vmstat command is as follows; output is provided in one-second intervals:

# /usr/ucb/vmstat 1
Virtual Memory Statistics: (pagesize = 8192)
procs        memory            pages                       intr        cpu
r  w  u  act  free wire  fault cow zero react pin pout   in  sy  cs  us sy  id
2 66 25  6417 3497 1570  155K  38K  50K    0  46K    0    4 290 165   0  2  98
4 65 24  6421 3493 1570   120    9   81    0    8    0  585 865 335  37 16  48
2 66 25  6421 3493 1570    69    0   69    0    0    0  570 968 368   8 22  69
4 65 24  6421 3493 1570    69    0   69    0    0    0  554 768 370   2 14  84
4 65 24  6421 3493 1570    69    0   69    0    0    0  865  1K 404   4 20  76
 

The following fields are particularly important for CPU monitoring:

See Section 6.3.1 for detailed information about the using the vmstat command to diagnose performance problems.

To use the vmstat command to diagnose a CPU performance problem, check the user (us), system (sy), and idle (id) time split. You must understand how your applications use the system to determine the appropriate values for these times. The goal is to keep the CPU as productive as possible. Idle CPU cycles occur when no runnable processes exist or when the CPU is waiting to complete an I/O or memory request.

The following list describes how to interpret the values for user, system, and idle time:

7.1.3    Monitoring the Load Average by Using the uptime Command

The uptime command shows how long a system has been running and the load average. The load average counts the jobs that are waiting for disk I/O, and applications whose priorities have been changed with either the nice or the renice command. The load average numbers give the average number of jobs in the run queue for the last 5 seconds, the last 30 seconds, and the last 60 seconds.

An example of the uptime command is as follows:

# /usr/ucb/uptime
1:48pm  up 7 days,  1:07,  35 users,  load average: 7.12, 10.33, 10.31

The command output displays the current time, the amount of time since the system was last started, the number of users logged into the system, and the load averages for the last 5 seconds, the last 30 seconds, and the last 60 seconds.

From the command output, you can determine whether the load is increasing or decreasing. An acceptable load average depends on your type of system and how it is being used. In general, for a large system, a load of 10 is high, and a load of 3 is low. Workstations should have a load of 1 or 2.

If the load is high, look at what processes are running with the ps command. You may want to run some applications during offpeak hours. See Section 6.3.2 for information about the ps command.

You can also lower the priority of applications with the nice or renice command to conserve CPU cycles. See nice(1) and renice(8) for more information.

7.1.4    Checking CPU Usage by Using the kdbx Debugger

The kdbx debugger cpustat extension displays CPU statistics, including the percentages of time the CPU spends in the following states:

The cpustat extension to the kdbx debugger can help application developers determine how effectively they are achieving parallelism across the system.

By default, the kdbx cpustat extension displays statistics for all CPUs in the system. For example:

# /usr/bin/kdbx -k /vmunix /dev/mem 
(kdbx)cpustat
 Cpu   User (%)    Nice (%) System (%)  Idle (%)   Wait (%)
===== ========== ========== ========== ========== ==========
    0       0.23       0.00       0.08      99.64       0.05
    1       0.21       0.00       0.06      99.68       0.05

See the Kernel Debugging manual and kdbx(8) for more information.

7.1.5    Checking Lock Usage by Using the kdbx Debugger

The kdbx debugger lockstats extension displays lock statistics for each lock class on each CPU in the system, including the following information:

For example:

# /usr/bin/kdbx -k /vmunix /dev/mem 
(kdbx)lockstats

See the Kernel Debugging manual and kdbx(8) for more information.

7.2    Improving CPU Performance

A system must be able to efficiently allocate the available CPU cycles among competing processes to meet the performance needs of users and applications. You may be able to improve performance by optimizing CPU usage.

Table 7-2 describes the guidelines for improving CPU performance.

Table 7-2:  Primary CPU Performance Improvement Guidelines

Guideline Performance Benefit Tradeoff
Add processors (Section 7.2.1) Increases CPU resources Applicable only for multiproccessing systems, and may affect virtual memory performance
Use the Class Scheduler (Section 7.2.2) Allocates CPU resources to critical applications None
Prioritize jobs (Section 7.2.3) Ensures that important applications have the highest priority None
Schedule jobs at offpeak hours (Section 7.2.4) Distributes the system load None
Stop the advfsd daemon (Section 7.2.5) Decreases demand for CPU power Applicable only if you are not using the AdvFS graphical user interface
Use hardware RAID (Section 7.2.6) Relieves the CPU of disk I/O overhead and provides disk I/O performance improvements Increases costs

The following sections describe how to optimize your CPU resources. If optimizing CPU resources does not solve the performance problem, you may have to upgrade your CPU to a faster processor.

7.2.1    Adding Processors

Multiprocessing systems allow you to expand the computing power of a system by adding processors. Workloads that benefit most from multiprocessing have multiple processes or multiple threads of execution that can run concurrently, such as database management system (DBMS) servers, Internet servers, mail servers, and compute servers.

You may be able to improve the performance of a multiprocessing system that has only a small percentage of idle time by adding processors. See Section 7.1.2 for information about checking idle time.

Before you add processors, you must ensure that a performance problem is not caused by the virtual memory or I/O subsystems. For example, increasing the number of processors will not improve performance in a system that lacks sufficient memory resources.

In addition, increasing the number of processors may increase the demands on your I/O and memory subsystems and could cause bottlenecks.

If you add processors and your system is metadata-intensive (that is, it opens large numbers of small files and accesses them repeatedly), you can improve the performance of synchronous write operations by using Prestoserve (see Section 2.4.8), or by using a RAID controller with a write-back cache (see Section 8.5).

7.2.2    Using the Class Scheduler

Use the Class Scheduler to allocate a percentage of CPU time to specific tasks or applications. This allows you to reserve CPU time for important processes, while limiting CPU usage by less critical processes.

To use class scheduling, group together processes into classes and assign each class a percentage of CPU time. You can also manually assign a class to any process.

The Class Scheduler allows you to display statistics on the actual CPU usage for each class.

See the System Administration manual and class_scheduling(4), class_admin(8), runclass(1), and classcntl(2) for more information about the Class Scheduler.

7.2.3    Prioritizing Jobs

You can prioritize jobs so that important applications are run first. Use the nice command to specify the priority for a command. Use the renice command to change the priority of a running process.

See nice(1) and renice(8) for more information.

7.2.4    Scheduling Jobs at Offpeak Hours

You can schedule jobs so that they run at offpeak hours (use the at and cron commands) or when the load level permits (use the batch command). This can relieve the load on the CPU and the memory and disk I/O subsystems.

See at(1) and cron(8) for more information.

7.2.5    Stopping the advfsd Daemon

The advfsd daemon allows Simple Network Management Protocol (SNMP) clients such as Netview or Performance Manager (PM) to request AdvFS file system information. If you are not using the AdvFS graphical user interface (GUI), you can free CPU resources and prevent the advfsd daemon from periodically scanning disks by stopping the advfsd daemon.

To prevent the advfsd daemon from starting at boot time, rename /sbin/rc3.d/S53advfsd to /sbin/rc3.d/T53advfsd.

To immediately stop the daemon, use the following command:

# /sbin/init.d/advfsd stop

7.2.6    Using Hardware RAID to Relieve the CPU of I/O Overhead

RAID controllers can relieve the CPU of the disk I/O overhead, in addition to providing many disk I/O performance-enhancing features. See Section 8.5 for more information about hardware RAID.