Chapter 8 Commands

A command is any executable program, such as a shell script or binary file. Performance Manager can execute commands on remote nodes or the local GUI node, and display the output back to the local GUI node.

Performance Manager comes with several performance analysis, AdvFS analysis, cluster analysis and system management commands. You can execute these as they are or modify them to suit your needs. Performance Manager commands can be found below the /var/opt/pm directory.

You can also execute your own commands from Performance Manager by adding commands to the Execute menu, and you can organize your commands in categories. The Configure dialog box is used to integrate your commands with Performance Manager.

Performance Analysis Commands

Performance analysis commands can execute on one node, but analyze data collected from other nodes. Performance Manager's performance analysis commands are scripts that detect performance problems and offer corrective advice in four areas: CPU, memory, network, and disk I/O.

CPU Commands
CPU Analysis

This script determines how efficiently a computer's CPU is being used. High idle time during a heavy load indicates an I/O bottleneck. High system time under a heavy load indicates excessive overhead. If inefficiency is discovered, other scripts can reveal the cause; try the Virtual Memory, Swapping, and Device I/O scripts.

Load Average

This script determines a computer's load average for the last minute, last 5 minutes, and last 15 minutes. The load average is the number of jobs in the run queue. An acceptable load average is 3 to 7 jobs for a large system, 1 to 2 jobs for a workstation. This script also reports if a computer is consumed by a small number of user processes, and lists the top CPU-using processes.

Memory Commands
Buffer Cache

This script determines if a computer's buffer cache is too large or too small. A too-small cache causes excessive I/O. A too-large cache causes excessive paging and swapping.

Excessive Paging

This script determines if there is excessive paging on a computer by checking the number of free pages, paged out pages, and page faults. Excessive paging can be caused by a new process trying to allocate pages, or by active virtual memory being too large relative to active real memory.

Excessive Swapping

This script displays virtual memory and swap space usage and detects excessive usage.

Memory Shortage

This script determines if a computer has a memory shortage. If there is much swapping during paging, and runnable processes are swapped out while the free list increases, lack of memory could cause desperation swapping (also called thrashing) to occur.

Virtual Memory

This script determines if a computer has virtual memory problems. This script displays swap configurations and the number of free pages, and compares the amounts of physical and virtual memory.

Network Commands
Gateway Errors

This script determines if a computer has excessive gateway errors by looking at the number of bad checksum fields for IP, ICMP, TCP and UDP. Gateway errors should be less than one hundredth of a percent of the total number of packets received.

Network Errors

This script determines if a network node (a computer in a network) has exceeded the acceptable number of network output errors and collisions. This script examines the length of the send queue for all connections, and displays the number of output errors, input errors, and collisions, as well as the number of in and out packets.

Packet Retransmissions

This script determines if a node has excessive network packet retransmissions by looking at the number of retransmissions and bad xids. (Bad xids are packets that return an xid different from the one sent.) Packet retransmissions should be less than 1% of the total number of client NFS calls. Retransmissions increase when you are working with network hardware or all your computers boot at the same time.

Disk I/O Commands
Excessive Transactions

This script displays the transactions per second (tps) and total transactions on each device and reports excessive activity.

File System Analysis

This script determines if there are sufficient inode and file table entries to support the number of system processes. If inode and open file usage are more than 80%, increase the system parameter to make the usage less than 80%.

System Management Commands

System management commands perform tasks on the node they are executing on. Performance Manager provides the following system management scripts. To execute one, from the main window's Execute menu choose System Management, then one of the following scripts:

CleanFilesystems

This script cleans full file systems of core files and other user-specified unneeded files.

FileModification

This script determines if files have been modified or accessed.

GrowthOfFiles

This script determines if files are growing faster than a certain rate.

MaintainFiles

This script allows you to perform the following file management tasks:

  • Move files to new file systems
  • Change files' permissions
  • Copy files to new file systems or tapes
  • Change files' user and group ownership
  • Make symbolic links
  • Undelete AdvFS files
  • Delete files
PMArchiver

This script allows you to capture all metric data on one or more nodes without having to monitor the nodes. The archived data can be replayed using Microsoft® Excel or any other graphing tool you create an interface for. PMArchiver also provides you with running averages. You can choose the sample interval for measurement granularity, the number of intervals to average over, and total sample time. The lower limit of the interval (-i) is bound by the time it takes to query the metrics.

Performance Manager will wait while these scripts run, only closing after they have reached completion. If you set a duration longer than the time you wish to run the PM GUI, you can run the scripts outside PM, from a command line.

PMDeltaArchiver

This script is similar to PMArchiver, but it tracks the delta of COUNTER type metrics, rather than the raw values of GAUGE type metrics.

RCArchiver

The rc_archiver will archive metrics from the snmpd, pmgrd, advfsd, and clstrmond daemons. It assumes the ports for the daemons are 161, 1167, 1163, and 1165 respectively. You will need to modify the script if your daemons run on different ports.

This demonstration script archives the rate in seconds or count per sample of data for a tabular metric that you specify on the command line. You can choose the sample interval, sample duration, archive field delimiter character, the port number of the daemon from which the metrics will be retrieved, and the directory where the archive files will be written.

PingNode

This script pings a node at intervals you set. When the roundtrip ping time between the initiating node and the node specified on the command line exceeds the set threshold, you are notified.

impact_diskmon and impact_procmon

These scripts monitor disks and processes, sending traps when a capacity threshold is crossed or a process has failed. If they are run from the PM GUI, they will close upon completion. If you wish to monitor over a period of time, run them from a command line.

  • impact_diskmon monitors disk partitions for fill percent thresholds.
  • impact_procmon monitors process names that should exist on node_list.
SignalProcess

This script sends the user-specified SIGNAL, in alphabetic or numeric form, to one or more processes. This script allows you to set the following flags:

  • Signal a process directly by entering a process ID.
  • Display all processes for a user and choose which to signal.
  • Display all processes containing a given string and choose which to signal.

If only one process matches your entry when using the grep or user flag, it will be signaled directly.

D iskUsage

This script creates a report displaying the disk usage of each user on the file system specified. By default the display will be written to standard out. This script allows you to set the following optional flags:

  • Mail the usage report to a user.
  • Write the report to a file.
AddSwapFile

This script allows you to add a UFS partition as additional swap space. The script prompts you for a block special device (such as rz4c on a 4.0x system or dsk1a on a 5.x system), creates an additional swap entry in /etc/fstab, and starts swapping to the newly created swap file. You will be asked to confirm items that alter your current system configuration. The script assumes that the disk is configured into the kernel, has a device special file, and that the in-memory disk label can be read.

Renice

This script alters the scheduling priority of one or more running processes. It allows you to do the following:

  • Set the scheduling priority.
  • Alter the priority of a process ID.
  • Alter the priority of all processes for a given user
  • Alter the priority of all processes for a given process group ID.
ProcessTree

This script parses the output of the UNIX ps command to give a tree of all processes with child processes tab indented underneath their parents.

filesize_thresh

This script makes an entry in cron to periodically check if a given file or directory has exceeded the specified threshold size. When a threshold is exceeded, mail will be sent to the address given with the -m flag and the cron entry will be removed automatically. The interval is limited to: 1, 5,1 0, 15, 20, 30, 60 or time_of_day (hh:mm) in 24 hour format due to cron entry restrictions.

pm_fax

This script faxes a message created from the threshold environmental variables to the specified phone number. This script relies on a properly configured and functioning version of HylaFAX (see http://www.vix.com/hylafax/ for source distribution and build information. The script was tested with hylafax-v3.0pl1. This script relies on the hylafax environmental variables being set.

pm_mail

This script will mail a threshold message read from the threshold environmental variables to the user specified on the command line. If no user is specified the message will be mail to root .

pm_pager

This script will send a message based on the threshold environmental variables to the specified pager phone number. This script assumes that you have a properly configured and functioning version of HylaFAX™ (see http://www.vix.com/hylafax/ for source distribution and build information). The script was tested with hylafax-v3.0pl1. This script relies on the hylafax environmental variables being set. The pager of HylaFAX does not appear to work with the SkyTel® SkyPager® service.

pm_shutdown

This script is a wrapper for the UNIX shutdown command that takes a list of machines that will be shut down simultaneously. If a message is not given, a default one will be included in the shutdown invocation.

pm_broadcast

This script is a wrapper for the UNIX rwall command. It writes a message to all users logged on the node(s) specified in the space-separated node list.

Cluster Performance Analysis Commands

Performance Manager provides the following Cluster Performance Analysis commands. To execute one, from the main window's Execute menu choose Cluster Performance Analysis, then one of the following commands:

ClusterLoadAverage

This script determines if a cluster is working under an extreme load (3 jobs in the run queue by default) using metrics retrieved from pmgrd for the last 5 seconds, last 30 seconds, and the last 60 seconds. It also reports if the cluster is consumed by a small number of user processes and lists the top process.

ClusterNodeStatus

This script lists the node members of a cluster maintained by the Connection Manager. When the -s switch is specified, it will list the state of each node in the cluster and notify the user when a node is down or not working properly.

DLMdeadlocks

This script checks to see if the Distributed Lock Manager (DLM) locks and deadlocks exceed thresholds acceptable for a cluster system. It also compares the number of locks received with the number of locks sent to see if they are within a specified percentage of each other.

DLMlocks

This script checks to see if the Distributed Lock Manager (DLM) lock requests and messages are within a certain specified percentage of each other. The lock metrics received are compared to the number of lock metrics sent to see if the result exceeds a specified percentage.

DLM resources

This script checks to see if the Distributed Lock Manager (DLM) resources and locks exceed thresholds acceptable for a cluster system. Threshold checks made include: too many processes currently attached to the DLM, too many locks currently allocated, and too many resources currently allocated.

DRDblockingServerClient

This script checks to see if the Distributed Raw Disk (DRD) block shipping server and client operations exceed thresholds acceptable for a cluster system. These operations include number of opens, closes, reads, writes, and ioctls.

DRDmemoryChannel

This script checks to see if the following Distributed Raw Disk (DRD) block shipping client memory channel operations exceed thresholds acceptable for a cluster system. These operations include number of reads, writes, and waits over the MC as well as number of unaligned reads and writes.

cmon

Wrapper for executing the TruCluster Version 1.0 cmon utility.

asemgr

Wrapper for executing the TruCluster Version 1.0 asemgr utility.

Threshold Management Commands

Threshold management commands can be executed when a threshold is crossed. Performance Manager provides the following threshold management commands. To execute one, from the main window's Execute menu choose Threshold Management, then one of the following commands:

SendFax

This script faxes a message created from the threshold environmental variables to the specified phone number. This script relies on a properly configured and functioning version of HylaFAX ; see http://www.vix.com/hylafax/ for source distribution and build information. The script was tested with hylafax-v3.0pl1. This script relies on the hylafax environmental variables being set.

SendPage

This script will send a message based on the threshold environmental variables to the specified pager phone number. This script assumes that you have a properly configured and functioning version of HylaFAX . See http://www.vix.com/hylafax/ for source distribution and build information. The script was tested with hylafax-v3.0pl1. This script relies on the hylafax environmental variables being set. The pager of HylaFAX does not appear to work with the SkyTel SkyPage service.

Send Mail

This script will mail a threshold message read from the threshold environmental variables to the user specified on the command line. If no user is specified the message will be mailed to root.

AdvFS Performance Analysis Commands

Performance Manager provides the following AdvFS Performance Analysis scripts. To execute one, from the main window's Execute menu choose AdvFS Performance Analysis, then one of the following scripts:

AdvFSDomain

This script determines if AdvFS performance can be improved by tuning some parameters. It looks at the percentage of volumes used and checks if there is any uneven usage. The balance command should be used to do any necessary balancing. The AdvFSDomain script can limit the number of volumes if necessary.

AdvFSIO

This script determines if the node has excessive AdvFS I/O problems. It looks at the number of maximum read/write blocks and the I/O write flush threshold value and checks if any of these parameters need tuning.

AdvFSTuner

This script determines if AdvFS performance can be improved by tuning some parameters. It looks at the percentage of volumes used and the buffer cache hit ratio. It checks whether the log needs to be moved to a less used volume and whether the cache needs any tuning.

Command Operations

You can execute, configure, move, add, and delete commands from the Performance Manager GUI. The example (following) of an execute dialog box for CPUAnalysis shows the extent of controls you can set for command execution.

Executing Commands

To run a command on one or more nodes, follow these steps:

  1. Before running scripts on remote nodes, you must have a login ID and the /.rhosts file on each remote node must give root access to the node running the Performance Manager GUI. Specify both a node alias and a fully qualified domain name. For example:
  2. gui_node root

    gui_node.usc.edu.com root

  3. If the command does not exist on a remote node:
    • When the command is executed, Performance Manager copies the command from the node running the GUI to the remote node.
    • Executes the command.
    • Deletes the command on the remote node.
    • Any output is sent back to the node running the GUI for display in an output window.
  4. In the main window's nodes area, select the nodes you want to run a command on. (If no nodes are selected, the command runs on the node on which the GUI is running.)
  5. From the main window's Execute menu, choose a command to run. (You can modify these commands and add your own; from the main window's Commands menu, choose Configure .)
  6. If the command takes any flags or arguments, an Execute window opens. Specify the flags and arguments you wish, then click on the OK or Apply button to run the command.

Command Execution Dialog Box
Adding Commands to the Execute Menu

To add your own commands to the Execute menu:

  1. From the main window's Commands menu, choose Configure, which opens the Configure dialog box:
  2. Configure Dialog Box
  3. From the Category option menu, choose a command category, or choose New to create a new one. Choosing New (even if it is already visible, you must click on the word New ) opens the Script Category Mgmt dialogbox. Choose Add Category from the option menu, type a new category in that dialog box, and click on OK. The category you choose is the category the new command will belong to.
  4. Category Management Dialog Box
  5. From the Operation option menu, choose New Command.
  6. Click in the Command field and type a command name. Use no more than 50 characters consisting of letters, numbers, spaces, commas, underscores (_), and percent signs (%).
  7. Click in the Executable field and type the full path of the command's executable file; for example
    /staff3/bin/print_page. Use no more than 50 characters consisting of letters, numbers, commas, periods, slashes (/), underscores (_), and percent signs (%).
  8. If you choose Yes , when the command is run, a window opens containing the command's output.
  9. Click on your choice and the radio button will change to another color.
  10. If the command takes flags, click on the Flag button to open the Flag dialog box.
  11. If the command takes arguments, click on the Argument button to open the Argument dialog box.

The Apply button applies any changes you made. The Reset button clears all the fields in the Configure window. The Close button closes the dialog box without applying any changes.

Deleting Commands from the Execute Menu

Follow this procedure to delete commands:

  1. From the main window's Commands menu, choose Configure, which opens the Configure dialog box.
  2. From the Category option menu, choose the command category containing the command to be deleted.
  3. From the Command List, select the command to be deleted.
  4. From the Operation option menu, choose Delete Command.
  5. Click on the Apply button to delete the command.
Modifying Commands

Follow this procedure to modify a command:

  1. From the main window's Commands menu, choose Configure, which opens the Configure window.
  2. From the Category option menu, choose the command category containing the command to be modified.
  3. From the Command List, select the command to be modified.
  4. From the Operation option menu, choose Modify Command. Make the changes to modify the command.
  5. Click on the Apply button to modify the command.
Adding Command Categories

Follow this procedure to add a command category:

  1. From the main window's Commands menu, choose Script Category Mgmt, which opens the Script Category Mgmt dialog box.
  2. From the option menu, choose Add Category .
  3. Click in the Enter Category field and type the name of the new category.
  4. Click on the OK button.
Deleting Command Categories

Follow this procedure to delete a category:

  1. From the main window's Commands menu, choose Script Category Mgmt, which opens the Script Category Mgmt dialog box.
  2. From the option menu, choose Delete Category.
  3. Click in the Enter Category field and type the name of the category to be deleted.
  4. Click on the OK button.
Moving Commands Between Categories

Follow this procedure to move commands:

  1. From the main window's Commands menu, choose Move, which opens the Move Command dialog box.
  2. Move Dialog Box
  3. Choose a category from the From menu. The commands in this category will appear in the Command List.
  4. In the Command List, select a command to be moved.
  5. Choose a category from the to menu. This is the category the selected command will be moved into.
  6. Click on the OK or Apply button.

Go to Main   Go to Previous   Go to Next