Chapter 7 Thresholds

A threshold is a limit (high or low) placed on a specific monitored metric. When a limit is exceeded for more than a specified number of sampling intervals (its tolerance), that threshold is crossed.

For example, you could set a threshold of 5% maximum CPU time on system processes on all nodes, and give the threshold a tolerance of three. Then, if a node had more than 5% of its CPU time used for system processes for more than 3 consecutive sampling intervals, that threshold would be crossed.

You can set thresholds to notify you when they are crossed. The Threshold Notifications dialog box is the default method of notification and provides you with detailed information.

Caution:  Executing resource-intensive commands when a threshold is crossed causes the system load to increase. The increased load can cause more frequent threshold crossings, and in some cases, the threshold crossings are due solely to command execution. This can result in an excessive and continually growing system load.

To avoid this situation, increase the tolerance for the expression being monitored. The command will not execute until the threshold is crossed the number of times specified by the tolerance level.

Some other examples of thresholds:

When a threshold is crossed, the following occurs:

The session window displays threshold data along with monitoring data. The displays are managed in the same way, and the type is designated at the beginning of the title bar with a D for displays and a T for thresholds.

Threshold Notifications

The Threshold Notifications dialog box has a list view of threshold activity and a reporting window for information on selected thresholds. There are three action buttons:

Threshold Notifications Dialog Box

Setting Thresholds

Follow this procedure to set a threshold:

  1. Select a node, cluster, or group in the main window's node area.
  2. Click on the Threshold button in the work area.
  3. Select a metric category.
  4. Select the specific metrics for monitoring from the list.
  5. Set the value of the threshold.
  6. Set the rearm point. The rearm point occurs when the metric drops a specified amount below the threshold. If it recrosses the threshold after rearming, another alert will be sent.

These are the metric categories displayed by default in the threshold work area:

Threshold metric categories
Selecting the More button for a specific metric opens another dialog box for advanced settings (notification methods and additional information).
More... Button
CPU Thresholds

You can set the thresholds on the following CPU metrics:

  • Average Job Loads over Last 5 Seconds
  • Percentage of CPU Time in User State
  • Average Job Loads over Last 30 Seconds
  • Percentage of CPU Time in System State
  • Average Job Loads over Last 60 Seconds
  • Percentage of CPU Time in Idle State
System Thresholds

You can set thresholds for the following system metrics:

  • Rate of Context Switches
  • Rate of Device Interrupts
Processes Thresholds

You can set thresholds for the following processes metrics:

  • Percentage of CPU Use by Top Processes
  • Percentage of CPU Use by Top Users
Buffer Cache Thresholds

You can set thresholds for the following buffer cache metrics:

  • Percentage of Read Misses
Network Thresholds

You can set thresholds for the following network metrics:

  • Percentage of Timeouts for Calls
  • Rate of IP Datagrams Discarded
  • Rate of Ethernet Collisions
  • Rate of ICMP Errors
  • Percentage of Erroneous Outbound Packets
  • Rate of TCP Errors
  • Percentage of Erroneous Inbound Packets
  • Rate of UDP Errors
File System Thresholds

You can set thresholds for the following file system metrics:

  • Percentage of Available File Space
  • Percentage of Free Inodes
Memory Thresholds

You can set thresholds for the following memory metrics:

  • Percentage of Free Paging Memory
  • Number of Free Pages
  • Rate of Page Faults
  • Rate of Processes Swapped Out
  • Rate of Pages Paged Out
  • Percentage of Free Swap Space
AdvFS Thresholds

You can set thresholds for the following AdvFS metrics:

  • AdvFS Agent is Down
  • Percentage of Free Space in Fileset
  • Percentage of Free Space in AdvFS Domains
  • Percentage of Free Space in Domain Volume
  • Percentage of Free Space in Domain
TruCluster Thresholds

You can set thresholds for the following TruCluster metrics:

  • TCR Agent is Down
  • Deadlock Queue
Environmental Thresholds

You can set thresholds for the following environmental metrics:

  • High Temperature Reading
  • Status of Fans
  • Status of Thermal Sensor
  • Status of Power Supplies
Advanced Threshold Dialog (more...) Box

The advanced threshold (more...) dialog box has two sections. Use them for these tasks:

Threshold Notification Methods
  • Choose one or more notification methods by clicking the checkbox on.
    • Threshold Notification Dialog Box (default selection). This displays a dialog box on your screen when a threshold is crossed.
    • Send Email to: Type an address in this field.
    • Execute: Command - Set the Execute toggle. Choose Command to open a pull-down list of command categories, then choose a command from the submenu to open a command execution dialog box.
  • Use the Notification Message text entry field to create your own notification message.
AdditionalTthreshold Information
  • Set the tolerance for this threshold. This is the number of consecutive threshold crossings permitted before a violation is reported.
  • Set the interval for this threshold. This is the sampling rate, or time specified between samples.

Click on OK to save setting and return to the main window, click on Reset to return the settings to their defaults, and click on Cancel close the dialog box without saving the settings.

Threshold Environment Variables

These environment variables are set up internally to retrieve threshold information from commands that you create. For example, the ./var/opt/pm/Smscripts/pm_mailer script sends detailed mail about the crossed threshold that uses this information. You can create your own shell script that accesses these values using the $ symbol in front of the variable, for example, $PMTHRESH DESCRIPTION. These variables are helpful in creating your own logging script that tracks thresholds and rearms of Performance Manager's metrics.

Environment variable Description
PMTHRESH_DESCRIPTION Description of the expression in the database.
PMTHRESH_CURRENT_VALUE Value that has triggered threshold.
PMTHRESH_THRESHOLD_VALUE Value that had to be passed to trigger threshold.
PMTHRESH_NODE Node on which triggered threshold was detected.
PMTHRESH_USER_MESSAGE User message from Advanced Threshold Dialog box.
PMTHRESH_UPDATE_TIME The update time value from the triggered expression.
PMTHRESH_REARM_VALUE The value at which the threshold will be rearmed.
PMTHRESH_TOLERANCE_VALUE The tolerance of the triggers.
PMTHRESH_STATE Value is a string being either crossed or rearmed corresponding to the triggered event.
PMTHRESH_INSTANCE Additional information about the triggered threshold, such as which file system or CPU crossed.
PMTHRESH_OPERATOR Greater than or less than the threshold value.
Go to Main   Go to Previous   Go to Next