About the Monitoring Cycle
Agent execution is controlled by the cron
daemon on each server. The high-level steps of a monitoring cycle are as follows.
- Verify that the agent is idle.
- If the previous run of the agent has not finished, allow it to finish. Only one instance of the monitoring agent should be running at any time.
- Load and execute all appropriate device modules used to generate instrumentation reports and generate health-related events.
- The system generates instrumentation reports by probing the device for all relevant information, and it saves this information. The system then compares the report data to previous reports and evaluates the differences to determine whether health-related events need to be generated.
- Events are also created from information found in log files. For example, all errors and warnings are translated into a log event without further analysis. Most events are generated because a rule or policy in the software concluded that a problem exists, but if the storage array indicates issues in the
syslog
file, an event is immediately generated.
- Send any generated health-related event to the master agent if the events were generated by a slave agent, or, send the events to all interested parties if the event is generated by the master agent.
- The master agent is responsible for generating its own events and collecting events from the slaves. Events can also be aggregated by the master agent before dissemination.
- Note: Aggregated events and events that require action by service personnel (known as actionable events) are also referred to as alarms.
- Store instrumentation reports for future comparison.
- Event logs are accessible from the Administration tab of the user interface. The software updates the database with the necessary statistics. Some events require that a certain threshold be attained before an event is generated. For example, having the cyclic redundancy count (CRC) of a switch port increase by one is not sufficient to trigger an event, since a certain threshold is required.
- The Sun Storage Automated Diagnostic Environment monitoring and diagnostic software supports email thresholds that prevent the generation of multiple emails about the same component of the same device. By keeping track of the number of events that were already sent in a specified timeframe, redundant email alarms can be prevented. Other notification recipients (non-email) do not support this feature.
- Send the events or alarms to interested parties.
- Events are sent only to recipients that have been set up for notification. The types of events can be filtered so that only pertinent events are sent to each individual.
- Note: If they are enabled, the email providers and the Sun Network Storage Command Center (NSCC) receive notification of all events.
Related Topics