7.3 Automatic Analysis
Automatic analysis is the immediate analysis of an event that has been captured and decomposed by SEA as soon as the event is generated by the system (or shortly thereafter), regardless of any interfaces that may be running. No user intervention is required. Automatic analysis is always enabled while the Director is running. The Director is always running unless it is manually stopped or, during installation, you chose not to start the Director when the system is rebooted (Tru64 UNIX, HP-UX, Linux, or OpenVMS systems).
Problem reports resulting from automatic analysis are sent to all interfaces and to all recipients that are set up to be notified.
See Chapter 10 for information about setting up notification services.
7.3.1 Scavenge
Automatic analysis processes events as they occur. However, when the Director is stopped, SEA indicates the last event from the binary log file that was processed in the analysis database. When the system is restarted, SEA checks the database to see which events have been processed and processes all the events that occurred after that point. This operation is referred to as scavenging. The scavenge operation finds events that are still pending processing and ensures that no events are missed, even when the system is restarted. The first time scavenge occurs, it processes the entire event log. Once this is complete, new events are processed as they occur. The scavenge operation occurs four minutes after the Director is started. If the Director is started and stopped within four minutes, no scavenge occurs.
Initially, the entire system event log is read to find any events that can be analyzed. A filter is then applied to the analyzable events. All analyzable events that occurred within a week of the current time are processed.
If there are no analyzable events, the scavenge feature becomes dormant and a marker representing an unsupported system is stored in the automatic analysis database. As long as the unsupported system marker is present on the system, no scavenging occurs. If there is at least one recognized event, scavenging occurs every time the Director is stopped and started
Scavenging and the Web Interface
If you connect to the Web Interface before scavenging begins, events that arrive while the Web Interface is running will appear in the Real-Time Monitoring view. All the events that arrive before scavenging starts are processed once scavenging begins and any problem reports that result from scavenging also appear in the Real-Time Monitoring view. However, any events that were added to the event log before the Web Interface was started will not appear in the Real-Time Monitoring view.
7.3.2 Reset
Resetting the automatic analysis database can significantly impact the results seen from future analysis.
In rare cases, you may by asked to reset the automatic analysis database as part of troubleshooting an operational problem with SEA. Be aware that resetting the database erases all active callouts and stored analysis data. After resetting, the database only retains the following:
- FRU configuration data for the hardware present
- A scavenging marker indicating the last event read from the system binary event log
Follow these steps to reset the automatic analysis database. For the procedure to work, the database must be uncorrupted and functioning properly:
- Stop the Director (see Section 3.8).
- Issue the wsea reset command (only available in the new common syntax).
- Restart the Director (see Section 3.7).
Why a Reset Affects Future Analysis
A reset clears all active problem reports and storage units. Storage units are records of past events that some rules use for thresholding and multiple event analysis. After a reset, the lack of these records can significantly change analysis results.
For example, SEA can accumulate storage units that count toward satisfaction of a threshold filter. When a reset erases the units, problem reports that occur at the threshold may be delayed (because the count started over) or even completely suppressed.
The scenario usually involves correctable events. SEA generally reports uncorrectable faults when they occur, but correctable events such as intermittent disk read errors may be subject to threshold filtering. In other words, SEA only sends a problem report when enough correctable events occur within a specified time frame. This allows SEA to signal that a device is suspect even though a hard fault has not happened yet.
To reduce the impact of resetting, first review recent events (the minimum recommendation is to review the past 24 hours). During the review, look for recurring events, typically correctable errors, that involve any device that has not already been called out in problem reports. These events can indicate suspect devices.
7.3.3 Disable
If necessary, automatic analysis can be disabled from the CLI as described in Chapter 5. You may want to disable automatic analysis if SEA is running on a platform such as HP-UX or Linux, where a native error log is not currently analyzed.