Summary:
This document describes the overall Storage Automated Diagnostic Environment including the use of daemons and crons, the probing techniques used to monitor devices, the Notification providers and the event generation structure. This document is intended for system administrators and requires some knowledge of Unix (Solaris). It can be used with the 'Getting Started Guide' which describe in detail the functions of the Graphical User Interface. This document will refer to Sun storage products using abbreviations, the abbreviation list in appendix A gives a more correct product list.
What is the Storage Automated Diagnostic Environment:
It's a distributed application used to monitor and diagnose Sun storage products, Sun supported switches and Sun Virtualization products. The main functions are device monitoring, event generation, topology discovery and presentation, diagnostics, revision checking, device/fru reports and configuration (system edition). The application depends on agents installed in-band (on the data path) and out-of-band (via ethernet) to do it's monitoring. Installing the package on a server will add a cron entry to the server and a agent-specific http service the list of services handled by inetd on this same server. The cron wakes up the agent periodically (tunable) to probe devices and monitor log files. A configuration file maintained in the graphical user interface (GUI) is used to maintain the list of devices that the agent(s) should monitor. One of these agents is considered the master agent, all slave agents reports their findings (alerts and events) to the master agent for further processing. Events are generated with Service Advisor content like probable-cause and recommended-action to help further isolation to a single FRU.
The main function of the master agent is to expose this monitoring database (including configuration, instrumentation reports, events, health, topology etc.) thru a GUI and to send all messages to event consumers (called Notification Providers in the GUI) like SRS. The master GUI centralize all configuration functions for both master and slave agents There is no need to point a browser to a slave server to configure that slave agent. Events can be sent as local Email to administrators of the site or as alerts and events back to SRS, NetConnect, and the Sun Network Storage Command Center (NSCC). The NSCC is a statistical database used by Sun engineering to discover trends and problems with Sun storage products. Configuring email and providers is also done using the GUI and stored in the configuration file.
The following diagram is an example of a configuration where a master and a slave work together to monitor 2 Sun T3 partner-groups, 1 Switch and 3 Sun A5000:
There are 2 versions of the Storage Automated Diagnostic Environment, a Device Edition that includes all functions (except for a few Sun-Solution specific functions) and a Sun-Solution edition used on the service processor on the 3900/6900/6320 products). The Device-Edition (package name SUNWstade ) is really a 'San' edition since it includes all the topology and SAN aggregation functions. The Sun-Solution edition (package name SUNWstads) is pre-installed on the service processor of the solution products and include different features than the device edition; it also includes specialized management functions like '3900/6900 Configuration' function. These two packages are created from the same code-base.
The SUNWstade Installation Life-Cycle:
A typical SUNWstade installation consist of the following steps:
Install SUNWstade on a set of servers, one of them is selected as the master agent, usually because it's already a management station or because it has access to email and is registered with the name-server and easily accessible. The master agent is the one providing the user interface, it is called 'master' even when there is no slave present. Each instance of an agent, either master or slave can monitor devices. Devices can be monitored in-band (usually by slave agents installed on the appropriate server) or out-of-band (from any agent). When log files are available (like in the case of t3/t4 and 3310 (minnow), it is usually best to install an agent on the server where these logfiles are replicated and monitor the devices out-of-band from this agent. This configuration allow the same agent to see logfile information and to probe the device and correlate the information found. After pkgadd, run /opt/SUNWstade/bin/ras_install to set-up inetd services and crons. ras_install will ask a few basic question like 'Is this a master or a slave' , 'where is the master', 'do you want SSL security on' etc..
Initialize the configuration. Access the application by pointing a browser to the host which includes the proper port number. Port numbers are 7654 (non-secure) and 7443 (secure). NOTE: Initial login is always username=ras password=agent which can be changed after the initial login. Additional users with varied permissions, locale and browser preferences can also be created. The initial configuration consists of entering site information, discovering devices, adding storage devices manually to the configuration, adding local email addresses for event reception and adding notification providers for transmission of events to SRS, SSRR, NetConnect etc.. Most of these functions can also be done from cli commands for convenience and automation purposes. A 'review-config' report can be executed from the GUI to make a sanity check against the configuration.
Devices Discovery. The application monitors the devices included in it's configuration file (/opt/SUNWstade/DATA/rasagent.conf). Devices can be added to this file using 'Add Device', 'Discover Devices' or the 'ras_admin' CLI command (/opt/SUNWstade/bin/ras_admin). 'Add Device' is straightforward and usually involves entering the IP of the device. Before the application can add a device to it's configuration, it must be able to access and identify the device. Identification usually means finding the port WWN of the device along with the enclosure-ID. Device discovery can be automated using the /etc/deviceIP.conf file . This file has a syntax similar to /etc/hosts and is maintained by the system administrator. It contains a list of all devices that should be monitored by the application. Look at appendix D for an example of this file. Both the CLI (ras_admin discover_deviceIP) and the GUI can be used to discover devices based on the /etc/deviceIP.conf file.
Topology Discovery. This is really just one more configuration step but it is a little more complicated. To do a complete topology discovery, every agent (master and slave) must discover their section of the San, both in-band and out-of-band, merge this information into a single topology and send this topology to the master agent for further aggregation. The master agent will merge all received topologies with its own topology to create a single 'MASTER' topology. The topology created by the application is primarily a physical topology, it includes enclosure information, partner-group information, in-band path information, wwn etc... It will be saved as the current San 'snapshot' and will be used in all San-related operations until a new San topology snapshot is created by the customer. This is available from Admin -> Topo.Maintenance -> Topology Snapshot
Start the agents. When package is installed and ras_install is done, the agents for each device may not running. Agents are started from the GUI, usually after device discovery and notification provider initialization. Starting agents really means that the crons are now active on all agents (master and slaves). This function is available from Admin->general_maintenance->start_agents (refer to the site_map, figure 1). Refer to the online help or Getting Started Guide for more details about device and provider initialization.
When a device alert occurs, the application will notify the Site administrator using email (if configured) and the application will also send notification to Sun using one of the remote services (SRS, SSRR etc..) which were orignially configured. Emails sent to the administrator are often sufficient to isolate the problem since they will include a probable cause and recommended action. To get a better overall picture of the problem, the site administrator or Sun' personnel may want to access the GUI (or CLI) and review the email information in context. This can be done by looking at the device itself (Monitor->Devices), at the topology (Monitor->Topology) or at the complete eventlog (Monitor ->EventLog). See figure 2,3,4 for examples of these functions. Figure 5 shows a sample email. After reviewing this information, diagnostics can be executed to further isolate the cause the problem.
Isolate the problem. Diagnostics can be executed from the CLI or from the GUI. The GUI allows users to execute tests remotely using the slave agents. This feature allows the user to start and control test from one centralized GUI on the master server even when the actual diagnostic test is running on a slave server.
Once the problem is fixed, the user can clear the health of the device in the GUI, recreate a topology if new storage devices were added and go back to step-5.
Monitoring Strategy:
The monitoring is done by master and slave agents installed on a set of servers. These servers are selected for the following reasons:
Server has access to storage devices in-band (Examples might be the Sun StorEdge A5K).
Server has access to logfiles like /var/adm/messages or storage device log files like /var/adm/messages.t3 .
Server has out-of-band access to storage devices that can be monitored out-of-band like Sun T3 and Sun Switches.
Multiple servers are used to distribute the monitoring load. For example, not all Sun StorEdge T3 arrays need to be monitored from the same agent. Many times, Sun StorEdge T3's will be installed in groups and will replicate their logfiles (messages.t3) to more than one server. In this case, it is best to install a slave agent on each server to have access to the logfile and the corresponding t3s from the same agent. Please reference the Installation and Configuration Planning guide for more details about configurations.
Monitoring Cycle:
Agents execution is controlled by the cron daemon on each server. The main steps of a monitoring cycle are:
Verify that the agent is alone, if the previous run of the agent has not finished: allow it to finish. Only one instance of the monitoring agent (./opt/SUNWstade/bin/rasagent) should be running at any one time.
Load and execute all appropriate device modules used to generate instrumentation reports and generate health related events. Instrumentation reports are generated by probing the device for all relevant information and saving this information in a report stored in /var/opt/SUNWstade/DATA. These reports are compared from one run of the agent to the next to generate health related events. Events are also created by relaying information found in logfiles. For example, all Errors and Warnings found in /var/adm/messages.t3 will be translated into a 'LogEvent' event without further analysis. Most events are generated because a rule or policy concluded that a problem exists, but if the T3 indicates issues in the syslog file, an event is immediately generated. See appendix C for more details about the commands used to monitor devices;
Send these events to the master agent if they were generated by a slave. Or, send the events to all interested parties if the agent is the master agents. The master agent is responsible for generating its own events and collecting events from the slaves. Events can also be aggregated on the master before dissemination.
Store Instrumentation reports in the DATA directory. NOTE: Event logs accessible from the GUI under Monitor->Logs, (/opt/SUNWstade/DATA/Events.log). The application will then update the state database with the necessary statistics. Some events require that a certain threshold be attained before an event is generated, for example, having the CRC count of a switch port going up by one is not sufficient to trigger an event sinse certain threshold is required. Another example are with email. The application supports email thresholds which can be used to prevent the generation multiple emails about the same component of the same device. By keeping track of how many events were already sent in specified timeframe, redundant email alerts can be prevented. NOTE: Other providers (non-Email) do not support this feature since it may be important to track all indications which are sent. Most events do not have that problem since are only sent when the initial state change occurs. For example, if a battery supply is lost, an alert is sent for this transition (ie. Power failure) and no more events will be sent until the state returns to a good state (ie. Power returned) or enters another state (ie. Power supply removed).
Send the appropriate events to the interested parties. Not all events are sent to everybody. For example, local administrators can choose what kind of events they want. For example, and admin can choose the device type he is interested in, the types of events he is interested in (ie. Loss of communication) and the administrator can choose the level of alerts to receive (ie. Warnings and Errors only). NOTE: The Sun SRS provider only receives actionable events (See Event structure) but the Sun Network Storage Command Center (NSCC via NetConnect) receives all events.
Event Life-Cycle:
Most events are based on health transitions. When, for example, the state of a device goes from 'Online' to 'Offline' , a health transition occurs. It is the transition from 'offline' to 'online' that generates an event, not the actual value 'offline'. If the state alone was used to generate events, the same events would be generated all the time. Transitions cannot be used when monitoring logfiles, so logEvents can be very repetitive. This problem is minimized by attaching thresholds to entries in the logfiles. Thresholds ensure that a minimum number of logfile entries within a certain time period will occur before an event is generated. The application also includes an 'event-maximums' database that keeps track of the number of events generated about the same subject in the same 8 hours period. This database is used to stop the generation of repetitive events when there is no other way to do it. For example, if the port of a switch was toggling between offline and online every few minutes, the event-maximums database would ensure that this toggling is reported only once every 8 hours instead of every 5 minutes.
Events are usually generated using the following rules:
The very first time a device is monitored, a discovery event is generated. It is not actionable and is used to set a monitoring baseline, primarily for NSCC. This event describes in details the components of the storage device. Every week after discovery, an audit event is generated. It has the same content as the discovery event.
A LogEvent can be generated when interesting information is found in host or storage logfiles. This information is usually tied to the right storage devices when possible and sent to all consumers. These events can be made actionable based on thresholds and sent to SRS, SSRR, NetConnect etc.
Events are generated when changes are seen in the content of the instrumentation report generated by probing the device and compared to the last instrumentation report (x minutes old usually). This is where most events are generated: stateChangeEvent, TopologyEvent, alarmEvent etc.. See Appendix B for a complete list of events by devices. In the GUI, use Report-> Service Advisor -> Event Advisor to read more about events.
When possible, related events are combined by the master agent to generate Aggregated Events. Note: event aggregation is not enabled by default but can be used for automated aggregation of multiple events into a single email which shows the aggregated event as well as the original events which were used to derive this conclusion.
All events include the following fields:
An Event_type describing the kind of event this is: discovery, LogEvent, stateChangeEvent etc...
A Device_category that correspond to a device class : a5k,t3,a3500fc,switch,brocade etc..
An event_severity: The severity in
is 0=Notice/Normal, 1=Warning, 2=Error, 3=Error/Critical.
Severity's are shown in the GUI using the following icons:
An Actionable flag: An event is actionable when actions are required, these events will go to Sun SRS for example. Errors and Error/Critical events are also actionable and some warnings are actionable but most are not. Notices are not actionable.
An Event_topic: The topic is device specific. For example, on a switch , the event_type could be stateChangeEvent and the topic is 'port.1' : This event/topic is used to alert that a switch port went from 'online' to 'offline' for example.
An Event_Code: This code is a set of 3 numbers separated by periods. It is used to identify the events and is equivalent to device_category.event_type.event_topic but is much shorter. See the Event Advisor for a complete list of codes. Event Codes are often visible in the GUI and allow easy access to the Service Advisor database where causes and actions are stored.
Alternate Master:
The application supports the concept of an alternate master. An alternate master is a slave that, on every run of the cron, verifies that the real master is still alive and when the real master does not respond, takes over some of the responsibilities of the real master. All slave , including the alternate master, have a copy of the complete configuration. This configuration describe where all the agents are located (IPaddress, etc..). This information allows the alternate master to call the slaves and temporarily redirect the flow of events from the real-master to the alternate master.
Since the real master is responsible for sending events and email, one of the main functions of the alternate master is to alert the administrator that the master server is no longer operational, this event would otherwise never be sent. The alternate master does not try to become the real master, it will however remember which agent is the real master and will relinquish it's role as temporary master once communication with the real master is regained. This architecture is meant to deal with temporary loss of the master agent: if the master agent is removed from the site, a different server should be made the permanent master (running ras_install again).
Product Footprint:
The application was designed to have a very small footprint and to be invisible when not in use, it includes a cron and an on-demand http-service used for browser/slave/master communication.
The software includes a cron that runs every 5 minutes. Every time the cron program starts, it verifies with the configuration file if it is time to execute the agents. The real agent frequency can be changed from agent to agent through the GUI. If, for example, the agent frequency was changed to 30 minutes, the cron will abort 5 times out of 6. This cron agent (/opt/SUNWstade/bin/rasagent) runs on both master and slave agents and is a Perl program that can grow to approximately 15Meg of memory. The software package does not include Perl, so a version of Perl must be present on the server for the application to work (Perl version 5.005 or up). When running, the cron agent stores device-specific information in the /opt/SUNWstade/DATA directory and it's process size is not affected by the number of devices being monitored: once the monitoring of a device is completed, instrumentation data is stored on the disk and erased from memory.
The cron agent is only used to probe devices and generate events, it does not provide access to the GUI, that is done by an http service usually installed on port 7654 and 7443 (secure). This program, called /opt/SUNWstade/rashttp is started from inetd and will stay in memory for as long as a user requires the GUI. Rashttp has a timeout period (default is 30 seconds) after which it will exit. This was done to minimize the the number of process present of the servers. This http service is also a Perl program and it's footprint is similar to the cron agent. It is used to answer http requests coming from browsers or from slaves. Master and slaves uses http to share configuration information, topology information, new events etc..
Security Options:
The package can be installed with security turned on by executing ras_install and answering 'Yes' to the security question. This means that SSL (Secure Socket Layer) is used for transmission of information between the master agent and the browser and between the master agent and the slave agents. The package includes a default certificate that expires in 2008 (located in /opt/SUNWstade/System/certificate.pem), which uses The Highest grade encryption (RC4 with 128-bit secret key) . When secure mode is used, the URL used to access the master agent is https://hostname:7443. The non-secure URL is http://hostname:7654. Site specific certificates can be created with the openssl utilities (part of the public domain OpenSSL product). A command similar to the following would be used: /usr/local/ssl/bin/openssl req -days 200 -new -nodes -x509 -out new_certificate.pem -keyout new_certificate.pem2. See appendix C for certificate details.
For further security, the application supports multiple logins. These logins can be added by the 'root' login (login 'ras', default password 'agent') along with specific capabilities (guest, admin, expert, test). This allows different users to login with their own login/password and have a restricted set of functions available in the GUI.
Sun-Solutions:
Sun StorEdge Solutions products including Sun 3900/6900 (Indy) and Sun 6320 (Maserati midrange) are logical storage devices created from Sun Switches, Sun T3/6120, Sun Virtualization engines and a Service Processor. These components are pre-configured into a single product which includes a version of SUNWstads on the service processor (system Edition). The version of SUNWstads in the solution rack can be accessed like any other master agent with a browser pointing to the IPaddress of the service processor. NOTE: To the outside world (including an external instance of the application), this solution rack is treated as a single device.
In past releases, it was possible to configure the agent on a Sun Solution rack as a slave agent but this option was removed in release 2.2 for scalability and maintainability purposes. When a agent installed outside of the rack needs to monitor this rack, it discovers the Sun Solution rack as a single device with it's own unique icon. In the following graph, The 2 switches of the Sun solution rack both have a current error (in red). These errors are represented in the rack icon by 2 small red boxes, one for each switch slots. To see the detailed topology inside the rack, the user must look at the software version on the service processor of the 3900 or use the link-and-launch facility available on the master agent (outside the rack). See figure 3 for an example of a topology with a Sun solution as a single icon.
Notification Providers:
The application supports a variety of Notification providers including local email, SRS, NetConnect, Trap and SSRR. These providers must be activated manually, this can be done using the GUI or the ras_admin cli. Information is sent to the providers each time the agent completes it's cycle. NOTE: Slave agents send events to the 'master' and the 'master' sends events to the providers.
Local Email: Local email is used primarily to send event information to local administrators. Multiple email addresses can be entered in the GUI, each with a different filter. When emails are generated, they are aggregated by event severity and by email address. This means that one email can contain more that one event, but these events must be the same severity level: An error and a warning will never be combined into a single email. Along with the main event information, the email will include Service Advisor information (information, probable cause and recommended action). Events also include and EventCode that can be used as a lookup key into the Event Advisor database (also available from the GUI).
SRS Provider: This module sends and basic monitoring topology and all actionable events to the SRS console installed at the customer site. SRS only receives actionable events and does not receives all the event information available to local emal for example: only the event source, event description and eventCode are sent. The SRS console must be visible to the master for this provider to work. Communication with the console is done using http.
The SSRR providers relies on Unix for communication. When events are available, the sendToSupport program (/va/remote.support/scripts/sendtosupport) is executed with a file containing these events as argument. Both actionable and non-actionable events are sent to SSRR and actionable events are sent separately and are marked 'beep' (as opposed to no-beep for the rest).
The NetConnect module relies on the SHUTTLE file (/opt/SUNWstade/DATA/SHUTTLE) to comminicate with the NetConnect product. There is 2 SHUTTLE files (SHUTTLE.1 and SHUTTLE.3 to separate actionable events from non-actionable. When available, the ncsend program is also executed (package_base/SUNWnc/bin/ncsend). All events are sent to NetConnect. NSCC uses NetConnect to populate it's database with events coming from clients.
SunMC. Activating the SunMC module allows the application to send topology and alert information to a SunMC agent. These alerts are visible from the SunMC console. A special 'rasagent' module must be installed on the SunMC agent to receive these alerts. This module is included with SUNWstade (/opt/SUNWstade/System/SunMC/SUNWesraa.tar.gz).
SNMP Traps. SNMP traps can be sent for actionable events and can be received by any management application that can receive traps.
FIGURES
Figure1: Site Map:
This page shows all available functions. This page is generated dynamically and can change based on the edition of the application and the capabilities of the user who logged in the application (ie. A user without permission to run diagnostics tests will not see help information about diagnostics.)
Figure2: Monitor Devices:
This page shows the content of the application when 3 frames are used. The top frame is for navigation. The left frame shows a list of devices that are monitored with their respective health level ('Sev' for severity). The right frame can show 5 pages ([ Summary | Health | Log | Report | Graph ]) . The graph page shows an icon of the selected device (in this case a Switch) and all immediate neighbors of this device in the San (png graphics file). This graph is also followed by a list of current health problems with this switch.
Figure3: Topology Graph:
This page will display the topologies generated by each slave and by the master separately or the combined topology (called MASTER). Topologies can be filtered and grouped for easy access. Icons of the topology can be moved and saved in their new position to generate a more readable layout. Right-clicking on an icon exposes a menu of functions that can be performed on this icon (ie. Ue right-mouse button to display a device report or run a diagnostic). Right clicking away from any icon allows the user to change the zoom level of the diagram. Holding the shift key allows to highlight multiple icons at the same time, this is useful when moving icons. In this graph, both devices and links can be marked and clicked. Just like devices, links can be selected (right-click) to see more details about the link status. This topology graph is produced by an applet but the [ Print ] function allows to generate a png (graphic format like gif) representation for easy printing.
Figure 3a: Inside a Sun Solution:
This topology, visible from the Service Processor of a Sun Solution shows the SP itself along with the external switches, the Virtualization Engines, the internal switches and the storage arrays, in this case 3 T3s. It also shows the San connections between the components of the rack. In this figure, the Sun Solution is called 'wst31', in the previous figure, it was a different rack called 'sp87'. Sun Solutions come in a variety of models, so the type and the number of components can vary.
Figure 4: Monitor Log:
This Event Log page can be used to display a subset of the event log stored in DATA/Events.log. Events are shown with a link to the Service Advisor for more information about a particular event.
Figure 6: Local Email Notification:
Email are generated from the master agent to email addresses entered in the configuration using the GUI. Each email address can have different event filters. Email information can include 'description', 'information' , 'probable cause' and 'recommended action', this example has no 'probable cause' section.
Appendix A: Abbreviation List:
Appendix B: Commands used for monitoring:
This section describe the commands and techniques used to monitor the storage devices supported by the application.
3310 (Minnow): This agent uses the cli command '/opt/SUNWstade/bin/sccli show <option>'. This command works both in-band and out-of-band. The application uses the same API interface for the in-band and out-of-band cases. This command extracts enclosure information and the content of the 3310 message log.
A3500FC: This agent uses the commands of the rm6 package (healthck, lad, rdacutil etc.) . These commands function in-band.
A5000: The luxadm command is used to monitor A5K. It is important to make sure that the latest luxadm patches are installed before installing the package to monitor A5000.
Brocade: The application uses the snmp library (snmpget, snmpwalk) to extract information from brocade switches out-of-band.
D2: Luxadm along with others in-band cli command (disk_inquiry, rdbuf, identify and vpd) are used to monitor D2.
Host : The Host agent also uses luxadm to read lun and hba status. It also uses unix commands (df, showrev, pkginfo) to extract host information.
MCData: The application uses snmp also for McData switches.
Sun switches: For 1 gig switches, The application uses the sanbox cli command. For the more recent 2Gig switches, snmp is used.
T3: The application uses http quieries to extract properties from the T3 arrays (also called tokens). Sun StorEdge T3 arrays come with a webserver that can be used to monitor the status of the array. The T3 tokens content is similar to the output of the 'fru stat', 'fru list', 'vol stat' etc. telnet commands. The content of the messages.t3/messages.6120 logfile is also used: Warning (W: ), Errors (E: ) and important notices are monitored by the application.
6120 (T4): Uses the same technique as T3.
FC-TAPE: Luxadm is also used to monitor Fibre Channel Tapes.
V880Disk: The application uses luxadm display to monitor V880Disk in-band.
Message Files: A separate module monitor the /var/adm/message file. This module saves the 'seek' value of the end of the file and reads the new entries in the files. When these new entries are deemed important from a storage point of view, LogEvents are generated. Hba drivers write to this log file.
Sun Virtualization (VE). The Sun Virtualization devices (formerly Vicom) are monitored out-of-band using VE specific commands (showmap, slicview, svstat, mpdrive). Virtualization devices are included with the Sun Solutions 6900 series.
Sun StorEdge 39xx/69xx solution racks. The application monitors a solution rack by communicating with the agent on the service procesor of the rack. This communication is HTTP based.
Appendix C : Certificate Details
Appendix D: /etc/deviceIP.conf
This file can only be used with devices that are accessible out-of-band using an IP number. Switches, Sun T3, Sun 6120, Sun 3510 and Sun Solution are currently supported.
#IPNO NAME TYPE(optional)
10.10.10.1 t3-b1
10.10.10.2 t3-b2
10.10.10.3 switch-s1
10.10.10.4 switch-s2
10.10.10.5 minnow1 3510
10.10.10.6 indy-1 rack
10.10.10.7 6120-1
10.10.10.8
Appendix E: Event List
############################ 3310.grid: Sun 3310/3510 ############################ 3310 AlarmEvent Revision 3310 AlarmEvent channel 3310 AlarmEvent enclosure 3310 AlarmEvent fan 3310 AlarmEvent firmware_version 3310 AlarmEvent part 3310 AlarmEvent power 3310 AlarmEvent raid_level 3310 AlarmEvent size 3310 AlarmEvent temperature 3310 AlarmEvent volume 3310 CommunicationEstablishedEvent ib 3310 CommunicationEstablishedEvent oob 3310 CommunicationLostEvent e 3310 CommunicationLostEvent ib 3310 ComponentInsertEvent disk 3310 ComponentInsertEvent power 3310 ComponentRemoveEvent disk 3310 DeviceLostEvent aggregate 3310 DiscoveryEvent enclosure 3310 LocationChangeEvent enclosure 3310 LogEvent cpu 3310 QuiesceEndEvent enclosure 3310 QuiesceStartEvent enclosure 3310 StateChangeEvent+ disk 3310 StateChangeEvent+ volume 3310 StateChangeEvent- disk 3310 StateChangeEvent- volume ############################ 6120.grid: StorEdge 6120 ############################ 6120 AlarmEvent+ power.temp 6120 AlarmEvent- disk.pathstat 6120 AlarmEvent- disk.port 6120 AlarmEvent- disk.temperature 6120 AlarmEvent- interface.loopcard.cable 6120 AlarmEvent- power.battery 6120 AlarmEvent- power.fan 6120 AlarmEvent- power.output 6120 AlarmEvent- power.temp 6120 AlarmEvent cacheMode 6120 AlarmEvent cacheModeBehind 6120 AlarmEvent initiators 6120 AlarmEvent log 6120 AlarmEvent lunPermission 6120 AlarmEvent revision 6120 AlarmEvent system_reboot 6120 AlarmEvent sysvolslice 6120 AlarmEvent time_diff 6120 AlarmEvent volCount 6120 AlarmEvent volOwner 6120 AuditEvent enclosure 6120 CommunicationEstablishedEvent ib 6120 CommunicationEstablishedEvent oob 6120 CommunicationLostEvent ib 6120 CommunicationLostEvent oob 6120 ComponentInsertEvent controller 6120 ComponentInsertEvent disk 6120 ComponentInsertEvent interface.loopcard 6120 ComponentInsertEvent power 6120 ComponentRemoveEvent controller 6120 ComponentRemoveEvent disk 6120 ComponentRemoveEvent interface.loopcard 6120 ComponentRemoveEvent power 6120 DeviceLostEvent aggregate 6120 DiagnosticTest- 6120test 6120 DiagnosticTest- 6120volverify 6120 DiscoveryEvent enclosure 6120 LocationChangeEvent enclosure 6120 LogEvent array_error 6120 LogEvent array_warning 6120 LogEvent controller.port 6120 LogEvent disk 6120 LogEvent disk.log 6120 LogEvent disk.senseKey 6120 LogEvent driver.SSD_WARN 6120 LogEvent power 6120 LogEvent power.refreshBattery 6120 LogEvent power.replaceBattery 6120 LogEvent temp_threshold 6120 QuiesceEndEvent enclosure 6120 QuiesceStartEvent enclosure 6120 StateChangeEvent+ controller 6120 StateChangeEvent+ disk 6120 StateChangeEvent+ interface.loopcard 6120 StateChangeEvent+ power 6120 StateChangeEvent+ volume 6120 StateChangeEvent- controller 6120 StateChangeEvent- disk 6120 StateChangeEvent- interface.loopcard 6120 StateChangeEvent- power 6120 StateChangeEvent- volume 6120 Statistics enclosure ############################ a3500fc.grid: Sun A3500FC ############################ a3500fc AlarmEvent- battery a3500fc AuditEvent enclosure a3500fc CommunicationEstablishedEvent ib a3500fc CommunicationLostEvent ib a3500fc ComponentInsertEvent controller a3500fc ComponentInsertEvent disk a3500fc ComponentRemoveEvent controller a3500fc ComponentRemoveEvent disk a3500fc DeviceLostEvent aggregate a3500fc DiagnosticTest- a3500fctest a3500fc DiscoveryEvent enclosure a3500fc LocationChangeEvent enclosure a3500fc StateChangeEvent+ disk a3500fc StateChangeEvent- controller a3500fc StateChangeEvent- disk ############################ a5k.grid: Sun A5000 ############################ a5k AlarmEvent- backplane a5k AlarmEvent- backplane.fan a5k AlarmEvent- disk a5k AlarmEvent- interface.gbic a5k AlarmEvent- interface.iboard a5k AuditEvent enclosure a5k CommunicationEstablishedEvent ib a5k CommunicationLostEvent ib a5k ComponentInsertEvent disk a5k ComponentRemoveEvent disk a5k DeviceLostEvent aggregate a5k DiagnosticTest- a5ksestest a5k DiagnosticTest- a5ktest a5k DiscoveryEvent enclosure a5k LocationChangeEvent enclosure a5k StateChangeEvent+ disk a5k StateChangeEvent+ interface.iboard a5k StateChangeEvent+ power a5k StateChangeEvent- disk a5k StateChangeEvent- interface.iboard a5k StateChangeEvent- power a5k logEvent driver ############################ agent.grid: ############################ agent AgentDeinstallEvent enclosure agent AgentInstallEvent enclosure agent AlarmEvent system_errors agent AlternateMaster+ enclosure agent AlternateMaster- enclosure agent CommunicationEstablishedEvent oob agent CommunicationLostEvent ntc agent CommunicationLostEvent oob agent HeartbeatEvent enclosure ############################ brocade.grid: Brocade switch ############################ brocade AlarmEvent sensor.fan brocade AlarmEvent sensor.power brocade AlarmEvent sensor.temperature brocade AlarmEvent system_reboot brocade AuditEvent enclosure brocade CommunicationEstablishedEvent oob brocade CommunicationLostEvent oob brocade ConnectivityLostEvent aggregate brocade DeviceLostEvent aggregate brocade DiagnosticTest- switchtest brocade DiscoveryEvent enclosure brocade LocationChangeEvent enclosure brocade LogEvent PhysState brocade LogEvent port.statistics brocade StateChangeEvent+ port brocade StateChangeEvent- port brocade Statistics enclosure ############################ d2.grid: Sun D2 ############################ d2 AlarmEvent- fan d2 AlarmEvent- power d2 AlarmEvent esm.revision d2 AlarmEvent midplane.revision d2 AlarmEvent slot_count d2 AlarmEvent temperature d2 AuditEvent enclosure d2 CommunicationEstablishedEvent ib d2 CommunicationLostEvent ib d2 ComponentRemoveEvent esm d2 ComponentRemoveEvent midplane d2 DeviceLostEvent aggregate d2 DiagnosticTest- d2test d2 DiscoveryEvent enclosure d2 LocationChangeEvent enclosure d2 StateChangeEvent+ disk d2 StateChangeEvent- disk ############################ host.grid: Host ############################ host AlarmEvent+ hba host AlarmEvent- hba host AlarmEvent- lun.T300 host AlarmEvent- lun.VE host AlarmEvent disk_capacity host AlarmEvent disk_capacity_okay host DiagnosticTest- ifptest host DiagnosticTest- qlctest host DiagnosticTest- socaltest host LogEvent array_error host LogEvent array_warning host LogEvent driver.ELS_RETRY host LogEvent driver.Fabric_Warning host LogEvent driver.Firmware_Change host LogEvent driver.LOOP_OFFLINE host LogEvent driver.LOOP_ONLINE host LogEvent driver.MPXIO host LogEvent driver.MPXIO_offline host LogEvent driver.PFA host LogEvent driver.QLC_LOOP_OFFLINE host LogEvent driver.QLC_LOOP_ONLINE host LogEvent driver.SCSI_ASC host LogEvent driver.SCSI_TRAN_FAILED host LogEvent driver.SCSI_TR_READ host LogEvent driver.SCSI_TR_WRITE host LogEvent driver.SFOFFTOWARN host LogEvent driver.SF_CRC_ALERT host LogEvent driver.SF_CRC_WARN host LogEvent driver.SF_DMA_WARN host LogEvent driver.SF_OFFLALERT host LogEvent driver.SF_OFFLINE host LogEvent driver.SF_RESET host LogEvent driver.SF_RETRY host LogEvent driver.SSD_ALERT host LogEvent driver.SSD_WARN host LogEvent error host LogEvent warning host PatchInfo enclosure host backup enclosure host patchInfo enclosure ############################ internal.grid: ############################ internal AuditEvent enclosure internal CommunicationEstablishedEvent ib internal CommunicationLostEvent ib internal ComponentInsertEvent disk internal ComponentRemoveEvent disk internal DiagnosticTest- fcdisktest internal DiscoveryEvent enclosure ############################ mcdata.grid: McData switch ############################ mcdata AlarmEvent fan mcdata AlarmEvent power mcdata AlarmEvent system_reboot mcdata AuditEvent enclosure mcdata CommunicationEstablishedEvent oob mcdata CommunicationLostEvent oob mcdata ConnectivityLostEvent aggregate mcdata DeviceLostEvent aggregate mcdata DiscoveryEvent enclosure mcdata LocationChangeEvent enclosure mcdata LogEvent PhysState mcdata LogEvent port.statistics mcdata StateChangeEvent+ port mcdata StateChangeEvent- port mcdata Statistics enclosure ############################ san.grid: ############################ san LinkEvent_CRC Any|Any san LinkEvent_CRC host|storage san LinkEvent_CRC host|switch san LinkEvent_CRC switch|a3500fc san LinkEvent_CRC switch|a5k san LinkEvent_CRC switch|storage san LinkEvent_CRC switch|switch san LinkEvent_CRC switch|t3 san LinkEvent_CRC ve|switch san LinkEvent_ITW Any|Any san LinkEvent_ITW host|storage san LinkEvent_ITW host|switch san LinkEvent_ITW switch|a3500fc san LinkEvent_ITW switch|a5k san LinkEvent_ITW switch|storage san LinkEvent_ITW switch|switch san LinkEvent_ITW switch|t3 san LinkEvent_ITW ve|switch san LinkEvent_SIG Any|Any san LinkEvent_SIG host|storage san LinkEvent_SIG host|switch san LinkEvent_SIG switch|a3500fc san LinkEvent_SIG switch|a5k san LinkEvent_SIG switch|storage san LinkEvent_SIG switch|switch san LinkEvent_SIG switch|t3 san LinkEvent_SIG ve|switch ############################ se.grid: Sun 3900/6900 ############################ se AggregatedEvent POWERSEQ1 se AlarmEvent- lun se AlarmEvent- remove_lun se CommunicationLostEvent oob se ComponentInsertEvent lun se ComponentRemoveEvent lun se ComponentRemoveEvent slot se DeviceLostEvent aggregate se StateChangeEvent links se StateChangeEvent port se StateChangeEvent slot se StateChangeEvent sp ############################ se2.grid: Sun 6320 ############################ se2 AggregatedEvent POWERSEQ1 se2 AlarmEvent- lun se2 AlarmEvent- power_sequencer se2 ComponentInsertEvent lun se2 ComponentRemoveEvent lun se2 DeviceLostEvent aggregate ############################ switch.grid: Sun Switch ############################ switch AlarmEvent chassis.fan switch AlarmEvent chassis.power switch AlarmEvent chassis.temperature switch AlarmEvent port.statistics switch AlarmEvent system_reboot switch AlarmEvent zone_change switch AuditEvent enclosure switch CommunicationEstablishedEvent oob switch CommunicationLostEvent oob switch ConnectivityLostEvent aggregate switch DeviceLostEvent aggregate switch DeviceLostEvent ib switch DiagnosticTest- switchtest switch DiscoveryEvent enclosure switch LocationChangeEvent enclosure switch LogEvent port.statistics switch StateChangeEvent+ port switch StateChangeEvent- port switch Statistics enclosure ############################ switch2.grid: Sun Switch2 ############################ switch2 AlarmEvent- chassis.board switch2 AlarmEvent- chassis.fan switch2 AlarmEvent chassis.power switch2 AlarmEvent port.statistics switch2 AlarmEvent system_reboot switch2 AuditEvent enclosure switch2 CommunicationEstablishedEvent oob switch2 CommunicationLostEvent fsa switch2 CommunicationLostEvent oob switch2 ConnectivityLostEvent aggregate switch2 DeviceLostEvent aggregate switch2 DiagnosticTest- switch2test switch2 DiscoveryEvent enclosure switch2 LocationChangeEvent enclosure switch2 StateChangeEvent+ port switch2 StateChangeEvent- port switch2 Statistics enclosure ############################ t3.grid: Sun T3 ############################ t3 AlarmEvent+ power.temp t3 AlarmEvent- disk.pathstat t3 AlarmEvent- disk.port t3 AlarmEvent- disk.temperature t3 AlarmEvent- interface.loopcard.cable t3 AlarmEvent- power.battery t3 AlarmEvent- power.fan t3 AlarmEvent- power.output t3 AlarmEvent- power.temp t3 AlarmEvent add_initiators t3 AlarmEvent backend_loop t3 AlarmEvent cacheMode t3 AlarmEvent cacheModeBehind t3 AlarmEvent device_path t3 AlarmEvent initiators t3 AlarmEvent log t3 AlarmEvent loop.statistics t3 AlarmEvent lunPermission t3 AlarmEvent remove_initiators t3 AlarmEvent revision t3 AlarmEvent system_reboot t3 AlarmEvent sysvolslice t3 AlarmEvent time_diff t3 AlarmEvent volCount t3 AlarmEvent volOwner t3 AuditEvent enclosure t3 CommunicationEstablishedEvent ib t3 CommunicationEstablishedEvent oob t3 CommunicationLostEvent ib t3 CommunicationLostEvent oob t3 ComponentInsertEvent controller t3 ComponentInsertEvent disk t3 ComponentInsertEvent interface.loopcard t3 ComponentInsertEvent power t3 ComponentRemoveEvent controller t3 ComponentRemoveEvent disk t3 ComponentRemoveEvent interface.loopcard t3 ComponentRemoveEvent power t3 DeviceLostEvent aggregate t3 DiagnosticTest- t3test t3 DiagnosticTest- t3volverify t3 DiscoveryEvent enclosure t3 LocationChangeEvent enclosure t3 LogEvent array_error t3 LogEvent array_warning t3 LogEvent controller.port t3 LogEvent disk t3 LogEvent disk.error t3 LogEvent disk.log t3 LogEvent disk.senseKey t3 LogEvent power.battery t3 LogEvent power.battery.refresh t3 LogEvent power.battery.replace t3 LogEvent temp_threshold t3 QuiesceEndEvent enclosure t3 QuiesceStartEvent enclosure t3 RemovalEvent enclosure t3 StateChangeEvent+ controller t3 StateChangeEvent+ disk t3 StateChangeEvent+ interface.loopcard t3 StateChangeEvent+ power t3 StateChangeEvent+ volume t3 StateChangeEvent- controller t3 StateChangeEvent- disk t3 StateChangeEvent- interface.loopcard t3 StateChangeEvent- power t3 StateChangeEvent- volume t3 Statistics enclosure ############################ tape.grid: FC-Tape ############################ tape AuditEvent enclosure tape CommunicationEstablishedEvent ib tape CommunicationLostEvent ib tape DeviceLostEvent aggregate tape DiagnosticTest- fctapetest tape DiscoveryEvent enclosure tape LocationChangeEvent enclosure tape StateChangeEvent+ port tape StateChangeEvent- port ############################ v880disk.grid: Sun V880 Disk ############################ v880disk AlarmEvent- backplane v880disk AlarmEvent- loop v880disk AlarmEvent- temperature v880disk AuditEvent enclosure v880disk CommunicationEstablishedEvent ib v880disk CommunicationLostEvent ib v880disk ComponentInsertEvent disk v880disk ComponentRemoveEvent disk v880disk DeviceLostEvent aggregate v880disk DiagnosticTest- daktest v880disk DiscoveryEvent enclosure v880disk LocationChangeEvent enclosure ############################ ve.grid: Vicom VE ############################ ve AlarmEvent log ve AlarmEvent volume ve AlarmEvent volume_add ve AlarmEvent volume_delete ve AuditEvent enclosure ve CommunicationEstablishedEvent oob ve CommunicationLostEvent oob.command ve CommunicationLostEvent oob.ping ve CommunicationLostEvent oob.slicd ve DeviceLostEvent aggregate ve DiagnosticTest- ve_diag ve DiagnosticTest- veluntest ve DiscoveryEvent enclosure ve LocationChangeEvent enclosure