Storage Automated Diagnostic Environment 2.x (StorADE)
 Administration Overview

Summary:

This document describes the overall StorADE environment including the use of daemons and crons, the probing techniques used to monitor devices, the Notification providers and the event generation structure. This document is intended for system administrators and requires some knowledge of Unix (Solaris). It can be used with the 'User guide' which describe in detail the functions of the Graphical User Interface. This document will refer to Sun storage products using abbreviations, the abbreviation list in appendix A gives a more correct product list.



What is StorADE:

StorADE is a distributed application used to monitor and diagnose Sun storage products, Sun supported switches and Sun Virtualization products. The main functions of StorADE are device monitoring, event generation, topology discovery and presentation, diagnostics, revision checking, device/fru reports and configuration (system edition). StorADE depends on agents installed in-band (on the data path) and out-of-band (via ethernet) to do it's monitoring. Installing the StorADE package on a server will add a cron entry to the server and a StorADE-specific http service the list of services handled by inetd on this same server. The cron wakes up the StorADE agent periodically (tunable) to probe devices and monitor log files. A configuration file maintained in the StorADE graphical user interface (GUI) is used to maintain the list of devices that the agent(s) should monitor. One of these agents is considered the master agent, all slave agents reports their findings (alerts and events) to the master agent for further processing. Events are generated with Service Advisor content like probable-cause and recommended-action to help further isolation to a single FRU.

The main function of the master agent is to expose this monitoring database (including configuration, instrumentation reports, events, health, topology etc.) thru a GUI and to send all messages to event consumers (called Notification Providers in the GUI) like SRS. The master GUI centralize all configuration functions for both master and slave agents There is no need to point a browser to a slave server to configure that slave agent. Events can be sent as local Email to administrators of the site or as alerts and events back to SRS, NetConnect, and the Sun Network Storage Command Center (NSCC). The NSCC is a statistical database used by Sun engineering to discover trends and problems with Sun storage products. Configuring email and providers is also done using the StorADE GUI and stored in the configuration file.

The following diagram is an example of a configuration where a master and a slave work together to monitor 2 Sun T3 partner-groups, 1 Switch and 3 Sun A5000:

 


 

There are 2 versions of StorADE, a Device Edition that includes all StorADE functions (except for a few Sun-Solution specific functions) and a Sun-Solution edition used on the service processor on the 3900/6900/6320 products). The Device-Edition (package name SUNWstade ) is really a 'San' edition since it includes all the topology and SAN aggregation functions of StorADE. The Sun-Solution edition (package name SUNWstads) is pre-installed on the service processor of the solution products and include different features than the device edition; it also includes specialized management functions like '3900/6900 Configuration' function. These two packages are created from the same code-base.

The StorADE Installation Life-Cycle:

A typical StorADE installation consist of the following steps:

  1. Install StorADE on a set of servers, one of them is selected as the master agent, usually because it's already a management station or because it has access to email and is registered with the name-server and easily accessible. The master agent is the one providing the user interface, it is called 'master' even when there is no slave present. Each instance of an agent, either master or slave can monitor devices. Devices can be monitored in-band (usually by slave agents installed on the appropriate server) or out-of-band (from any agent). When log files are available (like in the case of t3/t4 and 3310 (minnow), it is usually best to install an agent on the server where these logfiles are replicated and monitor the devices out-of-band from this agent. This configuration allow the same agent to see logfile information and to probe the device and correlate the information found. After pkgadd, run /opt/SUNWstade/bin/ras_install to set-up inetd services and crons. ras_install will ask a few basic question like 'Is this a master or a slave' , 'where is the master', 'do you want SSL security on' etc..

  2. Initialize the configuration. Access StorADE by pointing a browser to the host which includes the proper port number. StorADE port numbers are 7654 (non-secure) and 7443 (secure). NOTE: Initial login is always username=ras password=agent which can be changed after the initial login. Additional users with varied permissions, locale and browser preferences can also be created. The initial configuration consists of entering site information, discovering devices, adding storage devices manually to the StorADE configuration, adding local email addresses for event reception and adding notification providers for transmission of events to SRS, SSRR, NetConnect etc.. Most of these functions can also be done from cli commands for convenience and automation purposes. A 'review-config' report can be executed from the GUI to make a sanity check against the configuration.

  3. Devices Discovery. StorADE monitors the devices included in it's configuration file (/opt/SUNWstade/DATA/rasagent.conf). Devices can be added to this file using 'Add Device', 'Discover Devices' or the 'ras_admin' CLI command (/opt/SUNWstade/bin/ras_admin). 'Add Device' is straightforward and usually involves entering the IP of the device. Before StorADE can add a device to it's configuration, it must be able to access and identify the device. Identification usually means finding the port WWN of the device along with the enclosure-ID. Device discovery can be automated using the /etc/deviceIP.conf file . This file has a syntax similar to /etc/hosts and is maintained by the system administrator. It contains a list of all devices that should be monitored by StorADE. Look at appendix D for an example of this file. Both the CLI (ras_admin discover_deviceIP) and the GUI can be used to discover devices based on the /etc/deviceIP.conf file.

  4. Topology Discovery. This is really just one more configuration step but it is a little more complicated. To do a complete StorADE topology discovery, every agent (master and slave) must discover their section of the San, both in-band and out-of-band, merge this information into a single topology and send this topology to the master agent for further aggregation. The master agent will merge all received topologies with its own topology to create a single StorADE 'MASTER' topology. The topology created by StorADE is primarily a physical topology, it includes enclosure information, partner-group information, in-band path information, wwn etc... It will be saved as the current San 'snapshot' and will be used in all San-related operations until a new San topology snapshot is created by the customer. This is available from Admin -> Topo.Maintenance -> Topology Snapshot

  5. Start the agents. When StorADE is installed and ras_install is done, the agents for each device may not running. Agents are started from the GUI, usually after device discovery and notification provider initialization. Starting agents really means that the StorADE crons are now active on all agents (master and slaves). This function is available from Admin->general_maintenance->start_agents (refer to the site_map, figure 1). Refer to the StorADE User Guide for more details about device and provider initialization.

  6. When a device alert occurs, StorADE will notify the Site administrator using email (if configured) and StorADE wil also send notification to Sun using one of the remote services (SRS, SSRR etc..) which were orignially configured. Emails sent to the administrator are often sufficient to isolate the problem since they will include a probable cause and recommended action. To get a better overall picture of the problem, the site administrator or Sun' personnel may want to access the StorADE GUI (or CLI) and review the email information in context. This can be done by looking at the device itself (Monitor->Devices), at the topology (Monitor->Topology) or at the complete StorADE eventlog (Monitor ->EventLog). See figure 2,3,4 for examples of these functions. Figure 5 shows a sample email. After reviewing this information, diagnostics can be executed to further isolate the cause the problem.

  7. Isolate the problem. Diagnostics can be executed from the CLI or from the GUI. The StorADE GUI allows users to execute tests remotely using the slave agents. This feature allows the user to start and control test from one centralized GUI on the master server even when the actual diagnostic test is running on a slave server.

  8. Once the problem is fixed, the user can clear the health of the device in the StorADE GUI, recreate a topology if new storage devices were added and go back to step-5.

Monitoring Strategy:

The monitoring is done by master and slave agents installed on a set of servers. These servers are selected for the following reasons:

  1. Server has access to storage devices in-band (Examples might be the Sun StorEdge A5K).

  2. Server has access to logfiles like /var/adm/messages or storage device log files like /var/adm/messages.t3 .

  3. Server has out-of-band access to storage devices that can be monitored out-of-band like Sun T3 and Sun Switches.

  4. Multiple servers are used to distribute the monitoring load. For example, not all Sun StorEdge T3 arrays need to be monitored from the same agent. Many times, Sun StorEdge T3's will be installed in groups and will replicate their logfiles (messages.t3) to more than one server. In this case, it is best to install a slave agent on each server to have access to the logfile and the corresponding t3s from the same agent. Please reference the Installation and Configuration Planning guide for more details about StorADE configurations.



Monitoring Cycle:

Agents execution is controlled by the cron daemon on each server. The main steps of a monitoring cycle are:

  1. Verify that the agent is alone, if the previous run of the agent has not finished: allow it to finish. Only one instance of the monitoring agent (./opt/SUNWstade/bin/rasagent) should be running at any one time.

  2. Load and execute all appropriate device modules used to generate instrumentation reports and generate health related events. Instrumentation reports are generated by probing the device for all relevant information and saving this information in a report stored in /var/opt/SUNWstade/DATA. These reports are compared from one run of the agent to the next to generate health related events. Events are also created by relaying information found in logfiles. For example, all Errors and Warnings found in /var/adm/messages.t3 will be translated into a 'LogEvent' event without further analysis. Most events are generated because a rule or policy in StorADE concluded that a problem exists, but if the T3 indicates issues in the syslog file, an event is immediately generated. See appendix C for more details about the commands used to monitor devices;

  3. Send these events to the master agent if they were generated by a slave. Or, send the events to all interested parties if the agent is the master agents. The master agent is responsible for generating its own events and collecting events from the slaves. Events can also be aggregated on the master before dissemination.

  4. Store Instrumentation reports in the DATA directory. NOTE: Event logs accessible from the GUI under Monitor->Logs, (/opt/SUNWstade/DATA/Events.log). StorADE will then update the state database with the necessary statistics. Some events require that a certain threshold be attained before an event is generated, for example, having the CRC count of a switch port going up by one is not sufficient to trigger an event sinse certain threshold is required. Another example are with email. StorADE supports email thresholds which can be used to prevent the generation multiple emails about the same component of the same device. By keeping track of how many events were already sent in specified timeframe, redundant email alerts can be prevented. NOTE: Other providers (non-Email) do not support this feature since it may be important to track all indications which are sent. Most events do not have that problem since are only sent when the initial state change occurs. For example, if a battery supply is lost, an alert is sent for this transition (ie. Power failure) and no more events will be sent until the state returns to a good state (ie. Power returned) or enters another state (ie. Power supply removed).

  5. Send the appropriate events to the interested parties. Not all events are sent to everybody. For example, local administrators can choose what kind of events they want. For example, and admin can choose the device type he is interested in, the types of events he is interested in (ie. Loss of communication) and the administrator can choose the level of alerts to receive (ie. Warnings and Errors only). NOTE: The Sun SRS provider only receives actionable events (See Event structure) but the Sun Network Storage Command Center (NSCC via NetConnect) receives all events.



Event Life-Cycle:

Most StorADE events are based on health transitions. When, for example, the state of a device goes from 'Online' to 'Offline' , a health transition occurs. It is the transition from 'offline' to 'online' that generates an event, not the actual value 'offline'. If the state alone was used to generate events, the same events would be generated all the time. Transitions cannot be used when monitoring logfiles, so logEvents can be very repetitive. This problem is minimized by attaching thresholds to entries in the logfiles. Thresholds ensure that a minimum number of logfile entries within a certain time period will occur before an event is generated. StorADE also include an 'event-maximums' database that keeps track of the number of events generated about the same subject in the same 8 hours period. This database is used to stop the generation of repetitive events when there is no other way to do it. For example, if the port of a switch was toggling between offline and online every few minutes, the event-maximums database would ensure that this toggling is reported only once every 8 hours instead of every 5 minutes.

Events are usually generated using the following rules:

  1. The very first time a device is monitored, a discovery event is generated. It is not actionable and is used to set a monitoring baseline, primarily for NSCC. This event describes in details the components of the storage device. Every week after discovery, an audit event is generated. It has the same content as the discovery event.

  2. A LogEvent can be generated when interesting information is found in host or storage logfiles. This information is usually tied to the right storage devices when possible and sent to all consumers. These events can be made actionable based on thresholds and sent to SRS, SSRR, NetConnect etc.

  3. Events are generated when changes are seen in the content of the instrumentation report generated by probing the device and compared to the last instrumentation report (x minutes old usually). This is where most StorADE events are generated: stateChangeEvent, TopologyEvent, alarmEvent etc.. See Appendix B for a complete list of events by devices. In the StorADE GUI, use Report-> Service Advisor -> Event Advisor to read more about events.

  4. When possible, related events are combined by the StorADE master agent to generate AggregatedEvents. Note: event aggregation is not enabled by default but can be used for automated aggregation of multiple events into a single email which shows the aggregated event as well as the original events which were used to derive this conclusion.

All events include the following fields:



Alternate Master:

StorADE supports the concept of an alternate master. An alternate master is a slave that, on every run of the cron, verifies that the real master is still alive and when the real master does not respond, takes over some of the responsibilities of the real master. All slave , including the alternate master, have a copy of the complete StorADE configuration. This configuration describe where all the agents are located (IPaddress, etc..). This information allows the alternate master to call the slaves and temporarily redirect the flow of events from the real-master to the alternate master.

Since the real master is responsible for sending events and email, one of the main functions of the alternate master is to alert the administrator that the master server is no longer operational, this event would otherwise never be sent. The alternate master does not try to become the real master, it will however remember which agent is the real master and will relinquish it's role as temporary master once communication with the real master is regained. This architecture is meant to deal with temporary loss of the master agent: if the master agent is removed from the site, a different server should be made the permanent master (running ras_install again).



Product Footprint:

StorADE was designed to have a very small footprint and to be invisible when not in use, it includes a cron and an on-demand http-service used for browser/slave/master communication.

The StorADE software includes a cron that runs every 5 minutes. Every time the cron program starts, it verifies with the StorADE configuration file if it is time to execute the agents. The real agent frequency can be changed from agent to agent through the GUI. If, for example, the agent frequency was changed to 30 minutes, the cron will abort 5 times out of 6. This cron agent (/opt/SUNWstade/bin/rasagent) runs on both master and slave agents and is a Perl program that can grow to approximately 15Meg of memory. StorADE does not include Perl, so a version of Perl must be present on the server for StorADE to work (Perl version 5.005 or up). When running, the cron agent stores device-specific information in the /opt/SUNWstade/DATA directory and it's process size is not affected by the number of devices being monitored: once the monitoring of a device is completed, instrumentation data is stored on the disk and erased from memory.

The cron agent is only used to probe devices and generate events, it does not provide access to the StorADE GUI, that is done by an http service usually installed on port 7654 and 7443 (secure). This program, called /opt/SUNWstade/rashttp is started from inetd and will stay in memory for as long as a user requires the GUI. Rashttp has a timeout period (default is 30 seconds) after which it will exit. This was done to minimize the the number of process present of the servers. This http service is also a Perl program and it's footprint is similar to the cron agent. It is used to answer http requests coming from browsers or from slaves. Master and slaves uses http to share configuration information, topology information, new events etc..

Security Options:

StorADE can be installed with security turned on by executing ras_install and answering 'Yes' to the security question. This means that SSL (Secure Socket Layer) is used for transmission of information between the master agent and the browser and between the master agent and the slave agents. The StorADE package includes a default certificate that expires in 2008 (located in /opt/SUNWstade/System/certificate.pem), which uses The Highest grade encryption (RC4 with 128-bit secret key) . When secure mode is used, the URL used to access the master agent is https://hostname:7443. The non-secure URL is http://hostname:7654. Site specific certificates can be created with the openssl utilities (part of the public domain  OpenSSL product). A command similar to the following would be used: /usr/local/ssl/bin/openssl req -days 200 -new -nodes -x509 -out new_certificate.pem -keyout new_certificate.pem2. See appendix C for certificate details.

For further security, StorADE support multiple logins. These logins can be added by the 'root' login (login 'ras', default password 'agent') along with specific capabilities (guest, admin, expert, test). This allows different users to login with their own login/password and have a restricted set of functions available in the GUI.

 

Sun-Solutions:

Sun StorEdge Solutions products including Sun 3900/6900 (Indy) and Sun 6320 (Maserati midrange) are logical storage devices created from Sun Switches, Sun T3/6120, Sun Virtualization engines and a Service Processor. These components are pre-configured into a single product which includes a version of StorADE on the service processor (StorADE system Edition). The version of StorADE in the solution rack can be accessed like any other StorADE master agent with a browser pointing to the IPaddress of the service processor. NOTE: To the outside world (including an external instance of StorADE), this solution rack is treated as a single device.

In past releases, it was possible to configure the agent on a Sun Solution rack as a slave agent but this option was removed in StorADE 2.2 for scalability and maintainability purposes. When a StorADE agent installed outside of the rack needs to monitor this rack, it discovers the Sun Solution rack as a single device with it's own unique icon. In the following graph, The 2 switches of the Sun solution rack both have a current error (in red). These errors are represented in the rack icon by 2 small red boxes, one for each switch slots. To see the detailed topology inside the rack, the user must look at the StorADE version on the service processor of the 3900 or use the link-and-launch facility available on the master agent (outside the rack). See figure 3 for an example of a topology with a Sun solution as a single icon.



 Notification Providers:

StorADE supports a variety of Notification providers including local email, SRS, NetConnect, Trap and SSRR. These providers must be activated manually, this can be done using the GUI or the ras_admin cli. Information is sent to the providers each time the agent completes it's cycle. NOTE: Slave agents send events to the 'master' and the 'master' sends events to the providers.



FIGURES 

Figure1: Site Map:

This page shows all available functions. This page is generated dynamically and can change based on the edition of StorADE and the capabilities of the user who logged in the application (ie. A user without permission to run diagnostics tests will not see help information about diagnostics.)

Figure2: Monitor Devices:

This page shows the content of StorADE when 3 frames are used. The top frame is for navigation. The left frame shows a list of devices that are monitored with their respective health level ('Sev' for severity). The right frame can show 5 pages ([ Summary | Health | Log | Report | Graph ]) . The graph page shows an icon of the selected device (in this case a Switch) and all immediate neighbors of this device in the San (png graphics file). This graph is also followed by a list of current health problems with this switch.


 

Figure3: Topology Graph:

This page will display the topologies generated by each slave and by the master separately or the combined topology (called MASTER). Topologies can be filtered and grouped for easy access. Icons of the topology can be moved and saved in their new position to generate a more readable layout. Right-clicking on an icon exposes a menu of functions that can be performed on this icon (ie. Ue right-mouse button to display a device report or run a diagnostic). Right clicking away from any icon allows the user to change the zoom level of the diagram. Holding the shift key allows to highlight multiple icons at the same time, this is useful when moving icons. In this graph, both devices and links can be marked and clicked. Just like devices, links can be selected (right-click) to see more details about the link status. This topology graph is produced by an applet but the [ Print ] function allows to generate a png (graphic format like gif) representation for easy printing.

Figure 3a: Inside a Sun Solution:

This topology, visible from the Service Processor of a Sun Solution shows the SP itself along with the external switches, the Virtualization Engines, the internal switches and the storage arrays, in this case 3 T3s. It also shows the San connections between the components of the rack. In this figure, the Sun Solution is called 'wst31', in the previous figure, it was a different rack called 'sp87'. Sun Solutions come in a variety of models, so the type and the number of components can vary.

Figure 4: Monitor Log:

This Event Log page can be used to display a subset of the event log stored in DATA/Events.log. Events are shown with a link to the Service Advisor for more information about a particular event.


 

Figure 6: Local Email Notification:

Email are generated from the master agent to email addresses entered in the StorADE configuration using the GUI. Each email address can have different event filters. Email information can include 'description', 'information' , 'probable cause' and 'recommended action', this example has no 'probable cause' section.


 

 

Appendix A: Abbreviation List:

 

Appendix B: Commands used for monitoring:

This section describe the commands and techniques used to monitor the storage devices supported by StorADE.

Appendix C : Certificate Details







Appendix D: /etc/deviceIP.conf

This file can only be used with devices that are accessible out-of-band using an IP number. Switches, Sun T3, Sun 6120, Sun 3510 and Sun Solution are currently supported.



#IPNO NAME TYPE(optional)

10.10.10.1 t3-b1

10.10.10.2 t3-b2

10.10.10.3 switch-s1

10.10.10.4 switch-s2

10.10.10.5 minnow1 3510

10.10.10.6 indy-1 rack

10.10.10.7 6120-1

10.10.10.8







Appendix E: Event List

############################
3310.grid: Sun 3310/3510 
############################
3310       AlarmEvent                     Revision
3310       AlarmEvent                     channel
3310       AlarmEvent                     enclosure
3310       AlarmEvent                     fan
3310       AlarmEvent                     firmware_version
3310       AlarmEvent                     part
3310       AlarmEvent                     power
3310       AlarmEvent                     raid_level
3310       AlarmEvent                     size
3310       AlarmEvent                     temperature
3310       AlarmEvent                     volume
3310       CommunicationEstablishedEvent  ib
3310       CommunicationEstablishedEvent  oob
3310       CommunicationLostEvent         e
3310       CommunicationLostEvent         ib
3310       ComponentInsertEvent           disk
3310       ComponentInsertEvent           power
3310       ComponentRemoveEvent           disk
3310       DeviceLostEvent                aggregate
3310       DiscoveryEvent                 enclosure
3310       LocationChangeEvent            enclosure
3310       LogEvent                       cpu
3310       QuiesceEndEvent                enclosure
3310       QuiesceStartEvent              enclosure
3310       StateChangeEvent+              disk
3310       StateChangeEvent+              volume
3310       StateChangeEvent-              disk
3310       StateChangeEvent-              volume
############################
6120.grid: StorEdge 6120 
############################
6120       AlarmEvent+                    power.temp
6120       AlarmEvent-                    disk.pathstat
6120       AlarmEvent-                    disk.port
6120       AlarmEvent-                    disk.temperature
6120       AlarmEvent-                    interface.loopcard.cable
6120       AlarmEvent-                    power.battery
6120       AlarmEvent-                    power.fan
6120       AlarmEvent-                    power.output
6120       AlarmEvent-                    power.temp
6120       AlarmEvent                     cacheMode
6120       AlarmEvent                     cacheModeBehind
6120       AlarmEvent                     initiators
6120       AlarmEvent                     log
6120       AlarmEvent                     lunPermission
6120       AlarmEvent                     revision
6120       AlarmEvent                     system_reboot
6120       AlarmEvent                     sysvolslice
6120       AlarmEvent                     time_diff
6120       AlarmEvent                     volCount
6120       AlarmEvent                     volOwner
6120       AuditEvent                     enclosure
6120       CommunicationEstablishedEvent  ib
6120       CommunicationEstablishedEvent  oob
6120       CommunicationLostEvent         ib
6120       CommunicationLostEvent         oob
6120       ComponentInsertEvent           controller
6120       ComponentInsertEvent           disk
6120       ComponentInsertEvent           interface.loopcard
6120       ComponentInsertEvent           power
6120       ComponentRemoveEvent           controller
6120       ComponentRemoveEvent           disk
6120       ComponentRemoveEvent           interface.loopcard
6120       ComponentRemoveEvent           power
6120       DeviceLostEvent                aggregate
6120       DiagnosticTest-                6120ofdg
6120       DiagnosticTest-                6120test
6120       DiagnosticTest-                6120volverify
6120       DiscoveryEvent                 enclosure
6120       LocationChangeEvent            enclosure
6120       LogEvent                       array_error
6120       LogEvent                       array_warning
6120       LogEvent                       controller.port
6120       LogEvent                       disk
6120       LogEvent                       disk.log
6120       LogEvent                       disk.senseKey
6120       LogEvent                       driver.SSD_WARN
6120       LogEvent                       power
6120       LogEvent                       power.refreshBattery
6120       LogEvent                       power.replaceBattery
6120       LogEvent                       temp_threshold
6120       QuiesceEndEvent                enclosure
6120       QuiesceStartEvent              enclosure
6120       StateChangeEvent+              controller
6120       StateChangeEvent+              disk
6120       StateChangeEvent+              interface.loopcard
6120       StateChangeEvent+              power
6120       StateChangeEvent+              volume
6120       StateChangeEvent-              controller
6120       StateChangeEvent-              disk
6120       StateChangeEvent-              interface.loopcard
6120       StateChangeEvent-              power
6120       StateChangeEvent-              volume
6120       Statistics                     enclosure
############################
a3500fc.grid: Sun A3500FC 
############################
a3500fc    AlarmEvent-                    battery
a3500fc    AuditEvent                     enclosure
a3500fc    CommunicationEstablishedEvent  ib
a3500fc    CommunicationLostEvent         ib
a3500fc    ComponentInsertEvent           controller
a3500fc    ComponentInsertEvent           disk
a3500fc    ComponentRemoveEvent           controller
a3500fc    ComponentRemoveEvent           disk
a3500fc    DeviceLostEvent                aggregate
a3500fc    DiagnosticTest-                a3500fctest
a3500fc    DiscoveryEvent                 enclosure
a3500fc    LocationChangeEvent            enclosure
a3500fc    StateChangeEvent+              disk
a3500fc    StateChangeEvent-              controller
a3500fc    StateChangeEvent-              disk
############################
a5k.grid: Sun A5000 
############################
a5k        AlarmEvent-                    backplane
a5k        AlarmEvent-                    backplane.fan
a5k        AlarmEvent-                    disk
a5k        AlarmEvent-                    interface.gbic
a5k        AlarmEvent-                    interface.iboard
a5k        AuditEvent                     enclosure
a5k        CommunicationEstablishedEvent  ib
a5k        CommunicationLostEvent         ib
a5k        ComponentInsertEvent           disk
a5k        ComponentRemoveEvent           disk
a5k        DeviceLostEvent                aggregate
a5k        DiagnosticTest-                a5ksestest
a5k        DiagnosticTest-                a5ktest
a5k        DiscoveryEvent                 enclosure
a5k        LocationChangeEvent            enclosure
a5k        StateChangeEvent+              disk
a5k        StateChangeEvent+              interface.iboard
a5k        StateChangeEvent+              power
a5k        StateChangeEvent-              disk
a5k        StateChangeEvent-              interface.iboard
a5k        StateChangeEvent-              power
a5k        logEvent                       driver
############################
agent.grid:  
############################
agent      AgentDeinstallEvent            enclosure
agent      AgentInstallEvent              enclosure
agent      AlarmEvent                     system_errors
agent      AlternateMaster+               enclosure
agent      AlternateMaster-               enclosure
agent      CommunicationEstablishedEvent  oob
agent      CommunicationLostEvent         ntc
agent      CommunicationLostEvent         oob
agent      HeartbeatEvent                 enclosure
############################
brocade.grid: Brocade switch 
############################
brocade    AlarmEvent                     sensor.fan
brocade    AlarmEvent                     sensor.power
brocade    AlarmEvent                     sensor.temperature
brocade    AlarmEvent                     system_reboot
brocade    AuditEvent                     enclosure
brocade    CommunicationEstablishedEvent  oob
brocade    CommunicationLostEvent         oob
brocade    ConnectivityLostEvent          aggregate
brocade    DeviceLostEvent                aggregate
brocade    DiagnosticTest-                switchtest
brocade    DiscoveryEvent                 enclosure
brocade    LocationChangeEvent            enclosure
brocade    LogEvent                       PhysState
brocade    LogEvent                       port.statistics
brocade    StateChangeEvent+              port
brocade    StateChangeEvent-              port
brocade    Statistics                     enclosure
############################
d2.grid: Sun D2 
############################
d2         AlarmEvent-                    fan
d2         AlarmEvent-                    power
d2         AlarmEvent                     esm.revision
d2         AlarmEvent                     midplane.revision
d2         AlarmEvent                     slot_count
d2         AlarmEvent                     temperature
d2         AuditEvent                     enclosure
d2         CommunicationEstablishedEvent  ib
d2         CommunicationLostEvent         ib
d2         ComponentRemoveEvent           esm
d2         ComponentRemoveEvent           midplane
d2         DeviceLostEvent                aggregate
d2         DiagnosticTest-                d2test
d2         DiscoveryEvent                 enclosure
d2         LocationChangeEvent            enclosure
d2         StateChangeEvent+              disk
d2         StateChangeEvent-              disk
############################
host.grid: Host 
############################
host       AlarmEvent+                    hba
host       AlarmEvent-                    hba
host       AlarmEvent-                    lun.T300
host       AlarmEvent-                    lun.VE
host       AlarmEvent                     disk_capacity
host       AlarmEvent                     disk_capacity_okay
host       DiagnosticTest-                ifptest
host       DiagnosticTest-                qlctest
host       DiagnosticTest-                socaltest
host       LogEvent                       array_error
host       LogEvent                       array_warning
host       LogEvent                       driver.ELS_RETRY
host       LogEvent                       driver.Fabric_Warning
host       LogEvent                       driver.Firmware_Change
host       LogEvent                       driver.LOOP_OFFLINE
host       LogEvent                       driver.LOOP_ONLINE
host       LogEvent                       driver.MPXIO
host       LogEvent                       driver.MPXIO_offline
host       LogEvent                       driver.PFA
host       LogEvent                       driver.QLC_LOOP_OFFLINE
host       LogEvent                       driver.QLC_LOOP_ONLINE
host       LogEvent                       driver.SCSI_ASC
host       LogEvent                       driver.SCSI_TRAN_FAILED
host       LogEvent                       driver.SCSI_TR_READ
host       LogEvent                       driver.SCSI_TR_WRITE
host       LogEvent                       driver.SFOFFTOWARN
host       LogEvent                       driver.SF_CRC_ALERT
host       LogEvent                       driver.SF_CRC_WARN
host       LogEvent                       driver.SF_DMA_WARN
host       LogEvent                       driver.SF_OFFLALERT
host       LogEvent                       driver.SF_OFFLINE
host       LogEvent                       driver.SF_RESET
host       LogEvent                       driver.SF_RETRY
host       LogEvent                       driver.SSD_ALERT
host       LogEvent                       driver.SSD_WARN
host       LogEvent                       error
host       LogEvent                       warning
host       PatchInfo                      enclosure
host       backup                         enclosure
host       patchInfo                      enclosure
############################
internal.grid:  
############################
internal   AuditEvent                     enclosure
internal   CommunicationEstablishedEvent  ib
internal   CommunicationLostEvent         ib
internal   ComponentInsertEvent           disk
internal   ComponentRemoveEvent           disk
internal   DiagnosticTest-                fcdisktest
internal   DiscoveryEvent                 enclosure
############################
mcdata.grid: McData switch 
############################
mcdata     AlarmEvent                     fan
mcdata     AlarmEvent                     power
mcdata     AlarmEvent                     system_reboot
mcdata     AuditEvent                     enclosure
mcdata     CommunicationEstablishedEvent  oob
mcdata     CommunicationLostEvent         oob
mcdata     ConnectivityLostEvent          aggregate
mcdata     DeviceLostEvent                aggregate
mcdata     DiscoveryEvent                 enclosure
mcdata     LocationChangeEvent            enclosure
mcdata     LogEvent                       PhysState
mcdata     LogEvent                       port.statistics
mcdata     StateChangeEvent+              port
mcdata     StateChangeEvent-              port
mcdata     Statistics                     enclosure
############################
san.grid:  
############################
san        LinkEvent_CRC                  Any|Any
san        LinkEvent_CRC                  host|storage
san        LinkEvent_CRC                  host|switch
san        LinkEvent_CRC                  switch|a3500fc
san        LinkEvent_CRC                  switch|a5k
san        LinkEvent_CRC                  switch|storage
san        LinkEvent_CRC                  switch|switch
san        LinkEvent_CRC                  switch|t3
san        LinkEvent_CRC                  ve|switch
san        LinkEvent_ITW                  Any|Any
san        LinkEvent_ITW                  host|storage
san        LinkEvent_ITW                  host|switch
san        LinkEvent_ITW                  switch|a3500fc
san        LinkEvent_ITW                  switch|a5k
san        LinkEvent_ITW                  switch|storage
san        LinkEvent_ITW                  switch|switch
san        LinkEvent_ITW                  switch|t3
san        LinkEvent_ITW                  ve|switch
san        LinkEvent_SIG                  Any|Any
san        LinkEvent_SIG                  host|storage
san        LinkEvent_SIG                  host|switch
san        LinkEvent_SIG                  switch|a3500fc
san        LinkEvent_SIG                  switch|a5k
san        LinkEvent_SIG                  switch|storage
san        LinkEvent_SIG                  switch|switch
san        LinkEvent_SIG                  switch|t3
san        LinkEvent_SIG                  ve|switch
############################
se.grid: Sun 3900/6900 
############################
se         AggregatedEvent                POWERSEQ1
se         AlarmEvent-                    lun
se         AlarmEvent-                    remove_lun
se         CommunicationLostEvent         oob
se         ComponentInsertEvent           lun
se         ComponentRemoveEvent           lun
se         ComponentRemoveEvent           slot
se         DeviceLostEvent                aggregate
se         StateChangeEvent               links
se         StateChangeEvent               port
se         StateChangeEvent               slot
se         StateChangeEvent               sp
############################
se2.grid: Sun 6320 
############################
se2        AggregatedEvent                POWERSEQ1
se2        AlarmEvent-                    lun
se2        AlarmEvent-                    power_sequencer
se2        ComponentInsertEvent           lun
se2        ComponentRemoveEvent           lun
se2        DeviceLostEvent                aggregate
############################
switch.grid: Sun Switch 
############################
switch     AlarmEvent                     chassis.fan
switch     AlarmEvent                     chassis.power
switch     AlarmEvent                     chassis.temperature
switch     AlarmEvent                     port.statistics
switch     AlarmEvent                     system_reboot
switch     AlarmEvent                     zone_change
switch     AuditEvent                     enclosure
switch     CommunicationEstablishedEvent  oob
switch     CommunicationLostEvent         oob
switch     ConnectivityLostEvent          aggregate
switch     DeviceLostEvent                aggregate
switch     DeviceLostEvent                ib
switch     DiagnosticTest-                switchtest
switch     DiscoveryEvent                 enclosure
switch     LocationChangeEvent            enclosure
switch     LogEvent                       port.statistics
switch     StateChangeEvent+              port
switch     StateChangeEvent-              port
switch     Statistics                     enclosure
############################
switch2.grid: Sun Switch2 
############################
switch2    AlarmEvent-                    chassis.board
switch2    AlarmEvent-                    chassis.fan
switch2    AlarmEvent                     chassis.power
switch2    AlarmEvent                     port.statistics
switch2    AlarmEvent                     system_reboot
switch2    AuditEvent                     enclosure
switch2    CommunicationEstablishedEvent  oob
switch2    CommunicationLostEvent         fsa
switch2    CommunicationLostEvent         oob
switch2    ConnectivityLostEvent          aggregate
switch2    DeviceLostEvent                aggregate
switch2    DiagnosticTest-                switch2test
switch2    DiscoveryEvent                 enclosure
switch2    LocationChangeEvent            enclosure
switch2    StateChangeEvent+              port
switch2    StateChangeEvent-              port
switch2    Statistics                     enclosure
############################
t3.grid: Sun T3 
############################
t3         AlarmEvent+                    power.temp
t3         AlarmEvent-                    disk.pathstat
t3         AlarmEvent-                    disk.port
t3         AlarmEvent-                    disk.temperature
t3         AlarmEvent-                    interface.loopcard.cable
t3         AlarmEvent-                    power.battery
t3         AlarmEvent-                    power.fan
t3         AlarmEvent-                    power.output
t3         AlarmEvent-                    power.temp
t3         AlarmEvent                     add_initiators
t3         AlarmEvent                     backend_loop
t3         AlarmEvent                     cacheMode
t3         AlarmEvent                     cacheModeBehind
t3         AlarmEvent                     device_path
t3         AlarmEvent                     initiators
t3         AlarmEvent                     log
t3         AlarmEvent                     loop.statistics
t3         AlarmEvent                     lunPermission
t3         AlarmEvent                     remove_initiators
t3         AlarmEvent                     revision
t3         AlarmEvent                     system_reboot
t3         AlarmEvent                     sysvolslice
t3         AlarmEvent                     time_diff
t3         AlarmEvent                     volCount
t3         AlarmEvent                     volOwner
t3         AuditEvent                     enclosure
t3         CommunicationEstablishedEvent  ib
t3         CommunicationEstablishedEvent  oob
t3         CommunicationLostEvent         ib
t3         CommunicationLostEvent         oob
t3         ComponentInsertEvent           controller
t3         ComponentInsertEvent           disk
t3         ComponentInsertEvent           interface.loopcard
t3         ComponentInsertEvent           power
t3         ComponentRemoveEvent           controller
t3         ComponentRemoveEvent           disk
t3         ComponentRemoveEvent           interface.loopcard
t3         ComponentRemoveEvent           power
t3         DeviceLostEvent                aggregate
t3         DiagnosticTest-                t3ofdg
t3         DiagnosticTest-                t3test
t3         DiagnosticTest-                t3volverify
t3         DiscoveryEvent                 enclosure
t3         LocationChangeEvent            enclosure
t3         LogEvent                       array_error
t3         LogEvent                       array_warning
t3         LogEvent                       controller.port
t3         LogEvent                       disk
t3         LogEvent                       disk.error
t3         LogEvent                       disk.log
t3         LogEvent                       disk.senseKey
t3         LogEvent                       power.battery
t3         LogEvent                       power.battery.refresh
t3         LogEvent                       power.battery.replace
t3         LogEvent                       temp_threshold
t3         QuiesceEndEvent                enclosure
t3         QuiesceStartEvent              enclosure
t3         RemovalEvent                   enclosure
t3         StateChangeEvent+              controller
t3         StateChangeEvent+              disk
t3         StateChangeEvent+              interface.loopcard
t3         StateChangeEvent+              power
t3         StateChangeEvent+              volume
t3         StateChangeEvent-              controller
t3         StateChangeEvent-              disk
t3         StateChangeEvent-              interface.loopcard
t3         StateChangeEvent-              power
t3         StateChangeEvent-              volume
t3         Statistics                     enclosure
############################
tape.grid: FC-Tape 
############################
tape       AuditEvent                     enclosure
tape       CommunicationEstablishedEvent  ib
tape       CommunicationLostEvent         ib
tape       DeviceLostEvent                aggregate
tape       DiagnosticTest-                fctapetest
tape       DiscoveryEvent                 enclosure
tape       LocationChangeEvent            enclosure
tape       StateChangeEvent+              port
tape       StateChangeEvent-              port
############################
v880disk.grid: Sun V880 Disk 
############################
v880disk   AlarmEvent-                    backplane
v880disk   AlarmEvent-                    loop
v880disk   AlarmEvent-                    temperature
v880disk   AuditEvent                     enclosure
v880disk   CommunicationEstablishedEvent  ib
v880disk   CommunicationLostEvent         ib
v880disk   ComponentInsertEvent           disk
v880disk   ComponentRemoveEvent           disk
v880disk   DeviceLostEvent                aggregate
v880disk   DiagnosticTest-                daktest
v880disk   DiscoveryEvent                 enclosure
v880disk   LocationChangeEvent            enclosure
############################
ve.grid: Vicom VE 
############################
ve         AlarmEvent                     log
ve         AlarmEvent                     volume
ve         AlarmEvent                     volume_add
ve         AlarmEvent                     volume_delete
ve         AuditEvent                     enclosure
ve         CommunicationEstablishedEvent  oob
ve         CommunicationLostEvent         oob.command
ve         CommunicationLostEvent         oob.ping
ve         CommunicationLostEvent         oob.slicd
ve         DeviceLostEvent                aggregate
ve         DiagnosticTest-                ve_diag
ve         DiagnosticTest-                veluntest
ve         DiscoveryEvent                 enclosure
ve         LocationChangeEvent            enclosure