A typical Storage Automated Diagnostic Environment installation consists of the following steps:
One server is the master agent (usually because it is already a management station or because it has access to email and is registered with the name-server and easily accessible). The master agent is the agent that provides the user interface. It is called 'master' even when there is no slave present. Each instance of an agent, either master or slave, can monitor devices.
Devices can be monitored in-band (usually by slave agents installed on the appropriate server) or out-of-band (from any agent). When log files are available (as in the case of Sun StorEdge T3, T3+, and 6120 arrays and Sun StorEdge 3310/3510/3511 Fibre Channel arrays), it is usually best to install an agent on the server where these log files are replicated and monitor the devices out-of-band from this agent. This configuration enables the same agent to see logfile information and to probe the device and correlate the information found.
7654
(non-secure) and 7443
(secure). The initial configuration consists of the following steps:
Most of these functions can also be performed using CLI commands for convenience and automation purposes.
username
=<ras>, password
=<agent>. After the initial login, you can change the password with the software's Root Password feature.
The Storage Automated Diagnostic Environment monitors the devices included in it's configuration file (/opt/SUNWstade/DATA/rasagent.conf
). Devices can be added to this file using 'Add Device', 'Discover Devices' or the ras_admin
(1M) CLI command (/opt/SUNWstade/bin/ras_admin
). 'Add Device' is straightforward and usually involves entering the IP of the device.
Before the Storage Automated Diagnostic Environment can add a device to its configuration, it must be able to access and identify the device. Identification usually means finding the WorldWide Name (WWN) of the device along with the enclosure-ID. Device discovery can be automated using the /etc/deviceIP.conf
file.
Note -
JBOD devices cannot be discovered using the /etc/deviceIP..conf file.
This file has a syntax similar to /etc/hosts
and is maintained by the system administrator. It contains a list of all devices that should be monitored by Storage Automated Diagnostic Environment software.
Both the CLI (ras_admin
(1M) discover_deviceIP
) and the GUI can be used to discover devices based on the /etc/deviceIP.conf
file.
Note -
Device discovery for the Sun StorEdge 3510 Fibre Channel array cannot be automated using the /etc/deviceIP.conf
file, as it can with other devices. The valid format for the Sun StorEdge 3510 FC array is IP
Name
(for example, 10.0.0.1 switch1
).
Discovering a topology is slightly more complicated than the other steps. To do a complete topology discovery, every agent (master and slave) must discover their section of the SAN, both in-band and out-of-band, merge this information into a single topology and send this topology to the master agent for further aggregation. The master agent merges all received topologies with its own topology to create a single Storage Automated Diagnostic Environment 'MASTER' topology.
The topology created by the Storage Automated Diagnostic Environment is primarily a physical topology. It includes enclosure information, partner-group information, in-band path information, and the World Wide Name (WWN). It will be saved as the current SAN snapshot and will be used in all SAN-related operations until a new SAN topology snapshot is created by the customer. This is available from Admin ->Topo.Maintenance -> Topology Snapshot.
When the Storage Automated Diagnostic Environment package is installed and ras_install
has completed, the agents for each device may not be running. Agents are started from the GUI, usually after device discovery and notification provider initialization. Starting agents really means that the Storage Automated Diagnostic Environment crons are now active on all agents (master and slaves). This function is available from Admin->general_maintenance->start_agents initialization.
When a device alert occurs, the Storage Automated Diagnostic Environment software notifies the site administrator using email (if configured). The Storage Automated Diagnostic Environment software also sends notification to Sun using one of the remote services (for example, Net Connect) which were originally configured. Email messages sent to the administrator are often sufficient to isolate the problem, since they include a probable cause and recommended action.
See the Storage Automated Diagnostic Environment online help for the following information:
When a device alert occurs, the Storage Automated Diagnostic Environment software can send remote notification, in addition to local email notification. Remote notification consists of alert and alarm information, as well as telemetry data about the devices that are monitored. See the online help for details about remote notification.
To get a broad view of the problem, the site administrator or Sun personnel can review the email information in context. This can be done by:
For many Alarms, information regarding the Probable Cause and Recommended Action can be accessed from the Alarm view. This information should allow the user to isolate the source of the problem. In cases where the problem is still undetermined, diagnostic tests should be executed.
Diagnostics can be executed from the CLI or from the GUI. The Storage Automated Diagnostic Environment GUI enables users to execute tests remotely using the slave agents. This feature allows the user to start and control tests from one centralized GUI on the master server even when the actual diagnostic test is running on a slave server.