Installation Life-Cycle

A typical Storage Automated Diagnostic Environment installation consists of the following steps:

  1. Install the software on a set of servers.
  2. One server is the master agent (usually because it is already a management station or because it has access to email and is registered with the name-server and easily accessible). The master agent is the agent that provides the user interface. It is called 'master' even when there is no slave present. Each instance of an agent, either master or slave, can monitor devices.

    Devices can be monitored in-band (usually by slave agents installed on the appropriate server) or out-of-band (from any agent). When log files are available (as in the case of Sun StorEdge T3, T3+, and 6120 arrays and Sun StorEdge 3310/3510/3511 Fibre Channel arrays), it is usually best to install an agent on the server where these log files are replicated and monitor the devices out-of-band from this agent. This configuration enables the same agent to see logfile information and to probe the device and correlate the information found.

  3. Initialize the configuration.
    1. Access the Storage Automated Diagnostic Environment by pointing a browser to the host which includes the proper port number. The Storage Automated Diagnostic Environment port numbers are 7654 (non-secure) and 7443 (secure).
    2. The initial configuration consists of the following steps:

    3. Entering site information
    4. Discovering devices
    5. Adding storage devices manually to the software configuration
    6. Adding local email addresses for event reception
    7. Adding notification providers for transmission of events.
    8. Most of these functions can also be performed using CLI commands for convenience and automation purposes.

    9. Check your configuration using the Review Configuration feature. This feature is in the GUI's Administration--> General Maintenance section.
    10. The initial login is always username=<ras>, password=<agent>. After the initial login, you can change the password with the software's Root Password feature.
    11. In addition, you can set up users, assign roles and permissions, and customize window options using the User Roles feature. Both of these features are in the GUI's Administration--> System Utilities section.
  4. Discover devices.
  5. The Storage Automated Diagnostic Environment monitors the devices included in it's configuration file (/opt/SUNWstade/DATA/rasagent.conf). Devices can be added to this file using 'Add Device', 'Discover Devices' or the ras_admin(1M) CLI command (/opt/SUNWstade/bin/ras_admin). 'Add Device' is straightforward and usually involves entering the IP of the device.

    Before the Storage Automated Diagnostic Environment can add a device to its configuration, it must be able to access and identify the device. Identification usually means finding the WorldWide Name (WWN) of the device along with the enclosure-ID. Device discovery can be automated using the /etc/deviceIP.conf file.


    Note -

    JBOD devices cannot be discovered using the /etc/deviceIP..conf file.


    This file has a syntax similar to /etc/hosts and is maintained by the system administrator. It contains a list of all devices that should be monitored by Storage Automated Diagnostic Environment software.

    Both the CLI (ras_admin(1M) discover_deviceIP) and the GUI can be used to discover devices based on the /etc/deviceIP.conf file.


    Note -

    Device discovery for the Sun StorEdge 3510 Fibre Channel array cannot be automated using the /etc/deviceIP.conf file, as it can with other devices. The valid format for the Sun StorEdge 3510 FC array is IP Name (for example, 10.0.0.1 switch1).


  6. Discover topology.
  7. Discovering a topology is slightly more complicated than the other steps. To do a complete topology discovery, every agent (master and slave) must discover their section of the SAN, both in-band and out-of-band, merge this information into a single topology and send this topology to the master agent for further aggregation. The master agent merges all received topologies with its own topology to create a single Storage Automated Diagnostic Environment 'MASTER' topology.

    The topology created by the Storage Automated Diagnostic Environment is primarily a physical topology. It includes enclosure information, partner-group information, in-band path information, and the World Wide Name (WWN). It will be saved as the current SAN snapshot and will be used in all SAN-related operations until a new SAN topology snapshot is created by the customer. This is available from Admin ->Topo.Maintenance -> Topology Snapshot.

  8. Start the agents.
  9. When the Storage Automated Diagnostic Environment package is installed and ras_install has completed, the agents for each device may not be running. Agents are started from the GUI, usually after device discovery and notification provider initialization. Starting agents really means that the Storage Automated Diagnostic Environment crons are now active on all agents (master and slaves). This function is available from Admin->general_maintenance->start_agents initialization.

  10. Set up local email notification.
  11. When a device alert occurs, the Storage Automated Diagnostic Environment software notifies the site administrator using email (if configured). The Storage Automated Diagnostic Environment software also sends notification to Sun using one of the remote services (for example, Net Connect) which were originally configured. Email messages sent to the administrator are often sufficient to isolate the problem, since they include a probable cause and recommended action.

    See the Storage Automated Diagnostic Environment online help for the following information:

  12. Set up remote email notification.
  13. When a device alert occurs, the Storage Automated Diagnostic Environment software can send remote notification, in addition to local email notification. Remote notification consists of alert and alarm information, as well as telemetry data about the devices that are monitored. See the online help for details about remote notification.

  14. Monitor the devices.
  15. To get a broad view of the problem, the site administrator or Sun personnel can review the email information in context. This can be done by:

    • Viewing the device itself (Monitor->Devices)
    • Displaying the topology (Monitor->Topology)
    • Analyzing the device's event log (Monitor ->Event Log)
  16. Isolate the problem.
  17. For many Alarms, information regarding the Probable Cause and Recommended Action can be accessed from the Alarm view. This information should allow the user to isolate the source of the problem. In cases where the problem is still undetermined, diagnostic tests should be executed.

    Diagnostics can be executed from the CLI or from the GUI. The Storage Automated Diagnostic Environment GUI enables users to execute tests remotely using the slave agents. This feature allows the user to start and control tests from one centralized GUI on the master server even when the actual diagnostic test is running on a slave server.

  18. Once the problem is fixed, the user can clear the health of the device in the Storage Automated Diagnostic Environment GUI, recreate a topology if new storage devices were added and go back to step-5.

Related Topics