Appendix A |
Modules Appendix |
This chapter covers the following topics:
This section covers the following topics:
The Sun Management Center agent is based on Tcl and TOE technologies. This section provides background information about the development environment of the Sun Management Center agent.
Tcl (Tool Command Language) is an interpreted command-oriented language that can be used to connect building blocks built in system programming languages like C. Commands can be added to the interpreter using a clean C interface, and these commands co-exist with built-in Tcl commands.
Tcl has both simple variables and associative arrays, and all values (including procedure bodies) are represented as strings.
For more information about the Tcl language, refer to Tcl and Tk Toolkit.
The Tcl Object Extension (TOE) is a simple modification to the Tcl language that provides an object-oriented environment that supports a rich set of object-oriented (OO) features, and that is backward compatible with conventional Tcl code.
The premise behind the TOE modifications is simple. It was observed that all Tcl hash table access is channelled through two C macros, one to create hash entries and one to locate them.
Using this knowledge, these macros were overridden to call a set of recursive hash table operators that are capable of locating commands or data in a more sophisticated manner. This twisting of the hash table operators can be done with a one-line modification to the Tcl source code and is completely transparent to all users of these functions.
Using the modified hash table behavior, an object system was built that capitalizes on this new hash table scoping algorithm. A simple data structure, known as a TOE object, was created that is simply a pair of hash tables (one for commands, one for data) and a set of pointers to other TOE objects. The hash tables store procedures and data (properties) local to that object, while the pointers reference parent objects. Parent objects can be recursed to locate commands or data not found in the local hash tables.
To complete the system, a pointer to the current TOE object is placed in the global command hash table of the interpreter. When a command is executed, the Tcl system uses the low level Tcl hash operators to find the body of the command. These modified operators detect an active TOE context, and delegate the hash lookup to the hash tables of the current TOE object. Failure to locate the target key in that object triggers recursion into each of the parent pointers until the key is hit or all ancestors have been searched.
This transparent recursion makes all hash entries in all parents of an object appear to be local to that object. This behavior corresponds to inheritance in an object oriented environment. Other key object oriented features, such as polymorphism and dynamic binding, also fall out of the design, as the function performed by a procedure depends entirely on the object in which it was invoked.
A TOE object is a data structure consisting of a command hash table, a dictionary hash table (for object property storage), parent object pointers (for ancestral relationships) and a superior object pointer (for structural relationships).
Because a TOE object contains its own command and dictionary hash tables, objects can support their own command vocabulary and properties. The command names are local to the object, so commands bearing the same name can coexist in different objects. The dictionary properties are independent of the Tcl variable system, so variable use need not alter or conflict with object properties.
The TOE system supports ancestral and structural object relationships.
Ancestral Relationships
These relationships define the parent/child relationships of objects. This defines the object-oriented inheritance characteristics of an object, with the child object inheriting commands and data from the parent object.
In this relationship, child objects can see all of the commands and dictionary data in the parent object, that is implemented by referencing the parent objects on every hash table lookup. This parental referencing becomes a parent tree traversal if the parents themselves have parents.
Structural Relationships
These relationships define the superior/inferior relationships of objects in a tree structure. Objects can be organized into tree structures where each object has a superior (the object up the tree) and zero or more inferiors (the objects down the tree).
By independently supporting these two types of relationships, trees of objects can be constructed where the structural aspects of the tree (defined by the overall purpose of the objects) is independent of the inheritance of the nodes in the tree (defined by the functions performed by the individual objects).
In this example, the structure of the tree is related to the overall purpose of the objects (in this case, a model of a file system), while the ancestry of each object determines what the objects do and how it behaves (in this case, the primitive data types the object represents).
Every TOE object contains a set of properties. In the TOE environment, object properties are stored in a dictionary. Each object contains a dictionary that stores properties relevant to that object instance. In the implementation of TOE, a dictionary is a hash table that stores information using logical keys.
The TOE object dictionaries use a two-key paradigm, where two logical names are used to reference any one data entity. This allows dictionaries to be partitioned into separate sections, with the division being based on the purpose, source, or volatility of the data being stored. These dictionary partitions are referred to as slices in the TOE system, and the pieces of data within each slice are named using what is referred to as the dictionary key. Dictionary slices can be thought of as property classes when used to configure object instances.
TABLE A-1 Dictionary Example Slice
Key
Value
value refreshCommand "df -kF ufs" value refreshInterval "60" alarmlimit warning "10000" alarmlimit error "5000" data 1 "95000"
The object's dictionary has three partitions or slices:
Value
The value slice contains configuration information. In this case it is the refresh command and interval of the file system entity.
Alarmlimit
The alarmlimit slice contains the error and warning level alarm limits.
Data
The data slice contains the dynamic data of the object, in this case the current floating point value of the managed property, free.
This is a typical example of data partitioning using slices, where the slices are based on the purposes and sources of the dictionary entries and are directly related to the classes of properties used by an object instance.
The dictionaries define certain operations that can be performed on entire slices. These operations include the ability to list all the currently defined keys in a slice and to undefine an entire slice. Hence maintenance of dictionary keys is simplified if the slices are properly configured, and a certain amount of accountability can be achieved if the dictionaries are partitioned along functional boundaries.
The TOE object dictionaries have the inherent ability to import and export themselves as formatted text. The format of this representation is referred to as the .x file format. In this format, the slices and keys of a dictionary are represented using a well-defined, unambiguous syntax.
Dictionary entries can be described using the following syntax:
[slice:]key = value
Using this syntax, the dictionary entries in the preceding example table can be represented as:
value:refreshCommand = "df -kFufs" value:refreshInterval = "60" alarmlimit:warning = "10000" alarmlimit:error = "5000" data:1 = 95000
In this example, both of the slices of the object's dictionary were exported together, and all keys are prefixed by their slice name. In actuality, slices can be exported and imported individually, and if there is only one slice present, the slice prefix is optional. This can be thought of as slice relative, since the keys are placed in whatever slice is specified at the time of import. For example, the data slice of the dictionary can be exported slice relative as follows:
warning = "10000" error = "5000"
The dictionaries of many objects can be exported or imported in a single operation. In such operations, the tree structure of the objects is maintained in the .x file output. The .x file syntax for an object is as follows:
object1 = { key1 = "value 1" key2 = "value 2" }
In this notation, the opening of the curly brace indicates that the key-value pairs to follow belong to the object named object1. Such a representation is generated if an export is performed from the superior object of the object1 object. This hierarchical representation can be nested as deep as the object tree, supporting arbitrarily nested .x file representations. The following is an example of an .x file representation that is two levels deep:
object1 = { key1 = "value 1" key2 = "value 2" object2 = { key3 = "value 3" key4 = "value 4" } }
The .x file format supports the specification of actions, or logical operations, to be performed during initialization on objects described in the object tree. The general form of an action is:
[ action args ... ]
This syntax is simply a set of square braces enclosing the action command line, and optional arguments can be specified. The actual actions supported by the .x file parser depends on the application using the object tree, but several actions are always valid, such as:
Inherit--Adds the named object(s) to the object's parent list, thus altering the ancestral relationships of the current object. This action is the primary way of creating parent and child relationships within trees that are specified using module configuration files.
mount = { [ inherit primitives.string ] ... }
Load--Loads the named .x file into the current object. This is the primary mechanism for combining multiple module configuration files into a single object tree. In the following example, the .x file named primitives.x is loaded into the primitives object.
primitives = { [ load primitives.x ] }
Source--Loads and executes a Tcl/TOE source file into the current object. This is the primary means of extending and overriding an object's command set from an .x file. In the example, a Tcl/TOE file named primitives.prc is loaded and executed into the proc object.
proc = { [ source primitives.prc ] }
By using the nested nature of module configuration files and the inherit action, both the ancestral and structural aspects of an object tree can be represented.
The following .x file can be used to describe the file system subtree in FIGURE A-5:
filesystem = { mount = { [ inherit primitives.string ] } size = { [ inherit primitives.float ] } free = { [ inherit primitives.float ] } }
TOE object classes are the primary mechanism employed to extend the command vocabulary of TOE objects. TOE object classes encapsulate a set of commands that provide a well defined function. TOE objects can then inherit these classes to gain the desired functionality of the command set.
Examples of TOE object classes used by the Sun Management Center agent include the MIB node class and the SNMP class. The MIB node class enables TOE objects to gather and store data periodically and perform alarm checks on the data. The SNMP class encapsulates SNMP communication capabilities.
The agent framework consists of a single tree structure within the agent that contains global services, configuration data, classes and templates that can be used by any object within the agent.
The following is a general structure of an agent's TOE object tree:
The agent framework provides the core agent services and functions that include SNMP communications, command execution, and module management.
This framework exists to support the realization of managed objects, properties and other modeling elements that perform the actual monitoring and management functions of the agent. The managed objects, properties, and other modeling elements are encapsulated in management modules and are also loaded in this tree.
The shell service object (.services.io.sh) provides a mechanism for the Sun Management Center agent to execute commands (scripts and programs) and obtain the results of the command. This service is commonly used by module MIB nodes for data acquisition and for executing alarm actions.
The shell service supports the queuing of commands to be executed. It also supports the spawning of multiple shells to allow commands to be executed in parallel.
This service involves the agent opening pipes to one or more captive Bourne shell processes. The maximum number of shells to run is configurable.
When interfacing with the shell services, the caller specifies the shell command and the callback to process the command results.
The shell command to be executed can be specified with or without a full path. If the command is not specified with a full path, the command is searched for in the directories specified by the PATH environment variable of the agent.
The callback specification is comprised of a TOE object identifier and a callback method. The TOE ID specifies the TOE object context in which the callback method should be executed. The callback method must be specified with the %result argument (for example, callbackMethod %result) that is substituted with a result specification every time the callback is invoked.
The result specification returned to the callback is in the form of a three element list comprised of a return code, a transaction identifier, and corresponding data. The possible results are as follows:
A very simple shell protocol defines the interaction between the agent and the shell.
For each command to be executed, the agent sends the command to be executed to the shell, followed by echo EOT, where EOT is the terminating character. The shell executes the commands so that the command result is returned followed by EOT. The reception of the terminating character indicates the end of the transaction, implying that the next command can be sent to the shell.
The icmp object (.services.io.icmp) enables the Sun Management Center agent to ping hosts to determine whether they are up or down. Ping uses the ICMP protocol ECHO_REQUEST datagram to elicit an ICMP ECHO_RESPONSE from the specified host. A host is assumed to be up if it responds. By default, a host is assumed to be down if it does not respond after three retries, each with a timeout of 10 seconds. The number of retries and the timeout can be overridden by specifying the maxRetries and retryInterval parameters, respectively, in the .services.io.icmp object.
The ping service is used by the SNMP interface to determine the status of a host whose agent does not respond to an SNMP request. This service is also used by the Topology module in the Topology agent when monitoring entities as IP-based devices.
The mel object (.services.mel) provides timer services to other objects. It allows other nodes to register and cancel time based events.
The default service (.services.io.default) is a default shell service provided for general use by any object. However, in general, modules that require a shell service should specify their own shell service to guarantee the availability of its access to a shell service.
This service (.services.history) maintains a table of all current data logging requests. This table is automatically updated whenever the data logging specifications of a managed property changes.
This table is queried by the data logging registry module using the listRegistry method. This module allows console users to view information about all managed properties whose values are currently being logged.
The logging information includes the following fields:
Note - The data logging registry service does not perform the actual addition or removal data logging requests; it maintains a table that reflects the current data logging requests.
The configuration of data logging is supported through shadow SNMP requests to the appropriate MIB node.
This service (.services.fscan) allows MIB objects to subscribe for file scanning services. Conceptually, MIB objects subscribe by specifying a filename, regular expression pattern, and a callback. The service incrementally scans the file for regular expression pattern and when the pattern is detected, the callback is called with the match results. When the MIB object is no longer interested in the scanning of the pattern, it can then perform an unsubscription request.
This service is used by MIB objects whose alarm check involves log rules.
To subscribe for the detection of a pattern in a file, the fsSubscribe method is used:
fsSubscribe <filename> <pattern> <callback spec> ?<node template>?
where:
If the subscription is successful, the TOE object ID of the file scanning node is returned. If the subscription fails, -1 is returned.
To remove an existing subscription, the fsUnsubscribe method can be used:
fsUnsubscribe <filename> <pattern> <callback>
Module management is a fundamental function provided by the Sun Management Center agent framework. It enables the agent to load and unload the management modules that define the monitoring and management functions performed by the agent.
Modules comprise of a set of managed objects and properties that focus on a particular aspect of system or application condition and performance.
The discussion of module management in the Sun Management Center agent is divided into the following topics:
The Sun Management Center agent supports SNMP contexts to identify MIB modules that can have multiple instances. Each SNMP context is represented by a separate MIB subtree.
The .iso subtree represents the default SNMP context (all modules that can only be instantiated once they are loaded into this subtree). The standard MIB objects that are not part of modules are also loaded into this subtree.
In general, the .iso subtree for the default SNMP context contains two main branches, the standard management branch (mgmt) and the private enterprises branch.
The standard SNMP management MIB objects are loaded in the mgmt subtree. An example of a standard SNMP MIB is the MIB for Network Management of TCP/IP-based internets (MIB-II).
The enterprises branch contains enterprise specific subtrees.
For instance, the Sun Management Center agent always instantiates a core module loader in the .iso*enterprises.sun.prod.sunsymon.agent.base.mibman object in the default SNMP context. Sun Management Center modules that can only have a single instance are also loaded in under the enterprises branch in the default SNMP context.
Each instance of a module that can be multi-instantiated is assigned an SNMP context. The name of the module instance corresponds to the SNMP context name. Each nondefault SNMP context is represented by a separate <context name>.iso.* subtree under the .contexts object.
For example, loading a Topology module whose instance name is view-1 creates the .contexts.view-1.iso.* subtree that represents the view-1 SNMP context.
By convention, the Sun Management Center agent modules developed by Sun are loaded within the sun branch in the appropriate SNMP context. Similarly, Sun Management Center agent modules developed by Halcyon are loaded in a subtree under the appropriate SNMP context.
![]()
The preferred location of Sun Management Center modules can be specified in an .x file (base-oids-<enterprise>-d.dat) that maps the logical object names to object identifiers. The Sun Management Center agent loads this file on start up. It can also be specified in the parameter file of the module.
The location where modules are loaded is important for hierarchical summarization and for general module management. Hierarchical summarization groups the alarm statuses of all managed child objects to generate an overall status of the managed objects for that portion of the MIB tree. Organizing modules into groups allows modules to be managed as a group.
Sun Management Center specific modules loaded by the agent are classified into the following module types:
operatingSystem--monitor operating system related entities associated with the local host system (for example, CPU usage, swap, processes, file systems, and so forth.)
Each Sun Management Center agent module is loaded into its corresponding module type branch under the appropriate modules subtree and SNMP context. The following diagram shows a module subtree.
Classifying modules by these categories is important for hierarchical summarization. This classification of modules separates the various alarm summary lists, enabling the alarm summary of managed objects in the MIB to reflect the status of the respective category.
When a Sun Management Center agent starts up, the agent loads the set of modules specified in its module configuration file (base-modules-d.dat). Once the agent is running, a Sun Management Center console user can load additional modules or unload loaded modules. The loaded modules are saved to the module configuration file (that is, /var/opt/SUNWsymon/cfg/base-modules-d.dat) so that the same set of modules is automatically reloaded if the agent is restarted.
The module configuration file contains entries for each module to be loaded. For each module to be loaded, its location in the MIB tree hierarchy, name, and parameters must be specified. Each entry in the file has the following format:
<module spec> = "<MIB location> <enterprise> <module name> <module parameters>"
where:
- module spec specifies the module name and module instance name (if one exists) concatenated with a + sign (for example, fscan+syslog, mib2-system).
- MIB location specifies the full TOE object path to the root node of the module. For example, the mib2 system module location is:
.iso.org.dod.internet.mgmt.mib-2.system.
- enterprise specifies the name of the enterprise MIB that the module resides in. For example, a module developed by Sun should reside in the sun enterprise. A module that is not enterprise specific (for example, mib2-system) should specify a blank enterprise.
- module name specifies the actual name of the module without the module instance specification (for example, mib2-system, fscan).
- module parameters specifies the module parameters in the form of a list containing key-value pairs terminated by semi-colons (that is, `;'). All string values with white-spaces should be enclosed with backslashed double quotes (that is, "
- "\aaa bbb\"). For example, to specify module parameters a and b whose values are 123 and "1 2 3", respectively; use the following specification: {a = 123; b = \"1 2 3\";}.
The module parameters that can be specified correspond to those parameters specified in the module's parameter file (that is, <module>-m.x).
Common module parameters include:
For modules that can be instantiated multiple times, the instance and instanceName parameters should also be defined. In addition, modules can specify additional parameters that are specific for the module.
This file contains three module entries: mib2-system, agent-stats, and fscan+syslog. The mib2-system entry demonstrates the loading of a non-enterprise specific module. The agent-stats entry shows how to load a simple Sun Enterprise module. The fscan+syslog entry shows how to load a Sun Enterprise module that can be instantiated multiple times. This module also contains module specific parameters.
Note - Each entry must be specified on one line only. To improve readability, each entry has been divided into multiple lines in the following example.
In the following table, note that each row is a continious string of syntax.
Before loading or unloading a module on the platform agent, stop the platform agent. Then load or unload a module.
Note - To load and unload a module on a platform agent, you need to edit the /var/opt/SUNWsymon/cfg/platform-modules-d.dat file. You must create this file if does not already exist.
![]() |
To Stop the Platform Agent |
![]() |
Enter one of the following commands:
|
or
es-stop -l
![]() |
To Load a Module in the Platform Agent |
1. | Copy the platform-modules-d.dat file to /var/opt/SUNWsymon/cfg:
|
2. | Start the platform agent with the following commands: |
a. |
Stop the agent: |
b. | Enter one of the following commands to start the agent:
|
or
|
![]() |
To Unload a Module in the Platform Agent |
1. | Remove applicable entries from the following file:
|
2. | Restart the platform agent with the following commands: |
a. |
Stop the agent: |
b. | Enter one of the following commands to start the agent:
|
or
|
The MIB manager provides general MIB related services to external entities through SNMP. Sun Management Center agents instantiate the MIB manager in the .iso*enterprise.sun.prod.sunsymon.agent.base.mibman object.
The MIB manager is comprised of MIB objects that provide the following services:
A procedures (that is, _procedures) TOE object also exists as a peer object of the MIB objects listed above. This object is not a MIB node object and only serves as a repository for MIB manager related procedures that can be inherited by the MIB nodes that need to execute the procedures.
The finder object is used to resolve the SNMP URL of a currently loaded MIB object to its object identifier (OID).
When an SNMP URL is set into the finder object, the finder object locates the MIB object identified by the URL and returns its OID in the form of an OID URL.
The OID URL has the following general format;
snmp://<host>:<port>/oid/[<context>]/<oids>[/<subid>][?<shadow spec>]#<instance spec>
Subsequently, the OID can be determined from the OID URL and used to access directly the MIB object identified by the SNMP URL.
![]() |
To Convert an OID URL to an Actual OID |
1. | Parse off the OID portion of the URL. |
2. | Extract the context if one is specified. |
3. | If the OID includes a shadow specification, extract it. |
4. | If the instance spec is a non-integer, it can be comprised of one or more comma separated instance data types (int, ip, str, +str, oid, or +oid). |
These data types define how to convert the textual instance to a numeric instance. The `+' indicates that the actual length of the instance must be prepended to the instance since its length is not implied. The values of int, ip, and oid instance types are integers and so these values map directly to the subid values. The str instance types indicate that the instance values are alphanumeric and must be converted to their corresponding decimal ASCII value and concatenated with a period (.) (for example, abc --> 97.98.99). |
5. | If it is a shadow OID, append the instance length and append the shadow specification. |
6. | Replace all (/), (#), and (?) characters with a period (.). |
For example, the SNMP URL for the system description property in the mib-2 system module is:
|
When this value is set to the finder node, the resulting response is the OID URL:
|
The actual OID can be extracted from the OID URL as follows: |
a. | Parse off the portion after the /oid/ substring (that is, 1.3.6.1.2.1.1/1#0). |
b. | Substitute all `/' and `#'characters with `.' (that is, 1.3.6.1.2.1.1.1.0). |
![]() |
To Access the fulldes Shadow Attribute of the Same MIB Property |
![]() |
Set the following URL to the finder:
|
The resulting OID URL is:
snmp://<host>:<port>/oid/2.3.6.1.2.1.1/1?7.1#0
![]() |
To Convert the Shadow OID URL to a Valid OID |
The OID URL for a shadow OID contains a `?' that signifies the start of the shadow attribute index specification. The `#' signifies the start of the instance specification. To convert the shadow OID URL to a valid OID, do the following:
1. | Parse off the portion after /oid/ (for example, 2.3.6.1.2.1.1/1?7.1#0). |
2. | From the parsed string, extract the shadow index specification that is enclosed by `?' and `#' and replace the `/' and `?' with a `.' (that is, shadow index specification is 7.1 and OID is 2.3.6.1.2.1.1.1#0). |
3. | Since the instance is an integer, simply append the length of the instance specification to the OID and replace the # with `.' since instance is `0', length is 1 -- 2.3.6.1.2.1.1.1.0.1. |
4. | Append the shadow index specification to the OID (2.3.6.1.2.1.1.1.0.1.7.1). |
This OID can then be used to get the full description shadow attribute for the mib-2 system description property.
![]() |
To Access a Table Property in a Module |
An example to get the scan pattern for a specific row (unix_error row instance) in the file scanning module (syslog module instance):
_
Send the following SNMP URL to the finder:
snmp://<host>:<port>/mod/fscan+syslog/fscanstats/scanTable/ scanEntry/pattern#unix_error
The resulting OID URL is:
|
![]() |
To Convert the OID URL to an OID |
1. | Parse off the OID portion (that is, syslog/1.3.6.1.4.1.42.2.12.2.2.24/1/3/1/4#+str). |
2. | Extract the context (that is, syslog). |
3. | Since the instance specification is +str, the textual instance name must be converted to a numeric instance with the length prepended (unix_error --> 10.117.110.105.120.95.101.114.114.111.114). |
4. | Append the instance to the OID and replace the `/' and `#' with `.' ( 1.3.6.1.4.1.42.2.12.2.2.24.1.3.1.4.10.117.110.105.120.95.101.114.114.111.114). |
This OID can then be used to request the data via SNMP. If using SNMPv2c or SNMPv2u, specify the context in the contextName field of the SNMP PDU. If using SNMPv1, specify the context name in the community field as <community>:<context> (for example, if the community name is public and the context is syslog, use public:syslog as the community field).
The loader MIB object is a leaf node that permits modules to be loaded by SNMP. Only users with sufficient security privileges are permitted to load modules (refer to the Sun Management Center Security SDS for more details about SNMP security).
The module loader input specifies the module parameters as key-value pairs separated by `;'. These parameters are based on the same information specified in the module configuration file described earlier.
For example, to load the mib-2 system module, the following string can be set to the loader node.
module = mib2-system; moduleName = "MIB2 System"; version = 1.0; console = mib2-system; location = .iso.org.dod.internet.mgmt.mib- 2.system; enterprise = ""; moduleType = localApplication; instance = ""; desc = "The MIB2 System module monitors MIB2 system information.";
The checker MIB object is a leaf node that provides an SNMP interface for checking the status of a module. Given a module name and an optional module instance, it determines whether the module is currently loaded, not loaded, or not installed on the agent machine.
The following responses can be returned by the checker node:
The browser root MIB object is a leaf node whose value can be retrieved via SNMP. The value of the node is an SNMP URL that represents the root object of the MIB hierarchy tree. This value is used by the Sun Management Center console to determine the root of the MIB hierarchy of an agent's MIB for browsing purposes.
The default browser root URL is:
snmp://<host>:<port>/sym/base/mibman/modules
This MIB object is a leaf node that supports the retrieval of information about modules that are currently loaded by the agent via SNMP.
Specifically, by setting a module name to this MIB node, it returns the module name, module version, and number of loaded instances of the specified module. For example, setting the value fscan must return fscan 2.0 1 where fscan is the module name, 2.0 is the version, and 1 is the number of loaded instances.
Alternatively, by setting a blank value to the MIB node, the module name, module version, and number of loaded instances for all the modules currently loaded are returned as a list of sublists ({fscan 2.0 1} {mib2-system 2.0 1}).
The modules object is branch MIB object that contains five module tables corresponding to the five module types: hardware, operatingSystem, localApplication, remoteSystem, and serverSupport. Each table contains the currently loaded modules, classified by their module type.
Each table contains the following columns:
In addition to the mibman branch in the .iso*base subtree, every Sun Management Center agent component MIB contains the info, trapInfo, trapForward, and control branches. This section describes these MIB branches.
The .iso*base.info branch contains nodes that provide general information about the host system, the agent, and modules installed on the system.
The system branch contains leaf nodes that provide the following information:
The agent branch contains leaf nodes that provide the following information:
The modules branch contains a table listing all the modules that can be loaded by the agent. The table contains the following columns:
The module count can be -1, 0, or some positive integer. A value of -1 implies that the module is currently loaded and cannot be instantiated multiple times. A value of 0 implies that the module is not currently loaded. A positive integer reflects the current number of loaded instances of the module and implies that the module can be loaded multiple times.
The .iso*base.trapInfo branch contains MIB objects whose values are included in the variable bindings of various enterprise specific traps that can be generated by the agent. The branch contains the following nodes:
The setTrapInfo method is the primary interface for sending the statusChange and valueRefresh traps whose variable bindings include the statusOID and refreshOID objects, respectively. This method takes the trap type (statusOID or refreshOID) as an argument to specify the type of trap to send. The method must be called from the context of the MIB object whose OID must be included in the trap message variable binding.
The other traps have more specific functions and are intended to be used only by their respective users (eventInfo is used by event infrastructure, userConfig is used by usmUser MIB, and moduleInfo is used by the module load and unload methods).
The .iso*base.trapForward branch contains nodes that support trap subscription. Specifically, this branch contains the following nodes:
The subscription specifications supported by these nodes are described in Appendix G.
The .iso*base.control branch object contains the action and cache leaf nodes. The set security access of these nodes are restricted to users with administrative security privileges.
The action node supports the ability to shutdown the agent. To shutdown the agent, set the value to 2. The ability to restart the agent using this node is not supported in Sun Management Center 2.1.
The cache node is included in Sun Management Center 2.1 software. This node supports the ability to retrieve and manage the agent's current SNMP finder cache via SNMP. The current contents of the finder cache can be retrieved through a get request to the cache node.
Setting the cache node value to * clears all the entries in the finder cache. Setting the node value to a host name clears all the cache entries associated with the specified host. Setting the node value to a host name and port (host:port) clears all the cache entries associated with the specified host and port.
The Tcl/TOE commands and procedures tha follow are available in all nodes for use as refresh commands or filters.
This function takes the name of a managed property as its only argument and returns the value of the managed property. This function must be executed in the node that is the superior of <node name>.
This function must be executed in a leaf node and returns the value stored for the specified <index>. If the leaf node is a scalar, <index> is always 0. If the leaf node is a vector (within a table), <index> can be any value from 1 to the number of rows stored in the table.
This function can be used to return all data stored in a leaf node. Like getValue, this function must be executed in a leaf node.
This function can be used to return data from a table. This function must be executed from a node that inherits from the MANAGED-OBJECT-TABLE-ENTRY primitive. If no <rowname> is specified, the function returns all the data. If <rowname> is specified, the data for the row reference by that name is returned.
This function returns the number of rows stored in a table. This function must be called from a node that inherits from the MANAGED-OBJECT-TABLE-ENTRY primitive.
The getFilter qualifier specifies a Tcl command or procedure that is used to convert the data from how it is stored (in the data slice) to how it is returned after a 'get' operation. To function properly, the type of the object needs to match the type of the output of the getFilter.
This function can be used to set the value of a managed property. <index> is 0 for all scalar leaf nodes, and 1 or higher for a table property indicating which element in the vector is to be set. <value> is the value to be set.
Typically, this function is used together with toe_send to enable the evaluation of a command in the context of another object. This function recursively searches up the MIB tree for <node name> and returns the unique TOE ID of that object if it is found. <node name> can be a absolute path to the node (starting from .iso) or a relative path to the node <node1>.<node2>...
This function is used to evaluat the allows command in the context of another object. The <toeid> is the TOE ID of the object in which the command is to be evaluated. The TOE ID of an object is typically determined using the locate command. <command> can be any Tcl/TOE command that is valid in the context of the node. For example, toe_send [ locate node1 ] getValue 0, retrieves the data value stored in node1.
A useful data filter is the transposeFilter, which can be used to transpose a table of data.
This function accepts the name of a managed property and returns the rate of change per second for the managed property since the previous sample.
Same as rateFilter except for 64-bit integer values.
This function is similar to rateFilter function, except that is operates on a list of data instead of a scalar.
Same as tableRateFilter except for 64-bit integer values.
This function computes the value of a named managed property as a percentage of another managed property.
This function accepts the name of two managed property peers, each of which contains the same number of values. The list of values associated with the first property is computed as a percentage of the list of values associated with the second property. The function returns a list of percentages.
This function is used to compute the slope of the line that best fits through a set of data values. This function accepts a single numerical argument. This value is stored along with previous values passed into this function. The number of data points stored internally is specified by the refreshParams qualifier
This function provides a multiply and accumulate function to provide digital filtering capabilities. This function accepts a single numerical argument that is stored along with other values passed into the function.
The refreshParams qualifier specifies the coefficients of the filter. The sum of the coefficients must be one so that the result does not have to be normalized. The number of coefficients indicates the number of data points to store internally.
A status string can be retrieved for any node via SNMP through the shadowmap.
The status string is a sequence of tab-separated fields. It is constructed out of the state and name of the node, and other relevant information.The exact format of this status string may change as Sun Management Center software development progresses.
This section contains examples of status strings as they currently exist. The purpose of these examples is to show how the node state contributes to the status string, and how the status of underlying child objects is represented in the status of a parent branch object.
Consider a managed object, CPU, with managed properties of idle time and busy average. If there is no alarm condition on either of the managed properties, the shadowmap status strings are displayed as:
Idle Time Status: {INF-0 fly Solaris Example CPU Idle Time OK snmp://204.225.247.154:161/mod/solaris/cpu/idle 0 882368193 } Busy Average Status: {INF-0 fly Solaris Example Average CPU Usage OK snmp://204.225.247.154:161/mod/solaris/cpu/average 0 882368193} CPU Status: {INF-0 fly Solaris Example CPU Usage OK snmp://204.225.247.154:161/mod/solaris/cpu 0 882368193 }
Now suppose that the idle time is in alarm because the system is less than 10% idle, and the busy average is in alarm because the system is more than 90% busy. Now the shadowmap status strings are displayed as:
Idle Time Status: {ERR-5 fly Solaris Example CPU Idle Time < 10% snmp://204.225.247.154:161/mod/solaris/cpu/idle 25 882368193 } Busy Average Status: {ERR-5 fly Solaris Example Average CPU Usage > 50% snmp://204.225.247.154:161/mod/solaris/cpu/average 25 882368193} CPU Status: {ERR-5 fly Solaris Example CPU Idle Time < 10% snmp://204.225.247.154:161/mod/solaris/cpu/idle 25 882368193 } {ERR-5 fly Solaris Example Average CPU Usage > 50% snmp://204.225.247.154:161/mod/solaris/cpu/average 25 882368193}
Note - The overall CPU status is a list of the alarm statuses of the underlying properties.
In general the contents of a status string is given by a tab-separated string:
<alarm state>-<alarm severity>\t<host>\t<module name>\t<medium description>\t<alarm message>\t<snmp url>\t<alarm level>\t<timestamp>
where:
- <alarm state> is the alarm state value in nickname form (see TABLE A-2). This value drives the icon that is displayed in the console.
- <alarm severity> is a value from 0 to 9 that is used to rank alarms within each state.
- <host> is the name of the host that is generating this alarm.
- <module name> is the name of the module that is generating this alarm.
- <medium description> is the mediumDesc value of the node that is generating the alarm.
- For nodes using the rCompare rule, <alarm message> is <alarm check> <alarm limit> [<unit>]. In the preceding examples, <alarm check> is > or <, <alarm limit> is 10 or 50, and <unit> is %. Other alarm rules can set the this message text explicitly.
- <snmp url> is the SNMP URL that corresponds to the node that is generating the alarm.
- <alarm level> is the numeric representation of <alarm state>-<alarm severity>. The conversion is <alarm state value> *10 + <alarm severity>. Fore example ERR-5 has an <alarm level> of 25. TABLE A-2 lists the default values for <alarm state value> and <alarm severity>.
- <timestamp> is the epoch time when the alarm limit was last evaluated.
TABLE A-2 Alarm Level Alarm State
State Value
Default
Severity
OK
0
0
OFF
0
1
DIS
0
1
INF
0
5
WRN
1
5
ERR
2
5
IRR
2
7
DWN
2
9
When a module is loaded into the agent and viewed through the Sun Management Center console, information is cached and also saved to files. This is done for performance and to allow the information to be persistent across restarts of the agent. As a result, there are issues to consider when testing changes to a module:
Module definition files adhere to the following naming conventions:
<module><-subspec>-<descriptor>.<extension>where
- <module> is the module name.
- <subspec> is an optional qualifier for the module name.
- <descriptor> is one of a set of standard descriptors indicating the purpose of the file.
- <extension> is one of a set of standard file extensions indicating the file type.
By convention, the <module> and <subspec> portions of the filename are common for all files associated with a specific module. This allows related module files to be easily grouped together while eliminating the chances of filename contention with the definition files of other modules. The following are standard descriptors for module definition files:
-d
Daemon file
-ruletext-d Rule message text file
-models-d
Model file
-m Parameter file
-ruletext-d Rule initialization file
Additional standard descriptors are:
-j Java console file -s Java server file
The following are standard extensions for module definition files:
.x
File in module configuration file format
.def .Default file
.flt
Tcl/TOE Filter file
.prc Tcl/TOE Procedure file
.tcl Tcl commands and procedures
.sh Executable shell scripts
.dat Data file
.rul Tcl/TOE rule file
.properties Internationalization text file
Some of module definition files for the Solaris Example module must be named as follows:
The following are solaris example module file names:
solaris-example-m.x
Solaris Example Parameter file
solaris-example-d.x Solaris Example Agent file
solaris-example-d.def
Solaris Example Alarm file
solaris-example-d.flt Solaris Filter file
The following lists the required definition files.
TABLE A-3 Mandatory Module Files <module><-subspec>-m x Parameter file
<module><-subspec>-models-d.x <object*>-models-d.x Model files (may be multiple files)
<module><-subspec>-d.x Agent file
The following optional files can be defined for each module, depending on the module implementation requirements:
TABLE A-4 Optional Module Files <module><-subspec>-d.flt Filter file
<module><-subspec>-d.prc Procedure file
<module><-subspec>-*.sh (can be multiple files)
<module><-subspec>-d.rul Rule file
<module><-subspec>-ruleinit-d.x Rule Initialization file
<module><subspec>-ruletext-d.x Rule Message Text file
<module><-subspec>.properties Properties file
ServerOverrideBundle.properties Server Override Properties file
<module><-subspec>-oids-d.dat Module OIDs file
<module><-subspec>-traps-d.x Traps file
<name>16x16-j.gif Standard Icon file
<name>32x32-j.gif Topology Icon file
<module><-subspec>-d.def Alarm file
If binary extensions or packages are used by a module to facilitate or optimize data acquisition and alarm processing, one or more of the following files can exist also:
TABLE A-5 Binary Extension Files <module><-subspec>-shell.tcl Package load commands
pkg<module><-subspec>.so Standard Tcl package shared object
lib<module><-subspec>.so Standard UNIX shared object
Each of the files listed above is discussed in detail in the following sections.
All module files except the following must be installed in the
/opt/SUNWsymon/modules/cfg directory of the agent host. The exceptions to this rule are:
The list of modules available in the Load Module console window is determined only when the agent is first started. When a new module has been added, this list can be updated by forcing the agent to redetermine the list of available modules. This can be done by right-clicking in the Load Module window and selecting the Refresh menu option. The update of the list may take a while depending on the number of modules available. The list can be also be updated by restarting the agent.
This section covers the following topics:
The chapter describes the concepts and techniques used in Sun Management Center software to construct models of the entities to be managed. It also describes mechanisms employed by the Sun Management Center agent to enable these models to gather data, determine status, and perform actions on the managed entities.
For more information about management modules, refer to the Chapter 5.
This section describes how the entities to be managed by the Sun Management Center agent are modeled using TOE objects and primitive classes. It also describes how alarm conditions associated with the managed entities are represented.
Sun Management Center software is based on the object oriented paradigm, in which objects are used to model the various aspects of a system for the purpose of managing that system. The physical and logical components of a system that are being managed are referred to as managed entities. Managed entities can be disks, boards, hosts, clusters and networks. Managed entities that are host platforms are referred to as managed nodes.
The various types of managed entities are modeled using managed object classes, and these classes are combined to form a meta model for a particular system, the structure of which accurately models the structure of the managed entities it represents. To perform management functions, models must be realized in a process running on a managed node, at which time each managed object class in the model is instantiated into a managed object.
Because of the hierarchical nature of the components of a system, managed entities can be the aggregation of other managed entities. Similarly, managed objects that are instantiated during the realization of a model are considered to be the aggregation of all the subordinate managed objects in that model.
An example of this would be a host that is composed of a power supply, boards, a chassis and other components. The host and all subcomponents are considered managed entities, even though the host entity collectively includes the others. In the model of such a system, the managed object class representing the host is an aggregation of the classes representing the other entities. In a realization of this model, the host managed object is an aggregation of managed objects representing the power supply, boards, a chassis and other components.
In Sun Management Center software, models of managed entities usually take the form of a management module, and the tree structure of the managed objects and properties within a module is often referred to as a Management Information Base, or MIB.
Managed entities are modeled using managed objects, which are instances of managed object classes. The managed properties of the managed entities directly correspond to the properties of the managed object classes used to build the model. In a realization of a model, it is these managed properties that contain the information pertinent to the monitoring and management of the managed entity.
When realizing a model, a tree of TOE objects is created that implements the structure and functions of the model. In this realization, a MIB node object is created for every managed object and every managed property in the model.
These objects are derived from a set of primitives, that in turn are derived from the TOE MIB node class, which implements much of the required management functionality, including timed data acquisition, alarm status checking, rule execution, and alarm creation. The object instances are therefore quite adept at general management functions, and the model that describes them is responsible for configuring them for their specific management purpose.
Using this approach, one inconsistency must be understood. The use of a TOE object to represent a managed object would be very straightforward if it were not for the fact that the properties of the managed object cannot be modeled directly by the properties of the TOE object. If this were the case, then the set of properties available to be managed by a TOE object would be limited to the set of properties not used by the TOE object internally to perform its management function. In other words, there could be contention between the object properties and those of the managed object in the model.
It is for this reason, as well as for the simplification of the TOE object implementation, that the properties of managed objects are represented using separate TOE objects. This is a natural function for these objects, which exist primarily to acquire data and take management actions. This means that objects in the realization can correspond to properties of the managed entity, and the properties of the TOE object can, in fact, correspond to qualifiers of the managed entity. This remapping in the realization is necessary given the realization mechanism used.
For example, a file system can be modeled as a managed object represented by a TOE object in the Sun Management Center agent. Conversely, the file system size would be modeled as a managed property but would be represented by another TOE object; instead of a property of the TOE object that represents the file system.
The construction of management models involves the use of management primitives, which are object classes that exhibit specific management behavior. These primitives correspond to the following model elements:
Managed properties are divided into specific primitives based on the data type of the property and the types of alarm checks to be performed on that object (such as integer type with high limits or string type with regular expression checks).
Primitives are composed of several property classes. This means that the type, function, and behavior of the primitive is defined by several broad categories of properties. TABLE A-6 lists the five property classes used to define primitives.
TABLE A-6 Managed Model Primitives Type of Property Class
Description
Structural
Object tree structure properties
Technique specific
Properties pertaining to security and communication protocols
Realization
Properties defining the data acquisition operations of the object
Management
Properties specifying operational ranges and alarm actions
Management Rules
Inference-based rule specification properties
FIGURE A-16 shows the composition of object primitives using these five property classes.
![]()
A set of object primitives is available when constructing models that cover all the major object types and alarm check scenarios. These primitives intrinsically define all properties pertinent to SNMP access and ASN.1 description, including the ASN.1 type and the access communities.
Using primitives when constructing models will therefore define most of the properties in the structural and technique-specific property classes. Properties of the other classes, such as refresh information (for data acquisition) and alarm limits (for status determination) can then be added to the model.
All of the properties associated with a MIB object are effectively defined in the TOE object that represents the MIB object. Most of these properties are accessible through SNMP and the shadow MIB.
One of the primary purposes of management models is to detect system events. There are two types of events that can be detected, hard events, which are specific occurrences within the system (such as a disk crash or a process termination), and soft events, which correspond to a managed property going into or out of an arbitrary range. Hard events can be detected in a very objective way, usually through the presence of a message in a log file or a specific indication in a data acquisition operation. Soft events, on the other hand, are very subjective, and their occurrence is purely a function of the operational limits associated with the related property or properties.
The nodes of a MIB tree attempt to ascertain the condition of the managed system entities with which they correspond. All changes in an entity's condition correspond to a system event, and the detection of a system event typically leads to a change in the status of a managed object or managed property. Changes in status lead to the creation of an alarm event, which is passed through the system as an indication that the event occurred. It is the creation of these alarms that is of primary importance in the monitoring process.
Alarms contain all of the information useful to clients interested in a particular event. This information includes the identity of the managed node on which the event was detected, a readable portion describing the nature of the event or of the current condition of the entity, a severity number, the time of detection of the event, and the URL of the managed object or property which detected the event. Alarms are intended to be globally valid, and thus all fields, including the readable portions and the URL, are sufficiently qualified to make them completely unambiguous in a global context.
Alarm objects can contain the following fields:
The fields in the alarm object are separated by tabs.
Sun Management Center agents manage objects by autonomously collecting and monitoring data. The agents use simple alarm checks and/or rules based technology to determine the status of the managed objects. The agent can then automatically generate alarms or perform actions based on the detected conditions, thereby providing predictive failure capabilities and automanagement. The agents make data and status of the managed objects available to the Sun Management Center server and Sun Management Center console layers.
A fully realized model will perform monitoring and management operation at regular intervals or on demand. The objects within the model perform certain operations to achieve this, and the results of these operations are well defined.
In a typical management scenario, the following sequence of events occurs for a managed object or property:
Essentially, the nodes in the tree autonomously gather data, place it in the appropriate objects or properties, check limits, fire rules, and take action on state changes. In a normal scenario, no interaction is required between the manager and the agent in order to perform management operations, and the only communication required is the trapping of alarms on state changes.
To refresh the information in the MIB tree, data acquisition operations must be performed. In Sun Management Center agents, this is generally referred to as the refresh operation. Typical refresh operations manifest themselves as the invocation of a refresh command in the context of a refresh service. A refresh service is an object within the agent that can be used for data acquisition. A refresh command is a service-dependent command that defines the specific operation to perform. Conceptually, the refresh command is sent to the refresh service each time a refresh is triggered.
Refresh services can be any object supporting the service interface. Typically, refresh services can include such things as:
Services are discussed in detail in the Agent Framework chapter in this document.
Note - Since the agent is single-threaded, it is blocked when running Tcl commands in the internal service. If it is expected that a Tcl command can take a significant amount of time to return its result, a Tcl subshell service should be employed to execute these commands. The Tcl subshell process can load the required Tcl extension(s) so that it can execute Tcl commands and return the results to the agent asynchronously.
The data cascade is disseminating a buffer of data into a tree of managed objects or properties. By strictly defining the rules governing data updates, a wide variety of data acquisition scenarios are available. Data can be acquired one piece at a time and placed into managed properties, or larger amounts of information can be acquired in a single data acquisition operation and cascaded into several managed properties or even several managed objects.
In general, all data acquisition operations are initiated by an active node. An active node is a managed object or property that has refresh information associated with it. Active nodes can be managed objects, managed property classes, or managed properties, depending on the desired cascade scenario. Also, properties can be of either a scalar or a vector dimension, and this affects the data update operation.
Conceptually, the data cascade consists of a tree node acquiring information and either placing the information in itself (in the case of managed properties) or passing it down to inferiors in the tree, who in turn either consume it or pass it on. Data left over from one inferior node tree is passed on to the next inferior tree until all the data is consumed. Failure to consume all the information, or there not being enough information to fill the tree, constitute an overflow or underflow condition.
Overflow conditions are not detected by the agent since extra data is discarded. Underflow conditions are not directly detected by the affected nodes. However, other nodes or external clients that query the node for its value detect the absence of data and flag the condition.
Because the structure of the object tree can vary infinitely, so too can the various manifestations of the data cascade. In practice, however, there are only a few common cascade scenarios that lend themselves to several broad categories of tree structure. These cascade scenarios are described in the following sections.
In this scenario, a property node (which is always a leaf of the tree) is the active node. It initiates a data acquisition (DAQ) operation, receives the results, and places the information in itself. In this scenario, the property is scalar in dimension, meaning it represents one datum, and hence the DAQ operation must return one and only one piece of data. This can be illustrated as follows:
An example of an active scalar node is the system uptime managed property. The refresh command of this node computes the system uptime and the uptime value is stored in the node.
As in the active scalar case, active vector cascades result from a single property being the active node. In this case, however, the property is a vector, meaning it represents zero or more pieces of information. The DAQ operation must return zero or more pieces of data, all of which are placed, in order, into the managed property.
An example of an active vector node is a managed property that stores the list of files in a directory. The refresh command runs the UNIX ls command and the list of files in the current directory are stored in the node.
In this scenario, a branch of the node tree is the active object. This branch can be a managed object, managed object table, or a managed property class, but it is never a managed property (which are always leaves). Under this branch are several scalar leaves (managed properties), each requiring one datum per refresh. The DAQ operation in the branch returns several pieces of data, with the data being passed first to the first leaf node, which consumes one piece, and then on to the subsequent leaf nodes, each of which consuming another piece. The amount of data returned by the refresh operation must match the number of leaves under the active node, or an over/underflow condition occurs.
![]()
An example of a compound scalar would be a set of nodes modeling the one, five, and fifteen minute load averages of a system. A load managed object is the active branch. Under this branch are the one, five, and fifteen minute load average managed properties. The refresh command of the active branch would return the three load average values and these values are cascaded into the three children nodes.
This scenario, also known as a table cascade, arises when a branch of the node tree contains several property leaves, all of which are vector in dimension. In this case, data cascading down the tree is placed into the vector leaves in equal amounts, with the data order interpreted as row-major and the property leaves treated as columns of a table. If there are N leaves in the tree, then the DAQ operation must return exactly M*N pieces of data, where M is the resulting table depth.
An example of a compound vector would be a set of nodes modeling a file system table that contains information for each file system partition. Possible columns in this table would be the partition mount point and size. The refresh command of the branch would then return the mount point name and corresponding size of each file system partition.
The complex case represents a mixture of the preceding scenarios. In the complex case, the information is passed down through the tree using the general mechanism described above. Scalar leaves consume one piece of data, tables will consume M*N pieces of information and simple vectors consume all they are given.
An example of a complex cascade scenario involves augmenting the file system table example described earlier with an additional managed property that stores the number of file system partitions. In this case, the active object's refresh command returns the number of partitions, followed by the mount points and sizes of each file system partition.
This is where active nodes are placed under other active nodes in the node tree. As a rule, active nodes do not accept information from higher-level cascades. Hence, in this case, the higher-level cascade bypasses the nested active node, and the nested object is responsible for refreshing itself and/or the tree of nodes below it.
An example of a nested heterogeneous cascade is a set of nodes modeling the process usage of a system. The managed properties consist two passive nodes (number of active processes and number of sleeping processes) and an active node (maximum number of available process slots). The active branch object's refresh command returns the number of active and sleeping processes. The active leaf node's refresh command returns the maximum number of available process slots.
Derived Heterogeneous
Similar to the nested heterogeneous case, this scenario involves a derived node placed under an active node (or another derived node) in the node tree. Like active nodes, derived nodes do not accept information from above. In this case, however, the DAQ operation of the derived node may depend on, and hence be triggered by, the update of the objects around it. The firing of the refresh operation of the derived node therefore is intrinsically linked to the data cascade from the superior object.
An example of a derived heterogeneous cascade is a set of nodes modeling the swap usage of a system. The managed properties consist of two passive nodes (current swap usage and the total swap) and a derived node (percentage swap used). The active object's refresh command returns the current and total swap. The percentage swap used is then computed from the two returned values.
A derived node is a member of the MIB tree that uses other MIB nodes as the service(s) for its refresh. In other words, its value is a function of the values or qualifiers of one or more other managed properties. Through the use of derived variables, it is possible to create nodes whose value represents averages, rates of change, specific digital filters (for example, high pass, low pass, or band pass) or other useful calculated information.
Derived nodes establish dependency relationships with the nodes on which they rely through the use of the refresh triggers specification. Nodes can be triggered off the change in value or status of one or more nodes, and refreshes automatically when any of the specified events occur. Derived nodes can also update at an interval, although this is usually unnecessary if the triggers are specified properly.
After completion of the full refresh operation (the refresh request and the subsequent data cascade), a set of refresh actions occur. For nodes in a MIB tree, these actions include the alarm rule checks, which involve checking the data values of the managed properties against a set of alarm criteria.
These alarm checks determine the current status of the managed entities being monitored, as described in the information model. The alarm checks can be classified into simple comparison checks or more complex rule evaluation.
Simple Comparison Checks
Simple comparison checks apply only to single data entries of managed properties and are usually dyadic relational operations involving numeric limits, regular expressions, or comparison strings. The output of these checks is a status code, with the status produced corresponding to the state associated with the most severe alarm check that tests positive. If none of the checks are satisfied, the node is considered to be in the ok state, and nodes with no alarm checks are always considered ok.
Rule Evaluation
Rules provide a mechanism to specify customized alarm checks in place of standard alarm checks that perform simple comparisons. Rules are potentially complex expressions involving the values or status of one or more MIB nodes, and generate values or status that corresponds to the outcome of their computations. As opposed to simple comparison alarm checks, rules can embody complex comparisons, computations and relationships, and the status they produce may represent a very informed decision.
Each rule in the agent has a corresponding MIB node, and this node triggers the evaluation of the rule, maintains any rule-specific qualifiers, and acts as a repository for the resulting data or status.
Having a one-to-one correspondence between rules and MIB nodes facilitates both the triggering of the rule and the generation of alarm objects, as the identity of the MIB node generating the alarm must be placed in the alarm. The URL in the alarm can then point back to a node that represents the rule. Acknowledgment of the alarms generated by a rule and the editing of rule-specific qualifiers can be done through the use of the rule's URL.
Using this approach, the technology to evaluate rules is independent of the triggering mechanism and the alarm generation. Because the rule is fired by the standard triggering mechanisms, and because the values or status of all nodes on which the rule depends can be passed to the rule at the time of triggering, the rule needs to implement the relevant computation or comparison and return the ensuing data or status. Making use of this, a simple Tcl-based rule mechanism are available for implementing the body of a rule, and support for rules based on commercial, third party inference engines can be added easily in the future.
If a change in alarm status is detected, an alarm object (as described earlier in the information model) is generated and the following alarm actions are triggered:
Status Propagation
Detection of system events causes a change in the status of the corresponding managed property in the MIB tree. This change in status must be reported to superiors in the MIB tree, as these superiors correspond to the managed objects or managed property classes to which the managed property exhibiting the status change belongs. By propagating status at the time of a status change, all managed objects and properties at any level of the MIB tree are in sync with the current state of their inferiors.
This upward passing of status information is typically referred to as hierarchical summarization, and is very important to the operation of both the agents and the management layer. By permitting managed objects at all levels to describe their own status, the determination of status at the server and console levels is greatly simplified.
The placement of status lists in objects at each level of the MIB tree can be diagrammed as follows:
Each object in the tree can be queried for its status using shadow SNMP operations. Leaf objects such as idle and busy only contains their own statuses generated from the alarm rule checks.
Branch objects reflect the status of all its children by containing a list of all exceptional (not ok) status conditions. For example, the status of the usage object contains any alarm status conditions of the idle and busy properties. Similarly, the cpu object status are based on the count, load, usage, and threads objects. Finally, the resources object status contains the statuses of all the objects shown in the tree above.
The ability to query the status of any managed object in the MIB tree allows agents to logically combine the status of many disjointed, structurally unrelated managed objects into a single logical element group. Logical element groups can then be used to extend the managed object hierarchy beyond a single agent.
Alarm Status Change and Event Traps
Alarm objects are passed to the management layers through the transmission of SNMP traps. Specifically, two SNMP traps are generated when the status of a managed object changes:
These traps are used by the management layers to facilitate centralized event management and alarm correlation.
Event Propagation
When the alarm status of an object is detected, event information is written to a circular log file on the local host and an event trap is sent to the event manager.
The default event log destination is specified through agent's status channel output specification (that is statusOutput in the file base-config.x). For example, the default event log destination for the Sun Management Center agent is specified as follows:
statusOutput = "clog://localhost/../log/ agentStatus.log;lines=250; width=200;flags=rw+;mode=644"
The event trap sent to the Event Manager causes the Event Manager to request all the event information that it has not previously retrieved from the agent. This is accomplished by tracking the last known line number in the event file and file creation time.
The Event Manager then stores the retrieved events in the event database where it can be accessed from the Sun Management Center console.
If an event trap is lost, the Event Manager does not immediately request the event information corresponding to the trap. However, upon reception of the next event trap, it will retrieve all events not previously retrieved.
Alarm Logging
All statusChange traps received by the Trap Handler are logged to the trap output channel. By default, the log destination is defined in the file base-config.x to be a circular log file.
trapOutput = "clog://localhost/../log/alarms.log;lines=1000; width=200;flags=r w+;mode=644"
User-Defined Alarm Actions
If any user-defined alarm actions were specified for the managed property and the detected alarm condition, the actions are performed. The execution of the alarm action is logged in the agent's circular log file; however, the user has to explicitly redirect output from the script to a file if this information is required.
User-defined alarm actions are entered through the Actions tab in the Attribute Editor, and are only applicable at present for leaf nodes. An alarm action can be specified for each of the possible alarm levels, namely critical, alert, caution, indeterminate, close, as well as for the case of any change in alarm state.
The alarm action entered is the name (without a path specified) of a user-defined Bourne shell script placed in the bin subdirectory of the directory named by environment variable ESDIR. The script must be owned by root and executable. Command line arguments can also be specified following the name of the script. The necessity of root ownership on the script provides an added measure of security, so that only privileged users can create scripts that run automatically.
Special command line arguments the have the following significance can be specified.
TABLE A-7 Special Command Line Arguments Argument
Significance
%rowname row name
%state current alarm state
%prevstate previous alarm state
%value current value
%statusstringfmt (similar to the message in the console tooltip)
There is also a special script for sending email to specified users. The script name to use is simply email followed by one or more space-separated UNIX user names. This script causes an email message to be sent to the specified user names with text:
SyMON alarm action notification ... statusstring: Critical yangtze Solaris /var Space Used > 90%
The Management Information Base (MIB) is the realization of the managed objects and properties that comprise the management modules currently loaded by the Sun Management Center agent. The MIB is embodied by the ISO subtree described previously.
The MIB makes all the managed objects and properties accessible to other Sun Management Center components through SNMP. The MIB also contains infrastructure for loading management modules and arbitrating user interactions with managed objects and properties.
Modules are the lowest level of granularity of management models. They embody a set of managed objects and their corresponding properties, and are designed to fulfill a particular management requirement. The scope of a module is typically such that a loaded module incorporates a set of management functions broad enough to completely satisfy a particular management requirement.
Modules are defined using the module configuration file format, described previously. This specification represents a model that, when loaded, created a tree of TOE objects configured to perform the functions defined by that module. The act of loading an X file into a running agent corresponds directly to the realization of the object model, since the relationship between the information model and the underlying object technology is very close.
The management functions of modules can be enabled and disabled through an SNMP request or through the specification of the module's active time window. Disabling a module simply deactivates the autonomous data acquisition normally performed by the module's nodes.
The concept of a shadow MIB that supports SNMP access to attributes associated with the managed objects and properties in the agent MIB. These attributes can also be referred to as qualifiers.
The default shadow attributes for all managed objects and properties are specified in the file base-shadowmap-d.x. These shadow attribute specifications can be overridden for specific managed objects and properties by specifying the relevant parameters in the appropriate object's configuration file.
Some of the default attributes that are accessible through shadow operations include:
The Sun Management Center agent MIB supports the specification of MIB objects that gather data or execute actions only on demand. These MIB objects are accessible through SNMP, and their execution would normally be triggered by ad-hoc SNMP requests originating from a Sun Management Center GUI client.
Note - Note that these ad-hoc MIB objects do not gather data autonomously, and hence, are not intended for monitoring entities and determining their statuses.
These commands executed by these MIB objects must be synchronous so that the command result can be returned in the SNMP response. Examples of synchronous commands can include such things as Tcl command extensions and Tcl procedures. Shell commands are not permitted since they are asynchronous. Note that the agent process is blocked while the synchronous command is executed; this blocking is a very important consideration when designing these synchronous commands.
Examples of ad-hoc SNMP requests include:
For example, the managed property related to file statistics can be associated with the ad-hoc operation to retrieve the file contents.
Other MIB objects can be associated with one or more ad-hoc operations by specifying the appropriate ad-hoc MIB objects in its ad-hoc command shadow attribute. This list of ad-hoc commands specifications are accessible through shadow SNMP operations. For example, the managed property related to process statistics can be associated with the ad-hoc operation to get the process table.
The Sun Management Center agent MIB also supports the specification of MIB objects that facilitate the establishment of a stream based connection between a probe client and an agent spawned probe server. These connection based operations are referred to as probe operations and are typically initiated on an ad-hoc basis by the probe client (for example, a Sun Management Center GUI client connected to the Sun Management Center server). The involvement of the Sun Management Center agent permits the use of a consistent security model (namely SNMP usec security) when executing probe requests.
Ad-hoc probe operations are used to support:
For example, the managed property related to scanning a logfile can be associated with the ad-hoc operation to view the logfile.
Probe operations are facilitated by the probe server that the Sun Management Center agent runs when servicing probe requests.
The probe server is a generic process that does the following:
Establishing a Probe Connection
To establish the stream connection between a probe client and agent spawned probe server, the following mechanism is employed:
The data logging interface allows module developers to specify certain values to be logged at regular intervals. These logs can be used at later time for processing line statistical analysis, diagnosis, and similar functions.
The Data Logging Interface for the Sun Management Center 3.0 Developer Environment offers a more enhanced way of retrieving data log information than its previous version. This section details the configuration needed for data logging.
You can configure the SunMC agent to log data in the conventional format as described in version 2.1.1, however, this conventional interface will be deprecated and will be unavailable in the future release of SunMC. Until such time, you can make a one line change in the configuration file to allow you to view data logs in either the 2.1.1 format or the 3.0 format. The 3.0 format eliminates unnecessary categories of information and also provides a clear delineators.
Registry of Current Data Logging RequestsThe Sun Management Center agent is configurable to periodically log any managed property in the agent MIB to an internal data buffer and/or to an interface URL for persistent storage.
Each agent maintains a persistent registry of the current data logging requests to ensure that data logging continues when the agent is restarted. The agent can also load a data logging module that allows a console user to view the contents of the data logging registry.
The logging of the value of any managed property to an internal history buffer can be enabled/disabled through shadow SNMP operations. The length and logging interval of the internal history buffer is configurable through shadow SNMP operations.
The buffered data is accessible through shadow SNMP operations. This data is not persistent and is used for things as graphing.
The logging of the value of any managed property to a circular or regular log file can be enabled/disabled through shadow SNMP operations. The logging destination and logging interval is also configurable through shadow SNMP operations.
ConfigurationTo access a specific version of the format, define the configuration parameter in the agent file for the module. This specification will be applicable for the data that is logged to the files.
A new configuration parameter can be defined to force the logging in the new format.
historyVersion = 1|2
This parameter is optional for data log, and defaults to 1, or the previous version, if not defined. If the value is 1, then the data will be logged in the format that is defined.
By default, the data is logged in the following format:
<channel> <date> <component> <alarm code> <host> <module instance><module name> <managed property> =<value><units> <URL> <alarm severity> <timestamp>
where: |
The data log format will be different in the following cases:
The following is a sample line that is logged by the agent for 2.1.1:
historylog1 Mar 22 08:37:51 agent INF-0 tushara MIB-II Instrumentation number\ of interfaces = 2 snmp://129.146.53.61:1161/mod/mib2-instr/interfaces/ifNumber\ 0 953690072
The new format for data log is as follows:
<version>{<channel>} {<timestamp>} {<module name>[+<instance>]} {<managed property>} {<value>} {[<units>]}
where
- The fields, except for the version, are surrounded by {} and are separated by zero or more white space characters.
- <version> is an integer that defines the version of the log format. The rest of the fields are fixed for a given version. For this format, the version is always 2
- <channel> is the name diagnostic channel this message was logged under.
- <timestamp> is the time in seconds that have elapsed since midnight January 1, 1970 (GMT).
- <module name> specifies the module name.
- <instance> specifies the module instance, for modules that can be instantiated multiple times. This is an optional field
- <managed property> is the full name of the managed property being logged. For example, for the system uptime in the MIB-II module, it would be:
.iso.org.dod...mib2.system.sysUptime
- <value> is the value of the managed property.
- <units> specifies the units of the property value. This is an optional field.
Note - The above format is applicable for the data that is logged to files only. The data that is logged to cache will not have any change in the interface.
Each managed property can be logged to a standard log file or to a circular log file to conserve disk space.
Logging to files can be considered as short term storage. Conversely, logging to a database can be considered as long term storage. Data logged to files can be transferred to a database in a batch fashion. This functionality is not within the scope of standard agent data logging.
If more than one managed property is logged to the same destination, the logged data is interleaved. This should not pose a problem since each logged data entry is tagged with its name and timestamp.
The current design of the Sun Management Center agent does not include facilities to retrieve the data logged to a URL through shadow SNMP operations.
This data logging registry maintains a table containing information about data currently being logged. This functionality are implemented in the form of a service and will make data logging requests persistent.
The registry contains a table to store the following information for each data element to be logged:
This table is accessible by a data logging registry module to allow Sun Management Center console users to view the data currently being logged by a single agent. The module can be extended to allow console users to do such things as add, edit, and delete data logging request entries. These additional functions are not supported in Sun Management Center software.