Index Index for
Section 4
Index Alphabetical
listing for C
Bottom of page Bottom of
page

caa(4)

NAME

caa - Cluster Application Availability (CAA) information

SYNOPSIS

Application resource profile: NAME=resource_name TYPE=application [ACTION_SCRIPT=action_script] [ACTIVE_PLACEMENT={0|1}] [AUTO_START={0|1}] [CHECK_INTERVAL=check_interval] [DESCRIPTION=description] [FAILOVER_DELAY=failover_delay] [FAILURE_INTERVAL=failure_interval FAILURE_THRESHOLD=failure_threshold] [HOSTING_MEMBERS=member_list] [OPTIONAL_RESOURCES=resource_list] [PLACEMENT=placement_policy] [REQUIRED_RESOURCES=resource_list] [RESTART_ATTEMPTS=restart_attempts] [SCRIPT_TIMEOUT=script_timeout] Network resource profile: NAME=resource_name TYPE=network [DESCRIPTION=description] [FAILURE_INTERVAL=failure_interval FAILURE_THRESHOLD=failure_threshold] SUBNET=subnet_addr Tape resource profile: NAME=resource_name TYPE=tape [DESCRIPTION=description] DEVICE_NAME=device_name [FAILURE_INTERVAL=failure_interval FAILURE_THRESHOLD=failure_threshold] Media Changer resource profile: TYPE=changer NAME=resource_name [DESCRIPTION=description] DEVICE_NAME=device_name [FAILURE_INTERVAL=failure_interval FAILURE_THRESHOLD=failure_threshold] DEVICE_NAME=device_name

OPERANDS

TYPE={application|network|tape|changer} Resource type. Specify either application, network, tape or changer. NAME=resource_name Resource name. Specify a resource_name as a string containing a combination of characters [a-z, A-Z, 0-9, '.','_']. The resource name may not start with a period (.). [DESCRIPTION=description] Resource description string. [CHECK_INTERVAL=check_interval] Time (in seconds) at which the check entry point of the application's action script runs. The check interval is the maximum amount of time an application can be unavailable to clients before CAA attempts to restart it. A check interval of 0 means never check the resource. [FAILURE_THRESHOLD=failure_threshold] Number of times CAA may detect a resource failure within the failure interval before it marks the resource unavailable and stops monitoring it. If you do not specify a failure threshold, CAA uses a default failure threshold of 0 (zero), which turns off failure threshold monitoring. [FAILURE_INTERVAL=failure_interval] Time (in seconds) during which the failure threshold is tallied and applied. If you do not specify a failure threshold, CAA uses a default failure interval of 0 (zero), which turns off failure threshold monitoring. Specifying a nonzero failure interval is meaningless unless failure threshold is also nonzero. [REQUIRED_RESOURCES=resource_list] Ordered list of resources, separated by white space, on which the application depends. These resources must be active on any member on which the application is running, or must be application resources that may be started on the cluster member. If you don't specify a required resources list, CAA imposes no required dependencies upon the application resource. CAA uses the required resources list, in conjunction with the placement policy and hosting members list, to determine which members are eligible to host the application resource. It also uses the required resources list to start required application resources when the caa_start command is run with the -f option. A failure of a required resource on the hosting member, will cause CAA to initiate relocation of the application if the failed resource is available or can be started on another member. This could cause CAA to fail the application resource over to another member that provides the resource or to stop the application if there is no member that provides the resource. In the latter case, CAA continues to monitor the required resources and restarts the application when the resource is again available in the cluster. [OPTIONAL_RESOURCES=resource_list] Ordered list of optional resources, separated by white space. CAA uses the optional resources list, in conjunction with the required resources list, placement policy, and hosting members list, to determine the optimal member to host the application resource when more than one member is eligible to host the resource. Optional resources must be in the state ONLINE on a cluster member to affect resource placement. The cluster member with the most optional resources is used to run the application. If the hosting members list is not empty, the cluster member in the list with the most optional resources is used. If the number of optional resources on cluster members is equal, the member running the resource with the earliest placement in the list is used to run the application. The number of optional resources per resource is limited to 58. A failure of an optional resource on the hosting member does not initiate application relocation. [PLACEMENT=placement_policy] Policy according to which CAA selects the member on which to start or restart the application resource. CAA uses the placement policy in conjunction with the resource's required list. You can specify any one of the following as a placement policy: balanced CAA favors starting or restarting the application resource on a member based on the optional resources listed, see OPTIONAL RESOURCES for more information. If there are no optional resources listed the member with currently running the fewest application resources is chosen. The balanced application resources are distributed equally among all active members if no optional resources are listed. favored CAA refers to the hosting members list before starting or restarting the application resource. First, a member on the hosting members list is chosen based on optional resources, see OPTIONAL RESOURCES for more information. If a member cannot be chosen based on optional resources, the first member on the list is most favored to run the service. If that member is unavailable, the second member on the list is the most favored, and so on. If all members on the hosting members list are unavailable, CAA favors placing the application resource on the member currently running the fewest application resources. You must specify a hosting members list when you select a favored placement policy. restricted Similar to the favored placement policy, except that if all members on the hosting members list are unavailable, CAA will not start or restart the application resource. A restricted placement policy ensures that the resource will never run on a member that is not on the list, even if you attempt to explicitly relocate it to that member. You must specify a hosting members list when you select a restricted placement policy. If you do not specify a placement policy, CAA uses a balanced placement policy for the application resource by default. [HOSTING_MEMBERS=member_list] Hosting members list. Specify an ordered list of members, separated by white space, that can host the application resource. If you specify a placement policy of favored or restricted, you must also specify a hosting members list. CAA uses the hosting members list in conjunction with the application resource's placement policy. After optional resources are considered, Applications are placed on hosts in the order in which they are listed in the hosting members list. [ACTIVE_PLACEMENT={0|1}] Reevaluates the placement of an application resource when a cluster member joins the cluster. [RESTART_ATTEMPTS=restart_attempts] Number of times CAA attempts to restart the resource on the current member before attempting to relocate it elsewhere. The default number of restart attempts is 1. [FAILOVER_DELAY=failover_delay] Number of seconds CAA waits before attempting to relocate the application resource due to a host failure. If the original cluster member becomes available to run the application resource within the FAILOVER_DELAY time, the application will restart on that member. The default failover delay is 0 (zero) seconds. [AUTO_START={0|1}] When set to 1, start the application resource automatically after a cluster reboot, regardless of whether it had been stopped or running before the reboot. When set to 0, start the application resource automatically only if it had been running before the reboot. The default is 0. [ACTION_SCRIPT=action_script] User-written action script for the application resource. The format of CAA action scripts is similar to that of system init files located in the /sbin/init.d directory. The script file performs user-defined tasks and can invoke other scripts and executable programs. An action script has the following entry points: start Called by CAA to start or restart the application resource. The start entry point executes all commands necessary to start the application and must return 0 (zero) for success and a nonzero value for failure. stop Called by CAA to stop a running application resource. It is not called when stopping an UNKNOWN application resource (see caa_stop(8) for details). The stop entry point executes all commands necessary to stop the application and must return 0 (zero) for success and a nonzero value for failure. The stop entry point should consider an attempt to stop an application that is not running a success and return 0 (zero). check Called by CAA periodically (according to the resource's check interval) to determine the health of the application resource. The check entry point executes all commands necessary to determine whether the application is still running and must return 0 (zero) for success and a nonzero value for failure. If the check entry point returns failure, CAA initiates relocation for the application resource. You can specify either a full pathname for the script file, or its filename (in which case CAA looks for the file in the /var/cluster/caa/script directory). If you do not specify an action script, CAA looks for an action script named /var/cluster/caa/script/resource_name.scr. [SCRIPT_TIMEOUT=script_timeout] The maximum time for an action script to execute. An error message is returned if the script does not finish executing within the time (in seconds) specified. The timeout applies to all action script entry points (start, stop, and check). If this value is not specified CAA assumes a default value of 60 seconds. SUBNET=subnet_addr Subnet address of a network resource. Specify the subnet address in xxx.xxx.xxx.xxx format (for example, 16.140.112.0). The network is the bitwise AND of the IP address and the netmask. If you consider IP address of 16.69.225.12 and a netmask of 255.255.255.0 the subnet will be 16.69.225.0 DEVICE_NAME=device_name Device name of a tape or media changer device. Specify either the full path of the device (for example, /dev/tape/tape1) or just the device name.

DESCRIPTION

CAA tracks the state of the members in a cluster and resources in a cluster (such as networks and applications). CAA monitors the requirements of application resources in a cluster and ensures that applications run on members that meet their requirements. If the cluster member on which an application is running fails, or if a particular resource that another resource requires fails, CAA relocates the application to another member that has the required resources available. CAA can start a group of application resources with one call of the caa_start command. CAA will start all required application resources in the order they are listed in the resource profile if they are available to be started. CAA allows you to enhance overall application performance by balancing application execution among a set of available cluster members. CAA manages application, network, tape, and changer resources. You must have root privileges to use most CAA commands. Only, the caa_stat command does not require root priviliges. CAA consists of components that work together to make application resources highly available and monitor other resources: · A resource manager comprised of the run-time CAA daemons (caad) on all cluster members. The resource manager starts, stops, relocates, and restarts application resources when failure conditions occur. · Resource monitors that are used to check on the state of a particular type of resource. Resource monitors are located in the directory /var/cluster/caa/monitors. · A user interface that allows you to manage application and network resources in a cluster. The commands available with the command-line interface are listed in the SEE ALSO section of this reference page. The SysMan menu provides a graphical user interface (GUI) for performing system management tasks for the cluster, cluster members, and CAA applications. For more information on using SysMan, see sysman(8) and the online help available for the sysman application. · Resources that are managed and monitored by CAA. A resource is defined by its resource profile. A resource profile defines to CAA how a application or network resource should run in a cluster. The caa_profile command creates new resource profiles, either with default values or fully customized according to command-line values. It can also validate, update, or delete profiles. · Action scripts associated with resources that are used by CAA to start and stop the application resources. A resource profile is an ASCII text file that assigns values to attributes that define how a resource should be managed or monitored in a cluster. The attributes described in the SYNOPSIS and OPERANDS section of this reference page make up a profile. Create a resource profile by using the caa_profile(8) command, the Cluster Application Availability (CAA) Management branch under TruCluster Specific on the SysMan Menu, or a text editor. The type of resource (application, network, tape, or changer determines which keywords and operands you can specify in its profile. Profiles are written to the /var/cluster/caa/profile directory by caa_profile. CAA expects all resource profiles to be in the /var/cluster/caa/profile directory. When you create a resource profile with a text editor, you can omit optional operands and keywords. There are default values for most keywords. Resources must be registered with CAA, using the caa_register command, after a profile is created. CAA can only begin to monitor and manage a resource after it has been registered. After a resource has been registered, the information in the profile is contained in the Registry Database located at /var/cluster/caa/registry/caa.reg. This file should never be edited with a text editor. You can also update a resource profile with a text editor. Any time you edit a profile by hand, you should validate the profile with the caa_profile -validate command to check that the profile is syntactically correct. Using the caa_register -u command, you can then update the resource while the resource remains online. If you change the profile and do not update the registration, the Registry Database will not contain the new profile information. Only certain keyword settings can be updated: You cannot update the NAME or TYPE of any resource. You can update: ACTION_SCRIPT Changes to the action script location and contents take effect the next time CAA uses the script. DESCRIPTION Changes to the description take place immediately. HOSTING_MEMBERS Changes to the hosting members list take place the next time the placement policy is executed. REQUIRED_RESOURCES Changes to the required resource list take place the next time the placement policy is executed. OPTIONAL_RESOURCES Changes to the optional resource list take place the next time the placement policy is executed. PLACEMENT Changes to the placement policy take place the next time the placement policy is executed. AUTO_START Changes to auto-start take effect after the next cluster reboot. CHECK_INTERVAL Changes to the check interval take effect immediately and reset the check interval timer. FAILURE_THRESHOLD Changes to the failure threshold take effect at the next failure. FAILURE_INTERVAL Changes to the failure interval take effect at the next failure. RESTART_ATTEMPTS Changes to the restart attempts take place at the next failure. FAILOVER_DELAY Changes to the failover delay take effect at the next failure. CAA does extensive logging of its actions to both the command line and the EVM event management system. To monitor CAA related EVM events, see the examples below. See the EVM(5) reference page for details on how to use the EVM event management system.

EXAMPLES

The following is an example of an application resource profile: TYPE = application NAME = clock CHECK_INTERVAL = 60 FAILURE_THRESHOLD = 0 FAILURE_INTERVAL = 0 REQUIRED_RESOURCES = OPTIONAL_RESOURCES = HOSTING_MEMBERS = PLACEMENT = balanced RESTART_ATTEMPTS = 1 FAILOVER_DELAY = 0 AUTO_START = 0 ACTION_SCRIPT = clock.scr SCRIPT_TIMEOUT = 60 ACTIVE_PLACEMENT = 0 The following is an example of a network resource: TYPE = network NAME = net1 CHECK_INTERVAL = 60 FAILURE_THRESHOLD = 0 FAILURE_INTERVAL = 0 SUBNET = 16.140.112.0 For examples of action scripts see the directory /var/cluster/caa/script or /var/cluster/caa/examples. To monitor CAA events on the console, use the following command: # evmwatch | evmshow -f "[name *.caa.*]" To view events related to CAA that have been sent to the EVM Event Management System: # evmget | evmshow -f "[name *.caa.*]"

SEE ALSO

Commands: caa_profile(8), caa_register(8), caa_relocate(8), caa_start(8), caa_stat(1), caa_stop(8), caa_unregister(8) Daemon: caad(8) Files: /var/cluster/caa/script, /var/cluster/caa/profile, /var/cluster/caa/registry, /var/cluster/caa/monitors, /var/cluster/caa/examples TruCluster Server Cluster Administration

Index Index for
Section 4
Index Alphabetical
listing for C
Top of page Top of
page