Sun Microsystems Logo
Products and Services
 
Support and Training
 
 

A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  
 
Maintenance Commandsnhpmd(1M)


NAME

 nhpmd - process monitor daemon

SYNOPSIS

 /opt/SUNWcgha/lib/sparcv9/nhpmd

DESCRIPTION

The nhpmd daemon provides the Daemon Monitor service. The nhpmd daemon runs at the multiuser level on all nodes in the cluster. The nhpmd daemon surveys other Foundation Services daemons, many Solaris operating system daemons, and some companion product daemons. If a daemon that provides a critical service fails, the nhpmd daemon detects the failure and triggers a recovery response. The recovery response is specific to the daemon that failed.

The nhpmd daemon operates at a higher priority than the other Foundation Services daemons.

Foundation Services daemons and Solaris operating system daemons are launched by a startup script. A nametag is assigned to the daemon or group of daemons that is launched by each startup script. In some cases, such as for syslogd, a nametag is assigned to only one daemon. In other cases, such as for nfs_client, a nametag is assigned to a group of daemons. If one of the daemons covered by a nametag fails, the recovery response is performed by the nhpmd daemon on all of the daemons covered by that nametag. If the recovery response is to restart the failed daemon, all of the daemons grouped under that nametag are killed and restarted. For a list of monitored daemons and their associated recovery responses, see MONITORED DAEMONS.

Information about monitored daemons can be collected using the nhpmdadm command, as described in the nhpmdadm(1M) man page.

This man page lists the Foundation Services, Solaris operating system, and companion product daemons that are monitored by the nhpmd daemon and describes the recovery action taken by the nhpmd daemon on the node on which the monitored daemon failed.

EXTENDED DESCRIPTION

Note the following before using the nhpmd daemon:

  • The initialization process of the Foundation Services alters the /etc/inittab file by replacing the rc2 and rc3 strings with rc2.HA and rc3.HA strings. Do not modify or overwrite rc2.HA or rc3.HA.

  • The nhpmd daemon server is started automatically when the system starts up at init level 2 (multi-user mode).

  • The nhpmd daemon is a 64-bit application. It cannot run on a 32-bit kernel.

  • Files in the /var/run/SUNWcgha/pmd directory, and the directory itself, must not be removed while the nhpmd daemon is running.

  • The only signal to which the nhpmd daemon responds is SIGTERM. Provided that the nhpmd daemon is started by superuser, when the SIGTERM signal is sent, the nhpmd daemon stops all monitoring and exits. Previously monitored processes can now be traced or debugged.

  • The script provided as an action program to any nhpmdadm command must not be removed; it must exist when the nhpmd daemon attempts to execute it. If the system is out of main resources (memory or processes), the nhpmd daemon might not be able to launch or relaunch any executables.

  • To avoid collisions with other controlling processes, truss(1) does not allow a process to be traced that it detects as being controlled by another process by way of the /proc interface. The nhpmd daemon uses the /proc interface to monitor processes and their descendents, therefore, those processes that are submitted to the nhpmd daemon using the nhpmdadm tool cannot be traced or debugged.

  • When you list the processes that are running on the Foundation Services, you see the Foundation Services daemons. Some of the daemons delivered with the Foundation Services are part of the Foundation Services internal subsystem and cannot be publicly accessed. Some daemons run only on the master and vice-master nodes, and some run on all peer nodes.

  • When you list the running processes, the name of the Node Management Agent daemon does not appear as nma. To see the process name for the Node Management Agent daemon, use the ps command. The Process ID (PID) of this daemon is in /var/run/SUNWcgha/nma.pid.

MONITORED DAEMONS

The following lists give the nametag and associated recovery response of the Foundation Services, Solaris operating system, and companion product daemons that are monitored by the nhpmd daemon. The recovery responses listed are the default values. You can specify the number of times the nhpmd daemon tries to restart a daemon if you create the nhpmd.conf file. For a description of these daemons, see their man pages. For information about the nhpmd.conf file, see the nhpmd.conf(4) man page.

Monitored Daemons in the Foundation Services

The following list gives the nametag and recovery response of the monitored daemons in the Foundation Services.

Daemon - nhcrfsd

Nametag - nhcrfsd

Recovery response - relaunches the daemon up to three times. In some cases, the nhcrfsd daemon detects a fatal error and reboots the node.

Daemon - nhcmmd

Nametag - nhcmmd

Recovery response - does not restart the daemon; reboots the node on which it failed.

Daemon - nhprobed

Nametag - cgha_probe

Recovery response - does not restart the daemon; reboots the node on which it failed

Daemon - nma

Nametag - nma

Recovery response - relaunches the daemon up to 10 times then exits

Daemon - nhwdtd

Nametag - cgha_nhwdt

Recovery response - relaunches the daemon up to three times then reboots the node on which it failed

Monitored Daemons in the Companion Products

The following list gives the nametag and recovery response of the monitored daemons in the companion products.

Daemon - nskernd

Nametag - sndr.nskernd

Recovery response - does not restart the daemon; reboots the node on which it failed

Daemon - sndrd

Nametag - sndr.sndrd

Recovery response - does not restart the daemon; reboots the node on which it failed

Monitored Daemons in the Solaris Operating System

The following list gives the nametag and recovery response of the monitored daemons in the Solaris operating system.

Daemon - cron

Nametag - cron

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemons - dsvclokd, in.dhcpd

Nametag - dhcpd

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemons - fnsypd, keyserv, nis_cachemgr, rpcbind, rpc.nisd, rpc.nispasswdd, rpc.yp, ypbind, ypserv, ypxfrd

Nametag - rpc

Recovery response - sends an error message

Daemons - inetd, in.named

Nametag - inetsvc

Recovery response - reboots the node on which it failed

Daemon - in.routed

Nametag - inetinit.routed

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemon - in.rdisc

Nametag - inetinit.rdisc

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemon - in.rdisc

Nametag - nfs.client when no nhcrfsd daemon is running on the local node

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemon - lockd

Nametag - nfs.client.lockd when a nhcrfsd daemon is running on the local node

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemon - mountd, nfsd, nfslogd

Nametag - nfs.server

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemon - nscd

Nametag - nscd

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemon - slpd

Nametag - slpd

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemon - statd

Nametag - on the master node nfs.client.statd.crfs

Nametag - on the vice-master node nfs.client.statd

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemon - syslogd

Nametag - syslog

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemon - utmpd

Nametag - utmpd

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

Daemon - xntpd

Nametag - xntpd

Recovery response - relaunches the daemon up to two times and logs an error message if the second relaunch fails

DIAGNOSTICS

Diagnostic messages are logged to the console or in a file, depending on the system's syslog local0 facility settings.

ATTRIBUTES

See attributes(5) for descriptions of the following attributes:

ATTRIBUTE TYPEATTRIBUTE VALUE
Interface StabilityEvolving
AvailabilitySUNWnhpma, SUNWnhpmb, SUNWnhpms

SEE ALSO

nhpmdadm(1M), nhpmd.conf(4)


Netra HAS FS 2.1Go To TopLast Changed September 2004