 |
|
|
NAME
nhpmd - process monitor daemon
SYNOPSIS
/opt/SUNWcgha/lib/sparcv9/nhpmd
The nhpmd daemon provides the Daemon Monitor service.
The nhpmd daemon runs at the multiuser level on all nodes
in the cluster. The nhpmd daemon surveys other Foundation Services
daemons, many Solaris operating system daemons, and some companion product daemons. If a
daemon that provides a critical service fails, the nhpmd
daemon detects the failure and triggers a recovery response. The recovery
response is specific to the daemon that failed.
The nhpmd daemon operates at a higher priority than
the other Foundation Services daemons.
Foundation Services daemons and Solaris operating system daemons are launched by a startup script. A nametag is assigned to
the daemon or group of daemons that is launched by each startup script. In
some cases, such as for syslogd, a nametag is assigned
to only one daemon. In other cases, such as for nfs_client,
a nametag is assigned to a group of daemons. If one of the daemons covered
by a nametag fails, the recovery response is performed by the nhpmd daemon on all of the daemons covered by that nametag. If the recovery
response is to restart the failed daemon, all of the daemons grouped under
that nametag are killed and restarted. For a list of monitored daemons and
their associated recovery responses, see MONITORED DAEMONS.
Information about monitored daemons can be collected using the nhpmdadm command, as described in the nhpmdadm(1M) man page.
This man page lists the Foundation Services, Solaris operating system, and companion product
daemons that are monitored by the nhpmd daemon and describes
the recovery action taken by the nhpmd daemon on the node
on which the monitored daemon failed.
Note the following before using the nhpmd daemon:
-
The initialization process of the Foundation Services alters the /etc/inittab file by replacing the rc2 and rc3 strings with rc2.HA and rc3.HA
strings. Do not modify or overwrite rc2.HA or rc3.HA.
-
The nhpmd daemon server is started automatically
when the system starts up at init level 2 (multi-user mode).
-
The nhpmd daemon is a 64-bit application.
It cannot run on a 32-bit kernel.
-
Files in the /var/run/SUNWcgha/pmd directory,
and the directory itself, must not be removed while the nhpmd
daemon is running.
-
The only signal to which the nhpmd daemon
responds is SIGTERM. Provided that the nhpmd
daemon is started by superuser, when the SIGTERM signal
is sent, the nhpmd daemon stops all monitoring and exits.
Previously monitored processes can now be traced or debugged.
-
The script provided as an action program to any nhpmdadm command must not be removed; it must exist when the nhpmd daemon attempts to execute it. If the system is out of main
resources (memory or processes), the nhpmd daemon might
not be able to launch or relaunch any executables.
-
To avoid collisions with other controlling processes, truss(1) does not allow a process to be traced that it detects as
being controlled by another process by way of the /proc
interface. The nhpmd daemon uses the /proc
interface to monitor processes and their descendents, therefore, those processes
that are submitted to the nhpmd daemon using the nhpmdadm tool cannot be traced or debugged.
-
When you list the processes that are running on the Foundation Services,
you see the Foundation Services daemons. Some of the daemons delivered with the Foundation Services
are part of the Foundation Services internal subsystem and cannot be publicly accessed.
Some daemons run only on the master and vice-master nodes, and some run on
all peer nodes.
-
When you list the running processes, the name of the Node
Management Agent daemon does not appear as nma. To see
the process name for the Node Management Agent daemon, use the ps command. The Process ID (PID) of this daemon is in /var/run/SUNWcgha/nma.pid.
The following lists give the nametag and associated recovery response
of the Foundation Services, Solaris operating system, and companion product daemons that are monitored
by the nhpmd daemon. The recovery responses listed are
the default values. You can specify the number of times the nhpmd daemon tries
to restart a daemon if you create the nhpmd.conf file.
For a description of these daemons, see their man pages. For information about
the nhpmd.conf file, see the nhpmd.conf(4) man page.
Monitored Daemons in the Foundation Services
The following list gives the nametag and recovery response of the monitored
daemons in the Foundation Services.
- Daemon - nhcrfsd
-
Nametag - nhcrfsd
Recovery response - relaunches the daemon up to three times. In some
cases, the nhcrfsd daemon detects a fatal error and reboots
the node.
- Daemon - nhcmmd
-
Nametag - nhcmmd
Recovery response - does not restart the daemon; reboots the node on
which it failed.
- Daemon - nhprobed
-
Nametag - cgha_probe
Recovery response - does not restart the daemon; reboots the node on
which it failed
- Daemon - nma
-
Nametag - nma
Recovery response - relaunches the daemon up to 10 times then exits
- Daemon - nhwdtd
-
Nametag - cgha_nhwdt
Recovery response - relaunches the daemon up to three times then reboots
the node on which it failed
Monitored Daemons in the Companion Products
The following list gives the nametag and recovery response of the monitored
daemons in the companion products.
- Daemon - nskernd
-
Nametag - sndr.nskernd
Recovery response - does not restart the daemon; reboots the node on
which it failed
- Daemon - sndrd
-
Nametag - sndr.sndrd
Recovery response - does not restart the daemon; reboots the node on
which it failed
Monitored Daemons in the Solaris Operating System
The following list gives the nametag and recovery response of the monitored
daemons in the Solaris operating system.
- Daemon - cron
-
Nametag - cron
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
- Daemons - dsvclokd, in.dhcpd
-
Nametag - dhcpd
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
- Daemons - fnsypd, keyserv, nis_cachemgr, rpcbind,
rpc.nisd, rpc.nispasswdd, rpc.yp, ypbind, ypserv, ypxfrd
-
Nametag - rpc
Recovery response - sends an error message
- Daemons - inetd, in.named
-
Nametag - inetsvc
Recovery response - reboots the node on which it failed
- Daemon - in.routed
-
Nametag - inetinit.routed
Recovery response - relaunches the daemon up to two times and logs
an error message if the second relaunch fails
- Daemon - in.rdisc
-
Nametag - inetinit.rdisc
Recovery response - relaunches the daemon up to two times and logs
an error message if the second relaunch fails
- Daemon - in.rdisc
-
Nametag - nfs.client when no nhcrfsd
daemon is running on the local node
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
- Daemon - lockd
-
Nametag - nfs.client.lockd when a nhcrfsd
daemon is running on the local node
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
- Daemon - mountd, nfsd, nfslogd
-
Nametag - nfs.server
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
- Daemon - nscd
-
Nametag - nscd
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
- Daemon - slpd
-
Nametag - slpd
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
- Daemon - statd
-
Nametag - on the master node nfs.client.statd.crfs
Nametag - on the vice-master node nfs.client.statd
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
- Daemon - syslogd
-
Nametag - syslog
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
- Daemon - utmpd
-
Nametag - utmpd
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
- Daemon - xntpd
-
Nametag - xntpd
Recovery response - relaunches the daemon up to two times and logs an
error message if the second relaunch fails
Diagnostic messages are logged to the console or in a file, depending
on the system's syslog local0 facility
settings.
See attributes(5)
for descriptions of the following attributes:
ATTRIBUTE TYPE | ATTRIBUTE VALUE |
Interface Stability | Evolving |
Availability | SUNWnhpma, SUNWnhpmb, SUNWnhpms |
nhpmdadm(1M), nhpmd.conf(4)
Netra HAS FS 2.1 | Go To Top | Last Changed September 2004 |
Company Info
|
Contact
|
Copyright 2004 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
|