Sun Microsystems
Products & Services
 
Support & Training
 
 

Previous Previous     Contents     Index     Next Next

Monitoring Execution Hosts With qhost

Use the qhost command to retrieve a quick overview of the execution host status:

% qhost

This command produces output that is similar to the following example:

Example 1-1 Sample qhost Output

HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
arwen                   aix43           1     -       -       -       -       -
baumbart                irix65          2  0.00    1.1G   91.5M  128.0M     0.0
boromir                 hp11            1     -  128.0M       -  256.0M       -
carc                    lx24-amd64      2  0.00    3.8G  989.8M    1.0G     0.0
denethor                aix51           1 4.54G       -       -       -       -
durin                   lx24-x86        1  0.37  123.1M   46.5M  213.6M   26.6M
eomer                   sol-sparc64     1  0.13  256.0M  248.0M  513.0M   93.0M
lolek                   tru64           1  0.02    1.0G  790.0M    1.0G    8.0K
mungo                   lx22-alpha      1  1.00  248.9M   78.8M  129.8M    2.5M
nori                    sol-x86         2  0.38 1023.0M  372.0M  512.0M   37.0M
pippin                  darwin          1  0.00  640.0M  264.0M     0.0     0.0
smeagol                 hp11            1  0.35  512.0M  425.0M    1.0G   95.0M

See the qhost(1) man page for a description of the output format and for more options.

Invalid Host Names

The following is a list of host names that are invalid, reserved, or otherwise not allowed to be used:

global    

template    

all    

default    

unknown    

none    

Killing Daemons From the Command Line

To kill grid engine system daemons from the command line, use one of the following commands:

% qconf -ke[j] {hostname,... | all}
% qconf -ks
% qconf -km

You must have manager or operator privileges to use these commands. See Chapter 4, Managing User Access for more information about manager and operator privileges.

  • The qconf -ke command shuts down the execution daemons. However, it does not cancel active jobs. Jobs that finish while no sge_execd is running on a system are not reported to sge_qmaster until sge_execd is restarted. The job reports are not lost, however.

    The qconf -kej command kills all currently active jobs and brings down all execution daemons.

    Use a comma-separated list of the execution hosts you want to shut down, or specify all to shut down all execution hosts in the cluster.

  • The qconf -ks command shuts down the scheduler sge_schedd.

  • The qconf -km command forces the sge_qmaster process to terminate.

If you want to wait for any active jobs to finish before you run the shutdown procedure, use the qmod -dq command for each cluster queue, queue instance, or queue domain before you run the qconf sequence described earlier. For information about cluster queues, queue instances, and queue domains, see Configuring Queues.

% qmod -dq {cluster-queue | queue-instance | queue-domain}

The qmod -dq command prevents new jobs from being scheduled to the disabled queue instances. You should then wait until no jobs are running in the queue instances before you kill the daemons.

Restarting Daemons From the Command Line

Log in as root on the machine on which you want to restart grid engine system daemons.

Type the following commands to run the startup scripts:

% sge-root/cell/common/sgemaster
% sge-root/cell/common/sgeexecd

These scripts looks for the daemons normally running on this host and then start the corresponding ones.

Basic Cluster Configuration

The basic cluster configuration is a set of information that is configured to reflect site dependencies and to influence grid engine system behavior. Site dependencies include valid paths for programs such as mail or xterm. A global configuration is provided for the master host as well as for every host in the grid engine system pool. In addition, you can configure the system to use a configuration local to each host to override particular entries in the global configuration.

The cluster administrator should adapt the global configuration and local host configurations to the site's needs immediately after the installation. The configurations should be kept up to date afterwards.

The sge_conf(5) man page contains a detailed description of the configuration entries.

Displaying a Cluster Configuration With QMON

On the QMON Main Control window, click the Cluster Configuration button. The Cluster Configuration dialog box appears.

Figure 1-6 Cluster Configuration Dialog Box

Dialog box titled Cluster Configuration. Shows Host and Configuration
lists. Shows Add, Modify, Delete, Done, and Help buttons.

In the Host list, select the name of a host. The current configuration for the selected host is displayed under Configuration.

Displaying the Global Cluster Configuration With QMON

On the QMON Main Control window, click the Cluster Configuration button.

In the Host list, select global.

The configuration is displayed in the format that is described in the sge_conf(5) man page.

Adding and Modifying Global and Host Configurations With QMON

In the Cluster Configuration dialog box (Figure 1-6), select a host name or the name global, and then click Add or Modify. The Cluster Settings dialog box appears.

Dialog box titled Cluster Settings. Shows General Settings tab
with global configuration parameters you can set. Shows Ok and Cancel buttons.

The Cluster Settings dialog box enables you to change all parameters of a global configuration or a local host configuration.

All fields of the dialog box are accessible only if you are modifying the global configuration. If you modify a local host, its configuration is reflected in the dialog box. You can modify only those parameters that are feasible for local host changes.

If you are adding a new local host configuration, the dialog box fields are empty.

The Advanced Settings tab shows a corresponding behavior, depending on whether you are modifying a configuration or are adding a new configuration. The Advanced Settings tab provides access to more rarely used cluster configuration parameters.

Dialog box titled Cluster Settings. Shows Advanced Settings tab
with parameters you can set. Shows Ok and Cancel buttons.

When you finish making changes, click OK to save your changes and close the dialog box. Click Cancel to close the dialog box without saving changes.

See the sge_conf(5) man page for a complete description of all cluster configuration parameters.

Deleting a Cluster Configuration With QMON

On the QMON Main Control window, click the Cluster Configuration button.

In the Host list, select the name of a host whose configuration you want to delete, and then click Delete.

Previous Previous     Contents     Index     Next Next