![]() |
|||
![]() |
![]() ![]() |
![]() |
![]() ![]() |
![]() |
![]() ![]() |
![]() |
| ||||
Example 4-2 Example of qstat Output
Controlling Jobs With qdel and qmodTo control jobs from the command line, type one of the following commands with the appropriate arguments.
Use the qdel command to cancel jobs, regardless of whether the jobs are running or are spooled. Use the qmod command to suspend and resume (unsuspend) jobs already running. For both commands, you need to know the job identification number, which is displayed in response to a successful qsub command. If you forget the number, you can retrieve it with qstat. See Monitoring Jobs With qstat. Here are several examples of the qdel and qmod commands:
In order to delete, suspend, or resume a job, you must be the owner of the job or a grid engine manager or operator. See Managers, Operators, and Owners. You can use the -f (force) option with both commands to register a job status change at sge_qmaster without contacting sge_execd. You might want to use the force option in cases where sge_execd is unreachable, for example, due to network problems. The -f option is intended for use only by the administrator. In the case of qdel, however, users can force deletion of their own jobs if the flag ENABLE_FORCED_QDEL in the cluster configuration qmaster_params entry is set. See the sge_conf(5) man page for more information. Monitoring Jobs by EmailFrom the command line, type the following command with appropriate arguments.
The qsub -m command requests email to be sent to the user who submitted a job or to the email addresses specified by the -M flag if certain events occur. See the qsub(1) man page for a description of the flags. An argument to the -m option specifies the events. The following arguments are available:
Use a string made up of one or more of the letter arguments to specify several of these options with a single -m option. For example, -m be sends email at the beginning and at the end of a job. You can also use the Submit Job dialog box to configure these mail events. See Submitting Advanced Jobs With QMON. Monitoring and Controlling QueuesAs described in Displaying Queues and Queue Properties, the owners of queues have permission to suspend and resume queues, and to disable and enable queues. Owners might want to suspend or disable queues if certain machines are needed for important work, and those machines are strongly affected by jobs running in the background. You can control queues in two ways:
Monitoring and Controlling Queues With QMONIn the QMON Main Control window, click the Queue Control button. The Cluster Queues dialog box appears. ![]() Monitoring and Controlling Cluster QueuesThe Cluster Queue tab provides a quick overview of all cluster queues that are defined for the cluster. The Cluster Queue tab also provides the means to suspend and resume cluster queues, to disable and enable cluster queues, as well as to configure them. Information displayed in the Cluster Queue dialog box is updated periodically. Click Refresh to force an update. To select a cluster queue, click it. Click Delete, Suspend, Resume, Disable, or Enable to execute the corresponding operation on cluster queues that you select. The suspend/resume and disable/enable operations require notification of the corresponding sge_execd. If notification is not possible, you can force an sge_qmaster internal status change by clicking Force. For example, notification might not be possible because a host is down. The suspend/resume and disable/enable operations require cluster queue owner permission, grid engine manager permission, or operator permission. See Managers, Operators, and Owners for details. Suspended cluster queues are closed for further jobs. The jobs already running in suspended queues are also suspended, as described in Monitoring and Controlling Jobs With QMON. The cluster queue and its jobs are unsuspended as soon as the queue is resumed. Note - If a job in a suspended cluster queue was suspended explicitly, the job is not resumed when the queue is resumed. The job must be resumed explicitly. Disabled cluster queues are closed. However, the jobs that are running in those queues are allowed to continue. The disabling of a cluster queue is commonly used to "drain" a queue. After the cluster queue is enabled, it is eligible to run jobs again. No action on currently running jobs is performed. Error states are displayed using a red font in the queue list. Click Clear Error to remove an error state from a queue. Click Reschedule to reschedule all jobs currently running in the selected cluster queues. To configure cluster queues and queue instances, click Add or Modify on the Cluster Queue dialog box. See "Configuring Queues With QMON" in N1 Grid Engine 6 Administration Guide for details. Click Done to close the dialog box. Cluster Queue StatusEach row in the cluster queue table represents one cluster queue. For each cluster queue, the table lists the following information:
| ||||
| ||||
![]() |