Sun Microsystems
Products & Services
 
Support & Training
 
 

Previous Previous     Contents     Index     Next Next

The -p 100 priority of job L4_RR supersedes the license-based urgency, which results in the following prioritization:

job-ID  prior    name
---------------------
   3127 1.08000 L4_RR
   3128 0.10500 L5_RR
   3129 0.00500 L1_RR

In this case, traces of these jobs can be found in the schedule file for 6 schedule intervals:

::::::::
      3127:1:STARTING:1077903416:30:G:global:license:4.000000
      3127:1:STARTING:1077903416:30:Q:all.q@carc:slots:1.000000
      3128:1:RESERVING:1077903446:30:G:global:license:5.000000
      3128:1:RESERVING:1077903446:30:Q:all.q@bilbur:slots:1.000000
      3129:1:RESERVING:1077903476:31:G:global:license:1.000000
      3129:1:RESERVING:1077903476:31:Q:all.q@es-ergb01-01:slots:1.000000
      ::::::::
      3127:1:RUNNING:1077903416:30:G:global:license:4.000000
      3127:1:RUNNING:1077903416:30:Q:all.q@carc:slots:1.000000
      3128:1:RESERVING:1077903446:30:G:global:license:5.000000
      3128:1:RESERVING:1077903446:30:Q:all.q@es-ergb01-01:slots:1.000000
      3129:1:RESERVING:1077903476:31:G:global:license:1.000000
      3129:1:RESERVING:1077903476:31:Q:all.q@es-ergb01-01:slots:1.000000
      ::::::::
      3128:1:STARTING:1077903448:30:G:global:license:5.000000
      3128:1:STARTING:1077903448:30:Q:all.q@carc:slots:1.000000
      3129:1:RESERVING:1077903478:31:G:global:license:1.000000
      3129:1:RESERVING:1077903478:31:Q:all.q@bilbur:slots:1.000000
      ::::::::
      3128:1:RUNNING:1077903448:30:G:global:license:5.000000
      3128:1:RUNNING:1077903448:30:Q:all.q@carc:slots:1.000000
      3129:1:RESERVING:1077903478:31:G:global:license:1.000000
      3129:1:RESERVING:1077903478:31:Q:all.q@es-ergb01-01:slots:1.000000
      ::::::::
      3129:1:STARTING:1077903480:31:G:global:license:1.000000
      3129:1:STARTING:1077903480:31:Q:all.q@carc:slots:1.000000
      ::::::::
      3129:1:RUNNING:1077903480:31:G:global:license:1.000000
      3129:1:RUNNING:1077903480:31:Q:all.q@carc:slots:1.000000

Each section shows, for a schedule interval, all resource usage that was taken into account. RUNNING entries show usage of jobs that were already running at the start of the interval. STARTING entries show the immediate uses that were decided within the interval. RESERVING entries show uses that are planned for the future, that is, reservations.

The format of the schedule file is as follows:

jobID

The job ID

taskID

The array task ID, or 1 in the case of nonarray jobs

state

Can be RUNNING, SUSPENDED, MIGRATING, STARTING, RESERVING

start-time

Start time in seconds after 1.1.1070

duration

Assumed job duration in seconds

level-char

Can be P (for parallel environment), G (for global), H (for host), or Q (for queue)

object-name

The name of the parallel environment, host, or queue

resource-name

The name of the consumable resource

usage

The resource usage incurred by the job

The line :::::::: marks the beginning of a new schedule interval.


Note - The schedule file is not truncated. Be sure to turn monitoring off if you do not have an automated procedure that is set up to truncate the file.


What Happens in a Scheduler Interval

The Scheduler schedules work in intervals. Between scheduling actions, the grid engine system keeps information about significant events such as the following:

  • Job submission

  • Job completion

  • Job cancellation

  • An update of the cluster configuration

  • Registration of a new machine in the cluster

When scheduling occurs, the scheduler first does the following:

  • Takes into account all significant events

  • Sorts jobs and queues according to the administrator's specifications

  • Takes into account all the jobs' resource requirements

  • Reserves resources for jobs in a forward-looking schedule

Then the grid engine system does the following tasks, as needed:

  • Dispatches new jobs

  • Suspends running jobs

  • Increases or decreases the resources allocated to running jobs

  • Maintains the status quo

If share-based scheduling is used, the calculation takes into account the usage that has already occurred for that user or project.

If scheduling is not at least in part share-based, the calculation ranks all the jobs running and waiting to run. The calculation then takes the most important job until the resources in the cluster (CPU, memory, and I/O bandwidth) are used as fully as possible.

Previous Previous     Contents     Index     Next Next