![]() |
|||
![]() |
![]() ![]() |
![]() |
![]() ![]() |
![]() |
![]() ![]() |
![]() |
| ||
Jobs and QueuesIn a grid engine system, jobs correspond to bank customers. Jobs wait in a computer holding area instead of a lobby. queues, which provide services for jobs, correspond to bank employees. As in the case of bank customers, the requirements of each job, such as available memory, execution speed, available software licenses, and similar needs, can be very different. Only certain queues might be able to provide the corresponding service. To continue the analogy, grid engine software arbitrates available resources and job requirements in the following way:
Usage PoliciesThe administrator of a cluster can define high-level usage policies that are customized according to whatever is appropriate for the site. Four usage policies are available:
Policy management automatically controls the use of shared resources in the cluster to best achieve the goals of the administration. High-priority jobs are dispatched preferentially. Such jobs receive higher CPU entitlements if the jobs compete for resources with other jobs. The grid engine software monitors the progress of all jobs and adjusts their relative priorities correspondingly and with respect to the goals defined in the policies. Using Tickets to Administer PoliciesThe functional, share-based, and override policies are defined through a grid engine system concept that is called tickets. Tickets are like shares of a public company's stock. The more stock shares that you own, the more important you are to the company. If shareholder A owns twice as many shares as shareholder B, A also has twice the votes of B. Therefore shareholder A is twice as important to the company. The more tickets a job has, the more important the job is. If job A has twice the tickets of job B, job A is entitled to twice the resource usage of job B. Jobs can retrieve tickets from the functional, share-based, and override policies. The total number of tickets, as well as the number retrieved from each ticket policy, often changes over time. The administrator controls the number of tickets that are allocated to each ticket policy in total. Just as ticket allocation does for jobs, this allocation determines the relative importance of the ticket policies among each other. Through the ticket pool that is assigned to particular ticket policies, the administration can run a grid engine system in different ways. For example, the system can run in a share-based mode only. Or the system can run in a combination of modes, for example, 90% share-based and 10% functional. Using the Urgency Policy to Assign Job PriorityThe urgency policy can be used in combination with two other job priority specifications:
A job can be assigned an urgency value, which is derived from three sources:
The administrator can separately weight the importance of each of these sources in order to arrive at a job's overall urgency value. For more information, see Chapter 5, "Managing Policies and the Scheduler," in N1 Grid Engine 6 Administration Guide. Figure 1-2 shows the correlation among policies. Figure 1-2 Correlation Among Policies in a Grid Engine System ![]() Grid Engine System ComponentsThe following sections explain the functions of the most important grid engine system components. HostsFour types of hosts are fundamental to the grid engine system:
Master HostThe master host is central to the overall cluster activity. The master host runs the master daemon sge_qmaster and the scheduler daemon sge_schedd. Both daemons control all grid engine system components, such as queues and jobs. The daemons maintain tables about the status of the components, about user access permissions, and the like. By default, the master host is also an administration host and a submit host. See the sections that describe those hosts. Execution HostsExecution hosts are systems that have permission to execute jobs. Therefore execution hosts have queue instances attached to them. Execution hosts run the execution daemon sge_execd. Administration HostsAdministration hosts are hosts that have permission to carry out any kind of administrative activity for the grid engine system. Submit HostsSubmit hosts allow users to submit and control batch jobs only. In particular, a user who is logged in to a submit host can submit jobs with the qsub command, can monitor the job status with the qstat command, and can use the grid engine system OSF/1 Motif graphical user interface QMON, which is described in QMON, the Grid Engine System's Graphical User Interface. Note - A system can act as more than one type of host. DaemonsThree daemons provide the functionality of the grid engine system. sge_qmaster - the Master DaemonThe center of the cluster's management and scheduling activities, sge_qmaster maintains tables about hosts, queues, jobs, system load, and user permissions. sge_qmaster receives scheduling decisions from sge_schedd and requests actions from sge_execd on the appropriate execution hosts. sge_schedd - the Scheduler DaemonThe scheduling daemon maintains an up-to-date view of the cluster's status with the help of sge_qmaster. The scheduling daemon makes the following scheduling decisions:
The daemon then forwards these decisions to sge_qmaster, which initiates the required actions. | ||
| ||
![]() |