Sun Microsystems
Products & Services
 
Support & Training
 
 

Previous Previous     Contents     Next Next

Installing the N1 Grid Engine 6 Software

If you plan to add a new installation of the N1 Grid Engine 6 software or if you are just adding new packages to your N1 Grid Engine cluster installation (like Windows support, ARCo, or GEMM which have not been installed previously), see N1GE6Update4_Installation_Guide.pdf, the N1 Grid Engine 6 Installation Guide that is included on the distribution CD. If you already installed the N1 Grid Engine 6 software packages, you should install the patches which are available on http://sunsolve.sun.com. See the directions in the patch README documents on how to install the patches.

The patch matrix below lists the available patches for N1 Grid Engine 6 which are currently available (May 2005). Newer revisions of the patches or additional patches may be available at a later time. Please check http://sunsolve.sun.com for the availability of N1 Grid Engine 6 patches.

Table 1 Patches For Packages in Sun pkgadd Format

Package Name1

Operating System

Architecture2

Patch-Id

SUNWsgee

Solaris Sparc, 32bit

sol-sparc

118094-04

SUNWsgeex

Solaris, Sparc, 64bit

sol-sparc64

118130-04

SUNWsgeei

Solaris x86

sol-x86

118131-04

SUNWsgeec

all

common

118132-04

SUNWsgeea

all

arco

118133-04

SUNWsgeed

all

doc

119846-01

  1. See pkginfo(1)

  2. N1 Grid Engine binary architecture string or common (Architecture independent packages), arco (Accounting and Reporting Console), and doc (Documentation)

Table 2 Patches for Packages in tar.gz Format

Operating System

Architecture

Patch-ID

Solaris, Sparc, 32bit

sol-sparc

118082-04

Solaris, Sparc, 64bit

sol-sparc64

118083-04

Solaris, x86

sol-x86

118084-04

Linux kernel 2.4/2.6

x86, lx24-x86

118085-04

Linux kernel 2.4/2.6

AMD64 lx24-amd64

118086-04

IBM AIX 4.3

aix43

118087-04

IBM AIX 5.1

aix51

118088-04

Apple MAC OS/X

darwin

118089-04

HP-UX 11

hp11

118090-04

SGI Irix 6.5

irix65

118091-04

all

common

118092-04

all

arco

118093-04

all

doc

119861-01

Changes in N1 Grid Engine 6 Update 4 Software

Along with many bug fixes, N1 Grid Engine 6 Update 4 includes the following changes.

Support for Microsoft Windows Operating Systems

N1 Grid Engine 6 Update 4 Windows client functionality (submit, administration and execution host) is now available for Microsoft Windows 2000 SP3 (or higher), Windows XP Professional SP1 (or higher) and Windows Server 2003. The N1 Grid Engine command line tools and execution host functionality are almost fully supported on these operating systems.

The support of N1 Grid Engine for Windows allows users to fully integrate Windows hosts into an existing N1 Grid Engine environment. Users are able to submit and monitor their jobs through the command line tools. Administrators can have full control over a N1 Grid Engine cluster from a Windows host. The execution host functionality allows you to use Windows desktop machines and dedicated Windows compute servers for the execution of batch workload and interactive jobs.

Installation of N1 Grid Engine 6 U4 requires Microsoft Services For UNIX (SFU) 3.5 which provides tools and libraries to integrate Windows with UNIX. SFU 3.5 is available for no license fee and is supported by Microsoft. See http://www.microsoft.com/windows/sfu/default.asp for information and requirements about SFU as well as how to get SFU.

New Grid Engine Management Module (GEMM) for Sun Control Station

GEMM is a new addition to N1GE6 which provides a web-based interface for deployment, monitoring, and diagnostics of an N1 Grid Engine installation. It operates in the framework provided by Sun Control Station 2.2, a product which must be purchased separately.

Sun Control Station (SCS) 2.2 provides overall life-cycle management of servers, from bare-metal OS provisioning, to software and patch deployment, to basic health, inventory, and hardware monitoring, all in an easy-to-user web interface. GEMM adds to this the following capabilities:

  • Deploy N1GE -- Install and configure N1GE software on grid hosts, including a master host, compute hosts, and access hosts. Key features include:

    • deploy any supported version of N1GE. Initially, N1GE6u4 is supported. Future versions will be qualified for support.

    • work with a previously-installed master host.

  • Monitoring -- Provide high level monitoring of N1GE jobs, queues, and hosts. Key features include:

    • drill down for details on jobs, queues, and hosts

    • selectively filter job display to focus on jobs of interest

    • provide monitoring even for Grid hosts outside the SCS/GEMM framework.

  • Diagnostics -- Provide tools for doing first-level diagnostics of N1GE problems within the web interface. Key features include:

    • display job scheduling information

    • view spool files for running jobs

    • view messages files for qmaster and execd daemons.

Support for Solaris 10 x64 (Solaris on Opteron systems 64-bit)

Solaris 10_x64 (on AMD Opteron hardware) is now fully supported with this release.

Other Functionality Delivered With This Update Release

This list summarizes new and improved functionality which has been added to the N1 Grid Engine 6 software since it was released in June 2004.

  • The Accounting and Reporting Console (ARCo) now uses the GUI elements of the Sun Web Console. This update improves the look and feel as well as the scalability of the web interface of ARCo.

  • The scalability, submit rate, scheduling speed, status query speed of qstat, job turnaround times, and PE job start have been significantly improved for many typical use cases in comparison to the original N1 Grid Engine 6 release.

  • The execution daemon installation is supported in Solaris 10 Containers (Zones).

  • The DRMAA Java language binding is now available. The DRMAA Java language binding library is located in the <sge_root>/lib/drmaa.jar directory. The documentation is in <sge_root>/doc/javadocs.

  • The qping utility has been significantly improved to diagnose N1 Grid Engine daemon communication. See the qping(1) man page for more information.

  • The auto installation process has been improved. An auto installation and de-installation of the Berkeley DB (BDB) RPC server is now possible. The backup and restore procedure is now supported for the classic spooling option and the Berkeley DB RPC spooling option.

  • Berkeley DB spooling on NFSv4 under Solaris 10 is supported.

  • The BDB database now can be installed on a NFSv4 mounted filesystem on Solaris 10.

    For performance reasons, it is recommended that you use NFSv4 BDB spooling only when the NFSv4 mount provides an excellent high speed connection to the file server.

  • On Linux platforms, an LSB conforming "lock" file is created by the daemon startup script.

  • New man pages for the utility binaries gethostbyaddr, gethostbyname, gethostname, getservbyname and qping have been added. The man page sge_h_aliases(5) has been renamed to host_aliases(5).

  • It is now possible to avoid inheritance of the execution daemon environment to the job environment. It is also possible to avoid inheriting the variable LD_LIBRARY_PATH from the execution daemon environment (this feature can solve NFS issues in certain setups). See sge_conf(5) for more information.

  • New options to optimize the memory overhead and speed of qstat. See sge_qstat(5) and qstat(1) for more information (-u and -s flags).

  • The new qconf -purge switch easily allows the removal of all references to a host or hostgroup in a cluster queue. See qconf(1) under the -purge switch for more information.

  • The spooling interval for the sharetree usage can now be configured to reduce I/O and improve performance on NFS mounted qmaster spool directories. See sge_conf(5) for more information in the section about the STREE_SPOOL_INTERVAL parameter.

  • Faster execution daemon reconnect in the Certificate Security Protocol (CSP) installation mode.

  • Execution daemons now can reconnect faster to qmaster if the execution daemon or qmaster daemon has been restarted.

Previous Previous     Contents     Next Next