Installing the N1 Grid Engine 6 Software
If you plan to add a new installation of the N1 Grid Engine 6 software or if
you are just adding new packages to your N1 Grid Engine cluster installation (like
Windows support, ARCo, or GEMM which have not been installed previously), see N1GE6Update4_Installation_Guide.pdf, the N1 Grid Engine 6 Installation
Guide that is included on the distribution CD. If you already installed the N1 Grid
Engine 6 software packages, you should install the patches which are available on http://sunsolve.sun.com. See the directions in the patch README documents
on how to install the patches.
The patch matrix below lists the available patches for N1 Grid Engine 6 which
are currently available (May 2005). Newer revisions of the patches or additional patches
may be available at a later time. Please check http://sunsolve.sun.com for
the availability of N1 Grid Engine 6 patches.
Table 1 Patches For Packages in Sun pkgadd Format
Package Name1 | Operating System | Architecture2 | Patch-Id |
SUNWsgee | Solaris Sparc, 32bit | sol-sparc | 118094-04 |
SUNWsgeex | Solaris, Sparc, 64bit | sol-sparc64 | 118130-04 |
SUNWsgeei | Solaris x86 | sol-x86 | 118131-04 |
SUNWsgeec | all | common | 118132-04 |
SUNWsgeea | all | arco | 118133-04 |
SUNWsgeed | all | doc | 119846-01 |
See pkginfo(1)
N1 Grid Engine binary architecture string or common (Architecture
independent packages), arco (Accounting and Reporting Console),
and doc (Documentation)
Table 2 Patches for Packages in tar.gz Format
Operating System | Architecture | Patch-ID |
Solaris, Sparc, 32bit | sol-sparc | 118082-04 |
Solaris, Sparc, 64bit | sol-sparc64 | 118083-04 |
Solaris, x86 | sol-x86 | 118084-04 |
Linux kernel 2.4/2.6 | x86, lx24-x86 | 118085-04 |
Linux kernel 2.4/2.6 | AMD64 lx24-amd64 | 118086-04 |
IBM AIX 4.3 | aix43 | 118087-04 |
IBM AIX 5.1 | aix51 | 118088-04 |
Apple MAC OS/X | darwin | 118089-04 |
HP-UX 11 | hp11 | 118090-04 |
SGI Irix 6.5 | irix65 | 118091-04 |
all | common | 118092-04 |
all | arco | 118093-04 |
all | doc | 119861-01 |
Changes in N1 Grid Engine 6 Update 4 Software
Along with many bug fixes, N1 Grid Engine 6 Update 4 includes the following
changes.
Support for Microsoft Windows Operating Systems
N1 Grid Engine 6 Update 4 Windows client functionality (submit, administration
and execution host) is now available for Microsoft Windows 2000 SP3 (or higher), Windows
XP Professional SP1 (or higher) and Windows Server 2003. The N1 Grid Engine command
line tools and execution host functionality are almost fully supported on these operating
systems.
The support of N1 Grid Engine for Windows allows users to fully integrate Windows
hosts into an existing N1 Grid Engine environment. Users are able to submit and monitor
their jobs through the command line tools. Administrators can have full control over
a N1 Grid Engine cluster from a Windows host. The execution host functionality allows
you to use Windows desktop machines and dedicated Windows compute servers for the
execution of batch workload and interactive jobs.
Installation of N1 Grid Engine 6 U4 requires Microsoft Services For UNIX (SFU)
3.5 which provides tools and libraries to integrate Windows with UNIX. SFU 3.5 is
available for no license fee and is supported by Microsoft. See http://www.microsoft.com/windows/sfu/default.asp for information and requirements about SFU as well as how to get SFU.
New Grid Engine Management Module (GEMM) for Sun Control
Station
GEMM is a new addition to N1GE6 which provides a web-based interface for deployment,
monitoring, and diagnostics of an N1 Grid Engine installation. It operates in the
framework provided by Sun Control Station 2.2, a product which must be purchased separately.
Sun Control Station (SCS) 2.2 provides overall life-cycle management of servers,
from bare-metal OS provisioning, to software and patch deployment, to basic health,
inventory, and hardware monitoring, all in an easy-to-user web interface. GEMM adds
to this the following capabilities:
Deploy N1GE -- Install and configure N1GE software on grid hosts,
including a master host, compute hosts, and access hosts. Key features include:
deploy any supported version of N1GE. Initially, N1GE6u4 is supported.
Future versions will be qualified for support.
work with a previously-installed master host.
Monitoring -- Provide high level monitoring of N1GE jobs, queues,
and hosts. Key features include:
drill down for details on jobs, queues, and hosts
selectively filter job display to focus on jobs of interest
provide monitoring even for Grid hosts outside the SCS/GEMM framework.
Diagnostics -- Provide tools for doing first-level diagnostics
of N1GE problems within the web interface. Key features include:
display job scheduling information
view spool files for running jobs
view messages files for qmaster and execd daemons.
Support for Solaris 10 x64 (Solaris on Opteron systems 64-bit)
Solaris 10_x64 (on AMD Opteron hardware) is now fully supported with this release.
Other Functionality Delivered With This Update Release
This list summarizes new and improved functionality which has been added to
the N1 Grid Engine 6 software since it was released in June 2004.
The Accounting and Reporting Console (ARCo) now uses the GUI elements
of the Sun Web Console. This update improves the look and feel as well as the scalability
of the web interface of ARCo.
The scalability, submit rate, scheduling speed, status query speed
of qstat, job turnaround times, and PE job start have been significantly
improved for many typical use cases in comparison to the original N1 Grid Engine 6
release.
The execution daemon installation is supported in Solaris 10 Containers
(Zones).
The DRMAA Java language binding is now available. The DRMAA Java language
binding library is located in the <sge_root>/lib/drmaa.jar directory.
The documentation is in <sge_root>/doc/javadocs.
The qping utility has been significantly improved
to diagnose N1 Grid Engine daemon communication. See the qping(1) man
page for more information.
The auto installation process has been improved. An auto installation
and de-installation of the Berkeley DB (BDB) RPC server is now possible. The backup
and restore procedure is now supported for the classic spooling option and the Berkeley
DB RPC spooling option.
Berkeley DB spooling on NFSv4 under Solaris 10 is supported.
The BDB database now can be installed on a NFSv4 mounted filesystem
on Solaris 10.
For performance reasons, it is recommended that you use
NFSv4 BDB spooling only when the NFSv4 mount provides an excellent high speed connection
to the file server.
On Linux platforms, an LSB conforming "lock" file is created by the
daemon startup script.
New man pages for the utility binaries gethostbyaddr, gethostbyname, gethostname, getservbyname and qping have been added. The man page sge_h_aliases(5) has
been renamed to host_aliases(5).
It is now possible to avoid inheritance of the execution daemon environment
to the job environment. It is also possible to avoid inheriting the variable LD_LIBRARY_PATH from the execution daemon environment (this feature can solve
NFS issues in certain setups). See sge_conf(5) for more information.
New options to optimize the memory overhead and speed of qstat. See sge_qstat(5) and qstat(1) for
more information (-u and -s flags).
The new qconf -purge switch easily allows the removal
of all references to a host or hostgroup in a cluster queue. See qconf(1) under the -purge switch for more information.
The spooling interval for the sharetree usage can now be configured
to reduce I/O and improve performance on NFS mounted qmaster spool
directories. See sge_conf(5) for more information in the section
about the STREE_SPOOL_INTERVAL parameter.
Faster execution daemon reconnect in the Certificate Security Protocol
(CSP) installation mode.
Execution daemons now can reconnect faster to qmaster if
the execution daemon or qmaster daemon has been restarted.
|