This chapter contains information about using AlphaServer GS80/160/320 hard partitions in a TruCluster Server Version 5.1A configuration with Tru64 UNIX Version 5.1A. The chapter discusses the following topics:
An overview of the use of hard partitions in an AlphaServer GS80, GS160, or GS320 TruCluster Server configuration (Section 7.1).
The hardware requirements for using an AlphaServer GS80, GS160, or GS320 hard partition in a cluster (Section 7.2).
How to reconfigure a single partition AlphaServer GS80, GS160, or GS320 as multiple hard partitions in a TruCluster Server configuration (Section 7.3).
How to determine an AlphaServer GS80, GS160, or GS320 system configuration (Section 7.4).
How to update AlphaServer GS80, GS160, or GS320 firmware (Section 7.5).
An AlphaServer GS80/160/320 system provides the capability to define individual subsets of the system's computing resources. Each subset is capable of running an operating system.
The Tru64 UNIX Version 5.1A operating system supports hard partitions, which are partitions that are defined by a quad building block (QBB) boundary. All the CPUs, memory, and I/O resources in a QBB are part of a hard partition; you cannot split the components across multiple hard partitions, and resources cannot be shared between hard partitions. A partition can include multiple QBBs.
The TruCluster Server Version 5.1A product supports the use of AlphaServer GS80/160/320 hard partitions as a cluster member system. You can compose a cluster entirely of the partitions on a system, or of AlphaServer GS80/160/320 partitions and other AlphaServer systems. You can view an AlphaServer GS80/160/320 hard partition as a separate, standalone system.
The AlphaServer GS80/160/320 systems use the same switch technology, the same CPU, memory, and power modules, and the same I/O riser modules. The GS160 and GS320 systems house the modules in up to two system boxes, each with two QBBs, in a cabinet. The GS320 requires two cabinets for the system boxes.
The GS80 is a rack system with the system modules for each QBB in a drawer. An 8-processor GS80 uses two drawers for the CPU, memory, and I/O riser modules.
All the systems use the same type of PCI drawers for I/O.
They
are located in the GS160/GS320 power cabinet or in the GS80 RETMA
cabinet.
Additional PCI drawers are mounted in expansion cabinets.
7.2 Hardware Requirements for a Hard Partition in a Cluster
The TruCluster Server hardware requirements are the same for an AlphaServer GS80/160/320 hard partition as any other system in a cluster. You must have:
A supported host bus adapter connected to shared storage. This may be a KZPBA-CB for parallel SCSI, or a KGPSA-CA for Fibre Channel.
One or more network connections.
A Memory Channel interface. The AlphaServer GS80/160/320 system supports only the MC2 products.
Each AlphaServer GS80/160/320 hard partition that is used in a cluster must contain at least one QBB with a minimum of one CPU and one memory module. Additionally, there must be:
At least one local I/O riser module in the partition. Figure 7-1 shows a portion of an AlphaServer GS160 QBB with an I/O riser module with a BN39B cable that is connected to port 0.
At least one I/O riser in the partition must be connected to a
primary PCI drawer that provides the console terminal and operating
system boot disk.
For example, the portion of the cable on port 0 of
the local I/O riser shown in
Figure 7-1
could be
connected to the I/O Riser 0 (0-R
) connector in
Figure 2-1
and
Figure 7-3.
A primary PCI drawer contains a standard I/O module that provides both
System Reference Manual (SRM) and system control manager (SCM)
firmware.
You can connect additional I/O risers in the partition to
expansion PCI drawers.
Figure 7-1: Portion of QBB Showing I/O Riser Modules
Notes
You can have up to two I/O riser modules in a QBB, but you cannot split them across partitions.
Each I/O riser has two cable connections (Port 0 and Port 1). Ensure that both cables from one I/O riser are connected to the same PCI drawer (
0-R
and1-R
in Figure 2-1).A QBB I/O riser (local) is connected to a PCI I/O riser (remote) by BN39B cables. These cables are the same cables that are used with MC2 hardware. Ensure that you connect the BN39B cable from a QBB I/O riser to the
0-R
(I/O Riser 0) or1-R
(I/O Riser 1) connector in a PCI drawer and not to a Memory Channel module.We recommend that you connect I/O riser 0 (local I/O riser ports 0 and 1) to the primary PCI drawer that will be the master system control manager (SCM).
The BA54A-AA PCI drawer (the bottom PCI drawer in Figure 7-2 and Figure 7-3) is a primary PCI drawer. See Figure 2-1 for PCI drawer slot layout. A primary PCI drawer contains:
A standard I/O module in slot 0-0/1 that has EEPROMs for the system control manager (SCM) and system reference manual (SRM) firmware. The SCM is powered by the Vaux output of the PCI power supply whenever AC power is applied to the PCI drawer.
The master SCM uses the console serial bus (CSB) to:
Control system power-up
Monitor and configure the system
Halt and reset the system
Update firmware
Operating system disk
Two remote I/O riser modules (for connection to the QBB local I/O riser module)
Two PCI backplanes: Each PCI backplane (Figure 2-1) has two PCI buses. PCI bus 0 has three slots. PCI 1 has four slots. A primary PCI drawer has a standard I/O module in PCI bus 0 slot 0-0/1.
CD-ROM
Two power supplies (providing a redundant power supply)
Console serial bus (CSB) interface module: The console serial bus consists of a network of microprocessors that the master SCM controls in a master/slave relationship. Each node is programmed to control and monitor the subsystem in which it resides, in response to commands from, or when being polled, by the master SCM.
The CSB network consists of the following nodes:
One to eight SCMs.
The primary PCI drawer that is connected to
the operator control panel (OCP), and, with the lowest node ID
(usually 0), is the default master SCM upon initial power-up.
The
remaining SCMs are slaves.
You can designate one slave SCM as a
standby to the master.
The primary PCI drawer with the slave SCM that
you designate to be the standby must also be connected to the OCP.
The
OCP has two connectors for this purpose.
The standby SCM must have a
node ID (usually set to 1) that is higher than the master SCM.
Both
the master SCM and standby SCM must have the
scm_csb_master_eligible
SCM environment
variable set.
Note
We recommend that you put the primary PCI drawers that contain the master and standby SCM in the power cabinet. They both must be connected to the OCP.
One to 16 PCI backplane managers (PBMs), one for each PCI backplane
A hierarchical switch power manager (HPMs), if the H-switch is present
Local terminal/COM1 port (on the standard I/O module): Connect a cable from the local terminal port on the standard I/O module to the terminal server for each partition. The terminal server is connected to the system management console (PC) that provides a terminal emulator window for each console.
Modem port (on the standard I/O module)
Two universal serial bus (USB) ports (on standard I/O module)
Keyboard port
Mouse port
Operator Control Panel (OCP) port
Parallel port
Communication port (COM2)
The BA54A-BA PCI drawer is an expansion PCI drawer (top PCI drawer in Figure 7-2 and Figure 7-3) and contains:
Two I/O riser modules (for connection to a QBB I/O riser module)
Two power supplies (which provides a redundant power supply)
Two PCI backplanes. Each PCI backplane has 2 PCI buses, each with seven available slots.
Console serial bus interface module
Figure 7-2
shows the front view of an expansion
and a primary PCI drawer.
The primary PCI drawer is on the bottom.
You can easily recognize it because of the CD-ROM, keyboard and
mouse ports, COM2 and parallel ports, and connection to the
OCP.
Figure 7-3
shows the rear view of both
types of PCI drawers.
It is harder to distinguish the type of PCI
drawer from the rear, but slot 1 provides the key.
The primary PCI
drawer has a standard I/O module in slot 1, and the console and modem
ports and USB connections are visible on the module.
Figure 7-2: Front View of Expansion and Primary PCI Drawers
Figure 7-3: Rear View of Expansion and Primary PCI Drawers
7.3 Configuring Partitioned GS80, GS160, or GS320 Systems in a TruCluster Configuration
An AlphaServer GS80/160/320 system can be a member of a TruCluster Server configuration. Alternatively, any AlphaServer GS80/160/320 hard partition can participate as a member system, provided that the partition meets the hardware requirements that Section 7.2 describes.
The following section covers configuring a single partition
AlphaServer GS80/160/320 system as multiple hard partitions in a
TruCluster Server configuration.
The description covers the case of a newly
installed system that is to be used as two member systems in a
TruCluster Server configuration.
7.3.1 Repartitioning a Single-Partition AlphaServer GS80/160/320 as Two Partitions in a Cluster
The information in this section assumes that this is a new AlphaServer GS80/160/320 system with hardware installed, the system management console is connected for the first partition, a terminal emulator window is open for the first partition, and that the system has been powered up and tested as a single partition. Also, this section assumes that you have determined which QBBs to use in each partition. Although the procedure specifies two hard partitions, the maximum for a GS80 system, it will work equally well with any number of partitions (as supported by the system type) by modifying the amount and placement of hardware and the SCM environment variable values.
Notes
View each partition as a separate system.
Ensure that the system comes up as a single partition the first time that you turn power on. Do not turn the key switch on. Only turn on the AC circuit breakers. Use the SCM
set hp_count 0
command to ensure that the system comes up as a single partition. Then turn the key switch on to provide power to the system.
To repartition an AlphaServer GS80/160/320 system into two partitions to be used as TruCluster Server member systems, follow this procedure:
If necessary, install a primary PCI drawer for each additional hard partition beyond partition 0. Install any expansion PCI drawers as needed to provide additional PCI slots. Ensure that the system already has a primary PCI drawer for the first partition.
Note
We recommend that you install the primary PCI drawers that contain the master and standby SCM (if there is to be a standby SCM) in the power cabinet of a GS160 or GS320 or RETMA cabinet for a GS80; they both must be connected to the OCP.
Install the following hardware, as appropriate for your TruCluster Server configuration, in the primary (or expansion) PCI drawer of each partition and make all cable connections. Keep your configuration as symmetrical as possible to make troubleshooting and reconfiguration tasks easier.
Each system in a TruCluster Server configuration requires at least one Memory Channel adapter. Ensure that you abide by the restrictions described in Section 2.2, and that you connect the cables for Memory Channel interconnects to the Memory Channel modules and not to the I/O risers. The type of cables used, BN39B used for the Memory Channel interconnect are also used to connect the local I/O risers (on the QBB) to the remote I/O risers (on the PCI drawers).
Shared storage that is connected to KZPBA-CB (parallel SCSI) or KGPSA-CA (Fibre Channel) host bus adapters.
Network controllers.
Install BN39B cables between the local I/O risers on the QBBs in
the partition (see
Figure 7-1) and the remote I/O
risers in the primary and expansion PCI drawer (see
Figure 2-1
and
Figure 7-3).
Use
BN39B-01 cables (1-meter; 3.3-foot) for a PCI drawer in the GS80 RETMA
cabinet.
Use BN39B-04 cables (4-meter; 13.1-foot) if the PCI drawer
is in a GS160 or GS320 power cabinet.
Use BN39B-10 cables (10-meter;
32.8-foot) if the PCI drawer is in an expansion cabinet.
Ensure that
you connect the cables to the
0-R
and
1-R
(remote I/O riser) connections in the PCI
drawer and not to a Memory Channel module.
Note
We recommend that you connect I/O riser 0 (local I/O riser ports 0 and 1) to the primary PCI drawer that will be the master system control manager (SCM).
If you require more than two PCI drawers in a hard partition, you need more than one QBB in the partition. Each QBB supports two PCI drawers (2 cables between a local I/O riser and a PCI drawer).
Set the PCI drawer node ID with the pushbutton up-down counter on the CSB node ID module at the rear of each PCI drawer (see Figure 7-3). Set the node ID of the primary PCI drawer with the master SCM to zero. Set the node ID of the primary PCI drawer with the standby SCM (if applicable) to one. Increment the PCI drawer node ID for successive PCI drawers.
Ensure that the primary PCI drawer that contains the master SCM is connected to the OCP. Connect the primary PCI drawer with the standby SCM (if applicable) to the OCP.
Connect an H8585-AA connector to the terminal port on the standard I/O module for the new partition. Connect a BN25G-07 cable between the H8585-AA connector and the terminal server to provide the console terminal connection to the system management console.
Use the system management console terminal emulator to create a new terminal window for the partition.
Turn on the AC circuit breakers for each of the QBBs. Doing so provides power to the console serial bus (CSB) and SCM. Do not turn on the OCP key switch; you do not have to go through the lengthy power-up sequence to partition the system.
Notes
If the OCP key switch is in the
On
orSecure
position, the system will go through the power-up sequence.In this case, when the power-up sequence terminates, power down the system with the
power off
SCM command, then partition the system.If the
auto_quit_scm
SCM environment variable is set (equal1
), control will be passed to the SRM console firmware at the end of the power-up sequence. Use the escape sequence ([Esc] [Esc] scm) to transfer control to the SCM firmware. If theauto_quit_scm
SCM environment variable is not set (equal0
), the SCM retains control.If you execute the
power off
command at the master SCM, without designating a partition, power is turned off to the entire system. To turn power off to a partition, use the SCMpower off -par n
, where n is the partition number.A slave SCM can only control power for its own partition.
When the power-up self tests (POST) have completed, and the system has been powered down, use the master SCM to set the SCM environment variables to define the partitions.
The
hp_count
SCM
environment variable defines the number of hard partitions.
The
hp_qbb_maskn
SCM
environment variables define which QBBs, by bit position, will be
part of partition
n.
Example 7-1
shows how to set up two partitions, with each
partition containing two QBBs.
Partition 0 includes QBBs 0 and 1;
partition 1 includes QBBs 2 and 3.
Use the
show nvr
SCM command to display the SCM
environment variables.
Example 7-1: Defining Hard Partitions with SCM Environment Variables
scm_e0> set hp_count 2 [1] scm_e0> set hp_qbb_mask0 3 [2] scm_e0> set hp_qbb_mask1 c [3] scm_e0> show nvr [4] com1_print_en 1 hp_count 2 [5] hp_qbb_mask0 3 [5] hp_qbb_mask1 c [5] hp_qbb_mask2 0 hp_qbb_mask3 0 hp_qbb_mask4 0 hp_qbb_mask5 0 hp_qbb_mask6 0 hp_qbb_mask7 0 srom_mask ff f xsrom_mask ff ff ff ff ff ff ff ff ff 1 0 0 primary_cpu ff primary_qbb0 ff auto_quit_scm 1 [6] fault_to_sys 0 dimm_read_dis 0 scm_csb_master_eligible 1 [7] perf_mon 20 scm_force_fsl 0 ocp_text as gs160 auto_fault_restart 1 scm_sizing_time c
Sets the number of hard partitions to 2. [Return to example]
Sets bits 0 and 1 of the mask (0011
) to select
QBB 0 and QBB 1 for hard partition 0.
[Return to example]
Sets bits 2 and 3 of the mask (1100
) to select
QBB 2 and QBB 3 for hard partition 1.
[Return to example]
Displays the SCM environment variables (non-volatile ram) to verify that the hard partition variables are set correctly. [Return to example]
Verifies that the hard partition environment variables are correct. [Return to example]
Indicates that control will be transferred to the SRM console firmware
at the end of a power-up sequence.
If you want to execute SCM
commands use the escape sequence
([Esc]
[Esc]
scm) to transfer control to
the SCM firmware.
If you want to ensure that control stays with the
SCM at the end of a power-up sequence, set the
auto_quit_scm
SCM environment variable to zero.
[Return to example]
Indicates that the SCM on this primary PCI drawer is eligible to
be selected as the master SCM on subsequent power-ups.
It will be
selected if it is connected to the OCP, its CSB node ID is the lowest
of the SCMs that are eligible to become master, and the
scm_csb_master_eligible
environment
variable is set.
[Return to example]
Select one primary PCI drawer to be the master SCM and if
desired, another primary PCI drawer to be a standby SCM by setting the
scm_csb_master_eligible
environment variable.
The master and standby SCM must be connected to
the OCP.
The master SCM must have the lowest node ID.
Use the node ID address obtained from the
show csb
SCM command (see
Example 7-4).
If multiple primary
PCI drawers are eligible, the SCM on the PCI drawer with the lowest
node ID is chosen as master.
The other SCM will be a standby in case
of a problem with the master SCM.
If the node ID switch is set to zero, the CSB node ID will be 10 (Example 7-4). If the node ID switch is set to one, the CSB node ID will be 11.
For example, the following command enables the SCMs in the primary PCI drawers at node IDs 10 and 11 (switch settings of 0 and 1) to be master (and standby) of the console serial bus.
SCM_E0> set scm_csb_master_eligible 10,11
Note
The system will hang if the master SCM is not connected to the OCP.
At the standby SCM, set the
hp_count
and
hp_qbb_maskn
SCM
environment variables to match the setting at the master SCM:
SCM_E0> set hp_count 2 SCM_E0> set hp_qbb_mask0 3 SCM_E0> set hp_qbb_mask1 c
Turn the On/Off switch to the
On
or
Secure
position, then power
on each of the partitions with the master SCM.
After the power-up
sequence completes, transfer control to the SRM console firmware as shown
in
Example 7-2.
Example 7-2: Turning Partition Power On
SCM_E0> power on -par 0 [1]
.
.
.
SCM_E0> power on -par 1 [2]
.
.
.
SCM_E0> quit [3]
Turns on power to partition 0. [Return to example]
Turns on power to partition 1. [Return to example]
Transfers control from the SCM firmware to the SRM console firmware. [Return to example]
Note
If the
auto_quit_scm
SCM environment variable is set, control is passed to the SRM console firmware automatically at the end of the power-up sequence.
Obtain a copy of the latest firmware release notes for the AlphaServer system (see Section 7.5). Compare the present firmware revisions (see Example 7-4) with the required revisions that are indicated in the release notes. Update the firmware if necessary (see Section 7.5).
The SRM console firmware includes the ISP1020/1040-based PCI option firmware, which includes the KZPBA-CB. When you update the SRM console firmware, you are enabling the KZPBA-CB firmware to be updated. On a power-up reset, the SRM console loads PCI option firmware from the console system flash ROM into NVRAM for all Qlogic ISP1020/1040-based PCI options, including the KZPBA-CB PCI-to-Ultra SCSI adapter.
At the terminal emulator for each partition, access the SRM console firmware and complete each of the following as necessary:
If applicable, set the KZPBA-CB SCSI IDs and ensure that you have access to all the shared storage.
Run the Memory Channel diagnostics
mc_diag
and
mc_cable
to verify
that the Memory Channel adapters are operational (Section 5.6).
Install the Tru64 UNIX operating system (see the Tru64 UNIX Installation Guide).
Install the TruCluster Server software (see the TruCluster Server Cluster Installation manual).
If you are using Fibre Channel storage, follow the procedures in Chapter 6, Using Fibre Channel Storage.
Set up highly available applications or services as required.
7.4 Determining AlphaServer GS80/160/320 System Configuration
You may be required to reconfigure an AlphaServer GS80/160/320 system that is not familiar to you. Before you start to reconfigure any system, you need to determine:
The number of partitions in the system
Which QBBs are in each partition
Which PCI drawers are used by each partition
Which PCI drawer is connected to each QBB
The console serial bus (CSB) addresses
Determine the necessary information with the following system control manager
(SCM) commands:
show
nvr
(Example 7-1),
show
system
(Example 7-3), and
show
csb
(Example 7-4).
If you are at the SRM prompt, use the escape sequence ([Esc] [Esc] scm) to transfer control to the SCM firmware.
Example 7-3
shows the display for the
show system
SCM command for an AlphaServer
GS160 system.
Example 7-3: Displaying AlphaServer GS160 System Information
SCM_E0> show system System Primary QBB0 : 2 System Primary CPU : 0 on QBB2 [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] Par hrd/sft CPU Mem IOR3 IOR2 IOR1 IOR0 GP QBB Dir PS Temp QBB# 3210 3210 (pci_box.rio) Mod BP Mod 321 (ºC) (0) 0/30 PPPP --PP --.- --.- P0.1 P0.0 P P P -PP 27.0 (0) 1/31 PPPP --PP --.- --.- --.- --.- P P P -PP 26.0 (1) 2/32 PPPP --PP --.- --.- P1.1 P1.0 P P P PP- 26.0 (1) 3/33 PPPP --PP --.- --.- --.- --.- P P P PP- 27.0 HSwitch Type Cables 7 6 5 4 3 2 1 0 Temp(ºC) HPM40 8-port - - - - P P P P 29.0 [11] [12] [13] [14][15] [16] PCI Rise1-1 Rise1-0 Rise0-1 Rise0-0 RIO PS Temp Cab 7 6 5 4 3 2 1 7 6 5 4 3 2 1 1 0 21 (ºC) 10 L L L M - M - M L L L L L S * * PP 30.5 11 L L L M - M - M L L L L L S * * PP 30.0
Hard partition number. There are two hard partitions in this example (0 and 1). [Return to example]
QBB number and console serial bus (CSB) node ID. QBB 0 and 1 (CSB node IDs 30 and 31) are in partition 0. QBB 2 and 3 (CSB node IDs 32 and 33) are in partition 1. [Return to example]
Status of the CPU module, which is present, powered up, and has
passed self test (P
).
A dash (-) indicates an
empty slot.
An
F
indicates a self test failure.
In
this example, each QBB contains four CPU modules, each of which has
passed self test.
[Return to example]
Status of the memory module, which is present, powered up, and
has passed self test (P
).
A dash (-) indicates an
empty slot.
An
F
indicates a self test failure.
In
this example, each QBB contains two memory modules, both of which has
passed self test.
[Return to example]
Status of the PCI drawer I/O risers that are plugged into the
QBB I/O risers in the form of
Xm.n
.
X
can be a "P
",
"p
", "F
", or a dash (-).
QBB local I/O risers are
IOR0
(Port 0),
IOR1
(Port 1),
IOR2
(Port 2),
and
IOR3
(Port 3).
A
P
(uppercase) indicates that power is on and self test passed.
A
p
(lowercase) indicates that power is off and
self test passed, and an
F
indicates a self test
failure.
The
m.n
numbers for each
QBB indicate which PCI drawer (m
=
0
through
f
)
and which PCI drawer I/O riser (n
=
0, 1)
the local I/O riser is connected to.
For
example, QBB0 Port 0 (IOR0) is connected to PCI drawer 0 I/O riser 0
(P0.0
); QBB0 Port 1 (IOR1) is connected to PCI
drawer 0 I/O riser 1 (P0.1
).
Dashes (-) in place of
m.n
signify that the
I/O riser module is not installed.
The display always shows two
sequences of
--.-
(for example
--.-
--.-
) because there are two ports on a
local I/O riser module.
The other sequence you may observe is
Px.x
, which
indicates that the I/O riser module is installed, powered-up, and has
passed self test, but a cable is not connected to the port.
For
example, a status of
Px.x P2.0
indicates that the
local I/O riser is installed, but only one cable is connected.
[Return to example]
Status of the global port module, which passed self test. [Return to example]
Status of the QBB backplane power system manager (PSM), which passed self test. [Return to example]
Status of the QBB directory module, which passed self test. [Return to example]
QBB power supply status. Each of these QBBs has two power supplies. A dash (-) indicates that there is no power supply in that position. [Return to example]
QBB backplane temperature in degrees Celsius. [Return to example]
Hierarchical switch (H-switch) type, status, temperature, and a report of which QBBs are connected to the H-switch. In this example, QBBs 0, 1, 2, and 3 are connected to the H-switch. [Return to example]
Console serial bus node ID for PCI drawers. In this example, the first PCI drawer has node ID 10. The second PCI drawer has node ID 11. Note that in this case, the node ID switches are set to 0 and 1. [Return to example]
Status of each of the four PCI buses in a PCI drawer.
An
S
indicates that a standard I/O module is present.
Other modules present in a slot are identified by their power
dissipation:
L
: Lower power dissipation
M
: Medium power dissipation
H
: High power dissipation
Dash (-): There is no module in that slot.
In this example, the PCI modules with
M
(medium)
power dissipation are Memory Channel and Fibre Channel-to-PCI host bus
adapters.
[Return to example]
An indication of the presence or absence of the I/O riser modules in the PCI drawer. An asterisk (*) indicates that a module is present. [Return to example]
Status of the PCI drawer power supplies as follows:
A
P
(uppercase) indicates that the power supply is
powered on and passed self test.
A
p
(lowercase)
indicates that the power supply passed self test but has been powered off.
An
F
(uppercase) indicates that the power supply is
powered on and failed self test.
An
f
(lowercase)
indicates that the power supply failed self test and has been powered off.
An asterisk (*) indicates that the SCM has detected the presence of the power supply, but that there has been no attempt to power on the power supply.
PCI drawer temperature in degrees Celsius. [Return to example]
Example 7-4
shows the display for the
show
csb
SCM command for an AlphaServer GS160 system.
Example 7-4: Displaying Console Serial Bus Information
SCM_E0> show csb [1] [2] [3] [4] [5] [6] CSB Type Firmware Revision FSL Revision Power State 10 PBM T05.4 (03.24/01:14) T4.2 (09.08) ON 11 PBM T05.4 (03.24/01:14) T4.2 (09.08) ON 30 PSM T05.4 (03.24/01:09) T4.0 (07.06) ON SrvSw: NORMAL 30 XSROM T05.4 (03.24/02:10) C0 CPU0/SROM V5.0-7 ON C1 CPU1/SROM V5.0-7 ON C2 CPU2/SROM V5.0-7 ON C3 CPU3/SROM V5.0-7 ON C0 IOR0 ON C1 IOR1 ON 31 PSM T05.4 (03.24/01:09) T4.0 (07.06) ON SrvSw: NORMAL 31 XSROM T05.4 (03.24/02:10) C4 CPU0/SROM V5.0-7 ON C5 CPU1/SROM V5.0-7 ON C6 CPU2/SROM V5.0-7 ON C7 CPU3/SROM V5.0-7 ON 32 PSM T05.4 (03.24/01:09) T4.0 (07.06) ON SrvSw: NORMAL 32 XSROM T05.4 (03.24/02:10) C8 CPU0/SROM V5.0-7 ON C9 CPU1/SROM V5.0-7 ON CA CPU2/SROM V5.0-7 ON CB CPU3/SROM V5.0-7 ON C8 IOR0 ON C9 IOR1 ON 33 PSM T05.4 (03.24/01:09) T4.0 (07.06) ON SrvSw: NORMAL 33 XSROM T05.4 (03.24/02:10) CC CPU0/SROM V5.0-7 ON CD CPU1/SROM V5.0-7 ON CE CPU2/SROM V5.0-7 ON CF CPU3/SROM V5.0-7 ON 40 HPM T05.4 (03.24/01:18) X4.1 (08.18) ON E0 SCM MASTER T05.4 (03.24/01:21) T4.2 (09.08) ON E1 SCM SLAVE T05.4 (03.24/01:21) T4.2 (09.08) ON Ineligible
Console serial bus (CSB) node ID, or in the case of a QBB, the CPU number in the QBB. The CSB node address ranges are as follows:
10 to 1f: PCI backplane manager (PBM) -- The CSB node ID is based on the PCI drawer node ID setting.
e0 to e7: System control manager (SCM) -- The CSB node ID is also based on the PCI drawer node ID setting.
30 to 37: Power system manager (PSM) -- Based on the hard QBB ID (QBB 0 - 7)
40: Hierarchical switch power manager (HPM)
C0 to CF: In response to the SCM
show csb
command, the PSM provides CSB node addresses for the CPUs and I/O
risers even though they are not on the console serial bus.
This
enables SCM commands to be directed at any specific CPU, for instance
power off -cpu c4
.
The PSM responds to SCM
commands and powers the CPU on or off.
Type of CSB node:
PBM (PCI backplane manager)
PSM (Power system manager)
HPM (Hierarchical switch power manager)
SCM master: This PCI primary drawer has the master SCM.
SCM slave: The SCM on this PCI primary drawer is a slave and has not been designated as a backup to the master.
CPUn/SROM: Each CPU module has SROM firmware that is executed as part of the power-up sequence.
XROM: Each CPU executes this extended SROM firmware on the PSM module after executing the SROM firmware.
Revision level of the firmware and compilation date. [Return to example]
Revision level of the fail-safe loader (FSL) firmware. Each microprocessor on the CSB has both a normal firmware image in its flash ROM and a fail-safe loader image in a backup ROM. The fail-safe loader firmware is executed when the system is reset. It performs a checksum on the normal firmware image, and then passes control to the normal firmware image. [Return to example]
State of power for each CPU, I/O riser, and each node on the CSB. [Return to example]
An indication that power is
normal (NORMAL
), or that the QBB power is off and
can be serviced (SERVICE
).
The
Ineligible
field for the slave SCM indicates
that the SCM is not a backup to the master SCM.
[Return to example]
Occasionally you must update the AlphaServer GS80/160/320, or PCI host bus adapter firmware. To determine the need for a firmware update, you compare the current firmware versions with the versions available on the latest AlphaServer firmware update CD-ROM. The firmware release notes for the system provide a list of current firmware versions.
See Section 4.2 for two methods of obtaining the firmware release notes.
The following section provides an overview of how to update the firmware.
7.5.1 Updating AlphaServer GS80/160/320 Firmware
You can update the AlphaServer GS80/160/320 firmware with the loadable firmware update (LFU) utility by booting the AlphaServer Firmware Update CD-ROM.
You can use the LFU to update the following firmware:
System Reference Manual (SRM) flash ROM on the standard I/O module
The flash ROMs for the following console serial bus (CSB) microprocessors:
SCM: One on the standard I/O module of each primary PCI drawer
Power system manager (PSM): One on the PSM module in each QBB
PCI backplane manager (PBM): One on each PCI backplane
Hierarchical switch power manager (HPM): One on the H-switch
PCI host bus adapter EEPROMS
To update the AlphaServer GS80/160/320 firmware with the LFU utility, follow these steps:
At the console for each partition, shut down the operating system.
At the master SCM, turn power off to the system:
SCM_E0> power off
You can turn power off to individual partitions if you want. Ensure that power is turned off to all partitions.
SCM_E0> power off -par 0 SCM_E0> power off -par 1
Use the
show nvr
SCM command to display SCM
environment variables.
Record the
hp_count
and
hp_qbb_maskn
environment variables as a record of the hardware partition
configuration.
You do not change the
hp_qbb_maskn
environment variables, but record the variables anyway.
SCM_E0> show nvr COM1_PRINT_EN 1 HP_COUNT 2 HP_QBB_MASK0 3 HP_QBB_MASK1 c HP_QBB_MASK2 0 HP_QBB_MASK3 0 HP_QBB_MASK4 0 HP_QBB_MASK5 0 HP_QBB_MASK6 0 HP_QBB_MASK7 0
.
.
.
Remove all hardware partitions:
SCM_E0> set hp_count 0
Note
You do not need to zero the
hp_qbb_maskn
environment variables, only thehp_count
.
Turn power on to the system to allow SRM console firmware execution. The SRM code is copied to memory on the partition primary QBB during the power-up initialization sequence. SRM code is executed out of memory, not the SRM EEPROM on the standard I/O module.
SCM_E0> power on
Transfer control from the SCM to SRM console firmware (if the
auto_quit_scm
SCM
environment variable is not set):
SCM_E0> quit P00>>>
Use the console
show device
command to
determine which device is the CD-ROM.
Place the AlphaServer Firmware Update CD-ROM in the drive and boot:
P00>>> boot dqa0
The boot sequence provides firmware update overview information. Press Return to scroll the text, or press Ctrl/C to skip the text.
After the overview information has been displayed, the name of the
default boot file is provided.
If it is the correct boot file, press
Return at the
Bootfile:
prompt.
Otherwise, enter the name of the file from which you want to boot.
The LFU help message shown in the following example is displayed:
*****Loadable Firmware Update Utility***** ------------------------------------------------------------- Function Description ------------------------------------------------------------- Display Displays the system's configuration table. Exit Done exit LFU (reset). List Lists the device, revision, firmware name and update revision Readme Lists important release information. Update Replaces current firmware with loadable data image. Verify Compares loadable and hardware images. ? or Help Scrolls this function table.
The
list
command indicates, in the
device
column,
which devices it can update.
It also shows the present firmware
revision and the update revision on the CD-ROM.
Use the
update
command to update all firmware, or
you can designate a specific device to update; for example, SRM console firmware:
UPD> update srm
Caution
Do not abort the update -- doing so can cause a corrupt flash image in a firmware module.
A complete firmware update for a QBB can take from 5 minutes for a PCI with no updatable devices to over 30 minutes for a PCI with many updatable devices. The length of time increases proportionally with the number of PCI adapters that you have.
After you update the firmware, use the
verify
command to verify the firmware update, then
transfer control back to the SCM and reset the system:
P00>>> [Esc][Esc] scm SCM_E0> reset
Set the hard partitions back to the original configuration:
SCM_E0> set hp_count 2
At the master SCM, turn system power on:
SCM_E0> power on
At the master SCM, transfer control to the SRM console firmware. Then, using the SRM at the console of each partition, boot the operating system.