1    Administering Hardware

This chapter provides an overview of the hardware management model and describes the resources that are available to you. It discusses the following topics:

The hwmgr command also enables you to perform service tasks such as hot-swapping CPUs. For information on this feature, see hwmgr_ops(8) and the Managing Online Addition and Removal manual.

1.1    Understanding Hardware

A hardware component is any discrete part of the system such as a CPU, a networking card, or a hard disk. The system is organized in a hierarchy with the central processing unit (CPU) at the top and peripheral components such as disks and tapes, at the bottom. This is sometimes also referred to as the system topology. The following components are typical of the device hierarchy of most computer systems, although it is not a definitive list:

Hardware management involves understanding how all the components relate to each other, how they are logically and physically located in the system topology, and how the system software recognizes and communicates with components. To better understand the component hierarchy of a system, refer to the System Administration manual for an introduction to the SysMan Station. This is a graphical user interface that displays topological views of the system component hierarchy and allows you to manipulate such views.

The majority of hardware management tasks are automated. When you add a supported SCSI disk to a system and reboot the system, the disk is automatically detected and configured into the system. The operating system dynamically loads required drivers and creates the device special files. You need only to partition the disk and create file systems on the partitions (described in the System Administration manual) before you use it to store data. However, you must periodically perform some hardware management tasks manually, such as when a disk crashes and you need to bring a replacement disk online at the same logical location. You might also need to manually add components to a running system or redirect I/O from one disk to another disk.

Many other hardware management tasks are part of regular system operations and maintenance, such as repartitioning a disk or adding an adapter to a bus. Often, such tasks are fully described in the hardware documentation that accompanies the component itself, but you often need to perform tasks such as checking the system for the optimum (or preferred) physical and logical locations for the new component.

Another important aspect of hardware management is preventative maintenance and monitoring. Use the following operating system features to maintain a healthy system environment:

The organization of this manual reflects the hardware components and devices that you manage as follows:

Another way to think of this is that with a generic tool you can perform a task on many components, while with a targeted tool you can perform a task on only a single component. Unless stated otherwise, most operations are specific to a single system or to a cluster. See the TruCluster Server documentation for additional information on managing cluster hardware.

1.2    Reference Information

The following sections contain reference information related to documentation, system files, related software tools. Some tools described here are obsolete and scheduled for removal in a future release. Consult the Release Notes for a list of operating system features that are scheduled for retirement and migrate to its replacement as soon as possible. Check your site-specific shell scripts for any calls that might invoke an obsolete command.

1.2.1    Documentation

The following documentation contains information about hardware management:

The command line and graphical user interfaces also provide extensive online help.

1.2.2    Web Resources

Many procedures described in this manual concern the administration of system hardware and peripherals such as storage devices. Consult the owner's manual for any hardware device, particularly if you need information on using hardware-specific application software.

The following Web sites provide information and resources such as driver updates:

1.2.3    Software and Applications

Depending on your local system configuration, you might use one or more applications to manage system components. These applications often enable you to manage other devices connected to the system and its local components, such as Fibre Channel switches. The availability of a particular application depends on which storage components and which versions of Tru64 UNIX it supports. Some programs run only on a networked PC system and use a Web connection to access storage. The programs might also require a separate purchase under license, rather than bundled software provided with the hardware.

Often, a complex administrative procedure requires that you use one or more of these applications together with Tru64 UNIX commands and utilities such as the hwmgr. An example of such a complex procedure is configuring a redundant Fibre Channel storage array for multipath failover.

You can manage system components by using the following applications:

The system console

The system console is a command interface that is first visible when you power-on your AlphaServer. The system displays the console prompt (>>>) after the initial power-on tests complete successfully. This interface is the System Reference Manual (SRM) console, a term that is used synonymously with the phrase system firmware. (The firmware includes other components that are not part of the SRM.)

The console enables you to display information about system components. For example, the following command displays information about the hardware components in the system, such as the bus, target, and logical unit number (lun) address of a disk drive attached to a local SCSI bus:

>>>show device

Your AlphaServer owner's manual documents the SRM console commands. Minor variations in SRM command options exist between different AlphaServer systems. If you do not have the system's hardware manual, you can download a printable PDF version from the following Alpha Systems Technology Web site:

http://www.compaq.com/alphaserver/technology/index.html

Refer to the Tru64 UNIX Release Notes for important information on console-specific restrictions for your processor.

Array Controller Software and Hierarchical Storage Operating Firmware

The following software runs on certain models of storage array controllers, enabling you to perform administrative tasks:

New PCMCIA program cards are provided for each controller model whenever there is an update to the ACS or HSOF operating firmware. You can purchase update cards separately for each release, or obtain them automatically as part of an update service contract. Go to the following Web site for firmware update information:

http://www.compaq.com/products/storageworks/softwaredrivers/acs/index.html

Go to the following Web site to access ACS and HSOF online documentation:

http://www.compaq.com/products/storageworks/array-and-scsi-controllers/HSxuserdocs.html

StorageWorks Command Console (SWCC)

StorageWorks Command Console (SWCC) is a graphical storage configuration and monitoring software tool for arrays such as the EMA12000 and EMA16000 RAID Arrays. It reduces the task of storage management to simple point-and-click and enables you to configure and monitor storage graphically from a single management console. SWCC 2.3 provides an agent for Tru64 UNIX Version 4.0F and later releases. For information on the SWCC, go to the following Web site:

http://www.compaq.com/products/storageworks/swcc/index.html

Storage Area Networks (SAN)

HP supports SAN management through a growing number of software applications. Table 1-1 lists some of the applications available at the time of publication. Some applications run on a networked SAN appliance and might have dependencies on other software components. Other SAN tools provide only agents for specific versions of the operating system. This means that you manage your hardware from an interface running on an appliance (or a PC) that passes commands to an agent program running on Tru64 UNIX.

Table 1-1:  SAN Software

Title Description
SANworks Command Scripter Provides you with command control of HSJ80, HSG60, HSG80, HSZ70 and HSZ80 Array Controllers. You can create, edit, and run script files that contain Command Line Interpreter (CLI) commands. This product provides both local and LAN connections and a browser-based interface for remote connections. This application works with the SWCC.
SANworks Data Replication Manager (DRM) Provides a disaster-tolerance by providing hardware redundancy and data replication across multiple sites separated by some distance by replicating data at the target sites. The DRM sites are connected over some distance via fiber optic cable or asynchronous transfer mode (ATM). Data Replication Manager uses Fibre Channel gigabit switches to send the data between the sites.
SANworks Element Manager for StorageWorks HSG Provides a Web-based graphical configuration and monitoring tool to centralize storage management on HSG controllers.
SANworks Enterprise Volume Manager This Web-based application software enables you to manage controller-based clone and snapshot operations.
SANworks Network View Provides a browser-based application to automatically map SAN topology, monitor its availability, and display a map of the storage environment.
SANworks Open SAN Manager Provides centralized, appliance-based monitoring and management interface for the Open SAN. It enables you to organize, visualize, configure and monitor storage from a single navigation point on the SAN. It also provides a launch point for other SANworks applications and links to directly manage storage components on the SAN.

For SAN software and driver information, go to the following Web site:

http://www.compaq.com/storage/sanworks-support.html

Media Robot Utility (MRU)

An application that enables you to control robotic loaders such as the DLT and 4mm (DAT) or TKZ-series loaders.

1.2.4    Related Commands and Utilities

The following commands are also available to you for use in managing devices:

1.3    Identifying Hardware Management System Files

The following system files contain static or dynamic information that the system uses to configure the component into the kernel. Do not edit these files manually even if they are ASCII text files. Some files are context-dependent symbolic links (CDSLs), as described in the System Administration manual. If the links are accidentally broken, clustered systems cannot access the files until you verify and recreate the links. See cdslinvchk(8) and mkcdsl(8) for information on recreating CDSLs.

Caution

Although some hardware databases are text format, do not edit the databases. Any errors introduced into the databases might prevent your system from accessing devices, cause data corruption, or prevent your system from booting. Use only the appropriate commands and utilities to manage devices.

1.4    WWIDs and Shared Devices

SCSI device naming is based on the logical identifier (ID) of a device. This means that the device special filename has no correlation to the physical location of a SCSI device. UNIX uses information from the device to create an identifier called a worldwide identifier, which is usually written as WWID.

Ideally, the WWID for a device is unique, enabling the identification of every SCSI device attached to the system. However, some legacy disks (and even some new disks available from third-party vendors) do not provide the information required to create a unique WWID for a specific device. For such devices, the operating system attempts to generate a WWID, and in the extreme case uses the device nexus (its SCSI bus/target/LUN) to create a WWID for the device.

Consequently, do not use devices that do not have a unique WWID on a shared bus. If a device that does not have a unique WWID is put on a shared bus, a different device special file is created for each different path to the device. This can lead to data corruption if the operating system uses two different device special files to access the same device at the same time. To determine if a device has a cluster-unique WWID, use the following command:

#  /sbin/hwmgr show components

If a device has the c flag set in the FLAGS field, then it has a cluster-unique WWID and you can place it on a shared bus. Such devices are referred to as cluster-shareable because you can put them on a shared bus within a cluster.

Note

Exceptions to this rule are HSZ devices. Although an HSZ device might be marked as cluster shareable, some firmware revisions on the HSZ preclude multi-initiators from probing the device at the same time. See the owner's manual for the HSZ device and the Tru64 UNIX Release Notes for any current restrictions.

The following example displays all the hardware components that have cluster-unique WWIDs:

# /sbin/hwmgr show component -cshared
HWID: HOSTNAME FLAGS SERVICE COMPONENT NAME
-----------------------------------------------
35:   pmoba    rcd-- iomap   SCSI-WWID:0410004c:"DEC  RZ28     ..."
36:   pmoba    -cd-- iomap   SCSI-WWID:04100024:"DEC  RZ25F    ..."
42:   pmoba    rcd-- iomap   SCSI-WWID:0410004c:"DEC  RZ26L    ..."
43:   pmoba    rcds- iomap   SCSI-WWID:0410003a:"DEC  RZ26L    ..."
48:   pmoba    rcd-- iomap   SCSI-WWID:0c000008:0000-00ff-fe00-0000
49:   pmoba    rcd-- iomap   SCSI-WWID:04100020:"DEC  RZ29B    ..."
50:   pmoba    rcd-- iomap   SCSI-WWID:04100026:"DEC  RZ26N    ..."

You might have a requirement to make a device available on a shared bus even though it does not have a unique WWID. Using such devices on a shared bus is not recommended, but there is a method that enables you to create such as configuration. See Chapter 3 for a description of how you use the hwmgr edit scsi command option to create a unique WWID.

1.5    Device Naming and Device Special Files

Devices are made available to the rest of the system through device special files located in the /dev directory. A device special file enables an application (such as a database application) to access a device through its device driver, which is a kernel module that controls one or more hardware components of a particular type. For example, network controllers, graphics controllers, and disks (including CD-ROM devices).

The system uses device special files to access pseudodevice drivers that do not control a hardware component, for example, a pseudoterminal (pty) terminal driver, which simulates a terminal device. The pty terminal driver is a character driver typically employed by remote logins; it is described in Chapter 4. See the device driver documentation for detailed information on device drivers at the following URL:

http://www.tru64unix.com/docs/

Normally, device special file management is performed automatically by the system. For example, when you install a new version of the UNIX operating system, there is a point at which the system probes all buses and controllers and all the system devices are found. The system then builds databases that describe the devices and creates device special files that make devices available to users. The most common way that you use a device special file is to specify it as the location of a UFS file system in the system /etc/fstab file.

You need to perform manual operations on device special files only when there are problems with the system or when you need to support a device that the system cannot handle automatically. The following sections describe the way that devices and device special files are named and organized in Version 5.0 or higher.

The following considerations apply:

Legacy device names and device special files will be maintained for some time and their retirement schedule will be announced in a future release.

1.5.1    Related Documentation and Commands

The following documents contain information about device names:

1.5.2    Device Special File Directories

To contain the device special files, a /devices directory exists under the root directory (/). This directory contains subdirectories that each contain device special files for a class of devices. A class of device corresponds to related types of devices, such as disks or nonrewind tapes. For example, the /dev/disk directory contains files for all supported disks, and the /dev/ntape directory contains device special files for nonrewind tape devices. In this release, only the subdirectories for certain classes are created. For all operations you must specify paths by using the /dev directory and not the /devices directory.

Note

Some device special file directories are CDSLs, which enable devices to be available cluster wide when a system is part of a cluster. You should be familiar with the file system hierarchy described in the System Administration manual, in particular the implementation of CDSLs.

From the /dev directory, there are symbolic links to corresponding subdirectories to the /devices directory. For example:

lrwxrwxrwx 1 root system 25 Nov 11 13:02 ntape -> ../../../../devices/ntape

lrwxrwxrwx 1 root system 25 Nov 11 13:02 rdisk -> ../../../../devices/rdisk

lrwxrwxrwx 1 root system 24 Nov 11 13:02 tape -> ../../../../devices/tape

This structure enables certain devices to be host-specific when the system is a member of a cluster. It enables other devices to be shared between all members of a cluster. In addition, new classes of devices might be added by device driver developers and component vendors.

1.5.2.1    Pseudodevices and Non-storage Devices

To manage devices and their associated device special files, it is useful to know what is in the /dev directory on your system. In the /dev directory, you will find many devices listed, as shown in the following example output:

# ls -l
total 47
drwx------   2 root     system           512 Jun  5 15:55 .audit
-rwxr-xr-x   1 bin      bin            34777 Jun 19 00:02 MAKEDEV
-rw-r--r--   1 root     system          2238 Jun 28 16:42 MAKEDEV.log
-rwxr-xr-x   1 bin      bin             1418 Jun 19 00:02 SYSV_PTY
-rwxr-xr-x   1 bin      bin             1418 Aug  1  2001 SYSV_PTY.PreUPD
crw-------   1 root     system    47,      0 May 30 17:16 atm_cmm
crw-rw-rw-   1 root     system    87,      0 May 30 17:16 aud97
cr--------   1 root     system    17,      0 May 30 16:45 audit
srw-rw----   1 root     system             0 Jul 25 13:40 binlogdmb
crw-------   1 root     system    30,      0 May 30 16:45 cam
lrwxrwxrwx   1 root     system            27 May 30 17:16 changer -> \
../../../../devices/changer
crw--w--w-   1 root     daemon     0,      0 Jul 25 13:44 console
-rw-r--r--   1 root     system            12 Jul 25 13:39 console.gen
lrwxrwxrwx   1 root     system            25 May 30 17:16 cport -> \
../../../../devices/cport
lrwxrwxrwx   1 root     system            24 May 30 17:16 disk -> \
../../../../devices/disk
lrwxrwxrwx   1 root     system            25 May 30 17:16 dmapi -> \
../../../../devices/dmapi
crw-------   1 root     system    31,      0 Jul 26 13:43 kbinlog
crw-------   1 root     system     3,      1 May 30 16:45 kcon
crw-------   1 root     system    82,      0 May 30 17:16 kevm
crw-------   1 root     system    82,      2 May 30 17:16 kevm.pterm
crw-rw----   1 root     system    25,      1 May 30 16:45 keyboard0
crw-------   1 root     system     3,      0 May 30 16:45 klog
cr--r-----   1 root     mem        2,      1 May 30 16:45 kmem
drwxr-xr-x   2 root     system           512 May 12 18:14 lat
cr--r-----   1 root     mem       45,      0 May 30 16:45 lockdev
srw-rw-rw-   1 root     system             0 Jul 25 13:40 log
crw-rw-rw-   1 root     system    34,      0 May 30 17:16 lp0
cr--r-----   1 root     mem        2,      0 May 30 16:45 mem
crw-rw-rw-   1 root     system    88,      0 May 30 17:16 mmsess0
crw-rw----   1 root     system    23,      1 May 30 16:45 mouse0
crw-rw-rw-   1 root     system    52,      0 May 30 17:16 msb0
lrwxrwxrwx   1 root     system            15 May 30 17:16 none -> \
../devices/none
lrwxrwxrwx   1 root     system            25 May 30 17:16 ntape -> \
../../../../devices/ntape
crw-rw-rw-   1 root     system     2,      2 Jul 26 13:28 null
cr--r--r--   1 root     system    26,      0 May 30 16:45 pfcntr
crw-rw-rw-   1 root     system    32,     58 May 30 17:16 pipe
crw-rw-rw-   1 root     system    46,      0 May 30 16:45 poll
crw-------   1 root     system    37,      0 May 30 16:45 prf
srwxrwxrwx   1 root     system             0 Jul 25 13:43 printer
crw-rw-rw-   2 root     system    32,      7 May 12 18:12 ptm
crw-rw-rw-   1 root     system    32,     65 Jun 28 17:14 ptmx
crw-rw-rw-   2 root     system    32,      7 May 12 18:12 ptmx_bsd
drwxr-xr-x   2 root     system           512 May 12 18:12 pts
crw-rw-rw-   1 root     system     7,     31 May 30 16:45 ptyqf
crw-rw-rw-   1 root     system    89,      0 May 30 17:16 random
lrwxrwxrwx   1 root     system            25 May 30 17:16 rdisk -> \
../../../../devices/rdisk
drwxr-xr-x   2 root     system           512 May 30 17:16 sad
crw-------   1 root     system    80,1048575 May 30 17:16 scp_scsi
crw-rw-rw-   2 root     system    32,     49 May 12 18:12 snmpinfo
drwxr-xr-x   3 root     system           512 May 30 17:21 streams
crw-r--r--   1 root     system    15,      0 May 30 16:45 sysdev0
lrwxrwxrwx   1 root     system            24 May 30 17:16 tape -> \
../../../../devices/tape
crw-rw-rw-   1 root     system     1,      0 May 30 16:45 tty
crw-rw-rw-   1 root     system    35,      0 May 30 17:16 tty00
crw-rw-rw-   1 root     system    89,      1 Jul 25 13:39 urandom
crw-r-----   1 root     mem        2,      5 May 30 16:45 vmzcore
crw-rw----   1 root     system    33,      1 May 30 16:45 ws0
crw-rw-rw-   1 root     system    38,      0 May 30 16:45 zero
 
 

The preceding list is edited to remove duplicate device types. The directory usually contains many devices that are terminals (ttys), pseudoterminals (ptys), and disk storage devices (dsk). The number and varity of devices depend on the system's hardware and software configuration. Most of these devices are required by the operating system and applications, and do not relate to storage. You most frequently manage devices that involve data I/O, such as disks, tapes, and networking cards.

Use the hwmgr show devices command to list the devices and controllers that you manage most often. The output from this command excludes I/O devices that you cannot manage by using the dsfmgr or hwmgr commands. The output from the hwmgr show devices command also includes storage devices such as dev/cport/scp2 which relate to devices such as SCSI cards, Fibre Channel switches, and storage array controllers. Entries for dev/changer/mc* are tape changer devices.

To obtain a list of such devices, use the following hwmgr command:

# hwmgr view dev
 HWID: Device Name          Mfg     Model           Location
 ----------------------------------------------------------------------
    6: /dev/dmapi/dmapi
    7: /dev/scp_scsi
    8: /dev/kevm
   53: /dev/cport/scp5              SWXCR            xcr0
   54: /dev/disk/dsk1197c           SWXCR            ctlr-0-unit-0
   69: /dev/disk/floppy53c          3.5in floppy     fdi0-unit-0
   46: /dev/disk/dsk1155c   DEC     HSG80            bus-4-targ-2-lun-17
   79: /dev/disk/dsk6c      DEC     HSZ22    (C) DEC bus-0-targ-0-lun-5
   81: /dev/random
   82: /dev/urandom
   84: /dev/ntape/tape11    COMPAQ  SDT-10000        bus-5-targ-0-lun-0
   90: /dev/disk/dsk1194c   COMPAQ  BD009122C6       bus-2-targ-0-lun-0
   91: /dev/disk/dsk1195c   COMPAQ  BD009122BA       bus-2-targ-1-lun-0
  871: /dev/disk/dsk1180c   DEC     HSG80            bus-4-targ-2-lun-13
  122: /dev/changer/mc1             TL800    (C) DEC bus-4-targ-0-lun-10
  127: /dev/cport/scp2              DATA ROUTER      bus-4-targ-0-lun-0
  128: /dev/cport/scp3              HSG80CCL         bus-4-targ-2-lun-0
  129: /dev/cport/scp4              HSV110 (C)COMPAQ bus-4-targ-4-lun-0
  926: /dev/ntape/tape0     COMPAQ  DLT8000          bus-4-targ-0-lun-8
  927: /dev/ntape/tape1     COMPAQ  DLT8000          bus-4-targ-0-lun-9
  942: /dev/disk/dsk1199c   COMPAQ  HSV110 (C)COMPAQ IDENTIFIER=1001
  687: /dev/disk/cdrom53c   COMPAQ  CDR-8435         bus-6-targ-0-lun-0

The preceding output is edited to remove many similar entries for disk storage devices. In this output, you can see that some kernel subsystem pseudodevices such as /dev/kevm (kernel event manager) are listed, but the numerous terminals and pseudoterminals are not listed. Information about kernel subsystems such as /dev/kevm is provided in the system attributes (sys_attrs*) reference pages. See sys_attrs(5). Other pseudodevices, such as /dev/random are described in reference pages. Use the apropos command to locate the reference pages associated with a particular device as shown in the following examples:

# apropos random
.
.
random, urandom (4)     - Kernel random number source devices
# apropos disk
.
.
disk, dsk, cdrom, rz (7)  - SCSI disk interface

1.5.2.2    Legacy Device Special File Names

According to legacy device naming conventions, all device special files are stored in the /dev directory. The device special file names indicate the device type, its physical location, and other device attributes. Examples of the file name format for disk and tape device special file names that use the legacy conventions are /dev/rz14f for a SCSI disk and /dev/rmt0a for a tape device. The name contains the following information:

Path Prefix Type Number (Instance) Suffix
/dev r (raw) rz (disk) 0 c (partition)
/dev (rewind) rmt (tape) 4 a (density)
/dev n (non-rewind) rmt (tape) 12 h (density)

This information is interpreted as follows:

The path is the directory for device special files. All device special files are placed in the /dev directory.

The prefix differentiates one set of device special files for the same physical device from another set, as follows:

The type is the two or three-character driver name, such as rz for SCSI disk devices or rmt for tape devices.

The number is the unit number of the device, as follows:

The suffix differentiates multiple device special files for the same physical device, as follows:

Legacy device naming conventions are supported so that scripts continue to work as expected. However, features available with the current device naming convention might not work with the legacy naming convention. When Version 5.0 or higher is installed, none of the legacy device special files (such as rz13d) are created during the installation. If you determine that legacy device special file naming is required, you must create the legacy device names by using the appropriate commands described in dsfmgr(8). Some devices do not support legacy device special files.

1.5.2.3    Current Device Special File Names

Current device special files imply abstract device names and convey no information about the device architecture or logical path to the device. The new device naming convention consists of a descriptive name for the device and an instance number. These two elements form the basename of the device as shown in Table 1-3.

Table 1-3:  Sample Current Device Special File Names

Location in /dev Device Name Instance Basename
/disk dsk 0 dsk0
/rdisk dsk 0 dsk0
/disk cdrom 1 cdrom1
/tape tape 0 tape0

A combination of the device name, with a system-assigned instance number creates a basename such as dsk0.

The current device special files are named according to the basename of the devices, and include a suffix that conveys more information about the addressed device. This suffix differs depending on the type of device, as follows:

1.5.2.4    Converting Device Special File Names

If you have shell scripts that use commands that act on device special files, be aware that any command or utility supplied with the operating system operates on current and legacy file names in one of the following ways:

No device can use both forms of device names simultaneously. Test your shell scripts for compliance with the device naming methods. Refer to the individual reference pages or the online help for a command.

If you want to update scripts, translating legacy names to the equivalent current name is a simple process. Table 1-4 lists some examples of legacy device names and corresponding current device names. There is no relationship between the instance numbers. A device associated with legacy device special file /dev/rz10b might be associated with /dev/disk/dsk2b under the current system.

Using these names as examples, you can translate device names that appear in your scripts. You can also use the dsfmgr(8) command to convert device names.

Table 1-4:  Sample Device Name Translations

Legacy Device Special File Name New Device Special File Name
/dev/rmt0a /dev/tape/tape0
/dev/rmt1h /dev/tape/tape1_d1
/dev/nrmt0a /dev/ntape/tape0_d0
/dev/nrmt3m /dev/ntape/tape3_d2
/dev/rz0a /dev/disk/dsk0a
/dev/rz10g /dev/disk/dsk10g
/dev/rrz0a /dev/rdisk/dsk0a
/dev/rrz10b /dev/rdisk/dsk10b

1.5.3    Managing Device Special Files

In most cases, the management of device special files is undertaken by the system itself. During the initial full installation of the operating system, the device special files are created for every SCSI disk and SCSI tape device found on the system. If you updated the operating system from a previous version by using the update installation procedure, both the current device special files and the legacy device files might exist. However, if you subsequently add new SCSI devices the dsfmgr command creates only the new device special files by default. When the system is rebooted, the dsfmgr command is called automatically during the boot sequence to create the new device special files for the device. The system also automatically creates the device special files that it requires for pseudodevices such as ptys (pseudoterminals).

When you add a SCSI disk or tape device to the system, the new device is found and recognized automatically, added to the hardware management databases, and its device special files created. On the first reboot after installation of the new device, the dsfmgr command is called automatically during the boot sequence to create the new device special files for that device.

To support applications that work only with legacy device names, you might need to manually create the legacy device special files, either for every existing device, or for recently added devices only. Some recent devices that support features such as Fibre Channel can use only the current special device file naming convention.

The following sections describe some typical uses of the dsfmgr command. See dsfmgr(8) for detailed information on the command syntax. The system script file /sbin/dn_setup, which runs at boot time to create device special files, provides an example of a script that uses dsfmgr command options.

1.5.3.1    Using dn_setup to Perform Generic Operations

The /sbin/dn_setup script runs automatically at system startup to create device special file names. The /sbin/bcheckrc utility also verifies device special files during system startup. If /sbin/bcheckrc discovers any problems, it displays the following message at the console:

bcheckrc: Device Naming failed initial check.
    Run dsfmgr -v, and if no errors are 
    reported, exit or type CTRL-D to continue
    booting normally.

Normally, you do not need to use the dn_setup command. It is useful if you need to troubleshoot device name problems or restore a damaged special device file directory or database files. (See also Section 1.5.3.3.) If you frequently change your system configuration or install different versions of the operating system, you might see device-related error messages at the system console during system startup. These messages might indicate that the system is unable to assign device special file names. This problem can occur when the saved configuration does not map to the current configuration. Adding or removing devices between installations can also cause the problem.

The -sanity_check option alone is useful to administrators. Enter the following command to verify the device name database:

# /sbin/dn_setup -sanity_check
Passed.

If you see messages other than Passed, use the dsfmgr command to verify and fix the device name database. If the problem is serious and prevents a successful boot, you might be instructed to use other command options for debugging and problem solving under the guidance of your technical support office. See dn_setup(8).

1.5.3.2    Displaying Device Classes and Categories

Any individual type of device on the system is identified in the Category to Class-Directory, Prefix Database file, /etc/dccd.dat. You can display information in these databases by using the dsfmgr command. This information enables you to find out what devices are on a system, and obtain device identification attributes that you can use with other dsfmgr command options. For example, you can find a class of devices that have related physical characteristics, such as being disk devices. Each class of devices has its own directory in /dev such as /dev/ntape for nonrewind tape devices. Device classes are stored in the Device Class Directory Default Database file, /etc/dcdd.dat.

Use the following command to view the entries in the databases:

# /sbin/dsfmgr -s

dsfmgr: show all datum for system at /
 
Device Class Directory Default Database:
     # scope mode  name
    --  ---  ----  -----------
     1   l   0755  .
     2   c   0755  disk
     3   c   0755  rdisk
     4   c   0755  tape
     5   c   0755  ntape
     6   l   0755  none
 
Category to Class-Directory, Prefix Database:
 #   category       sub_category   type        directory  iw  t mode prefix
--   -------------- -------------- ----------  ---------  --  - ---- --------
 1   disk           cdrom          block       disk        1  b 0600 cdrom
 2   disk           cdrom          char        rdisk       1  c 0600 cdrom
 3   disk           floppy         block       disk        1  b 0600 floppy
 4   disk           floppy         char        rdisk       1  c 0600 floppy
 5   disk           floppy_fdi     block       disk        1  b 0666 floppy
 6   disk           floppy_fdi     char        rdisk       1  c 0666 floppy
 7   disk           generic        block       disk        1  b 0600 dsk
 8   disk           generic        char        rdisk       1  c 0600 dsk
 9   parallel_port  printer        *           .           1  c 0666 lp
10   pseudo         kevm           *           .           0  c 0600 kevm
11   tape           *              norewind    ntape       1  c 0666 tape
12   tape           *              rewind      tape        1  c 0666 tape
13   terminal       hardwired      *           .           2  c 0666 tty
14   *              *              *           none        1  c 0000 unknown
 
Device Directory Tree:
   12800    2 drwxr-xr-x  6 root system 2048 May 23 09:38 /dev/.
     166    1 drwxr-xr-x  2 root system  512 Apr 25 15:58 /dev/disk
    6624    1 drwxr-xr-x  2 root system  512 Apr 25 11:37 /dev/rdisk
     180    1 drw-r--r--  2 root system  512 Apr 25 11:39 /dev/tape
    6637    1 drw-r--r--  2 root system  512 Apr 25 11:39 /dev/ntape
     181    1 drwxr-xr-x  2 root system  512 May  8 16:48 /dev/none
 
Dev Nodes:
 13100  0 crw-------  1 root system 79,  0 May  8 16:47 /dev/kevm
 13101  0 crw-------  1 root system 79,  2 May  8 16:47 /dev/kevm.pterm
 13102  0 crw-r--r--  1 root system 35,  0 May  8 16:47 /dev/tty00
 13103  0 crw-r--r--  1 root system 35,  1 May  8 16:47 /dev/tty01
 13104  0 crw-r--r--  1 root system 34,  0 May  8 16:47 /dev/lp0
   169  0 brw-------  1 root system 19, 17 May  8 16:47 /dev/disk/dsk0a
  6627  0 crw-------  1 root system 19, 18 May  8 16:47 /dev/rdisk/dsk0a
   170  0 brw-------  1 root system 19, 19 May  8 16:47 /dev/disk/dsk0b
  6628  0 crw-------  1 root system 19, 20 May  8 16:47 /dev/rdisk/dsk0b
   171  0 brw-------  1 root system 19, 21 May  8 16:47 /dev/disk/dsk0c
    
.
.
.

This display provides you with information that you can use with other dsfmgr commands. (See dsfmgr(8) for a complete description of the fields in the databases.) For example:

1.5.3.3    Verifying and Fixing the Databases

Under unusual circumstances, the device databases might be corrupted or device special files might be accidentally removed from the system. You might see errors indicating that a device is no longer available, but the device itself does not appear to be faulty. If you suspect that there might be a problem with the device special files, you can check the databases by using the dsfmgr -v (verify) command option.

Caution

If you see error messages at system startup that indicate a device naming problem, use the verify command option to enable you to proceed with the boot. Check your system configuration before and after verifying the databases. The verification procedure fixes most errors and enables you to proceed. The verify option does not cure any underlying device or configuration problems.

Such problems are rare and usually arise when performing unusual operations such as switching between boot disks. Errors generally mean that the system is unable to recover and use a good copy of the previous configuration, and errors usually arise because the current system configuration no longer matches the database.

As for all potentially destructive system operations, ensure that you are able to restore the system to its identical previous configuration, and to restore the previous version of the operating system from your backup.

For example, if you attempt to configure the floppy disk device to use the mtools commands, and you find that you cannot access the device, use the following dsfmgr command to help diagnose the problem:

# /sbin/dsfmgr -v
dsfmgr: verify all datum for system at /
 
Device Class Directory Default Database:
    OK.
 
Device Category to Class Directory Database:
    OK.
 
Dev directory structure:
    OK.
 
Dev Nodes:
    ERROR: device node does not exist: /dev/disk/floppy0a
    ERROR: device node does not exist: /dev/disk/floppy0c
  Errors:   2
 
Total errors:   2

This output shows that the device special files for the floppy disk device are missing. To correct this problem, use the same command with the -F (fix) flag to correct the errors as follows:

# /sbin/dsfmgr -v -F
 
dsfmgr: verify all datum for system at /
 
Device Class Directory Default Database:
    OK.
 
Device Category to Class Directory Database:
    OK.
 
Dev directory structure:
    OK.
 
Dev Nodes:
    WARNING: device node does not exist: /dev/disk/floppy0a
    WARNING: device node does not exist: /dev/disk/floppy0c
    OK.
 
Total warnings:   2

In the preceding output, the ERROR changes to a WARNING, indicating that the device special files for the floppy disk are created automatically. If you repeat the dsfmgr -v command, no further errors are displayed.

1.5.3.4    Deleting Device Special Files

If a device is permanently removed from the system, you can remove its device special file to reassign the file to another type of device. Use the dsfmgr -D command option to remove device special files, as shown in the following example:

# ls /dev/disk
cdrom0a   dsk0a     dsk0c     dsk0e     dsk0g     floppy0a
cdrom0c   dsk0b     dsk0d     dsk0f     dsk0h     floppy0c
 
#  /sbin/dsfmgr -D /dev/disk/cdrom0*
 -cdrom0a -cdrom0a -cdrom0c -cdrom0c
#  ls /dev/disk
dsk0a     dsk0c     dsk0e     dsk0g     floppy0a
dsk0b     dsk0d     dsk0f     dsk0h     floppy0c

The output from the ls command shows that there are device special files for cdrom0. Running the dsfmgr -D command option on all cdrom devices, as shown by the wildcard symbol (*), causes all device special files for that subcategory to be permanently deleted. The message that follows repeats the basename (cdrom0) twice, because it also deletes the device special files from the /dev/rdisk directory where the raw or character device special files are located.

If device special files are deleted in error, and no hardware changes are made, recreate the files as follows:

#  /sbin/dsfmgr -n cdrom0a
 
  +cdrom0a +cdrom0a
#  /sbin/dsfmgr -n cdrom0c
 
  +cdrom0c +cdrom0c

1.5.3.5    Moving and Exchanging Device Special File Names

You might want to move (reassign) the device special files between devices by using the dsfmgr -m (move) command option. You can also exchange the device special files of one device for those of another device by using the -e option. For example:

#  /sbin/dsfmgr -m dsk0 dsk10
#  /sbin/dsfmgr -e dsk1 15
 
 

The following procedure provides an example of how you use these command options when replacing a tape device. The example uses a desktop TLZ06 (DAT) SCSI drive as an emergency replacement for a failed internal tape drive. The procedure also applies to replacing an internal tape drive or adding and removing a drive mounted in a tape changer (jukebox).

This procedure applies to all configurations that have an available SCSI address and (if required) an appropriate free SCSI port. If you do not have an available SCSI bus, you might need to first remove the failed tape and then add the replacement. Adding a tape drive requires a quick reboot of the system. Advise users that the file system will be unavailable for whatever time it typically takes for your system to reboot.

The scenario described in the following procedure assumes that you want to preserve the identity of a device by transferring an existing device to the newly installed device instead of using the name that the system automatically assigns to the new device. For example, assume you create a backup script that uses tape device tape0 of three possible devices attached to the system. The other tape devices are tape1 and tape2. If the tape0 fails and you replace it with a new device, the new device will be assigned the name tape3. Your backup script will then fail, because it addresses a missing device. You must either rewrite your scripts to address the new device name, or rename the device so that it takes on the identity of the failed device.

Alternatively, consider the advantages of automatic device name creation, making your local scripts and utilities independent of device names. The script continues to function independent of any hardware configuration changes. The dsfmgr and hwmgr commands enable you to poll the system configuration and dynamically determine the characteristics and attributes of a device. This strategy is more efficient because it enables you to failover to a healthy device if your original target is unavailable.

Before you use the procedure you must do the following:

Start the procedure as follows:

  1. Use the hwmgr command to display information about all SCSI devices as follows:

    #  /sbin/hwmgr show scsi
           SCSI             DEVICE DEVICE  DRIVER NUM  DEVICE FIRST
    HWID: DEVICEID HOSTNAME TYPE   SUBTYPE OWNER  PATH FILE   VALID PATH
    --------------------------------------------------------------------
     32:  0        f2394    disk   none    2      1    dsk0   [0/0/0]
     33:  1        f2394    disk   none    2      1    dsk1   [0/1/0]
     34:  2        f2394    cdrom  none    0      1    cdrom0 [0/4/0]
     35:  3        f2394    disk   none    0      1    dsk4   [0/2/0]
     41:  4        f2394    tape   none    0      1    tape0  [1/3/0]
    

    Pipe the output to a file to create a record of the existing SCSI devices.

    The output from the preceding command provides the following information, which is relevant to this procedure:

  2. Use the sysman shutdown or shutdown command to notify users and shut down the system. Specify that you want to stop the operating system by using the -h (halt) option . For example:

    # shutdown -h +10
    "System shutting down to replace backup device \
    Back in 15 minutes"
    

  3. At the console prompt, determine the current device address assignments by using the following command:

    >>> show config
    

    The output from this command includes a list of devices attached to each SCSI bus. For example, MKB300.3.0.13.0 is the entry for a TLZ06 tape device attached at target 3 on bus B. The last three numbers are the bus, target, LUN address.

    If you were unable to determine an available bus address in Step 1, you can determine it based on the output from the show config command.

  4. Switch off power to the system using the front panel switch and wait for at least 20 seconds before removing cables.

  5. Complete this step only if either of the following conditions apply. Otherwise, go to Step 6:

    If either condition applies, proceed as follows:

    1. Switch off the power to the tape drive and physically remove it from the system as described in its owner's manual.

    2. At the console prompt, reboot your system to single-user mode as follows:

      >>> boot -flags s
      

    3. Use the following command to delete the tape:

      # /sbin/hwmgr delete scsi -did NN
      

      The value specified for the -did option is the SCSI DEVICEID that you determined in Step 1.

    4. Under rare circumstances, the deletion may not be complete. You should verify that the device is gone by reentering the hwmgr show scsi command as described in Step 1. You should also verify the content of the /dev/tape and /dev/ntape directories to ensure that the device special files are gone.

    If any record of the device remains on the system, see the problem solving information later in this procedure.

  6. Proceed with this step only if you intend to replace the tape device.

    Connect the new tape drive as described in the owner's manual. Switch on the power to the tape drive and to your system in the appropriate order.

    Reboot the system to single-user mode as follows:

    >>> boot -flags s
    

  7. Watch the boot procedure for messages. During the reboot, the new tape drive is automatically detected and a message is displayed at the console that is similar to the following:

    dsfmgr:  Note creating device special files
    +tape16...
    

    This message confirms that the new tape device was discovered and the device special files are being created in the /dev/tape or /dev/ntape directory. If you do not see such a message, you must create the device special files manually. (However, you should first attempt to complete this procedure.)

  8. When your system has rebooted to multiuser level, enter the following command to display information about the newly added tape drive:

    # /sbin/hwmgr show scsi
          SCSI              DEVICE DEVICE  DRIVER NUM  DEVICE FIRST
    HWID: DEVICEID HOSTNAME TYPE   SUBTYPE OWNER  PATH FILE   VALID PATH
    --------------------------------------------------------------------
    48:  5        f2394     tape   none    0      1    tape16 [1/4/0]
    

    Comparing this command output to the output in Step 1, you can see that the new disk is now listed as tape16, HWID 48. Notice that the DEVICE FILE (device special file) instance for the new tape drive is 16 which considerably higher than that of the original tape drive (0).

  9. When you replace a device, you might also want to take one of the following actions:

    1. Transfer the device name from an existing (perhaps failed) tape drive to a replacement drive. Use the dsfmgr command with the -e (exchange) option, specifying the DEVICE FILE as follows:

      # /sbin/dsfmgr -e tape16 tape0
      tape16<==>tape0       tape16_d0<==>tape0_d0 tape16_d1<==>tape0_d1
      tape16_d2<==>tape0_d2 tape16_d3<==>tape0_d3 tape16_d4<==>tape0_d4
      tape16_d5<==>tape0_d5 tape16_d6<==>tape0_d6 tape16_d7<==>tape0_d7
      tape16c<==>tape0c     tape16<==>tape0       tape16_d0<==>tape0_d0
      tape16_d1<==>tape0_d1 tape16_d2<==>tape0_d2 tape16_d3<==>tape0_d3
      tape16_d4<==>tape0_d4 tape16_d5<==>tape0_d5 tape16_d6<==>tape0_d6
      tape16_d7<==>tape0_d7 tape16c<==>tape0c
      

      The output contains bidirectional (<==>) arrows, indicating that the device special file names are exchanged. Verify the exchange as follows:

      # /sbin/hwmgr show scsi
            SCSI              DEVICE DEVICE  DRIVER NUM  DEVICE FIRST
      HWID: DEVICEID HOSTNAME TYPE   SUBTYPE OWNER  PATH FILE   VALID PATH
      -------------------------------------------------------------------
      48:   5        f2394    tape   none    0      1    tape0  [1/4/0]
      

      The tape drive at SCSI address 1/4/0 (the newly added drive) is now tape0.

    2. Change the device special file to a lower number or 0. For example, the dump command defaults to tape0_d0 (tape 0 with default density). If you are using a tape changer (jukebox) with multiple tape drives, you might need to renumber the devices so that your local scripts work as expected.

      Use the dsfmgr command with the -m (move) option, specifying the DEVICE FILE as follows:

      # /sbin/dsfmgr -m tape16 tape1
       
      tape16=>tape1       tape16_d0=>tape1_d0 tape16_d1=>tape1_d1
      tape16_d2=>tape1_d2 tape16_d3=>tape1_d3 tape16_d4=>tape1_d4
      tape16_d5=>tape1_d5 tape16_d6=>tape1_d6 tape16_d7=>tape1_d7
      tape16c=>tape1c     tape16=>tape1       tape16_d0=>tape1_d0
      tape16_d1=>tape1_d1 tape16_d2=>tape1_d2  ape16_d3=>tape1_d3
      tape16_d4=>tape1_d4 tape16_d5=>tape1_d5 tape16_d6=>tape1_d6
      tape16_d7=>tape1_d7 tape16c=>tape1c
      

      The output contains one-way arrows to indicate that the device special file names are moved. Verify the exchange as follows:

      # /sbin/hwmgr show scsi
            SCSI              DEVICE DEVICE  DRIVER NUM  DEVICE FIRST
      HWID: DEVICEID HOSTNAME TYPE   SUBTYPE OWNER  PATH FILE   VALID PATH
      --------------------------------------------------------------------
      46:   4        f2394    tape   none    0      1    tape1  [0/5/0]
      48:   5        f2394    tape   none    0      1    tape0  [0/6/0]
       
       
      

      The tape drive at SCSI address 0/5/0 is now named tape1.

The checkpoints embedded in the procedure should ensure correct completion. Under rare circumstances, the addition of a new tape drive or the removal of an existing tape drive might fail. You can resolve this problem by using the following troubleshooting procedure:

  1. Keep a careful record of the options that you attempt in case you need to contact your field service representative.

    Avoid a trial-and-error approach by trying various combinations of hwmgr and dsfmgr commands. Most likely, the problem is that the device's database records are in an indeterminate state. Trying various command options might cause more problems.

  2. Start the Event Manager (EVM) viewer and ensure that it is configured to display events of priority 300 and greater:

    # /usr/sbin/sysman event_viewer
    

    Use the event viewer to verify hardware events generated by the procedure or to trap errors. You must refresh the event viewer to view current events. The event viewer displays events confirming that the tape drive was recognized and assigned a new basename (tape*) and a hardware identifier (HWID). It may take some time for the system to post these events and you might need to refresh the event viewer to see the events.

  3. Use the following command to scan the SCSI buses and assign identifiers to all new devices that are found:

    # /sbin/hwmgr scan scsi
    hwmgr: Scan request successfully initiated
    

    A hardware scan is asynchronous and, although the system prompt is returned almost immediately, it does not signify the end of a scan. A scan can take several minutes on a system with many SCSI devices. You must wait for the scan to complete. Because a scan is normally performed automatically on system startup, it does not display a message when it completes.

  4. When the scan is complete, use the following command to verify whether the tape drive was detected:

    # /sbin/hwmgr show scsi
          SCSI              DEVICE DEVICE  DRIVER NUM  DEVICE FIRST
    HWID: DEVICEID HOSTNAME TYPE   SUBTYPE OWNER  PATH FILE   VALID PATH
    -------------------------------------------------------------------
    46:   4        f2394    tape   none    0      1    tape1  [0/5/0]
    48:   5        f2394    tape   none    0      1           [1/4/0]
     
     
    

  5. If you notice a missing device special file in the hwmgr output in troubleshooting Step 4, verify whether or not the device special files were created for the newly added device. Verify the content of the /dev/tape or /dev/ntape directories as appropriate.

    If there is no device special file, use the dsfmgr command (see dsfmgr(8)) with the -K option to create the device special files as follows:

    # /sbin/dsfmgr -K
    +tape16 +tape16_d0 +tape16_d1...
    

  6. If the preceding steps are unsuccessful, the device database might be in an indeterminate state. Look for an entry for the tape drive in the /etc/dfsc.dat file by searching for the HWID or device special file name. For example, if the tape drive is listed as tape0, enter the following command:

    # grep "tape0" /etc/dfsc.dat
    

    The output from the preceding command is similar to the following:

    A: 0 130003e    48    9    6   c  ""   /dev/tape  tape 0
    

    In the preceding output, the fourth column contains the hardware identifier (HWID). In this case, it is 48.

  7. Use the following command to confirm the status of the device, using the HWID 48 from the preceding step:

    # /sbin/hwmgr show component -id 48
    

    If there is output from the command, contact your technical support representative. Otherwise, proceed to the next step.

  8. If you still cannot access the device, reboot your system to reset the device database. Contact your technical support representative if you are unable to reboot the system.

1.5.3.6    Renumbering Device Special Files

When you reconfigure components on your system (such as by deleting devices), the instance number assigned to device special files is incremented. In time, you might find that the device special files appear to be randomly numbered. Also, if you back up the system by using a clone-copy-delete procedure, the instance value increases at each backup and device names can become large and difficult to manage. Because clone backup programs must also determine the new device instances at each backup, the changing device special file names will increase the time required for a backup.

Using the dsfmgr -vI command option minimizes (or resets) the instance number of each device to the lowest possible value. If your system has a fixed configuration except for the backup procedure, using this option ensures that each new set of cloned backup devices always has the same new names, simplifying your backup procedure. The following procedure demonstrates how to use the dsfmgr -vI command to reset the device instances for all devices to the lowest possible value. The following procedure demonstrates how the dsfmgr -vI command works:

  1. The following (truncated) output shows how a tape's device special file is initially numbered on a newly installed system:

    # /sbin/hwmgr view device
    61: /dev/ntape/tape0    COMPAQ  SDT-10000   bus-5-targ-0-lun-0
    

  2. To simulate the effect of many changes to the system's configuration, the following command renumbers the tape's device special file to 40:

    # /sbin/dsfmgr -m tape0 tape40
    tape0=>tape40  tape0_d0=>tape40_d0  tape0_d1=>tape40_d1
    tape0_d2=>tape40_d2  tape0_d3=>tape40_d3  tape0_d4=>tape40_d4.......
    

  3. The following (truncated) output shows how a tape's device special file is now increased to tape40:

    # /sbin/hwmgr view device
    .61: /dev/ntape/tape40    COMPAQ   SDT-10000        bus-5-targ-0-lun-0
    

  4. To simulate the effect of removing devices from the system, the following command renumbers the tape's device special file to 10:

    # /sbin/dsfmgr -m tape0 tape40
    tape40>tape10.
    

    This means that there exist a number of unused device special file instances in the range11 through 39. However, the system is currently unaware that these instances are free, and it will continue to number newly installed devices at instance 41.

  5. To cause the system to allocate the lowest available instance, enter the following command:

    # /sbin/dsfmgr -vI
    dsfmgr: verify all datum for system (version) at /
     
    Default File Tree:
        OK.
     
    Device Class Directory Default Database:
        OK.
     
    Device Category to Class Directory Database:
        OK.
     
    Dev directory structure:
        OK.
     
    Device Status Files:
        OK.
     
    Dev Nodes:
        OK.
     
    Minimize next instance values:
          tape     41  =>   11
    

    The message "Minimize next instance values:" shows that the lowest available instance is now 11.

  6. To test that the operation is successful, the following commands delete and rescan the tape device to cause the system to allocate a new device special file at instance 11:

    # /sbin/hwmgr delete scsi -did 3
    hwmgr: The delete operation was successful
    # /sbin/hwmgr scan scsi
    hwmgr: Scan request successfully initiated
    # /sbin/hwmgr show scsi
    84:  3        argo1      tape      none    0      1    tape11 [5/0/0]
    

    The tape's device special file is now numbered 11, indicating that the system used the lowest available instance.

To use this option on your system, you need only amend your backup script, or create a new script that runs the dsfmgr -vI command option at the appropriate time. Alternatively, run the command after performing any hardware configuration changes.