G    Error Messages and Troubleshooting

This appendix provides a description of the error messages you might see if there is a problem during an installation. Every attempt was made to make the error list complete, however, it is possible you may encounter errors that are not described here. Logical Storage Manager (LSM) troubleshooting information also is included.

This appendix is organized by topic rather than by error message:

G.1    Full Installation Error Messages

The following describes the general error messages that you might encounter during a Full Installation.

The firmware revision on this system could not be detected

Some processor types do not allow the Full Installation technology to detect the current installed firmware level. See the Alpha AXP Systems Firmware Update Release Notes Overview for information regarding the recommended revision numbers and how to update your system firmware.

The following error occurred, causing the "ping" command to fail for disk_name: <Specific error>

While using the text-based interface to the Full Installation, you entered the ping command to determine the device name associated with a physical disk by blinking the input/output light. For some reason, the ping command failed. Failure of the ping command does not mean that you cannot use this disk; it simply means the disk cannot be identified in this way.

No free partitions are remaining on this system, therefore the installation process cannot continue with this configuration. Either choose the "Default File System Layout", or choose a new "Custom File System Layout." If you cannot alter the layout, you may need to edit your disk partitions. Refer to the "Installation Guide -- Advanced Topics" for more information about editing disk partitions from the UNIX shell.

During the Full Installation, the system ran out of free, unused disk partitions. Perhaps a disk has a disk label where partition a consumes the entire disk, and there are no other disks available. In that case, you have to repartition the disk into individual segments of the right size. If you are using the graphical user interface, invoke the Disk Configuration utility to perform this task. If you are using the text-based interface, exit to the UNIX shell and use the disklabel command to repartition the disk.

Some cases, however, might be more difficult to remedy. For example, if a system had two RZ25 disks (too small for the recommended disk label), and both disks have CD type labels on them (that is, partition a and partition c are the same large partition, and no other partitions are assigned), one disk could be used for the / file system , the other for usr, and there is no place for anything else. The solution in this case is to edit the disk labels.

The partition you selected for [usr] takes up the entire disk, and leaves no space for the LSM private region. Therefore, you must choose a different partition.

You selected to install and configure the Logical Storage Manager (LSM). Each disk needs an LSM private region partition, and no room was left on the disk in question. You have to repartition the disk to make a partition at least 2 MB in size to hold the LSM private region. If that is not possible, select another disk.

Unable to get a list of software subsets available for installation.

This message would occur if you have a corrupted distribution media. Contact your support representative to obtain a new set of operating system CD-ROMs. If you are performing the installation from a RIS server, the RIS environment may have to be recreated. Contact your RIS administrator about the problem.

Unable to determine the available software due to the following error: <Specific error>

This message would occur if you have a corrupted distribution media. Contact your support representative to obtain a new set of operating system CD-ROMs. If you are performing the installation from a RIS server, the RIS environment may have to be recreated. Contact your RIS administrator and notify them about the problem.

G.1.1    Disk Label, File System and LSM Configuration Error Messages

If errors are encountered during the configuration of a disk label, file system, or Logical Storage Manager (LSM) during a Full Installation, you will be instructed as follows:

Please inspect the file /var/tmp/install.FS.log to identify the
source of the failure

The /var/tmp/install.FS.log file is written in the /var memory file system (MFS) and is deleted upon system reboot. Use the more /var/tmp/install.FS.log command to view the contents of this file.

The corrective action depends upon the error message returned from the failed command. As a general procedure, ensure that the installation target disk is connected and is operating properly. If it is, restart the installation procedure and select a different disk (if possible). Contact your support representative to diagnose the problem with the disk.

Errors encountered during the LSM configuration phase also can be the result of specific problems with an existing LSM configuration. If possible, analyze the error message returned from the failing command, check and correct the existing configuration, then restart the installation. If you need more information about fixing LSM problems, refer to the Logical Storage Manager guide or the related LSM command reference pages.

The following topics are covered in this section:

G.1.1.1    Fixing LSM Configuration Errors

Details on any LSM install error, including the actual error message, can be found in the /var/tmp/install.FS.log file. This file exists in the /var memory file system and will be available only until the system is rebooted. Typically, the source of the error will be the last entry in the log.

Additional information on the current state of the LSM configuration can be displayed by using the commands shown in Table G-1:

Table G-1:  LSM Display Commands

Display Command LSM Components Displayed
voldisk list disk, disk media
voldisk -s list expanded voldisk list output
volprint volume, plex, subdisk, private region
volprint -t expanded volprint output
voldg list rootdg disk group, private region
disklabel -r name partition fstype

The various components of an LSM configuration can be removed manually once the source of the problem has been identified. The commands to remove specific LSM components are shown in Table G-2:

Table G-2:  LSM Remove Commands

Remove Command LSM Components Removed
voldisk rm name disk
voldg rmdisk name disk media, private region
voledit -rf rm name volume, plex, subdisk

The voldisk rm command removes the LSM disk and updates the partition fstype on the disk label to unused. If, for some reason, the disk label is not updated after executing this command, the fstype can be set manually to unused by entering the following command:

# disklabel -s unused /dev/disk/dsk0

Once the source of the problem has been removed, the installation can be restarted by entering the restart command or by rebooting the installation media.

If you do not need to preserve any existing LSM information, LSM can be removed completely from the system by issuing the disklabel -z command against each disk on the system before starting the installation procedure. This method is suggested if you are unsure of the integrity of an existing LSM configuration.

Caution

Be aware that the original configuration cannot be restored once the existing disk labels have been removed. All existing data on the system is lost.

This information is specific to LSM Full Installations, and it is not intended to be an overview of general LSM topics. If you need more information, refer to the Logical Storage Manager guide or the related LSM reference pages.

G.1.1.2    Fixing Multiple hostids Error on Existing LSM Configurations

LSM requires that a single hostid be defined for the rootdg disk group. Due to various reasons, including swapping disks between LSM systems without properly exporting them from their original system and importing them into the new system, an existing LSM configuration can be left in a state where multiple hostids exist for the rootdg disk group.

The following message is displayed during a Full Installation if you attempt to install LSM on a system with an LSM configuration that has multiple hostids:

LSM could not be initialized on this system due to the
following error:
lsm:voldctl: ERROR: enable failed: 
Multiple hostid's found for rootdg
 
o Choose "Continue" if you want the install process 
  to create all file systems without LSM.  Refer 
  to the Logical Storage Manager documentation for 
  instructions on how to configure LSM after 
  the installation.
o Choose "Exit Installation" if you would like to 
  attempt recovery from the LSM failure.  You will 
  be placed in single-user mode at the UNIX
  shell  with superuser privileges.  Refer to the 
  Logical Storage Manager documentation for any 
  additional information regarding the error.
 
  1) Continue
  2) Exit Installation
 
Enter your choice:
 
 

LSM cannot be selected during a Full Installation on a system where multiple hostids are found for the rootdg disk group. This problem, which is not specific to the installation environment, cannot be resolved in an automated fashion by the Full Installation process. If you want to install LSM, you must resolve the problem manually before restarting the Full Installation.

You may have to try several different methods to fix the problem:

  1. The first way to resolve the problem is to boot the existing system to determine the current state of LSM and remove the invalid hostids. On systems where multiple hostids exist, messages similar to the following are displayed when LSM is initialized during system boot:

    starting LSM in boot mode
    lsm:vold: WARNING: Disk dsk1d names group rootdg, but group ID differs
    lsm:vold: WARNING: Disk dsk2d names group rootdg, but group ID differs
    lsm:vold: WARNING: Disk dsk4h names group rootdg, but group ID differs
    lsm:vold: WARNING: Disk dsk5h names group rootdg, but group ID differs
    lsm:vold: WARNING: Disk dsk6h names group rootdg, but group ID differs
     
     
    

    In this example, the system has five LSM private regions located on partitions dsk1d, dsk2d, dsk4h, dsk5h, and dsk6h.

    Once the system is running, use the voldisk -s list command to view detailed information on each disk under LSM control. This information will include the hostid for each private region listed. Once the erroneous hostids have been identified, remove the private region containing these hostids and restart the Full Installation process. See Section G.1.1.1 for more information about the commands that can be used to interrogate the existing LSM configuration and how to manually remove sources of problems.

    Caution

    LSM private regions contain information that is critical to the existing LSM configuration. Removing LSM private regions should be performed with the utmost care and only should be performed by someone who understands both LSM and the details of the existing configuration. If you are unsure about performing this task, please ask your system administrator for assistance or refer to the Logical Storage Manager guide.

  2. If the system is in a state where it cannot be booted, or if the system boots but LSM cannot be enabled because the existing LSM configuration is corrupted, the problem will have to be rectified by attempting to physically remove the disk that contains the erroneous private region. In this case, a working knowledge of the system itself and what activities have been recently performed on it will help. There is a good chance that the erroneous private region exists on a disk that recently has been added to the system from another system. For example, an administrator might have swapped disks between systems without realizing that existing LSM information from the previous system was left on the disk.

  3. If removing suspect disks fails to rectify the problem because the proper disk cannot be identified, your only recourse is to completely remove the existing LSM configuration from the system before restarting the Full Installation. This can be performed by booting the distribution media, exiting to the UNIX shell, and using the disklabel -z command to zero out the disk label of every disk on the system that contains an LSM private region.

    You can determine which disks contain an LSM private region by analyzing the partition fstype values from the disk label of each disk. Refer to the disklabel(8) reference page for more information regarding LSM fstype values. If you cannot determine which disks contain an LSM private region, zero out the disk label of every disk on the system. If zeroing out the disk label of every disk on the system in not feasible, then you can cycle through a process of zeroing out the disk label of a single suspect disk and then restarting the Full Installation process until LSM can be selected. See Section G.1.1.1 for more information about using the disklabel -z command to remove an existing LSM configuration.

G.1.1.3    Restarting LSM

If the LSM daemons vold and voliod fail to restart when your system is rebooted or the LSM configuration database is corrupted, the LSM volume on which the / file system exists will not be accessible. Under such circumstances your system cannot be brought up to multiuser mode. To repair possible problems in /etc/vol/volboot or the rootdg disk group, use LSM commands to rectify the problem.

Use this procedure to restart LSM if it fails to start during system boot:

  1. Create LSM device special files:

    
    # volinstall
     
     
    

  2. Start the LSM configuration daemon in disable mode:

    # vold -k -r reset -m disable -x nohostid
    

  3. Initialize the /etc/vol/volboot file:

    # voldctl init
    

  4. Put vold in the enabled mode and import all LSM diskgroups:

    # voldctl enable
    

  5. Get a list of all disks known to LSM:

    
    # voldisk list
    

    Make sure that all disks have device special files in /dev/disk.

  6. Execute the volprint command to obtain information about the LSM configuration:

    
    # volprint -htA
    

  7. Start the LSM volumes:

    
    # volume -g diskgroup start volume_name
    

    The value of the diskgroup parameter is most likely rootdg, which represents the system disk.

  8. To rectify problems in a file, the volume needs to be mounted. For example, the / file system may have to be mounted to fix a file such as /etc/vol/volboot or /etc/inittab.

    If the / file system was using AdvFS as the file system type, enter commands similar to the following to mount it:

    # mkdir -p /etc/fdmns/root_domain
    # cd /etc/fdmns/root_domain
    # ln -s /dev/vol/rootdg/rootvol rootvol
    # mount root_domain#root /mnt
    

    If the / file system was using UFS as the file system type, the LSM volume rootvol is mounted as follows:

    # fsck /dev/rvol/rootdg/rootvol
    # mount /dev/vol/rootdg/rootvol /mnt
    

Refer to the Logical Storage Manager guide for more information about how to correct problems encountered while enabling LSM or starting LSM volumes.

G.1.2    Configuration Description File (CDF) Validation Errors

The following message is displayed when the installation procedure encounters an error while validating an install.cdf file before beginning an installation cloning:

The Configuration Description File (CDF) validation procedure has found the following errors: <List of specific errors>

This error causes the Full Installation to stop. The list of CDF validation errors will include one or more messages that discuss the errors encountered in the CDF.

The corrective action depends on the validation errors returned from the install procedure. If you are performing the installation from a RIS server, you should confirm with your RIS server administrator that your system is registered for the proper CDF. To continue with the cloned installation, the RIS server administrator must either reregister the system with the correct CDF or correct the current CDF based upon the validation error messages.

After you have corrected the problem, restart the Full Installation by doing one of the following:

The error message is saved in the /var/tmp/install.log for your reference until you reboot this system.

G.1.3    Software Subset Load Errors

The software load procedure can fail for a number of reasons, including software inventory problems resulting from a corrupted distribution media, network errors during a RIS install, and CD-ROM read errors during a CD-ROM install. To handle potential problems, the software load procedure makes two attempts to load software. If the initial attempt fails, a second attempt is made to load the specific software subsets that were not loaded during the first attempt.

The installation procedure was not able to correctly install the product_name software subsets. This may be the result of a corrupted distribution. Another attempt will be made to install this software.

This message is displayed if one or more subsets fail during the first load attempt. The subset load procedure will then attempt to reload the failed subsets. Check the /var/adm/smlogs/install.log file for more information.

The installation procedure successfully installed the mandatory software subsets. One or more optional subsets did not install correctly. This may be the result of a corrupted distribution. The installation will continue.

This message is displayed if an optional subset fails to load after two attempts. The corrective action depends on the software load errors returned from the install procedure. You can find more information about the problem when the installation has completed. Use the more command to view the contents of the /var/adm/smlogs/fverify.log and /var/adm/smlogs/setld.log log files to review the software load errors. Once the problem had been resolved, use the setld utility to load the failed subsets. Refer to Chapter 9 for more information about installing optional subsets after an installation.

The installation procedure was not able to correctly install the mandatory software subsets. This may be the result of a corrupted distribution. This error is fatal, and causes the installation procedure to stop. Additional information regarding this error can be found in the following log files: /var/tmp/install.log and /var/tmp/fverify.log

This message is displayed if a mandatory subset fails to load after two attempts. A failed mandatory subset load is a fatal error and the installation process will not be able to continue until the problem has been resolved. Use the more command to view the contents of the /var/tmp/install.log and /var/tmp/fverify.log log files to review the software load errors. The files are written in the /var memory file system and will be available only until the installation is restarted or the system is rebooted.

G.2    Update Installation Error Messages

The following sections describe error messages that you might encounter during an Update Installation.

G.2.1    Update Installation Startup

The following error messages may display after invoking the installupdate command, but before the Update Installation interface (graphical or text-based) is displayed.

You must have superuser privileges to run installupdate. You must be the user root to run the update install.

The /sbin/installupdate command must be run from the root login in single-user mode. Shut down the system to single-user mode and restart the Update Installation.

*** WARNING: Incorrect system state detected. *** Please shut down system to single user mode before attempting an update installation.

The system must be in single-user mode in order for the update to run. Single-user mode can be reached by using the shutdown command. Refer to shutdown(8) for more information.

G.2.1.1    CD-ROM Update

The errors described in this section may occur during an Update Installation from CD-ROM.

Please specify a block-special device file.

The argument to the installupdate command must be a block-special file. Block special file names in systems running operating system versions earlier than Version 5.0A begin with rz.

<mount point> is an invalid update installation mount point.

The Update Installation could not find the installation information at the specified location either because the location is incorrect or the media at that location is not the operating system media.

Cannot locate update information on /updmnt

The Update Installation script could not be found in the default location /updmnt due to invalid or incorrect media or location of media.

Update installation mount point already mounted: <mount-point search listing> Please unmount /updmnt manually.

The Update Installation detected that there is already something at the Update Installation mount point. Enter the following command:

# cd /
# umount /updmnt

Reenter the installupdate command.

Could not unmount: <umount attempt error listing> Please unmount /updmnt manually.

If for any reason you exit the Update Installation before completion, the Update Installation may be unable to unmount the distribution media properly. Enter the following command:


# cd /
# umount /updmnt

Cannot mount <media location supplied by user> on /updmnt. Check with the system manager of your host server.

The distribution media location given as an argument to installupdate is incorrect. Check the location and retry installupdate.

G.2.1.2    Remote Installation Services (RIS) Update

The errors described in this section may occur during an Update Installation from a RIS server.

Cannot find <RIS client> in risdb file. Check with the system manager of <RIS server>.

The client system is not registered on the RIS server. Register the client machine on the RIS server, then restart Update Installation.

Could not retrieve RIS area information on <RIS_server>. Exiting procedure...

Either the Update Installation could not start the network, or the targeted RIS area is no longer accessible. In the first case, check the client machine's network settings. In the second case, check with your RIS administrator.

Error starting <inet route gateway> Cannot continue with update installation.

The network daemons could not start and the distribution media was specified as a network file system (NFS) or RIS. Check to ensure that the network is accessible to the system you want to update, then restart the Update Installation.

G.2.2    Analysis Phase Error Messages

The following messages can display during interaction with the Update Installation analysis phases.

Cannot create directory /var/tmp/update/risupdinfo

The Update Installation makes local copies of the subset inventory files in the /var/tmp/update/risupdinfo directory to improve performance. This error may indicate that file system space is 99% or more full, and the update process could not create the directory. Perform disk space recovery procedures, such as deleting core files, extra kernel files, and all other unnecessary files to free up some disk space.

Cannot locate the product mapping file <RIS product directory/rp_mapping> on the RIS server <RIS server name>. Check with the system manager of the RIS server.

The rp_mapping file maps a product name to a mount point. Without this file Update Installation cannot find the product for which it is registered. This message usually indicates a corrupt RIS area.

<Operating system version> is currently installed on this system.

This message indicates that you are attempting to update the operating system to the exact version of the operating system that is already installed. If you receive this message, you are given the option to continue the update or exit.

You must have one of the following products loaded on your system before you can update to <new operating system version>. <List of products> Please refer to the Installation Guide for additional information on these releases.

The Update Installation cannot update your current version of the operating system to the new version. To get to the new version, you may need to perform successive updates or perform a Full Installation.

No installable subsets for this system found.

The Update Installation could not find installable software subsets on the distribution media, which is most likely corrupted. If you are using a CD-ROM, contact your support representative for another one. If you are using a RIS server, notify the administrator who will have to create another RIS environment for the product you want to install.

Error: Unable to retrieve the /tmp/updinfo file

This file contains data used by the install scripts. Bring your system back to multiuser mode, and restart the Update Installation process.

A non-zero return status <actual return code> has been detected after execution of the <user supplied script name> program. This is fatal and causes the Update Installation procedure to stop.

If you supplied a script within an update_preinstall or update_postload file, this message indicates that the script failed with a value other than 0 (zero). Check your script, and retry the Update Installation.

Error opening file: updpblock.dat

The updpblock.dat file contains the blocking layered products three-letter prefixes. This error resulted from a corrupt distribution media. If you are using a CD-ROM, contact your support representative for another one. If you are using a RIS server, notify the administrator who will have to create another RIS environment for the product you want to install.

The following errors may occur if your system does not have enough disk space to complete an Update Installation:

The Update Installation cannot save your kernel option selections. The kernel will be built with all mandatory and all optional kernel components. Use the /usr/sbin/doconfig command to select the desired optional kernel components and rebuild the kernel after the Update Installation completes.

There is not enough space to save your optional component selections. Use the doconfig command after the installation is complete to build optional components into the kernel.

An error occurred when attempting to write to the INIT_OPS_FILE. This is most likely due to insufficient disk space in the root file system

Check the / (root) file system for core files, extra kernel files, and any unnecessary files that are taking up disk space and remove them to free up space. Perform any other disk space recovery steps that are appropriate for your site. When you have freed up space, restart the Update Installation.

An error occurred while trying to open the input or output files in the call to updmore.

This error message displays only from the text-based interface and indicates that either the input (list of available selections) or output (list of user selections) files could not be opened.

G.3    Software Configuration Error Messages

Software configuration occurs after system reboot and is a process common to both Full and Update Installation. The following error messages can display during software configuration.

c_install: Cannot find /sbin/it.d/data/cinst.data

The data file that contains the list of installed software subsets to be configured could not be found. This file is written by the setld program and indicates a possible corruption of the setld program or the system disk on which the software was installed, or that the system disk has run out of space. This error is fatal.

dn_fix: name database [/etc/dfsc.dat] does not existdn_fix: name database [/etc/dfsl.dat] does not exist

The device file status [cluster,local] file, which contains hardware device attributes including the mapping of old-style device names (for example, rz*) to new-style device names (for example, dsk*) could not be found. These files are created dynamically by the dsfmgr program, and indicates a possible corruption of the dsfmgr program or the system disk on which the software was installed, or that the system disk has run out of space.

dn_fix: cannot create copy of <file> for recovery purposes

The dn_fix program was not able to create successfully backup copies of the listed file prior to performing old-to-new device name conversions on the devices contained in this file. This may be the result of a corrupted system disk, or that the system disk has run out of space.

dn_fix: cannot create copy of <directory> for recovery purposes

The dn_fix program was not able to create successfully a backup copy of the directory prior to performing old-to-new device name conversions on the devices contained in this directory. This may be the result of a corrupted system disk, or that the system disk has run out of space.

An error was encountered when trying to convert from the old device names to the new device names in file: <file_name>. No modifications have been made to the file. If you wish to use the new device names the file will have to be converted manually. The sed conversion script has been saved as: <file_name>.CNVTsed The failed conversion file has been saved as: <file_name>.CNVTfailThe error which occurred was: <specific_error>

The dn_fix program could not convert successfully all the references to old device names in the listed file to the new device names. The error that occurred is listed for reference. This error is not fatal, and the old device names will continue to be used. After the installation process completes, the conversion of the failing file should be manually reattempted.

An error was encountered when trying to convert from the old device names to the new device names in directory: <directory_name> No modifications have been made to this directory. If you wish to use the new device names the directory will have to be converted manually. The failed conversion file has been saved as: <file_name>.CNVTfail The error which occurred was: <error>

The dn_fix program could not convert successfully all the references to old device names in the listed directory to the new device names. The error that occurred is listed for reference. This error is not fatal, and the old device names contained in this directory will continue to be used. After the installation process completes, the conversion of the failing directory should be reattempted manually.

The merge routines which have failed are logged in /var/adm/smlogs/dnconvert.FailCNVT. To re-execute any of these routines, change directory to <script_location> and re-execute all the routines in this directory.

This message is output upon any failure of a conversion script to update the old device names to the new device names. The /var/adm/smlogs/dnconvert.FailCNVT file contains a list of the conversion scripts that failed and that should be reattempted manually after the installation procedure completes and the source of failure identified.

The version switch from "active-version" to "new-version" could not be completed successfully. This error is not fatal, and the installation will continue. The following message was received from /usr/sbin/versw -switch:<details of error>

The command /usr/sbin/versw -switch, which sets the version identifier for the newly installed version of the operating system, has failed. The text of the failure message is included. After the installation completes and the source of failure is identified, this command should be manually reattempted.

Failed to configure one or more mailserver files.Please verify that the 'pop' and 'imap' user accounts have been set up in either the local or NIS password database and run this command again to complete the installation phase of the mailserver setup.

This message indicates a failure in the attempt to update the modes, permissions, and inventory entries of POP and IMAP specific files. These files are shipped in the optional Additional Networking Services (OSFINET505) subset. This error is not fatal, and the following command should be reattempted after the installation process completes and you log in for the first time:

setld -c OSFINET505 MAILSERVERSETUP