Recovering from an Event ID 1034 on a server cluster (280425)



The information in this article applies to:

  • Microsoft Windows Server 2003, Enterprise Edition
  • Microsoft Windows Server 2003, Datacenter Edition
  • Microsoft Windows 2000 Datacenter Server
  • Microsoft Windows 2000 Advanced Server

This article was previously published under Q280425

SYMPTOMS

A physical disk resource may fail to come online, or the Cluster service may fail to start. The following message is generated in the system event log: Event ID: 1034
Source: ClusDisk
Description: The disk associated with cluster disk resource DriveLetter could not be found. The expected signature of the disk was DiskSignature.

CAUSE

These issues typically occur if either of the following conditions is true:
  • A disk has become unavailable or inaccessible, and therefore, the Cluster service cannot find it.
  • The signature on the disk has been changed.
The Cluster service recognizes and identifies disks by their disk signatures. Disk signatures are stored on the physical disk in the master boot record (MBR). The MBR is a record that the Cluster service keeps of all the disks that it manages. It uses the MBR to track the disks. During the course of Cluster service operations (start, restart, failover, and so forth), if the Cluster service cannot find a disk that is identified by a particular signature, it will fail to bring the disk online. The cluster component that specifically detects this condition and logs the error is the cluster disk filter driver (Clusdisk.sys). The error message provides information on the "missing disk" but does not indicate the reasons that this condition may have occurred.

RESOLUTION

To resolve this problem, follow these steps:
  1. Make sure that the disk is actually exposed through the shared interconnects and is visible to the operating system. To do this:
    1. Click Start, click Run, type CompMgmt.msc, and then click OK.
    2. In Computer Management under System Tools, Device Manager, look under Disk Drives, and you can view all the logical disks that are being presented to the node.

      All nodes in a cluster can see the same number of disk drives for disks that are managed by the cluster. For example, if there are 10 disks that are managed by the cluster, all 10 are visible to all nodes in the cluster. If you know the Target ID and LUN of the disk, you can validate them by clicking Properties for each disk.
    If the count does not match, the disk is not accessible to that node. Troubleshoot your storage solution to make sure that the disk is accessible and can be mounted by the operating system. When the storage solution is functioning correctly, you can rescan the bus by right-clicking the Device Manager disks.

    If the count does match, and if the Cluster service is up and running, reduce the complexity, if possible, by moving the all the disk resources (groups that host the resources) to a single node. If the Custer service has failed, shut down all nodes and restart one node.
  2. If the disk signatures have changed, use Dumpcfg.exe to write the expected signature back to the disk.

    The signatures of the disks as enumerated by dumpcfg should match the list that is derived from the following registry subkey:

    HKLM/System/CurrentControlSet/Services/Clusdisk/Parameters

    Clusdisk uses this information to bind to disks that are managed by the Cluster service.
  3. If the signatures in the list do not match the registry subkey list, you must correctly identify the disks that have had their signatures changed and reset them to the expected signatures. To do this:
    1. Power down all but one node.
    2. Document the disk number:
      1. Open Computer Management, double-click Storage, and then click Disk Management.
      2. In Logical Disk Manager, note the disk number and label that is associated with the failing disk. This information is to the left of the partition information. For example: Disk 0.
      Compare the information that is displayed with the message in the "Description" section of the Event ID 1034.

      For example: "The disk associated with cluster disk resource 'Disk Q:\'". The disk label should not change even if the signature has. The disk label will help you correctly identify the problem disk. Once the disk has been correctly identified its signature can be checked again to validate the mismatch.
    3. If you cannot see the disks in DiskMgmt.msc, set the Cluster service and Cluster Disk device to Manual, and then restart the node (all other nodes should remain shut down). To do this, follow these steps.

      NOTE: This step may not be necessary.
      1. Click Start, point to Programs, point to Administrative Tools, and then click Computer Management.
      2. Click Device Manager in the left pane, and then click Show Hidden Devices on the View menu.
      3. In the right pane, view the non-Plug and Play drives section, and then double-click the Clusdisk driver.
      4. On the Driver tab, change the Startup type option from System to Disabled.
      5. In the left pane, double-click "Services and Applications", and then click "Services".
      6. In the right pane, double-click the Cluster service, and then click Disabled in the Startup type box.
      7. Restart the node, and then repeat step 2 if necessary.
    4. Write the signature that the Cluster service expects to the disk:
      1. Obtain the expected signature from the "Description" section of the Event ID 1034 error message. For example: "The expected signature of the disk was 12345678."
      2. Copy DumpCfg.exe from the Windows 2000 Resource Kit to the local node. At the command prompt, type dumpcfg.exe. Under the [DISKS] section, the disk number and signature for all available disks is displayed. Validate the actual disk signature with what the Cluster service expects.
      3. Write the expected signature to the disk by using by using the following command, where 12345678 is the disk signature in hexadecimal, and 0 is the disk number that you replaced (which was obtained from the previous step):

        dumpcfg.exe -s 12345678 0

        For more information about using Dumpcfg.exe, type dumpcfg /? at the command prompt.
    5. Set the Cluster service back to Automatic, and set the Cluster Disk device back to System on the node. Start the Cluster Disk device, and then start the Cluster service.
    6. Open Cluster Administrator, and then bring the disk online.
    7. Turn on all other nodes, one at a time, and then test failover.

MORE INFORMATION

If you are having problems bringing disks online on a Windows NT 4.0 cluster, click the following article number to view the article in the Microsoft Knowledge Base:

243195 Event ID 1034 for MSCS Shared Disk After Disk Replacement



Malfunctioning multi-path software is a common cause for disk signatures to change. For more information about malfunctioning multi-path software in Windows 2000 clusters, please contact Microsoft support to obtain the hotfix that is described in the following Microsoft Knowledge Base article:

293778 Multiple-path software may cause disk signature to change


Modification Type:MinorLast Reviewed:3/24/2005
Keywords:kberrmsg kbprb KB280425