Software FT Sets Are Not Supported in Microsoft Cluster Server (171052)



The information in this article applies to:

  • Microsoft Windows NT Server, Enterprise Edition 4.0
  • Microsoft Cluster Server

This article was previously published under Q171052

SUMMARY

The software fault tolerance contained within Windows NT Server (FTDISK) will not be supported in Microsoft Cluster Server (MSCS) 1.0 for cluster disk resources on the cluster shared SCSI bus. This will include mirror sets, volume sets, and stripe sets with and without parity. FTDISK will continue to be supported for local disk resources.

This includes local disk resources on Windows NT Server Enterprise Edition servers using MSCS 1.0. Examples of using FTDISK for local disk resources would include creating an FTDISK RAID 5 stripe set that is used for non- cluster purposes on a server. For example, a customer could choose an FTDISK volume for an application that was not used on a cluster. For MSCS disk resources on a shared SCSI bus, however, the only RAID supported by Microsoft is hardware level RAID.

The two key facts about this situation are:
  • MSCS still supports RAID on all disks in a cluster, to protect your data in the event of a disk failure. However, disks on a shared SCSI bus must be protected by hardware RAID, while disks that are local to each server may be protected by either hardware or software RAID.

  • Windows NT Server software RAID is still fully supported for all disks connected to a non-clustered server. The technical reasons that prevent Microsoft from supporting software RAID on shared SCSI disks in a cluster are uniquely related to the way MSCS does server failover.

MORE INFORMATION

There are two key technical reasons why FTDISK is not supported on the shared SCSI bus in Microsoft Cluster Server 1.0.

The first reason is that RAID metadata cannot be reliably recovered by MSCS in all server failover scenarios. FTDISK stores metadata information about all disk members in the registry on the local machine. (The location of this information is HKEY_LOCAL_MACHINE\System\Disk.) Therefore, the only way to get to the disk metadata is to mount the file system on the disk members. This presents no difficulty with non-clustered servers because they, by default, always have access to local storage devices.

However, within a cluster, based on specific failure and boot sequences, there are occasional states where a computer is unable to start with all of the volumes necessary for the FTDISK diskset. In such a case, a data set could be orphaned or rolled to a previous version, because the information needed to identify the disk ownership is contained on the disk that is to be mounted. In a cluster it would be theoretically possible for server failures to result in unknown states for disks managed by the current FTDISK. The inability to safely recover RAID disk state until the disks were already brought back online could also expose the disk members to the possibility of data corruption, data loss, stale data, and other problems for a given data volume.

The other technical issue preventing support of the current FTDISK for shared SCSI disks in a cluster is the lack of a fully automated method of recovery from disk problems. For example, in the event of a failover, CHKDSK would need to be run on the FT volume to assess the integrity of the volume itself. At this time, there is no automatic means of doing this, leaving the responsibility of running CHKDSK to the user.

For additional information, please see the following articles in the Microsoft Knowledge Base:

160963 CHKNTFS.EXE: What You Can Use It For

158675 How to Cancel CHKDSK After It Has Been Scheduled

The above information is only relevant to implementing software fault tolerance in Microsoft Cluster Server 1.0. The current FTDISK software RAID remains a supported, reliable, and excellent disk protection solution for Windows NT Server when running on a single server.

Modification Type:MinorLast Reviewed:1/5/2006
Keywords:kbinfo kbsetup KB171052