Cannot Create Software Mirror If Disk Contains Bad Blocks (325615)

CAUSE

If you look in the system event log, you see event messages similar to the following posted by DMIO that report read or write errors from the affected disk

Event Type:	Information
Event Source:	dmio
Event Category:	None
Event ID:	29
Computer:       Computer_name
Description:    dmio: Harddisk1 read error at block 3328: status 0xc000009c

Event Type:	Information
Event Source:	dmio
Event Category:	None
Event ID:	26
Computer:	Computer_name
Description:    dmio: Found a bad block on disk Harddisk1 at block number 3328

where the status of 0xC000009C is STATUS_DEVICE_DATA_ERROR.

If you perform a Chkdsk against the source volume by using the /F /R switches to locate bad sectors and recover readable information, Chkdsk successfully marks those sectors (clusters) as "bad", as seen in the in the Bad Sectors section of Chkdsk output.

EXAMPLE:

C:\>chkdsk D: /f /r
The type of the file system is NTFS.
Volume label is MASTER.

CHKDSK is verifying files (stage 1 of 5)...
File verification completed.
CHKDSK is verifying indexes (stage 2 of 5)...
Index verification completed.
CHKDSK is verifying security descriptors (stage 3 of 5)...
Security descriptor verification completed.
CHKDSK is verifying file data (stage 4 of 5)...
File data verification completed.
CHKDSK is verifying free space (stage 5 of 5)...
Free space verification is complete.
Adding 1 bad clusters to the Bad Clusters File.  <-- bad sectors marked.
Correcting errors in the Volume Bitmap.
Windows has made corrections to the file system.

   2047999 KB total disk space.
        20 KB in 2 files.
         4 KB in 9 indexes.
         2 KB in bad sectors.  <-- (NOTE BAD SECTOR(S)
     12851 KB in use by the system.
     12288 KB occupied by the log file.
   2035122 KB available on disk.

      2048 bytes in each allocation unit.
   1023999 total allocation units on disk.
   1017561 allocation units available on disk.

As a result of Chkdsk finding bad blocks, the following event message is also posted:

Event Type:	Warning
Event Source:	dmio
Event Category:	None
Event ID:	35
Computer:	Computer_name
Description: dmio: Disk Harddisk1 block 3168 (mountpoint D:): Uncorrectable read error

Although Chkdsk instructs the file system not to use those sectors, when you try to establish the mirror again, you still receive the same error message from Logical Disk Manager (LDM) and the same DMIO system event log messages.

When it establishes software mirrors on dynamic disks, DMIO does a sector-by-sector copy of the source disk to the destination disk. DMIO does not know or care which sectors contain data or which sectors may have been marked "bad" by Chkdsk. Chkdsk marks those bad sectors only in the file system (FAT, FAT32, or NTFS), so that the file system does not try to use them. DMIO operates below the file system, and if it finds I/O errors while reading from a sector on the source disk or while trying to write the data to the destination disk, it aborts the mirroring operation.

MORE INFORMATION

SECTOR SPARING:

Sector Sparing technology permits software to communicate directly with a disk to mark a defective sector (block) as "bad." The SCSI specification defines command (0x7) for reassigning of bad blocks. When Windows comes across a bad disk block on a dynamic disk, Dmio.sys may call IOCTL_DISK_REASSIGN_BLOCKS for any bad sectors found.

The IOCTL_DISK_REASSIGN_BLOCKS operation maps defective blocks to new location on the disk. This request instructs the device to reassign the bad block address to a good block from its spare-block pool, and then the SCSI drive either returns a status of "failed" or "success." If successful, the SCSI drive remaps that sector with a spare, so that whenever a read or write operation is performed on the original bad sector number (or numbers), the SCSI drive redirects the I/O to the newly remapped sector (or sectors).

Although IDE drives do not support this sector sparing functionality and do not accept external software commands to tell the disk that it contains a bad block, some IDE drives do support internal remapping of sectors as they start going bad. This functionality is handled by the internal firmware on the IDE disk itself. If internal remapping is aggressive enough, the operating system (and therefore the end user) never know that they have experienced bad blocks.

Although sector sparing on SCSI disks is supported, this feature is not implemented immediately on read errors, but only during a write operation following a read failure.

After a read failure, DMIO records the offset of the bad sector into a bad sector list. The next write that hits one of the recorded bad sectors triggers IOCTL_DISK_REASSIGN_BLOCKS. There is a timeout of five minutes associated with how long a sector can stay in the bad sector list without being written to, which triggers the reassignment.

This behavior is most effective for healthy mirrors. When a read from one plex fails, the bad sector is recorded. DMIO then reads from the other plex and writes back to the first plex. The bad sector is reassigned during that write operation to maintain data integrity.

In the Add Mirror case where the bad sector is on the source disk, the bad sector number from the source disk is recorded in the bad sector list but is never written to. Therefore, it is never reassigned (unless the user finds a way to force a write to that bad sector within five minutes from the read failure), and therefore the mirror operation is aborted.

In the Add Mirror case where the bad sector is on the destination disk, the write failure does not add the bad sector number to the bad sector list. Therefore, it is not reassigned and the mirror operation is aborted.

Therefore, even though the code dealing with bad sectors is always active, the critical part (calling IOCTL_DISK_REASSIGN_BLOCKS) occurs only when a read operation and then a write-back operation are run on an already established mirror. This current design prevents a mirror from being established if a read or write error occurs on either the source or destination disk.

ADDITIONAL NOTE:

When DMIO successfully performs sector sparing (reassigns bad blocks) during a write operation on an already established mirror, the following event messages are posted.

Event Type:	Information
Event Source:	dmio
Event Category:	None
Event ID:	23
Computer:	Computer_name
Description: dmio: Reassigning bad block number 3328 on disk Harddisk1 

Event Type:	Information
Event Source:	dmio
Event Category:	None
Event ID:	24
Computer:	Computer_name
Description: dmio: Reassign bad block(s) on disk Harddisk1 succeeded

By design, Autochk.exe and Chkdsk.exe do not perform sector sparing. These utilities only record the defective sectors (blocks) in the bad cluster tables managed by the (FAT, FAT32, or NTFS) file system on the volume. If Chkdsk finds any clusters that contain one or more bad blocks, the clusters are marked as "bad" so that the file system does not try to use those clusters.

Cannot Create Software Mirror If Disk Contains Bad Blocks (325615)

SYMPTOMS

CAUSE

RESOLUTION

STATUS

MORE INFORMATION