XADM: Hot Split Snapshot Backups of Exchange (311898)



The information in this article applies to:

  • Microsoft Exchange 2000 Server
  • Microsoft Exchange Server 5.5

This article was previously published under Q311898

SUMMARY

This article provides information about vendor "hot split" snapshot backup solutions for Microsoft Exchange.

MORE INFORMATION

General Information

The Microsoft Exchange backup application programming interface (API) allows backups of running Exchange databases. Backing up running Exchange databases avoids service interruptions. Many third-party backup software vendors implement this API by using add-on software agents or modules. For example, the Microsoft Windows Backup utility is automatically extended to support Exchange online backups when Exchange modules are installed on a computer.

Some commercial Exchange backup solutions deliberately bypass the Exchange backup API that Microsoft provides. Such solutions back up the database files by using other methods. These backups are commonly known as "snapshot" backups. The primary advantage of these solutions, compared to using the online backup API, is speed of restoration. Typically, snapshot backup solutions depend on specialized hardware that can both back up and restore very large files very quickly.

Most vendors of snapshot backup solutions shut down Exchange databases before the solutions back up the Exchange databases. This shutdown brings database files to a consistent and steady state. This shutdown also requires a short interruption of service. To avoid even this short service outage, a "hot split" snapshot backup makes a file copy of a database while it is running.

Compared to a snapshot of the database that is taken after the database is stopped, a hot split backup introduces the following complexities:
  • The possibility of file-level damage to the database increases. During a hot split backup, the database file is open. The database file is changing from moment to moment at the time that the backup is taken. It is more complicated and risky to take a valid snapshot of the database file when the file is open than it is if the file is closed and stable.
  • After the database is restored, recovery is required to restart the database. Exchange soft recovery is a robust crash recovery mechanism. After an unmanaged shutdown of the database, soft recovery "replays" transaction logs. Transaction log replay brings the database into the same state that the database would be in after a typical shutdown (a consistent state). An inconsistent database file must go through successful soft recovery before it can be started again.

    A hot split backup captures the database in the same state as a database that is suddenly stopped without going through a typical shutdown. Recovery must be completed successfully after such a backup is restored. The vendor must preserve all of the required transaction logs. The vendor must also meet all of the other conditions that are required for a successful soft recovery. If soft recovery does not work, your only other option to make the database startable is to use the Eseutil utility to repair the database.

    NOTE: If a backup is restored by using the Exchange online backup API, that restoration also requires recovery of the database. This process is known as "hard recovery" because it includes more capabilities than soft recovery provides.

    Hard recovery integrates .pat (patch) file data into the database. Hard recovery can recover transaction logs, even if the databases were moved to different file paths after the backup was taken. Soft recovery cannot take into account path changes. For soft recovery of a hot split backup, the databases must be restored to the same file system paths that the databases were backed up from.

The Supportability of Hot Split Backup Solutions and Snapshot Backup Solutions

Microsoft recommends that you use a program that implements the Microsoft online backup API to back up and restore Exchange data for the following reasons:
  • The API ensures that all of the relevant files are backed up.
  • The API implements several checks and safeguards during backup and restoration.
  • The API has been thoroughly tested.
This does not mean that you cannot perform hot split and other snapshot backups safely and successfully. However, the backup solution vendor or Exchange administrator is responsible for making sure that all of the required files are backed up and restored correctly. The backup solution vendor or Exchange administrator is also responsible for making sure that the integrity of all of the files is preserved at each stage of the process.

If you implement a snapshot or hot split backup solution for Exchange, your vendor is your primary support provider for backup and recovery issues. Microsoft Product Support Services (PSS) can help you diagnose or analyze issues with the available database and transaction log file sets. However, Microsoft does not test, certify, or debug scripts and procedures for hot split backups or snapshot backups of Exchange data files. Microsoft PSS assistance is limited to advice about how to best continue to recover the available file set.

Information for Troubleshooters and Support Personnel

This article (Q311898) is intended for readers who have a good understanding of how recovery and transaction log file replay work with offline snapshot backups. This article covers only the issues and differences that are introduced by trying to recover the database when it is not already consistent. The information in this article assumes that the reader is familiar with analyzing database and log file header information.

For the purposes of this article, a "soft crash" is defined as a crash that occurs when a process stops unexpectedly, but the operating system and hardware continue to function normally. A "hard crash" occurs when the operating system or the hardware suddenly becomes unavailable or fails. After a hard crash, the failure rate of the database varies widely. This failure rate depends on the hardware and the effects of the crash on the file system.

The chances that an Exchange 2000 database will not start after a soft crash are well under 1 in 1,000. The Exchange soft recovery mechanism is very reliable. Exchange soft recovery is designed to recover the database perfectly, regardless of the operation that the information store process was performing at the time that the process was suddenly stopped.

However, in two circumstances, recovery is unsuccessful:
  • The database suddenly stopped because of a logical problem in the database. That problem still exists after recovery and is likely to cause the database to stop again.
  • The crash occurred because of, or resulted in, corrupted Exchange transaction log files or databases at the file system level. Soft recovery may not work if critical pages in the database files or transaction log files are damaged.
A hot split backup is designed to capture a copy of the database in a "soft crash" state. However, the methods that are required to capture the database during a hot split backup are similar to the circumstances of a hard crash. Therefore, the success of a hot split backup solution depends on carefully designed hardware. The hot split backup vendor must create methods that reliably capture Exchange database files in a "soft crash equivalent" state, and then validate that the capture succeeded without damage to the files or file system.

Determining Which Log Files Are Required for a Soft Recovery in Exchange Server 5.5

To safely run soft recovery on a hot split backup database, you must know the exact sequence of log files that are required. If any log file in the required sequence is missing, soft recovery does not work. This section describes how to determine the first and last log files that are required for Exchange Server 5.5.

When an Exchange database is shut down before a snapshot backup, all of the outstanding operations from the transaction logs are applied to the database. The database header's Last Consistent value is also updated to reflect the current sequence number of the transaction log file. Therefore, the Last Consistent value lists the first log file that is required to recover (or roll forward) from that snapshot. You can roll forward an indefinite stream of log files from this point, as long as you start with this log as the first one.

However, if a snapshot is taken while the database is running, the Last Consistent value is out of date. The Last Consistent value reflects the value at the time that the database was last started. The Last Consistent value may even point to a log file that is several weeks or months old. This value does not reflect the actual log file that is required. However, there is no other value in the database file itself that reveals your exact log file replay starting point.

In such cases, you can use the checkpoint value that is recorded in the header of the Edb.chk file. At any given moment while the database is running, the checkpoint value always reflects the actual first log file that is required for replay. (The checkpoint value in the Edb.chk file actually often lags behind the actual checkpoint; however the checkpoint value is always a safe and relatively recent value.)

The checkpoint value must be read and saved at some point in time before the database hot split backup occurs. If the recorded checkpoint is older than the actual checkpoint, no damage is done. However, if the recorded checkpoint is even one log file too new, the effect on soft recovery is catastrophic. The database suffers massive corruption from transaction log replay. The database becomes unstartable or unstable.

As a general rule, you do not have to restore the checkpoint file when you restore a hot split backup. Preserve the file or its header information for informational purposes only.

If you force recovery to continue even though log files are missing at the beginning of the required sequence, one of the most common errors that may occur is JET_errDiskIO (-1022, 0xfffffc02, or 4294966274). This error occurs if the size of the database was extended as one of the operations in a missing log, and a subsequent operation tries to access a database page that does not exist. Many other errors may occur during recovery or after a database is started. Which errors occur depends on the exact nature of the operations in the missing transaction logs.

You must determine not only the first log file that is required for recovery after a hot split backup, but you must also determine the last log file that is required for recovery. You can also play additional files beyond the last file required, but you must replay all of the log files between the first and last required logs to complete soft recovery.

To determine the last log file that is required for soft recovery, you must capture a copy of the current transaction log file (the Edb.log file) some time after the hot split backup of the database file is completed. If you capture a version of Edb.log that is even one second older than the database, soft recovery does not work. In most cases, the error JET_errDatabaseInconsistent (-550, 0xfffffdda, or 4294966746) occurs.

Determining Which Log Files Are Required for a Soft Recovery in Exchange 2000

In Exchange 2000, you can determine which log files are required to complete recovery from the Log Required value range that is recorded in the database header. An example of this value is:

Log Required: 2341-2345

In this example, the log files with decimal sequence numbers from 2341 through 2345 are required. Log file names in Exchange Server 5.5 have an EDB prefix, but log files in Exchange 2000 have an E00, E01, E02, or E03 prefix. The Exchange 2000 log file prefix depends on which of the four possible storage groups the logs belong to. Log files are also numbered in hexadecimal format, so you must convert the Log Required values from decimal to hexadecimal.

The Log Required value does not record the log prefix, and you must match the Log Signature value that is recorded in the database header with the log signature that is recorded in each log file header. This confirms that the logs match the database. Merely matching the sequence numbers is not sufficient.

In this example, the database is from the second storage group on the server. Therefore, the actual log files that are required are E0100925.log through E0100929.log.

In Exchange 2000, there are new safeguards to prevent playing an incomplete set of log files into the database. If you make a mistake and you do not actually have all of the log files that are required, recovery stops immediately before log file replay begins.

Renaming Transaction Log Files

If you backed up the current log file after you took a hot split backup of the database, the current log file is probably still named Edb.log (for Exchange Server 5.5) or E0n.log (for Exchange 2000). After the current Exchange transaction log is filled, it is closed and renamed with a numeric sequence number that corresponds to the lGeneration value in its header. If you captured the current transaction log before it was renamed, you captured a copy of this file before its final state.

IMPORTANT: As a general rule, never rename Edb.log (for Exchange Server 5.5) or E0n.log (for Exchange 2000).

If you want to play logs forward past these files, you must instead locate a copy of the log file that Exchange has already renamed with its final numeric specifier.

However, there are some circumstances in which you can rename a numeric log to Edb.log or E0n.log. You can rename a numbered transaction log file to Edb.log or E0n.log if the log file in question was created after the minimum log sequence that is required for soft recovery.

For soft recovery to run, the sequence of available logs must end with a log named Edb.log or E0n.log. If the most current log that was generated by the database is unavailable, you can rename the highest log file that you have, subject to the rule in the paragraph that immediately precedes this paragraph.

If all of the required log files are present, soft recovery of a hot split backup can continue according to the same rules that apply to offline snapshot backups. If all of the required and appropriate log files are available, the recovery process makes no distinction based on whether the current database file is consistent or inconsistent. Recovery applies all of the transactions that are not already in the database to bring the database up to date as far as possible.


Modification Type:MinorLast Reviewed:11/21/2005
Keywords:kbinfo KB311898