FIX: Losing DTC Service May Leave Orphaned Transactions (195542)
The information in this article applies to:
This article was previously published under Q195542
BUG #: 17882 (SQLBUG_65)
SYMPTOMS
When a connection is involved in a distributed transaction, SQL Server
relies on the transaction coordinator (Microsoft Distributed Transaction
Coordinator, or DTC) to notify it regarding transaction outcome so that it
knows whether to commit or abort the transaction. DTC uses a two phase
commit protocol among all enlisted servers and only commits if all servers
have indicated that they are prepared to commit.
Any time before the beginning of the two phase commit process, you can use
the Transact-SQL KILL statement to close a SQL Server connection that is
causing blocking or other undesirable behavior. However, after the two
phase commit protocol begins and SQL Server has indicated that it has the
transaction in a prepared state, it will no longer respond to the KILL
statement.
CAUSE
According to the two-phase commit protocol, any resource manager (that is,
SQL Server) that has successfully responded to the transaction coordinator
that it is in a prepared state must guarantee that it can successfully
commit or roll back the transaction after the transaction coordinator has
determined the transaction outcome. To do this, SQL Server will hold all
locks acquired during the transaction and will not respond to the
Transact-SQL KILL statement.
If DTC unexpectedly fails, or fails to notify SQL Server regarding the
transaction outcome, the resources used by the transaction will be tied up
until manual intervention occurs.
STATUS
Microsoft has confirmed this to be a problem in SQL Server
6.5. This problem has been corrected in U.S. Service Pack 5
for Microsoft SQL Server 6.5. For information about
how to download and install the latest SQL Server Service Pack, see
the following Microsoft Web site:
For more information, contact your primary support provider.
MORE INFORMATION
There are no automated methods to restart DTC if it fails, so you must
manually restart the service. When DTC is restarted and is recovering its
transaction log, it will detect that a transaction was pending and some
resource managers may not have been notified. It will then notify the
enlisting servers of the transaction outcome.
After SQL Server has responded that it is prepared, it will wait on
notification from DTC regarding the transaction outcome. If it has not
received any notification within 60 seconds, it will print a message in the
error log similar to the following:
98/10/20 09:15:43.76 spid11 Awaiting DTC message.
UOW: e2a10172-bda9-11d1-bf4c-00600893905d State: PREPARED
The UOW value (a Unit Of Work ID) is a unique identifier for the
transaction. This message will repeat every 60 seconds until notification
has been received from DTC. If you see this message, open the DTC
Administrative Console, or the DTC service in SQL Enterprise Manager. On
the Transaction tab, you will see a list of DTC transactions and their
current state, and you should see a transaction with a Unit of Work ID
value that matches the one in the error log message. From this console, you
can select the transaction, right-click, and force the transaction to
commit or abort.
If using the DTC Administrative Console does not alleviate the condition,
and stopping and restarting the DTC service doesn't either, the only way to
remedy the situation is to stop and restart SQL Server. During recovery,
SQL Server will interrogate DTC to determine the transaction outcome.
There may be rare circumstances where, even after restarting SQL Server,
the transaction outcome cannot be determined. This may occur if there is a
permanent network failure between the SQL Server and the commit coordinator, or it may occur in situations where a hardware failure
resulted in the loss of the DTC log file. In such circumstances, SQL Server
will log an error 3437 and stop recovery of the database:
Msg 3437: Error recovering database 'pubs' - could not connect to
the DTC to check completion status of xact: Rid pageid is
0x132f and row num is 0x11
Beginning in SQL Server 6.5 Service Pack 5, SQL Server has two enhancements
that may be useful in resolving situations such as this. First, a new
sysprocesses waittype value, 0x0411, has been added. This value can be used
to confirm that SQL Server is waiting on DTC. Second, if the above methods
have failed to resolve the problem, the Transact-SQL KILL statement has
been enhanced to allow an optional WITH COMMIT or WITH ABORT clause, as
shown in the following syntax:
KILL spid [WITH {COMMIT | ABORT}]
A warning will be written to the error log, indicating the UOW ID and the
forced transaction outcome. Take caution when using this new KILL statement
syntax, because it may result in inconsistencies among the various servers
enlisted in the transaction. This syntax should only be used when the other
methods have proved unable to resolve the issue. If the KILL ... WITH
COMMIT/ABORT syntax is used and the target spid is not waiting on DTC,
error 6108 will be raised.
If you find that this issue recurs, open a support incident to pursue the
underlying issues with SQL Server or DTC that are causing this situation.
Modification Type: | Major | Last Reviewed: | 7/19/2006 |
---|
Keywords: | kbBug kbfix KB195542 |
---|
|