FIX: The internal deadlock monitor may not detect a deadlock between two or more sessions in SQL Server 2005 (915918)



The information in this article applies to:

  • Microsoft SQL Server 2005 Developer Edition
  • Microsoft SQL Server 2005 Enterprise Edition
  • Microsoft SQL Server 2005 Enterprise X64 Edition
  • Microsoft SQL Server 2005 Standard X64 Edition
  • Microsoft SQL Server 2005 Standard Edition

Bug #: 459 (SQL Hotfix)
Microsoft distributes Microsoft SQL Server 2005 fixes as one downloadable file. Because the fixes are cumulative, each new release contains all the hotfixes and all the security fixes that were included with the previous SQL Server 2005 fix release.

SUMMARY

This article describes the following about this hotfix release:
  • The issues that are fixed by the hotfix package
  • The prerequisites for applying the hotfix package
  • Whether you must restart the computer after you apply the hotfix package
  • Whether the hotfix package is replaced by any other hotfix package
  • Whether you must make any registry changes after you apply the hotfix package
  • The files that are contained in the hotfix package

SYMPTOMS

In Microsoft SQL Server 2005, a deadlock may occur between two or more sessions. In very specific, rare circumstances, the internal deadlock monitor may not detect the deadlock. The sessions and their current transactions may stay blocked until you manually intervene or until a time-out terminates one of the blocked transactions.

This undetected deadlock problem only occurs when all the following conditions are true:
  • The server is running SQL Server 2005 Service Pack 1 (SP1) or the original release version of SQL Server 2005.
  • The server has multiple processors.
  • SQL Server is configured to run queries in parallel.
  • One of the deadlocked statements runs in parallel across multiple processors.
  • Typically, the execution plan of this deadlocked statement performs a sort operation or a hash join operation.
  • The scan operation or the seek operation under this sort operation or under this hash join operation waits for a lock.
  • This lock is incompatible with a lock that is held by a separate update statement in a different session.
  • This different session may not be running in parallel.

RESOLUTION

Hotfix information

A supported hotfix is now available from Microsoft, but it is only intended to correct the problem that is described in this article. Only apply it to systems that are experiencing this specific problem. This hotfix may receive additional testing. Therefore, if you are not severely affected by this problem, we recommend that you wait for the next SQL Server 2005 service pack that contains this hotfix.

To resolve this problem immediately, contact Microsoft Product Support Services to obtain the hotfix. For a complete list of Microsoft Product Support Services telephone numbers and information about support costs, visit the following Microsoft Web site:Note In special cases, charges that are ordinarily incurred for support calls may be canceled if a Microsoft Support Professional determines that a specific update will resolve your problem. The usual support costs will apply to additional support questions and issues that do not qualify for the specific update in question.

This hotfix is also included in the cumulative hotfix package (build 2153) for SQL Server 2005 that is described in Microsoft Knowledge Base article 918222. For more information, click the following article number to view the article in the Microsoft Knowledge Base:

918222 Cumulative hotfix package (build 2153) for SQL Server 2005 is available

Prerequisites

There are no prerequisites for this hotfix.

Restart information

You do not have to restart your computer after you apply this hotfix.

Registry information

To use one of the hotfixes in this package, you do not have to make any changes to the registry.

Hotfix file information

This hotfix may not contain all the files that you must have to fully update a product to the latest build. This hotfix contains only the files that you must have to correct the issues that are listed in this article.

The English version of this hotfix has the file attributes (or later file attributes) that are listed in the following table. The dates and times for these files are listed in Coordinated Universal Time (UTC). When you view the file information, it is converted to local time. To find the difference between UTC and local time, use the Time Zone tab in the Date and Time item in Control Panel.SQL Server 2005, 32-bit versions
File nameFile versionFile sizeDateTimePlatform
Microsoft.sqlserver.replication.dll2005.90.1531.01,608,40803-Mar-200616:16x86
Replrec.dll2005.90.1531.0781,01603-Mar-200616:16x86
Sbmsmdlocal.dll9.0.1531.015,588,56803-Mar-200616:17x86
Sbmsmdredir_dll9.0.1531.03,927,25603-Mar-200616:16x86
Sqlaccess.dll2005.90.1531.0349,40003-Mar-200616:16x86
Sqldiag.exe2005.90.1531.0960,21603-Mar-200616:16x86
Sqlservr.exe2005.90.1531.028,778,25603-Mar-200616:17x86
SQL Server 2005, 64-bit version
File nameFile versionFile sizeDateTimePlatform
Microsoft.sqlserver.replication.dll2005.90.1531.01,813,72003-Mar-200616:17x64
Osql.exe2005.90.1531.083,67203-Mar-200616:17x64
Replrec.dll2005.90.1531.01,007,32003-Mar-200616:17x64
Sbmsmdlocal.dll9.0.1531.015,588,56803-Mar-200616:17x86
Sbmsmdredir_dll9.0.1531.03,927,25603-Mar-200616:16x86
Sqlaccess.dll2005.90.1531.0356,56803-Mar-200616:17x86
Sqldiag.exe2005.90.1531.01,127,64003-Mar-200616:17x64
Sqlservr.exe2005.90.1531.039,483,09603-Mar-200616:17x64

WORKAROUND

Manually detect a long-term deadlock

To work around this problem, manually detect the long-term deadlock. Then, terminate one of the sessions that appears to be in the deadlock state. To do this, follow these steps:
  1. To determine the current blocking session, use one of the following methods:
    • In SQL Server Management Studio, click the instance name in Object Explorer, click the Summary tab, and then click Activity - All Blocking Transactions in the Report list.
    • In SQL Server Management Studio, expand Management, right-click Activity Monitor, and then click View locks by Process.
  2. To determine the last batch that ran on each session, run the following line of code.
    DBCC INPUTBUFFER (<session_id>)
  3. To terminate the session that is causing the deadlock, run the following line of code.
    KILL <session_id>
For more information, visit the following Microsoft Developer Network (MSDN) Web sites: Another method to manually detect the long-term deadlock is to configure the blocked process threshold. To do this, use the sp_configure stored procedure together with the blocked process threshold option. Then, monitor the Blocked Process Report event class in SQL Server Profiler, or use the sp_trace_create stored procedure and the sp_trace_setevent stored procedure for server-side tracing. For more information, visit the following MSDN Web sites:

Reduce the delays that are caused by an undetected deadlock

To reduce the delays that are caused by an undetected deadlock, you can use the following techniques:
  • Set a reasonable command time-out value in the application that sends commands to SQL Server. When the application waits longer than the command time-out value, the query that is running in the deadlocked session is automatically canceled to avoid additional delays. To set the command time-out value, use one of the following properties:
  • Use the SET LOCK_TIMEOUT Transact-SQL statement in the calling application to specify the number of milliseconds that any statement waits for a lock to be released. After the lock time-out occurs, one of the two statements in the long-term deadlock is canceled, and that statement's transaction is rolled back. Then, the other query obtains the required lock, and the other query runs until it is completed. To examine the lock time-out setting for a connection, use the following query.
    SELECT @@LOCK_TIMEOUT
    For more information about the SET LOCK_TIMEOUT Transact-SQL statement, visit the following MSDN Web site:
  • Set the query wait option by using the sp_configure stored procedure. The query wait option is a server-wide option that defines the maximum time in seconds that any query on the server waits for a resource before the query times out. However, this option may have an adverse effect on long-running queries or on long-running batch jobs that you expect to take a long time to finish. For more information about the query wait option, visit the following MSDN Web site:
  • Use the OPTION (MAXDOP 1) clause to provide a hint in the problem query or in the stored procedure. Use the sp_create_plan_guide stored procedure to force the hint to use the plan guide.

    Note The MAXDOP 1 option may reduce query performance because the query may not be divided to run on multiple processors.
  • Disable parallelism for the instance of SQL Server by limiting the degree of parallelism to one degree. Use the following code example.
    sp_configure 'max degree of parallelism', 1
    go
    reconfigure with override
    
    Note If the server has multiple processors and multiple high-cost queries that regularly use parallelism, disabling parallelism may have an adverse effect on the performance of those queries. For more information, visit the following MSDN Web sites:
  • Use the best-practice techniques that are outlined in SQL Server 2005 Books Online to write queries that may prevent deadlocks. Additionally, use the best-practice techniques to tune affected queries by using supporting indexes to reduce blocking and to avoid the lock conflicts that lead to deadlocks. For more information about how to minimize deadlocks, visit the following MSDN Web site:

STATUS

Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.

MORE INFORMATION

To verify that the blocking sessions are experiencing a true deadlock, run the following line of code.
Select * from sys.dm_os_waiting_tasks
Review the information in the sys.dm_os_waiting_tasks dynamic management view. To be a true deadlock, one execution context that is identified by the exec_context_id column of a certain session is blocked by another session. For example, the blocking_session_id column is populated. To be a true deadlock, that blocking session is blocked by any one of the execution contexts of the first session. Therefore, this creates a circular dependency of locks that can never be obtained.

Similarly, the information in the sys.dm_tran_locks dynamic management view or in the sp_lock stored procedure should show that there is waiting for locks. One session has the WAIT value or the CONVERT value in the request_status column. The opposite session already has the GRANT value on the lock that is incompatible with the required lock. Therefore, a circle of blocking occurs in which neither session can win without intervention. The members of the sysadmin role or of the processadmin role can manually detect the deadlock. For more information about deadlocks and how to detect deadlocks, visit the following MSDN Web site:This undetected deadlock problem is not caused by the deadlock monitor algorithm itself. However, this undetected deadlock problem is caused by the way that a SQL Server 2005 SOS task reports information back to the deadlock monitor thread. The information is about which other tasks are blocking deadlock monitor thread progress.

In the rare circumstances that are mentioned in this section, the SOS task reports that a task is blocked by a task that is null. This hotfix changes the behavior so that the blocked task reports the task address of the main execution context for the real blocking request ID or for the real blocking session ID.

The following factors make this undetected deadlock problem occur very rarely:
  • The events in the following steps must occur exactly in this order. Otherwise, a deadlock does not occur, or a deadlock occurs and is detected.
  • This undetected deadlock problem requires a certain type of parallel scan. The scan is nondeterministic in nature. One thread during the parallel scan must obtain the locked data pages in a specific order. The specific order is not guaranteed to be the same every time because the scan is nondeterministic in nature. Therefore, this undetected deadlock problem may not occur every time that you run the same statements with the same timing.
The following events must occur in this order for this undetected deadlock problem to occur:
  1. Two transactions start. For example, transaction A and transaction B start.
  2. Transaction A updates a row in table T first. However, transaction A does not commit yet. A row lock is being held.
  3. Transaction B scans table T. Transaction B acquires locks that are incompatible with transaction A. For example, this event may occur when the scan is a part of a larger update query.
  4. The scan in transaction B is a parallel scan that is running in a special part of a parallel query. The special part of the parallel query is called the exchange segment. Typically, the exchange segment falls after a sort or after a hash join.
  5. One of scanning threads in transaction B becomes blocked behind a lock that is held by transaction A. All the other threads finish their scans.

    Notice that in transaction B, each worker thread is assigned a logical order that is nondeterministic and that is based on a first-come first-served basis. The blocked thread must not be the last thread.
  6. As soon as transaction B becomes blocked, transaction A runs another update on table T. This time, transaction A becomes blocked behind a lock that is held by transaction B.
If these events occur with this exact timing and under these conditions, the deadlock may be detected, or the deadlock may not be detected. This is because of the nature of this problem. If the deadlock is not detected, transaction A is blocked by transaction B, and transaction B is blocked by transaction A. This ties up resources until one of the following events occurs:
  • You manually intervene.
  • A time-out terminates one of the blocked transactions.
  • You restart the SQL Server service.
For more information about software update terminology, click the following article number to view the article in the Microsoft Knowledge Base:

824684 Description of the standard terminology that is used to describe Microsoft software updates


Modification Type:MinorLast Reviewed:7/26/2006
Keywords:kbtshoot kbQFE kbhotfixserver kbpubtypekc KB915918 kbAudITPRO kbAudDeveloper