FIX: Latch Timeout Warnings Appear in Error Log and Slow SQL Server Response Occurs with AWE Enabled (303640)



The information in this article applies to:

  • Microsoft SQL Server 2000 (all editions)

This article was previously published under Q303640
BUG #: 354751 (SHILOH_BUGS)

SYMPTOMS

Multiple processor SQL Server installations that run with Address Windowing Extensions (AWE) enabled, may experience Latch Timeout warnings, 845 errors and slow response time. The error messages that may occur include:
2001-06-28 13:12:17.62 spid92 Time out occurred while waiting for buffer latch type 4, bp 0x1c69dfc0, page 6:24196), stat 0xb, object ID 2:3:1, EC 0x3AB69570 : 0, waittime 300. Not continuing to wait.

-and/or-

2001-06-28 14:24:16.95 spid6 Error: 845, Severity: 17, State: 1 2001-06-28 14:24:16.95 spid6 Time-out occurred while waiting for buffer latch type 1 for page (27:124576), database ID 7..

CAUSE

A SQL Server worker thread encounters a non-yielding loop condition, which leads to worker thread starvation conditions.

RESOLUTION

To resolve this problem, obtain the latest service pack for Microsoft SQL Server 2000. For additional information, click the following article number to view the article in the Microsoft Knowledge Base:

290211 INF: How to Obtain the Latest SQL Server 2000 Service Pack

Hotfix

NOTE: The following hotfix was created prior to Microsoft SQL Server 2000 Service Pack 2.

The English version of this fix should have the following file attributes or later:
   File name       Platform
   ------------------------
   S80414i.exe     Intel
				
NOTE: Due to file dependencies, the most recent hotfix or feature that contains the preceding files may also contain additional files.

WORKAROUND

If possible, avoid the use of AWE for SQL Server purposes until you can obtain the correction.

STATUS

Microsoft has confirmed this to be a problem in SQL Server 2000. This problem was first corrected in Microsoft SQL Server 2000 Service Pack 2.

MORE INFORMATION

The condition corrected by this build of SQL Server is extremely rare. For the problem to occur you must have a multiple processor computer cause a simultaneous collision on a given procedure cache page allocation by separate threads. At the same time, the computer also must require AWE umap/map activities while traversing a data chain, and the thread that eventually results on the CPU spin must be assigned to the same UMS Scheduler.

NOTE: The following DBCC command (DBCC STACKDUMP)is unsupported, and may cause unexpected behavior. Microsoft cannot guarantee that you can solve problems that result from the incorrect use of this DBCC command. Use this DBCC command at your own risk. This DBCC command may not be available in future versions of SQL Server. For a list of the supported DBCC commands, see the "DBCC" topic in the Transact-SQL Reference section of SQL Server Books Online.


To identify if you are encountering the problem, execute the following script:
dbcc sqlperf(umsstats)
dbcc stackdump

waitfor delay '00:01:00'

dbcc sqlperf(umsstats)
dbcc stackdump
				
The error log contains the stack traces for all current worker threads. The thread that is holding the latch has yielded.
77F82152 Module(ntdll+00002152) (NtWaitForSingleObject+0000000B)            
410714C4 Module(UMS+000014C4) (UmsThreadScheduler::Switch+00000058)         
4107176A Module(UMS+0000176A) (UmsScheduler::IdleLoop+00000122)             
410718E9 Module(UMS+000018E9) (UmsScheduler::Suspend+0000007E)              
41071813 Module(UMS+00001813) (UmsEvent::Wait+00000095)                     
004011C1 Module(sqlservr+000011C1) (ExecutionContext::WaitForSignal+000001B5)  
0040F16F Module(sqlservr+0000F16F) (upwait0+0000017C)                       
00401A3C Module(sqlservr+00001A3C) (CMemThread::TsGetAccess+0000008F)       
00401AE2 Module(sqlservr+00001AE2) (CMemThread::Free+00000036)              
00401B19 Module(sqlservr+00001B19) (commondelete+0000001B)                  
005E2104 Module(sqlservr+001E2104) (CSql::~CSql+00000021)                   
005E205B Module(sqlservr+001E205B) (CSqlMgr::DerefSql+00000065)             
005CFD60 Module(sqlservr+001CFD60) (CCompPlan::~CCompPlan+00000051)         
0044CB99 Module(sqlservr+0004CB99) (CCompPlan::`vector deleting destructor'+000
0000B)                                                                      
004432C7 Module(sqlservr+000432C7) (CCacheObject::Release+000000D8)         
005CF499 Module(sqlservr+001CF499) (CCache::FRemoveOne+00000316)            
00813ABC Module(sqlservr+00413ABC) (BPool::BulkUnmap+0000009F)              
00813C7D Module(sqlservr+00413C7D) (BPool::Map+00000098) 
.
.     
				
The thread that is spinning in NewPage:
0043058E Module(sqlservr+0003058E) (BPool::NewPage+000000D4)               
00430378 Module(sqlservr+00030378) (PageRef::FormatBase+000000F6)           
0043097E Module(sqlservr+0003097E) (PageRef::Format+00000065)               
00841DF3 Module(sqlservr+00441DF3) (AllocateFirstWorkTblPage+00000167)      
007FB3F1 Module(sqlservr+003FB3F1) (AllocateRootPage+0000007F)              
0042F650 Module(sqlservr+0002F650) (ncinsert+00000062)                      
0041BB11 Module(sqlservr+0001BB11) (rowinsert+00000101)                     
0041A624 Module(sqlservr+0001A624) (insert+00000013)                               
.
. 
				

Modification Type:MajorLast Reviewed:10/9/2003
Keywords:kbBug kbfix kbSQLServ2000preSP2Fix KB303640