6 Stack Limits in Multithreaded Execution Environments

This chapter discusses the following topics:

Stack limit checking
Stack overflow handling

The focus of these discussions is on dealing with stack limits in a multithreaded environment; however, the same information applies to singlethreaded environments. Although this calling standard is compatible with a multithreaded execution environment, the detailed mechanisms, data structures, and procedures that support this capability are not specified in the standard.

For a multithreaded environment, the following characteristics are assumed:

There can be one or more threads executing within a single process.

The state of a thread is represented in a thread environment block (TEB).

The TEB of a thread contains information that determines a stack limit, below which the stack pointer must not be decremented by the executing code, except for the code that implements the multithreaded mechanism itself.

Exception handling is fully reentrant and multithreaded.

There are three ways to terminate a thread correctly:
- By a call to exc_unwind() or exc_unwind_rfp(), specifying a null target environment

6.1 Stack Limit Checking

A program that is otherwise correct can fail because of stack overflow. A stack overflow occurs when extension of the stack (accomplished by decrementing the stack pointer, SP) allocates addresses not currently reserved for the current thread's stack.

Detection of a stack overflow condition is important. If a stack overflow is not detected, a thread that is writing into what it considered to be stack storage could modify data allocated in that memory for some other purpose. The results of such a situation would most likely be unpredictable and undesirable. In some cases, the overflow could result in unreproducible application failures.

Checking for stack overflow is a requirement for procedures that might execute in a multithreaded environment.

6.1.1 Stack Region Definitions

The various stack regions are defined as follows:

new stack region

This region of the stack extends from the old value of SP-1 to the new value of SP.

stack guard region

In a multithreaded environment, the memory beyond the limit of each thread's stack is protected by contiguous guard pages, which form the stack's guard region.

stack reserve region

In some cases, it is desirable to maintain a stack reserve region. This region is a minimum-sized region that is immediately above a thread's guard region. A reserve region is useful to ensure that the following conditions exist:

Exceptions or asynchronous signals have stack space to execute on a thread's stack
The exception dispatcher and any exception handler it might call have stack space to execute after an invalid attempt to extend the stack has been detected

The Digital UNIX calling standard does not require a reserve region.

6.1.2 Methods for Stack Limit Checking

Because memory can be accessible at addresses lower than those occupied by the guard region, compilers must generate code to ensure that the stack is never extended past the guard pages into accessible memory not allocated to the thread's stack.

The general strategy is to access each page of memory down to, and possibly including, the page corresponding to the intended new value for the stack pointer (SP). If the stack is to be extended by an amount larger than the size of a memory page, a series of accesses is required that works from higher-addressed pages to lower-addressed pages. Any access that results in a memory access violation indicates that the code has made an invalid attempt to extend the stack of the current thread.

Note
An access can be performed using a load operation or a store operation; however, care must be taken to use an instruction that is guaranteed to make an access to memory. For example, do not use an ldq $31,* instruction because the Alpha architecture allows it to result in no memory access at all, rather than a memory read access whose result is discarded because of the $31 destination.

There are two methods for stack-limit checking: implicit and explicit. In addition, the stack reserve region can be checked. The following sections describe each type of checking.

6.1.2.1 Implicit Stack Limit Checking

There are two mutually exclusive strategies for implicit stack limit checking:

If the lowest addressed byte of the new stack region is guaranteed to be accessed prior to any further stack extension, the stack can be extended by an increment that is equal in size to the guard region without any further accesses.

If some byte (not necessarily the lowest) of the new stack region is guaranteed to be accessed prior to any further stack extension, the stack can be extended by an increment that is equal in size to one-half the guard region without any further accesses.

Generally, the stack frame layout (shown in Section 3.1.2) and entry code rules (described in Section 3.2.6.1) do not make it feasible to guarantee access to the lowest address of a new stack region without introducing an extra access solely for that purpose. Consequently, this calling standard uses the second strategy. Although the maximum amount of implicit stack extension is smaller, the check is achieved at no additional cost.

This calling standard requires the minimum guard region size to be 8192 bytes, which is the size of the smallest memory protection granularity allowed by the Alpha architecture.

These factors are the basis for the following rule: If the stack is being extended by an amount less than or equal to 4096 and no reserve region is required, no explicit stack-limit checking is required.

However, because asynchronous interrupts and calls to other procedures can also cause stack extension without explicit stack limit checking, stack extension with implicit limit checking must follow a strict set of conventions:

Explicit stack limit checking must be performed unless the amount by which SP is decremented is known to be less than or equal to 4096 and no reserve region is required.

Some byte in the new stack region must be accessed before SP can be decremented for a subsequent stack extension. This access can be performed before or after SP is decremented for this stack extension, but it must be done before SP can be decremented again.

No standard procedure call can be made before some byte in the new stack region is accessed.

The system exception dispatcher ensures that the lowest addressed byte in the new stack region is accessed if any kind of asynchronous interrupt occurs after SP is decremented, but before the access in the new stack region occurs.

These conventions ensure that the stack pointer will not be decremented so far that it points to accessible storage beyond the stack limit without having the error detected by one of the following:

The guard region being accessed by the thread

An explicit stack limit check failure occuring

As a matter of practice, the system can provide multiple guard pages in the guard region. When a stack overflow is detected as a result of access to the guard region, one or more guard pages can be unprotected for use by the exception handling facility, and one or more guard pages can remain protected to provide implicit stack limit checking during exception processing. Note that the size of the guard region and the number of guard pages is defined by the system, not by this calling standard.

6.1.2.2 Explicit Stack Limit Checking

If the stack is being extended by an amount that is unknown at compile time or of a known size greater than the maximum implicit check size (4096), a code sequence that follows the rules for implicit stack limit checking can be executed in a loop to access the new stack region incrementally in segments smaller than or equal to the minimum page size (8192 bytes). At least one access must occur in each such segment.

The first access must occur between SP and SP4096 because, in the absence of more specific information, the previous guaranteed access relative to the current stack pointer might be as much as 4096 bytes greater than the current stack pointer address. The last access must be within 4096 bytes of the intended new value of the stack pointer. These accesses must occur in order, starting with the highest-addressed segment and working toward the lowest-addressed segment.

A simple algorithm that satisfies these rules (but can result in twice the minimum number of accesses) calls for performing a sequence of accesses in a loop starting with the previous value of SP and then decrementing by the minimum no-check extension size (4096) up to, but not including, the first value that is less than the new value for the stack pointer.

The stack must not be extended incrementally in procedure prologues. A procedure prologue that needs to extend the stack by an amount which is unknown at compile time or of a known size greater than the minimum implicit check size (4096) must test new stack segments (as described previously) in a loop that does not modify SP. The procedure prologue must then update the stack with one instruction that copies the new stack pointer value into SP.

Note
An explicit stack limit check can be performed either by inline code that is part of a prologue or by a run-time support routine that is specially tailored to be called from a procedure prologue.

6.1.3 Stack Reserve Region Checking

The size of the stack reserve region, if one exists, must be included in the increment size used for stack limit checks. However, the size is not included in the amount by which the stack is actually extended. Depending on the reserve size, stack reserve region checking could completely eliminate the ability to use implicit stack limit checking.

6.2 Stack Overflow Handling

If a stack overflow is detected, one of the following conditions occurs:

The system transparently extends the thread's stack, resets the TEB stack limit value appropriately, and continues execution of the thread.

The signal SIGSEGV is generated. If exc_raise_signal_exception() is installed as the handler for SIGSEGV, the corresponding exception is raised. (See Section 5.1.13 for information on exception and signal-handling coexistence.) To provide enough stack space to execute the exception dispatcher and handlers, specify that the sigstack be used when SIGSEGV is delivered. (See the sigstack(2) reference page.)

Note that if a transparent stack extension is performed, a stack overflow that occurs in a called procedure might cause the stack to be extended. Therefore, the TEB stack limit value must be considered volatile and potentially modified by external procedure calls as well as by the handling of exceptions.