6    Symmetric Multiprocessing and Locking Methods

Symmetric multiprocessing (SMP) describes a computer environment that uses two or more central processing units (CPUs). In an SMP environment, software applications and the associated kernel modules can operate on two or more of these CPUs. To ensure the integrity of the data manipulated by kernel modules in this multiprocessor environment, you must perform additional design and implementation tasks beyond those discussed in Writing Device Drivers. One of these tasks involves choosing a locking method. Tru64 UNIX provides you with two methods to write SMP-safe kernel modules: simple locks and complex locks.

This chapter discusses the information you need to decide which items (variables, data structures, and code blocks) must be locked in the kernel module and to choose the appropriate method (simple locks or complex locks). Specifically, the chapter describes the following topics associated with designing and developing a kernel module that can operate safely in an SMP environment:

The following sections discuss each of these topics. You do not need an intimate understanding of kernel threads to learn about writing kernel modules in an SMP environment. Chapter 9 of this book discusses kernel threads and the associated routines that kernel modules use to create and manipulate them.

6.1    Understanding Hardware Issues Related to Synchronization

Alpha CPUs provide several features to assist with hardware-level synchronization. Even though all instructions that access memory are noninterruptible, no single one performs an atomic read-modify-write operation. A kernel-mode thread of execution can raise the interrupt priority level (IPL) in order to block other kernel threads on that CPU while it performs a read-modify-write sequence or while it executes any other group of instructions. Code that runs in any access mode can execute a sequence of instructions that contains load-locked (LDx_L) and store-conditional (STx_C) instructions to perform a read-modify-write sequence that appears atomic to other kernel threads of execution.

Memory barrier instructions order a CPU's memory reads and writes from the viewpoint of other CPUs and I/O processors. The locking mechanisms (simple and complex locks) provided in the operating system take care of the idiosyncracies related to read-modify-write sequences and memory barriers on Alpha CPUs. Therefore, you need not be concerned about these hardware issues when implementing SMP-safe kernel modules that use simple and complex locks.

The rest of this section describes the following hardware-related issues:

6.1.1    Atomicity

Software synchronization refers to the coordination of events in such a way that only one event happens at a time. This kind of synchronization is a serialization or sequencing of events. Serialized events are assigned an order and processed one at a time in that order. While a serialized event is being processed, no other event in the series is allowed to disrupt it.

By imposing order on events, software synchronization allows reading and writing of several data items indivisibly, or atomically, to obtain a consistent set of data. For example, all of process A's writes to shared data must happen before or after process B's writes or reads, but not during process B's writes or reads. In this case, all of process A's writes must happen indivisibly for the operation to be correct. This includes process A's updates -- reading of a data item, modifying it, and writing it back (read-modify-write sequence). Other synchronization techniques ensure the completion of an asynchronous system service before the caller tries to use the results of the service.

Atomicity is a type of serialization that refers to the indivisibility of a small number of actions, such as those occurring during the execution of a single instruction or a small number of instructions. With more than one action, no single action can occur by itself. If one action occurs, then all the actions occur. Atomicity must be qualified by the viewpoint from which the actions appear indivisible: an operation that is atomic for kernel threads running on the same CPU can appear as multiple actions to a kernel thread of execution running on a different CPU.

An atomic memory reference results in one indivisible read or write of a data item in memory. No other access to any part of that data can occur during the course of the atomic reference. Atomic memory references are important for synchronizing access to a data item that is shared by multiple writers or by one writer and multiple readers. References need not be atomic to a data item that is not shared or to one that is shared but is only read.

6.1.2    Alignment

Alignment refers to the placement of a data item in memory. For a data item to be naturally aligned, its lowest-addressed byte must reside at an address that is a multiple of the size of the data item (in bytes). For example, a naturally aligned longword has an address that is a multiple of 4. The term naturally aligned is usually shortened to ``aligned.''

An Alpha CPU allows atomic access only to an aligned longword or an aligned quadword. Reading or writing an aligned longword or quadword of memory is atomic with respect to any other kernel thread of execution on the same CPU or on other CPUs.

6.1.3    Granularity

The phrase granularity of data access refers to the size of neighboring units of memory that can be written independently and atomically by multiple CPUs. Regardless of the order in which the two units are written, the results must be identical.

Alpha systems have longword and quadword granularity. That is, only adjacent aligned longwords or quadwords can be written independently. Because Alpha systems support only instructions that load or store longword-sized and quadword-sized memory data, the manipulation of byte-sized and word-sized data on Alpha systems requires that the entire longword or quadword that contains the byte- or word-sized item be manipulated. Thus, simply because of its proximity to an explicitly shared data item, neighboring data might become shared unintentionally. Manipulation of byte-sized and word-sized data on Alpha systems requires multiple instructions that:

  1. Fetch the longword or quadword that contains the byte or word

  2. Mask the nontargeted bytes

  3. Manipulate the target byte or word

  4. Store the entire longword or quadword

Because this sequence is interruptible, operations on byte and word data are not atomic on Alpha systems. Also, this change in the granularity of memory access can affect the determination of which data is actually shared when a byte or word is accessed.

The absence of byte and word granularity on Alpha systems has important implications for access to shared data. In effect, any memory write of a data item other than an aligned longword or quadword must be done as a multiple-instruction read-modify-write sequence. Also, because the amount of data read and written is an entire longword or quadword, you must ensure that all accesses to fields within the longword or quadword are synchronized with each other.

6.2    Locking in a Symmetric Multiprocessing Environment

In a single-processor environment, kernel modules need not protect the integrity of a resource from activities resulting from the actions of another CPU. However, in an SMP environment, the kernel module must protect the resource from multiple CPU access to prevent corruption. A resource, from the kernel module's standpoint, is data that more than one kernel thread can manipulate. You can store the resource in variables (global) and in data structure fields. The top half of Figure 6-1 shows a typical problem that could occur in an SMP environment. The figure shows that the resource called i is a global variable whose initial value is 1.

Furthermore, the figure shows that the kernel threads emanating from CPU1 and CPU2 increment resource i. A kernel thread is a single sequential flow of control within a kernel module or other systems-based program. The kernel module or other systems-based program makes use of the routines (instead of a threads library package such as DECthreads) to start, terminate, delete, and perform other kernel threads-related operations. These kernel threads cannot increment this resource simultaneously. Without some way to lock the global variable when one kernel thread is incrementing it, the integrity of the data stored in this resource is compromised in the SMP environment.

To protect the integrity of the data, you must enforce order on the accesses of the data by multiple CPUs. One way to establish the order of CPU access to the resource is to establish a lock. As the bottom half of the figure shows, the kernel thread emanating from CPU1 locks access to resource i, thus preventing access by kernel threads emanating from CPU2. This guarantees the integrity of the value stored in this resource.

Figure 6-1:  Why Locking Is Needed in an SMP Environment

The vertical line in the bottom half of the figure represents a barrier that prevents the kernel thread emanating from CPU2 from accessing resource i until the kernel thread emanating from CPU1 unlocks it. For simple locks, this barrier indicates that the lock is exclusive. That is, no other kernel thread can gain access to the lock until the kernel thread currently controlling it has released (unlocked) it.

For complex write locks, this barrier represents a wait hash queue that collects all of the kernel threads waiting to gain write access to a resource. With complex read locks, all kernel threads have read-only access to the same resource at the same time.

6.3    Comparing Simple Locks and Complex Locks

The operating system provides two ways to lock specific resources (global variables and data structures) referenced in code blocks in the kernel module: simple locks and complex locks. Simple and complex locks allow kernel modules to:

The following sections briefly describe simple locks and complex locks.

6.3.1    Simple Locks

A simple lock is a general-purpose mechanism for protecting resources in an SMP environment. Figure 6-2 shows that simple locks are spin locks. That is, the routines used to implement the simple lock do not return until the lock has been obtained.

As the figure shows, the CPU1 kernel thread obtains a simple lock on resource i. Once the CPU1 kernel thread obtains the simple lock, it has exclusive access over the resource to perform read and write operations on the resource. The figure also shows that the CPU2 kernel thread spins while waiting for the CPU1 kernel thread to unlock (free) the simple lock.

Figure 6-2:  Simple Locks Are Spin Locks

You need to understand the tradeoffs in performance and realtime preemption latency associated with simple locks before you use them. However, sometimes kernel modules must use simple locks. For example, kernel modules must use simple locks and spl routines to synchronize with interrupt service routines. Section 6.4 provides guidelines to help you choose between simple locks and complex locks.

Table 6-1 lists the data structure and routines associated with simple locks. Chapter 7 discusses how to use the data structure and routines to implement simple locks in a kernel module.

Table 6-1:  Data Structure and Routines Associated with Simple Locks

Structure/Routines Description
slock Contains simple lock-specific information.
decl_simple_lock_data Declares a simple lock structure.
simple_lock Asserts a simple lock.
simple_lock_init Initializes a simple lock structure.
simple_lock_terminate Terminates, using a simple lock.
simple_lock_try Tries to assert a simple lock.
simple_unlock Releases a simple lock.

6.3.2    Complex Locks

A complex lock is a mechanism for protecting resources in an SMP environment. A complex lock achieves the same results as a simple lock. However, kernel modules should use complex locks (not simple locks) if there are blocking conditions.

The routines that implement complex locks synchronize access to kernel data between multiple kernel threads. The following describes characteristics associated with complex locks:

Figure 6-3 shows that complex locks are not spin locks, but blocking (sleeping) locks. That is, the routines that implement the complex lock block (sleep) until the lock is released. Thus, unlike simple locks, you should not use complex locks to synchronize with interrupt service routines. Because of the blocking characteristic of complex locks, they are active on both single and multiple CPUs to serialize access to data between kernel threads.

As the figure shows, the CPU1 kernel thread asserts a complex lock with write access on resource i. The CPU2 kernel thread also asserts a complex lock with write access on resource i. Because the CPU1 kernel thread asserts the write complex lock on resource i first, the CPU2 kernel thread blocks, waiting until the CPU1 kernel thread unlocks (frees) the complex write lock.

Figure 6-3:  Complex Locks Are Blocking Locks

Like simple locks, complex locks present tradeoffs in performance and realtime preemption latency that you should understand before you use them. However, sometimes kernel modules must use complex locks. For example, kernel modules must use complex locks when there are blocking conditions in the code block. On the other hand, you must not take a complex lock while holding a simple lock or when using the timeout routine. Section 6.4 provides guidelines to help you choose between simple locks and complex locks.

Table 6-2 lists the data structure and routines associated with complex locks. Chapter 8 discusses how to use the data structure and routines to implement complex locks in a kernel module.

Table 6-2:  Data Structure and Routines Associated with Complex Locks

Structure/Routines Description
lock Contains complex lock-specific information.
lock_done Releases a complex lock.
lock_init Initializes a complex lock.
lock_read Asserts a complex lock with read-only access.
lock_terminate Terminates, using a complex lock.
lock_try_read Tries to assert a complex lock with read-only access.
lock_try_write Tries to assert a complex lock with write access.
lock_write Asserts a complex lock with write access.

6.4    Choosing a Locking Method

You can make your kernel modules SMP-safe by implementing a simple or complex locking method.

This section provides guidelines to help you choose the appropriate locking method (simple or complex). In choosing a locking method, consider the following SMP characteristics:

The following sections discuss each of these characteristics. See Section 6.4.6 for a summary comparison table of the locking methods that you can use to determine which items to lock in your kernel modules.

6.4.1    Who Has Access to a Particular Resource

To choose the appropriate lock method, you must understand the entity that has access to a particular resource. Possible entities that can access a resource are kernel threads, interrupt service routines, and exceptions. If you need a lock for resources accessed by multiple kernel threads, use simple or complex locks. Use a combination of spl routines and simple locks to lock resources that kernel threads and interrupt service routines access.

For exceptions, use complex locks if the exception involves blocking conditions. If the exception does not involve blocking conditions, you can use simple locks.

6.4.2    Prevention of Access to a Resource While a Kernel Thread Sleeps

You must determine if it is necessary to prevent access to the resource while a kernel thread blocks (sleeps). One example is waiting for disk I/O to a buffer. If you need a lock to prevent access to the resource while a kernel thread blocks (sleeps) and there are no blocking conditions, use simple or complex locks. Otherwise, if there are blocking conditions, use complex locks.

6.4.3    Length of Time the Lock Is Held

You must estimate the length of time the lock is held to determine the appropriate lock method. In general, use simple locks when the entity accesses are bounded and small. One example of a bounded and small access is some entity accessing a system time variable. Use complex locks when the entity accesses could take a long time or a variable amount of time. One example of a variable amount of time is some entity scanning linked lists.

6.4.4    Execution Speed

You must account for execution speed in choosing the appropriate lock method. The following factors influence execution speed:

6.4.5    Size of Code Blocks

In general, use complex locks for resources contained in long code blocks. Also, use complex locks in cases where the resource must be prevented from changing when a kernel thread blocks (sleeps).

Use simple locks for resources contained in short, nonblocking code blocks or when synchronizing with interrupt service routines.

6.4.6    Summary of Locking Methods

Table 6-3 summarizes the SMP characteristics for choosing the appropriate lock method to make your kernel module SMP safe. The first column of the table presents an SMP characteristic and the second and third columns present the lock methods.

The following list describes the possible entities that can appear in the second and third columns:

(The numbers before each Characteristic item appear for easy reference in later descriptions.)

Table 6-3:  SMP Characteristics for Locking

Characteristic Simple Lock Complex Lock
1. Kernel threads will access this resource. Yes Yes
2. Interrupt service routines will access this resource. Yes No
3. Exceptions will access this resource. Yes Yes
4. Need to prevent access to this resource while a kernel thread blocks and there are no blocking conditions. Yes Yes
5. Need to prevent access to this resource while a kernel thread blocks and there are blocking conditions. No Yes
6. Need to protect resource between kernel threads and interrupt service routines. Yes No
7. Need to have maximum execution speed for this kernel module. Yes No
8. The module references and updates this resource in long code blocks (implying that the length of time the lock is held on this resource is not bounded and long). Worse Better
9. The module references and updates this resource in short nonblocking code blocks (implying that the length of time the lock is held on this resource is bounded and short). Better Worse
10. Need to minimize memory usage by the lock-specific data structures. Yes No
11. Need to synchronize with interrupt service routines. Yes No
12. The module can afford busy wait time. Yes No
13. The module implements realtime preemption. Worse Better

Use the following steps to analyze your kernel module to determine which items to lock and which locking method to choose:

  1. Identify all of the resources in your kernel module that you could potentially lock. Section 6.5 discusses some of these resources.

  2. Identify all of the code blocks in your kernel module that manipulate the resource.

  3. Determine which locking method is appropriate. Use Table 6-3 as a guide to help you choose the locking method. Section 6.5.5 shows how to use this table for choosing a locking method for the example device register offset definition resources.

  4. Determine the granularity of the lock. Section 6.5.5 shows how to determine the granularity of the locks for the example device register offset definitions.

6.5    Choosing the Resources to Lock in the Module

Section 6.4 presents the SMP characteristics you must consider when choosing a locking method. You need to analyze each section of the kernel module (in device drivers, for example, the open and close device section, the read and write device section, and so forth) and apply those SMP characteristics to the following resource categories:

The following sections discuss each of these categories. See Section 6.5.5 for an example that walks you through the steps for analyzing a kernel module to determine which resources to lock.

6.5.1    Read-Only Resources

Analyze each section of your kernel module to determine if the access to a resource is read only. In this case, resource refers to module and system data stored in global variables or data structure fields. You do not need to lock resources that are read only because there is no way to corrupt the data in a read-only operation.

6.5.2    Device Control Status Register Addresses

Analyze each section of your kernel module to determine accesses to a device's control status register (CSR) addresses. Many kernel modules based on the UNIX operating system use the direct method; that is, they access a device's CSR addresses directly through a device register structure. This method involves declaring a device register structure that describes the device's characteristics, which include a device's control status register. After declaring the device register structure, the kernel module accesses the device's CSR addresses through the field that maps to it.

Some CPU architectures do not allow you to access the device CSR addresses directly. Kernel modules that need to operate on these types of CPUs should use the indirect method. In fact, kernel modules operating on Alpha systems must use the indirect method. Thus, the discussion of locking a device's CSR addresses focuses on the indirect method.

The indirect method involves defining device register offset definitions (instead of a device register structure) that describe the device's characteristics, which include a device's control status register. The method also includes the use of the following categories of routines:

Using these routines makes your kernel module more portable across different bus architectures, different CPU architectures, and different CPU types within the same architecture. For examples of how to use these routines when writing device drivers, see Writing Device Drivers. The following example shows the device register offset definitions that some xx kernel module defines for some XX device:


.
.
.
#define XX_ADDER 0x0 /* 32-bit read/write DMA address register */ #define XX_DATA 0x4 /* 32-bit read/write data register */ #define XX_CSR 0x8 /* 16-bit read/write CSR/LED register */ #define XX_TEST 0xc /* Go bit register. Write sets. Read clears */
.
.
.

6.5.3    Module-Specific Global Resources

Analyze the declarations and definitions sections of your kernel module to identify the following global resources:

Module-specific global variables can store a variety of information, including flag values that control execution of code blocks and status information. The following example shows the declaration and initialization of some typical module-specific global variables. Use this example to help you locate similar module-specific global variables in your kernel module.


.
.
.
int num_xx = 0;
.
.
.
int xx_is_dynamic = 0;
.
.
.

Module-specific data structures contain fields that can store such information as whether a device is attached, whether it is opened, the read/write mode, and so forth. The following example shows the declaration and initialization of some typical module-specific data structures. Use this example to help you locate similar module-specific data structures in your kernel modules.


.
.
.
struct driver xxdriver = {
.
.
.
};
.
.
.
cfg_subsys_attr_t xx_attributes[] = {
.
.
.
};
.
.
.
};
.
.
.
struct xx_kern_str {
.
.
.
} xx_kern_str[NXX];
.
.
.
struct cdevsw xx_cdevsw_entry = {
.
.
.
};

After you identify the module-specific global variables and module-specific data structures, locate the code blocks in which the kernel module references them. Use Table 6-3 to determine which locking method is appropriate. Also, determine the granularity of the lock.

6.5.4    System-Specific Global Resources

Analyze the declarations and definitions sections of your kernel module to identify the following global resources:

System-specific variables include the global variables hz, cpu, and lbolt. The following example shows the declaration of one system-specific global variable:


.
.
.
extern int hz;
.
.
.

System-specific data structures include controller, buf, and ihandler_t. The following example shows the declaration of some system-specific data structures:


.
.
.
struct controller *info[NXX];
.
.
.
struct buf cbbuf[NCB];
.
.
.

After you identify the system-specific global variables and system-specific data structures, locate the code blocks in which the module references them. Use Table 6-3 to determine which locking method is appropriate. Also, determine the granularity of the lock.

Note

To lock buf structure resources, use the BUF_LOCK and BUF_UNLOCK routines instead of the simple and complex lock routines. For descriptions of these routines, see the BUF_LOCK(9) and BUF_UNLOCK(9) reference pages.

6.5.5    How to Determine the Resources to Lock

Use the following steps to determine which resources to lock in your kernel modules:

  1. Identify all resources that you might lock.

  2. Identify all of the code blocks in the kernel module that manipulate each resource.

  3. Determine which locking method is appropriate.

  4. Determine the granularity of the lock.

The following example walks you through an analysis of which resources to lock for the xx module.

Step 1: Identify All Resources That You Might Lock

Table 6-4 summarizes the resources that you might lock in your kernel module according to the following categories:

Table 6-4:  Kernel Module Resources for Locking

Category Associated Resources
Device control status register (CSR) addresses. N/A
Module-specific global variables. Variables that store flag values to control execution of code blocks. Variables that store status information.
Module-specific global data structures. dsent, cfg_subsys_attr_t, driver, and the kernel module's kern_str structure.
System-specific global variables cpu, hz, lbolt, and page_size.
System-specific global data structures controller and buf.

One resource that the xx module must lock is the device CSR addresses. This module also needs to lock the hz global variable. The example analysis focuses on the following device register offset definitions for the xx module:


.
.
.
#define XX_ADDER 0x0 /* 32-bit read/write DMA address register */ #define XX_DATA 0x4 /* 32-bit read/write data register */ #define XX_CSR 0x8 /* 16-bit read/write CSR/LED register */ #define XX_TEST 0xc /* Go bit register. Write sets. Read clears */
.
.
.

Step 2: Identify All of the Code Blocks in the Module That Manipulate the Resource

Identify all of the code blocks that manipulate the resource. If the code block accesses the resource read only, you may not need to lock the resources that it references. However, if the code block writes to the resource, you need to lock the resource by calling the simple or complex lock routines.

The xx module accesses the device register offset definition resources in the open and close device section and the read and write device section.

Step 3: Determine Which Locking Method Is Appropriate

Table 6-5 shows how to analyze the locking method that is most suitable for the device register offset definitions for some xx module. (The numbers before each Characteristic item appear for easy reference in later descriptions.)

Table 6-5:  Locking Device Register Offset Definitions

Characteristic Applies to This Module Simple Lock Complex Lock
1. Kernel threads will access this resource. Yes Yes Yes
2. Interrupt service routines will access this resource. No N/A N/A
3. Exceptions will access this resource. No N/A N/A
4. Need to prevent access to this resource while a kernel thread blocks and there are no blocking conditions. Yes Yes Yes
5. Need to prevent access to this resource while a kernel thread blocks and there are blocking conditions. No N/A N/A
6. Need to protect resource between kernel threads and interrupt service routines. Yes Yes No
7. Need to have maximum execution speed for this kernel module. Yes Yes No
8. The module references and updates this resource in long code blocks (implying that the length of time the lock is held on this resource is not bounded and long). No N/A N/A
9. The module references and updates this resource in short nonblocking code blocks (implying that the length of time the lock is held on this resource is bounded and short). Yes Better Worse
10. Need to minimize memory usage by the lock-specific data structures. Yes Yes No
11. Need to synchronize with interrupt service routines. No N/A N/A
12. The module can afford busy wait time. Yes Yes No
13. The module implements realtime preemption. No N/A N/A

The locking analysis table for the device register offset definitions shows the following:

Based on the previous analysis, the xx module uses the simple lock method.

Step 4: Determine the Granularity of the Lock

After choosing the appropriate locking method for the resource, determine the granularity of the lock. For example, in the case of the device register offset resource, you can determine the granularity by answering the following questions:

  1. Is a simple lock needed for each device register offset definition?

  2. Is one simple lock needed for all of the device register offset definitions?

Table 6-5 shows that the need to minimize memory usage is important to the xx module; therefore, creating one simple lock for all of the device register offset definitions would save the most memory. The following code fragment shows how to declare a simple lock for all of the device register offset definitions:


.
.
.
#include <kern/lock.h>
.
.
.
decl_simple_lock_data( , slk_xxdevoffset);
.
.
.

If the preservation of memory were not important to the xx module, declaring a simple lock for each device register offset definition might be more appropriate. The following code fragment shows how to declare a simple lock structure for each of the example device register offset definitions:


.
.
.
#include <kern/lock.h>
.
.
.
decl_simple_lock_data( , slk_xxaddr); decl_simple_lock_data( , slk_xxdata); decl_simple_lock_data( , slk_xxcsr); decl_simple_lock_data( , slk_xxtest);
.
.
.

After declaring a simple lock structure for an associated resource, you must initialize it (only once) by calling simple_lock_init. You then use the simple lock routines in code blocks that access the resource. Chapter 7 discusses the simple lock-related routines.