2 Overview of the Symmetric Multiprocessing Environment

Symmetric multiprocessing (SMP) describes a computer environment that uses two or more central processing units (CPUs). In an SMP environment, software applications and the associated device drivers can operate on two or more of these CPUs. To ensure the integrity of the data manipulated by device drivers in this multiprocessor environment, you must perform additional design and implementation tasks beyond those discussed in Writing Device Drivers: Tutorial. One of these tasks involves choosing a locking method. Digital UNIX provides you with three methods to write SMP-safe device drivers: simple locks, complex locks, and funnels.

This chapter discusses the information you need to decide which items (variables, data structures, and code blocks) must be locked in the device driver and to choose the appropriate method or methods (simple locks, complex locks, and funnels).

Specifically, the chapter describes the following topics associated with designing and developing a device driver that can operate safely in an SMP environment:

Understanding hardware issues related to synchronization
Understanding the need for locking in an SMP environment
Comparing simple locks, complex locks, and funnels
Choosing a locking method
Choosing the resources to lock in a device driver

The following sections discuss each of these topics. You do not need an intimate understanding of kernel threads to learn about writing device drivers in an SMP environment. Part 3 of this book discusses kernel threads and the associated interfaces that device drivers use to create and manipulate them.

2.1 Understanding Hardware Issues Related to Synchronization

Alpha CPUs provide several features to assist with hardware-level synchronization. Even though all instructions that access memory are noninterruptible, no single one performs an atomic read-modify-write operation. A kernel-mode thread of execution can raise the interrupt priority level (IPL) in order to block other kernel threads on that CPU while it performs a read-modify-write sequence or while it executes any other group of instructions. Code that runs in any access mode can execute a sequence of instructions that contains load-locked (LDx_L) and store-conditional (STx_C) instructions to perform a read-modify-write sequence that appears atomic to other kernel threads of execution.

Memory barrier instructions order a CPU's memory reads and writes from the viewpoint of other CPUs and I/O processors. The locking mechanisms (simple and complex locks) that Digital provides take care of the idiosyncracies related to read-modify-write sequences and memory barriers on Alpha CPUs. Therefore, you need not be concerned about these hardware issues when implementing SMP-safe device drivers that use simple and complex locks.

The rest of this section describes the following hardware-related issues:

Atomicity
Alignment
Granularity

2.1.1 Atomicity

Software synchronization refers to the coordination of events in such a way that only one event happens at a time. This kind of synchronization is a serialization or sequencing of events. Serialized events are assigned an order and processed one at a time in that order. While a serialized event is being processed, no other event in the series is allowed to disrupt it.

By imposing order on events, software synchronization allows reading and writing of several data items indivisibly, or atomically, to obtain a consistent set of data. For example, all of process A's writes to shared data must happen before or after process B's writes or reads, but not during process B's writes or reads. In this case, all of process A's writes must happen indivisibly for the operation to be correct. This includes process A's updates reading of a data item, modifying it, and writing it back (read-modify-write sequence). Other synchronization techniques ensure the completion of an asynchronous system service before the caller tries to use the results of the service.

Atomicity is a type of serialization that refers to the indivisibility of a small number of actions, such as those occurring during the execution of a single instruction or a small number of instructions. With more than one action, no single action can occur by itself. If one action occurs, then all the actions occur. Atomicity must be qualified by the viewpoint from which the actions appear indivisible: an operation that is atomic for kernel threads running on the same CPU can appear as multiple actions to a kernel thread of execution running on a different CPU.

An atomic memory reference results in one indivisible read or write of a data item in memory. No other access to any part of that data can occur during the course of the atomic reference. Atomic memory references are important for synchronizing access to a data item that is shared by multiple writers or by one writer and multiple readers. References need not be atomic to a data item that is not shared or to one that is shared but is only read.

2.1.2 Alignment

Alignment refers to the placement of a data item in memory. For a data item to be naturally aligned, its lowest-addressed byte must reside at an address that is a multiple of the size of the data item (in bytes). For example, a naturally aligned longword has an address that is a multiple of 4. The term naturally aligned is usually shortened to ``aligned.''

An Alpha CPU allows atomic access only to an aligned longword or an aligned quadword. Reading or writing an aligned longword or quadword of memory is atomic with respect to any other kernel thread of execution on the same CPU or on other CPUs.

2.1.3 Granularity

The phrase granularity of data access refers to the size of neighboring units of memory that can be written independently and atomically by multiple CPUs. Regardless of the order in which the two units are written, the results must be identical.

Alpha systems have longword and quadword granularity. That is, only adjacent aligned longwords or quadwords can be written independently. Because Alpha systems support only instructions that load or store longword-sized and quadword-sized memory data, the manipulation of byte-sized and word-sized data on Alpha systems requires that the entire longword or quadword that contains the byte- or word-sized item be manipulated. Thus, simply because of its proximity to an explicitly shared data item, neighboring data might become shared unintentionally. Manipulation of byte-sized and word-sized data on Alpha systems requires multiple instructions that:

Fetch the longword or quadword that contains the byte or word
Mask the nontargeted bytes
Manipulate the target byte or word
Store the entire longword or quadword

Because this sequence is interruptible, operations on byte and word data are not atomic on Alpha systems. Also, this change in the granularity of memory access can affect the determination of which data is actually shared when a byte or word is accessed.

The absence of byte and word granularity on Alpha systems has important implications for access to shared data. In effect, any memory write of a data item other than an aligned longword or quadword must be done as a multiple-instruction read-modify-write sequence. Also, because the amount of data read and written is an entire longword or quadword, you must ensure that all accesses to fields within the longword or quadword are synchronized with each other.

2.2 Understanding the Need for Locking in a Symmetric Multiprocessing Environment

In a single processor environment, device drivers need not protect the integrity of a resource from activities resulting from the actions of another CPU. However, in an SMP environment, the device driver must protect the resource from multiple CPU access to prevent corruption. A resource, from the device driver's standpoint, is data that more than one kernel thread can manipulate. You can store the resource in variables (global) and data structure members. The top half of Figure 2-1 shows a typical problem that could occur in an SMP environment. The figure shows that the resource called i is a global variable whose initial value is 1.

Furthermore, the figure shows that the kernel threads emanating from CPU1 and CPU2 increment resource i. A kernel thread is a single sequential flow of control within a device driver or other systems-based program. The device driver or other systems-based program makes use of the kernel interfaces (instead of a threads library package such as DECthreads) to start, terminate, delete, and perform other kernel threads-related operations. These kernel threads cannot increment this resource simultaneously. Without some way to lock the global variable when one kernel thread is incrementing it, the integrity of the data stored in this resource is compromised in the SMP environment.

To protect the integrity of the data, you must enforce order on the accesses of the data by multiple CPUs. One way to establish the order of CPU access to the resource is to establish a lock. As the bottom half of the figure shows, the kernel thread emanating from CPU1 locks access to resource i, thus preventing access by kernel threads emanating from CPU2. This guarantees the integrity of the value stored in this resource.

Figure 2-1: Why Locking Is Needed in an SMP Environment

The vertical line in the bottom half of the figure represents a barrier that prevents the kernel thread emanating from CPU2 from accessing resource i until the kernel thread emanating from CPU1 unlocks it. For simple locks, this barrier indicates that the lock is exclusive. That is, no other kernel thread can gain access to the lock until the kernel thread currently controlling it has released (unlocked) it.

For complex write locks, this barrier represents a wait hash queue that collects all of the kernel threads waiting to gain write access to a resource. With complex read locks, all kernel threads can access the same resource read-only at the same time.

For funnels, this barrier indicates that processing of the device driver subsystem is forced onto a single CPU.

2.3 Comparing Simple Locks, Complex Locks, and Funnels

Digital UNIX provides three ways to lock specific resources (global variables and data structures) referenced in code blocks in the device driver: simple locks, complex locks, and funnels. Simple and complex locks allow device drivers to:

Synchronize access to a resource or resources. This means kernel threads emanating from multiple CPUs can safely update the count of global variables, add elements to or delete elements from linked lists, and update or read time elements.
Ensure a consistent view of state transitions (run to block and block to run) across multiple CPUs.
Make the operating system behave as though it were running on a single CPU.

A funnel allows device drivers to force execution onto a single CPU. In this case, the locking is achieved by simulating a single-processor environment.

The following sections briefly describe simple locks, complex locks, and funnels.

2.3.1 Simple Locks

A simple lock is a general-purpose mechanism for protecting resources in an SMP environment. Figure 2-2 shows that simple locks are spin locks. That is, the kernel interfaces used to implement the simple lock do not return until the lock has been obtained.

As the figure shows, the CPU1 kernel thread obtains a simple lock on resource i. Once the CPU1 kernel thread obtains the simple lock, it has exclusive access over the resource to perform read and write operations on the resource. The figure also shows that the CPU2 kernel thread spins while waiting for the CPU1 kernel thread to unlock (free) the simple lock.

Figure 2-2: Simple Locks Are Spin Locks

There are tradeoffs in performance and realtime preemption latency associated with simple locks that you should understand before you use them. However, there are times when device drivers must use simple locks. For example, device drivers must use simple locks (together with the spl interfaces) to synchronize with interrupt service interfaces. Section 2.4 provides guidelines to help you choose between simple locks, complex locks, or funnels.

Table 2-1 lists the kernel interfaces and data structure associated with simple locks. Chapter 3 discusses how to use these interfaces and data structure to implement simple locks in a device driver.

Table 2-1: Kernel Interfaces and Data Structure Associated with Simple Locks

Kernel Interfaces/Structure	Description
`decl_simple_lock_data`	Declares a simple lock structure.
`simple_lock`	Asserts a simple lock.
`simple_lock_init`	Initializes a simple lock structure.
`simple_lock_terminate`	Terminates, using a simple lock.
`simple_lock_try`	Tries to assert a simple lock.
`simple_unlock`	Releases a simple lock.
`slock`	Contains simple lock-specific information.

2.3.2 Complex Locks

A complex lock is a mechanism for protecting resources in an SMP environment. A complex lock achieves the same results as a simple lock. However, device drivers should use complex locks (not simple locks) if there are blocking conditions.

The kernel interfaces that implement complex locks synchronize access to kernel data between multiple kernel threads. The following describes characteristics associated with complex locks:

Multiple reader access
Thread blocking (sleeping) if the write lock is asserted

Figure 2-3 shows that complex locks are not spin locks, but blocking (sleeping) locks. That is, the kernel interfaces used to implement the complex lock block (sleep) until the lock is released. Thus, unlike simple locks, you should not use complex locks to synchronize with interrupt service interfaces. Because of the blocking characteristic of complex locks, they are active on both single and multiple CPUs to serialize access to data between kernel threads.

As the figure shows, the CPU1 kernel thread asserts a complex lock with write access on resource i. The CPU2 kernel thread also asserts a complex lock with write access on resource i. Because the CPU1 kernel thread asserts the write complex lock on resource i first, the CPU2 kernel thread blocks, waiting until the CPU1 kernel thread unlocks (frees) the complex write lock.

Figure 2-3: Complex Locks Are Blocking Locks

Like simple locks, complex locks present tradeoffs in performance and realtime preemption latency that you should understand before you use them. However, there are times when device drivers must use complex locks. For example, device drivers must use complex locks when there are blocking conditions in the code block. On the other hand, you must not take a complex lock while holding a simple lock or when using the timeout interface. Section 2.4 provides guidelines to help you choose between simple locks, complex locks, or funnels.

Table 2-2 lists the kernel interfaces and data structure associated with complex locks.

Chapter 4 discusses how to use these interfaces and data structure to implement complex locks in a device driver.

Table 2-2: Kernel Interfaces and Data Structure Associated with Complex Locks

Kernel Interfaces/Structure	Description
`lock`	Contains complex lock-specific information.
`lock_done`	Releases a complex lock.
`lock_init`	Initializes a complex lock.
`lock_read`	Asserts a complex lock with read-only access.
`lock_terminate`	Terminates, using a complex lock.
`lock_try_read`	Tries to assert a complex lock with read-only access.
`lock_try_write`	Tries to assert a complex lock with write access.
`lock_write`	Asserts a complex lock with write access.

2.3.3 Funnels

A funnel is a mechanism for protecting resources in an SMP environment. A funnel achieves the same results as simple and complex locks, not by locking resources and code blocks but by forcing execution of a subsystem (for example, a device driver) onto a single CPU. You funnel a device driver onto a single CPU by setting the d_funnel member of the associated dsent structure to the value DEV_FUNNEL. This value causes the kernel to make the appropriate calls to unix_master and unix_release to force execution of the driver onto the master CPU. Device drivers should not directly call the unix_master and unix_release interfaces. One exception to this recommendation is when you want a device driver's kernel threads to run only on the master CPU. This situation occurs when your driver creates and starts its own kernel threads and you set the d_funnel member of the associated dsent structure to the value DEV_FUNNEL. In this case, each kernel thread must call unix_master once to ensure that the kernel thread runs only on the master CPU. Remember to make a corresponding call to unix_release.

Figure 2-4 shows this feature of funnels. As the figure shows, the CPU1 kernel thread emanating from some device driver causes the kernel to call the unix_master interface, which forces execution onto the master CPU (also called the boot CPU). This action assumes that the device driver initialized the d_funnel member of the dsent or dsent table to DEV_FUNNEL.

Figure 2-4: A Funnel Forces Execution onto a Single CPU

Using funnels is not optimal for performance on CPUs running in an SMP environment. However, you can use funnels as an alternative to simple and complex locks to write SMP-safe third-party device drivers. Like simple and complex locks, funnels present tradeoffs in performance and realtime preemption latency that you should understand before you use them. Section 2.4 provides guidelines to help you choose between simple locks, complex locks, or funnels.

2.4 Choosing a Locking Method

You can make your device drivers SMP safe by implementing a locking method (simple or complex) or by using a funnel. Use a funnel if it is acceptable to force execution of the device driver onto a single CPU and optimal performance is not a consideration. Otherwise, choose one of the locking methods.

This section provides guidelines to help you choose the appropriate locking method (simple or complex).
In choosing a locking method, consider the following SMP-related characteristics:

Who has access to a particular resource
Prevention of access to the resource while a kernel thread sleeps
Length of time the lock is held
Execution speed
Size of code blocks

The following sections discuss each of these characteristics. See Section 2.4.6 for a summary comparison table of the locking methods that you can use to determine which items to lock in your device drivers.

2.4.1 Who Has Access to a Particular Resource

To choose the appropriate lock method, you must understand the entity that has access to a particular resource. Possible entities that can access a resource are kernel threads, interrupt service interfaces, and exceptions. If you need a lock for resources accessed by multiple kernel threads, use simple or complex locks. Use a combination of spl interfaces and simple locks to lock resources that kernel threads and interrupt service interfaces access.

For exceptions, use complex locks if the exception involves blocking conditions. If the exception does not involve blocking conditions, you can use simple locks.

2.4.2 Prevention of Access to a Resource While a Kernel Thread Sleeps

You must determine if it is necessary to prevent access to the resource while a kernel thread blocks (sleeps). One example is waiting for disk I/O to a buffer. If you need a lock to prevent access to the resource while a kernel thread blocks (sleeps) and there are no blocking conditions, use simple or complex locks. Otherwise, if there are blocking conditions, use complex locks.

2.4.3 Length of Time the Lock Is Held

You must estimate the length of time the lock is held to determine the appropriate lock method. In general, use simple locks when the entity accesses are bounded and small. One example of a bounded and small access is some entity accessing a system time variable. Use complex locks when the entity accesses could take a long time or a variable amount of time. One example of a variable amount of time is some entity scanning linked lists.

2.4.4 Execution Speed

You must account for execution speed in choosing the appropriate lock method. The following factors influence execution speed:

The way complex locks work
Complex locks are slightly more than twice as expensive (in terms of execution speed) as simple locks. The reason for this is that complex locks use the simple lock interfaces to implement the lock. Thus, it takes two lock and unlock pairs to protect a resource or code block with a complex lock as opposed to one pair for the simple lock.
Memory space used
Complex locks use more memory space than simple locks. The reason for this is the complex lock structure, lock, contains a pointer to a simple lock structure in addition to other data to implement the complex lock.
Busy wait time
Busy wait time is the amount of CPU time expended on waiting for a simple lock to become free. If the driver initiates a simple lock on a resource and the code block is long (or there are numerous interrupts), a lot of CPU time could be wasted waiting for the simple lock to become free. If this is the case, use complex locks to allow the current kernel thread to block (sleep) on the busy resource. This action allows the CPU to execute a different kernel thread.
Realtime preemption
Realtime preemption cannot occur when a simple lock is held. Use of complex locks (which can block) improves the performance associated with realtime preemption.

2.4.5 Size of Code Blocks

In general, use complex locks for resources contained in long code blocks. Also, use complex locks in cases where the resource must be prevented from changing when a kernel thread blocks (sleeps).

Use simple locks for resources contained in short, nonblocking code blocks or when synchronizing with interrupt service interfaces.

2.4.6 Summary of Locking Methods

Table 2-3 summarizes the SMP-related characteristics for choosing the appropriate lock method to make your device driver SMP safe. The first column of the table presents an SMP-related characteristic and the second and third columns present the lock methods. The following list describes the possible entities that can appear in the second and third columns:

Yes
Indicates that the lock method is suitable for the characteristic
No
Indicates that the lock method is not suitable for the characteristic
Better
Indicates that this lock method is most suitable for the characteristic
Worse
Indicates that this lock method is not the most suitable for the characteristic

(The numbers under Characteristic appear for easy reference in later descriptions.)

Table 2-3: Summary of SMP-Related Characteristics Used in Choosing a Lock Method

Characteristic	Simple Lock	Complex Lock
1. Kernel threads will access this resource.	Yes	Yes
2. Interrupt service interfaces will access this resource.	Yes	No
3. Exceptions will access this resource.	Yes	Yes
4. Need to prevent access to this resource while a kernel thread blocks and there are no blocking conditions.	Yes	Yes
5. Need to prevent access to this resource while a kernel thread blocks and there are blocking conditions.	No	Yes
6. Need to protect resource between kernel threads and interrupt service interfaces.	Yes	No
7. Need to have maximum execution speed for this device driver.	Yes	No
8. The driver references and updates this resource in long code blocks (implying that the length of time the lock is held on this resource is not bounded and long).	Worse	Better
9. The driver references and updates this resource in short nonblocking code blocks (implying that the length of time the lock is held on this resource is bounded and short).	Better	Worse
10. Need to minimize memory usage by the lock-specific data structures.	Yes	No
11. Need to synchronize with interrupt service interfaces.	Yes	No
12. The driver can afford busy wait time.	Yes	No
13. The driver implements realtime preemption.	Worse	Better

Use the following steps to analyze your device driver to determine which items to lock and which locking method to choose:

Identify all of the resources in your device driver that you could potentially lock. Section 2.5 discusses some of these resources.
Identify all of the code blocks in your device driver that manipulate the resource.
Determine which locking method is appropriate. Use Table 2-3 as a guide to help you choose the locking method. Section 2.5.5 shows how to use this table for choosing a locking method for the example device register offset definition resources.
Determine the granularity of the lock. Section 2.5.5 shows how to determine the granularity of the locks for the example device register offset definitions.

2.5 Choosing the Resources to Lock in the Driver

Section 2.4 presents the SMP-related characteristics you must consider when choosing a locking method. You need to analyze each section of the driver (for example, the open and close device section, the read and write device section, and so forth) and apply those SMP-related characteristics to the following resource categories:

Read-only resources
Device control status register (CSR) addresses
Driver-specific global resources
System-specific global resources

The following sections discuss each of these categories. See Section 2.5.5 for an example that walks you through the steps for analyzing a device driver to determine which resources to lock.

2.5.1 Read-Only Resources

Analyze each section of your device driver to determine if the access to a resource is read only. In this case, resource refers to driver and system data stored in global variables or data structure members. You do not need to lock resources that are read only because there is no way to corrupt the data in a read-only operation.

2.5.2 Device Control Status Register Addresses

Analyze each section of your device driver to determine accesses to a device's control status register (CSR) addresses. Many device drivers based on the UNIX operating system use the direct method; that is, they access a device's CSR addresses directly through a device register structure. This method involves declaring a device register structure that describes the device's characteristics, which include a device's control status register. After declaring the device register structure, the driver accesses the device's CSR addresses through the member that maps to it.

There are some CPU architectures that do not allow you to access the device CSR addresses directly. Device drivers that need to operate on these types of CPUs should use the indirect method. In fact, device drivers operating on Alpha systems must use the indirect method. Thus, the discussion of locking a device's CSR addresses focuses on the indirect method.

The indirect method involves defining device register offset definitions (instead of a device register structure) that describe the device's characteristics, which include a device's control status register. The method also includes the use of the following categories of kernel interfaces:

CSR I/O access interfaces
- read_io_port
  Reads data from a device register
- write_io_port
  Writes data to a device register
I/O copy interfaces
- io_copyin
  Copies data from bus address space to system memory
- io_copyio
  Copies data from bus address space to bus address space
- io_copyout
  Copies data from system memory to bus address space

Using these interfaces makes your driver more portable across different bus architectures, different CPU architectures, and different CPU types within the same architecture. See Writing Device Drivers: Tutorial for examples of how to use these interfaces.

The following example shows the device register offset definitions that some /dev/xx driver defines for some XX device:


.
.
.

#define XX_ADDER 0x0 /* 32-bit read/write DMA address register */
#define XX_DATA  0x4 /* 32-bit read/write data register */
#define XX_CSR   0x8 /* 16-bit read/write CSR/LED register */
#define XX_TEST  0xc /* Go bit register. Write sets. Read clears */

.
.
.

2.5.3 Driver-Specific Global Resources

Analyze the declarations and definitions sections of your device driver to identify the following global resources:

Driver-specific global variables
Driver-specific data structures

Driver-specific variables can store a variety of information including the following: flag values that control execution of code blocks and status information. The following example shows the declaration and initialization of some typical driver-specific global variables. Use this example to help you locate similar driver-specific global variables in your device drivers.


.
.
.


int num_xx = 0;

.
.
.


int xx_is_dynamic = 0;

.
.
.

Driver-specific data structures contain members that can store such information as whether the device is attached, whether the device is opened, the read/write mode, and so forth. The following example shows the declaration and initialization of some typical driver-specific data structures. Use this example to help you locate similar driver-specific data structures in your device drivers.


.
.
.

struct  driver xxdriver = {

.
.
.

};

.
.
.

cfg_subsys_attr_t xx_attributes[] = {

.
.
.

};

.
.
.

};

.
.
.

struct xx_softc {

.
.
.

} xx_softc[NXX];

.
.
.

struct cdevsw xx_cdevsw_entry = {

.
.
.

};

After you identify the driver-specific global variables and driver-specific data structures, locate the code blocks in which the driver references them. Use Table 2-3 to determine which locking method is appropriate. Also, determine the granularity of the lock.

2.5.4 System-Specific Global Resources

Analyze the declarations and definitions sections of your device driver to identify the following global resources:

System-specific global variables
System-specific data structures

System-specific variables include the global variables hz, cpu, and lbolt. The following example shows the declaration of one system-specific global variable:


.
.
.

extern int hz;

.
.
.

System-specific data structures include controller, buf, and ihandler_t. The following example shows the declaration of some system-specific data structures:


.
.
.

struct controller *info[NXX];

.
.
.

struct buf cbbuf[NCB];

.
.
.

After you identify the system-specific global variables and system-specific data structures, locate the code blocks in which the driver references them. Use Table 2-3 to determine which locking method is appropriate. Also, determine the granularity of the lock.

Note
To lock buf structure resources, use the BUF_LOCK and BUF_UNLOCK interfaces instead of the simple and complex lock interfaces. See Writing Device Drivers: Reference for descriptions of these interfaces.

2.5.5 Example Analysis of a Device Driver to Determine Resources to Lock

Use the following steps to determine which resources to lock in your device drivers:

Step 1: Identify All Resources That You Might Lock
Step 2: Identify All of the Code Blocks in the Driver That Manipulate the Resource
Step 3: Determine Which Locking Method Is Appropriate
Step 4: Determine the Granularity of the Lock

The following example walks you through an analysis of which resources to lock for the /dev/xx driver.

Step 1: Identify All Resources That You Might Lock

Table 2-4 summarizes the resources that you might lock in your driver according to the following categories:

Device control status register (CSR) addresses
Device driver-specific global variables
Device driver-specific data structures
System-specific global variables
System-specific global data structures

Table 2-4: Summary of Device Driver Resources That Are Candidates for Locking

Category	Associated Resources
N/A	Device control status register (CSR) addresses.
Device driver-specific global variables.	Variables that store flag values to control execution of code blocks. Variables that store status information.
Device driver-specific global data structures.	`dsent`, `cfg_subsys_attr_t`, `driver`, and the driver's `softc` structure.
System-specific global variables	`cpu`, `hz`, `lbolt`, and `page_size`.
System-specific global data structures	`controller` and `buf`.

One resource that the /dev/xx driver must lock is the device CSR addresses. This driver also needs to lock the hz global variable. The example analysis focuses on the following device register offset definitions for the /dev/xx driver:


.
.
.

#define XX_ADDER 0x0 /* 32-bit read/write DMA address register */
#define XX_DATA  0x4 /* 32-bit read/write data register */
#define XX_CSR   0x8 /* 16-bit read/write CSR/LED register */
#define XX_TEST  0xc /* Go bit register. Write sets. Read clears */

.
.
.

Step 2: Identify All of the Code Blocks in the Driver That Manipulate the Resource

Identify all of the code blocks that manipulate the resource. If the code block accesses the resource read-only, you may not need to lock the resources that it references. However, if the code block writes to the resource, you need to lock the resource by calling the simple or complex lock interfaces.

The /dev/xx driver accesses the device register offset definition resources in the Open and Close Device Section and the Read and Write Device Section.

Step 3: Determine Which Locking Method Is Appropriate

Table 2-5 shows how to analyze the locking method that is most suitable for the device register offset definitions for some /dev/xx driver. (The numbers under Characteristic appear for easy reference in later descriptions.)

Table 2-5: Locking Analysis Table for Device Register Offset Definitions

Characteristic	Applies to This Driver	Simple Lock	Complex Lock
1. Kernel threads will access this resource.	Yes	Yes	Yes
2. Interrupt service interfaces will access this resource.	No	N/A	N/A
3. Exceptions will access this resource.	No	N/A	N/A
4. Need to prevent access to this resource while a kernel thread blocks and there are no blocking conditions.	Yes	Yes	Yes
5. Need to prevent access to this resource while a kernel thread blocks and there are blocking conditions.	No	N/A	N/A
6. Need to protect resource between kernel threads and interrupt service interfaces.	Yes	Yes	No
7. Need to have maximum execution speed for this device driver.	Yes	Yes	No
8. The driver references and updates this resource in long code blocks (implying that the length of time the lock is held on this resource is not bounded and long).	No	N/A	N/A
9. The driver references and updates this resource in short nonblocking code blocks (implying that the length of time the lock is held on this resource is bounded and short).	Yes	Better	Worse
10. Need to minimize memory usage by the lock-specific data structures.	Yes	Yes	No
11. Need to synchronize with interrupt service interfaces.	No	N/A	N/A
12. The driver can afford busy wait time.	Yes	Yes	No
13. The driver implements realtime preemption.	No	N/A	N/A

The locking analysis table for the device register offset definitions shows the following:

Seven of the SMP-related characteristics (numbers 1, 4, 6, 7, 9, 10, and 12) apply to the /dev/xx driver.
Simple and complex locks are suitable for SMP-related characteristics 1 and 4.
Simple locks are better suited than complex locks for SMP-related characteristic 9.
Simple locks (not complex locks) are suitable for SMP-related characteristics 6, 7, 10, and 12.

Based on the previous analysis, the /dev/xx driver uses the simple lock method.

Step 4: Determine the Granularity of the Lock

After choosing the appropriate locking method for the resource, determine the granularity of the lock. For example, in the case of the device register offset resource you can determine the granularity by answering the following questions:

Is a simple lock needed for each device register offset definition?
Is one simple lock needed for all of the device register offset definitions?

Table 2-5 shows that the need to minimize memory usage is important to the /dev/xx driver; therefore, creating one simple lock for all of the device register offset definitions would save the most memory. The following code fragment shows how to declare a simple lock for all of the device register offset definitions:


.
.
.

#include <kern/lock.h>

.
.
.

decl_simple_lock_data( , slk_xxdevoffset);

.
.
.

If the preservation of memory were not important to the /dev/xx driver, declaring a simple lock for each device register offset definition might be more appropriate. The following code fragment shows how to declare a simple lock structure for each of the example device register offset definitions:


.
.
.

#include <kern/lock.h>

.
.
.

decl_simple_lock_data( , slk_xxaddr);
decl_simple_lock_data( , slk_xxdata);
decl_simple_lock_data( , slk_xxcsr);
decl_simple_lock_data( , slk_xxtest);



.
.
.

After declaring a simple lock structure for an associated resource, you must initialize it (only once) by calling simple_lock_init. You then use the simple lock-related interfaces in code blocks that access the resource. Chapter 3 discusses the simple lock-related interfaces.