9 Implementing Block Device Driver Interfaces

Block device drivers can contain the following sections:

A dump section, which contains a dump interface
A psize section, which contains a psize interface
A strategy section, which contains a strategy interface

The following sections explain how to implement or set up each of these interfaces.

9.1 Implementing the dump Interface

A device driver's dump interface performs the tasks necessary to copy system memory to the dump device. The code associated with a dump interface resides in the dump section of the device driver. You specify the entry point for a driver's dump interface in a dsent structure. Section 5.4 describes the dsent structure. Writing Device Drivers: Reference provides a reference page that gives additional information on the arguments and tasks associated with a dump interface.

To implement a dump interface you must understand the dev_t data type, which is discussed in Section 8.1.1. The following code shows you how to set up a dump interface, using the /dev/xx driver as an example:

xxdump(dumpdev)
dev_t dumpdev;  [1]
{

.
.
.

Declares an argument that specifies the device on which the dump operation should be performed. The values in this argument are passed to the driver's dump interface. Typically, these values specify the disk unit number and partition. The dump device is often specified explicitly in the target (system) configuration file. If the dump device is not explicitly specified in the system configuration file, the default is to dump to the dump partition of the system's boot device. [Return to example]

Note
Digital UNIX does not currently support a dump interface's ability to copy the contents of system memory to the specified device. Device driver writers should not provide a dump interface for this version of Digital UNIX.

9.2 Implementing the psize Interface

A device driver's psize interface performs the tasks necessary to return the size of a disk partition. Typically, only a disk device driver would implement a psize interface. The code associated with a psize interface resides in the psize section of the device driver. You specify the entry point for a driver's psize interface in a dsent structure. Section 5.4 describes the dsent structure. Writing Device Drivers: Reference provides a reference page that gives additional information on the arguments and tasks associated with a psize interface. See Writing Device Drivers: Advanced Topics for an example implementation of a psize interface.

To implement a psize interface you must understand the dev_t data type, which is discussed in Section 8.1.1. The following code shows you how to set up a psize interface, using the /dev/xx driver as an example:

xxpsize(dev)
dev_t dev;  [1]
            
{

.
.
.

Declares an argument that specifies the device and partition for which the size is being requested. [Return to example]

9.3 Implementing the strategy Interface

A device driver's strategy interface performs the tasks necessary to perform block I/O for block devices and to initiate read and write operations for character devices. The code associated with a strategy interface resides in the strategy section of the device driver. You specify the entry point for a driver's strategy interface in a dsent structure. Section 5.4 describes the dsent structure. Writing Device Drivers: Reference provides a reference page that gives additional information on the arguments and tasks associated with a strategy interface.

Although the strategy section applies to block device drivers, character device drivers can also contain a strategy interface that is called by the character driver's read and write interfaces.

The following list describes some typical tasks that you perform when implementing a strategy interface:

Use the system wide pool of buf structures
Use locally defined buf structures
Use the buf structure
Use buffer cache management
Set up the strategy interface

Your strategy interface will probably perform most of these tasks and, possibly, some additional ones. The following sections describe each of these tasks, using the /dev/xx driver as an example.

9.3.1 Using the Systemwide Pool of buf Structures

The buf structure describes arbitrary I/O, but is usually associated with block I/O and physio. The buf structure does not contain data. Instead, it contains information about where the data resides and information about the types of I/O operations. A systemwide pool of buf structures exists for block I/O; however, many device drivers also include locally defined buf structures for use with the physio kernel interface. The following code shows how the /dev/xx driver could use the systemwide pool of buf structures with the xxminphys interface.

xxminphys(bp)
register struct buf *bp; [1]
{

.
.
.

Declares a pointer to a buf structure called bp. The xxminphys interface references the systemwide pool of buf structures to perform a variety of tasks, including checking the size of the requested transfer. [Return to example]

9.3.2 Declaring Locally Defined buf Structures

The following code shows how the /dev/xx driver could declare an array of locally defined buf structures and reference it with the controller attach interface:


.
.
.

struct xx_unit {

.
.
.

struct buf *xxbuf;

.
.
.

} xx_unit[NXX];

.
.
.

#define XX TC_OPTION_SLOTS [1]

.
.
.

struct buf xxbuf[NXX]; [2]

.
.
.

xxcattach(ctlr)
struct controller *ctlr; [3]
{
struct xx_unit *xxunit; [4]

.
.
.

xxunit->xxbuf = &xxbuf[ctlr->ctlr_num]; [5]

The NXX constant is used to allocate the buf structures associated with the XX devices that currently exist on the system. [Return to example]
Declares an array of buf structures called xxbuf. The NXX constant is used to allocate the buf structures for the maximum number of XX devices that currently exist on the system. Thus, there is one buf structure per XX device. [Return to example]
Declares a pointer to a controller structure associated with a specific XX device. The ctlr_num member of this pointer is used as an index to obtain a specific XX device's associated buf structure. [Return to example]
Declares a pointer to the xx_unit data structure associated with this XX device. It contains members that store such information as whether the XX device is opened and the XX device's TC slot number. It also declares a pointer to the xxbuf structure. [Return to example]
Sets the buffer structure address (the xxbuf member of this XX device's xx_unit structure) to the address of this XX device's buf structure. The ctlr_num member is used as an index into the array of buf structures associated with this XX device. [Return to example]

9.3.3 Using buf Structure Members Related to Device Drivers

Table 9-1 lists the members of the buf structure along with their associated data types that device drivers might reference.

Table 9-1: Members of the buf Structure

Member Name	Data Type
`b_flags`	`int`
`b_forw`	`struct buf *`
`b_back`	`struct buf *`
`av_forw`	`struct buf *`
`av_back`	`struct buf *`
`b_bcount`	`int`
`b_error`	`short`
`b_dev`	`dev_t`
`b_un.b_addr`	`caddr_t`
`b_lblkno`	`daddr_t`
`b_blkno`	`daddr_t`
`b_resid`	`int`
`b_iodone`	`void (*b_iodone) ()`
`b_proc`	`struct proc *`

Writing Device Drivers: Reference provides a reference page description of this data structure. The following sections discuss all of these members.

9.3.3.1 The b_flags Member

The b_flags member specifies binary status flags. These flags indicate how a request is to be handled and the current status of the request. These status flags are defined in buf.h and get set by various parts of the kernel. The flags supply the device driver with information about the I/O operation.

The device driver can also send information back to the kernel by setting b_flags. Table 9-2 lists the binary status flags applicable to device drivers.

Table 9-2: Binary Status Flags Applicable to Device Drivers

Flag	Meaning
`B_READ`	This flag is set if the operation is read and cleared if the operation is write.
`B_DONE`	This flag is cleared when a request is passed to a driver `strategy` interface. The device driver writer must call `iodone` to mark a buffer as completed.
`B_ERROR`	This flag specifies that an error occurred on this data transfer. Device drivers set this flag if an error occurs.
`B_BUSY`	This flag indicates that the buffer is in use.
`B_PHYS`	This flag indicates that the associated data is in user address space.
`B_WANTED`	If this flag is set, it indicates that some process is waiting for this buffer. The device driver should issue a call to the `wakeup` interface when the buffer is freed by the current process. The driver passes the address of the buffer as an argument to `wakeup`.

9.3.3.2 The b_forw and b_back Members

The b_forw and b_back members specify a file system buffer hash chain. When the kernel performs an I/O operation on a buffer, the buf structures are not on any list. Device driver writers sometimes use these members to link buf structures to lists.

9.3.3.3 The av_forw and av_back Members

The av_forw and av_back members specify the position on the free list if the b_flags member is not set to B_BUSY. The kernel initializes these members. However, when the driver gets use of the buf structure, these members are available for local use by the device driver.

9.3.3.4 The b_bcount and b_error Members

The b_bcount member specifies the size of the requested transfer (in bytes). This member is initialized by the kernel as the result of an I/O request. The driver writer references this member to determine the size of the I/O request. This member is often used in the driver's strategy interface.

The b_error member specifies that an error occurred on this data transfer. This member is set to an error code if the b_flags member bit was set. The driver writer sets this member with the errors defined in the file errno.h.

9.3.3.5 The b_dev Member

The b_dev member specifies the special device to which the transfer is directed. The data type for this member is dev_t, which maps to major and minor construction macros. The device driver writer should not access the dev_t bits directly. Instead, the driver writer should use the major and minor interfaces to obtain the major and minor numbers for a special device. Section 8.1.1.1 and Section 8.1.1.2 provide examples of how to call these interfaces.

A device driver writer can specify device special file information (including major and minor numbers) in the sysconfigtab file fragment. Section 13.4 describes the sysconfigtab file fragment and Section 14.1.5 describes the syntax used to populate a sysconfigtab file fragment.

9.3.3.6 The b_un.b_addr Member

The b_un.b_addr member specifies the address at which to pull or push the data. This member is set by the kernel and is the main memory address where the I/O occurs. Driver writers use this member when their drivers need to perform DMA operations. It tells the driver where the data comes from and goes to in memory.

9.3.3.7 The b_lblkno and b_blkno Members

The b_lblkno member specifies the logical block number. The b_blkno member specifies the block number on the partition of a disk or on the file system. The b_blkno member is set by the kernel and it indicates the starting block number on the device where the I/O operation is to begin. Device drivers use this member only with block devices. For disk devices, this member is the block number relative to the start of the partition.

9.3.3.8 The b_resid and b_iodone Members

The b_resid member specifies (in bytes) the data not transferred because of some error. The b_iodone member specifies the interface called by iodone. The device driver calls iodone at the completion of an I/O operation. The driver calls the iodone interface, which calls the interface pointed to by the b_iodone member. The driver writer does not need to know anything about the interface pointed to by this argument.

9.3.3.9 The b_proc Member

The b_proc member specifies a pointer to the proc structure that represents the process performing the I/O. A device driver might pass b_proc in a call to the vtop interface in order to translate a virtual address to a physical address.

9.3.4 Using Buffer Cache Management

When the file system deals with regular files, directories, and block devices, the I/O requests are serviced through the buffer cache system. Because the buffer cache deals with fixed-size buffers, it is often necessary to translate the user's request for I/O into buffer-size pieces called blocks. Only in the case where the size of the user's I/O matches a block and aligns to a block boundary will the underlying request match the user's size. A large I/O request is broken down into many block requests, with each block request going to the buffer cache system separately. Both read and write requests smaller than a block force the file system to request a read of the entire block and deal with the small read or write in the buffer.

Regular files and directories go through an extra translation process to map their logical block number into the physical blocks of the disk device. This mapping process itself can generate block requests to the buffer cache system to deal with file extensions or to obtain or modify indirect file system blocks.

Buffer reads and buffer writes do not necessarily cause I/O to occur. In the case of buffer reads, the request can be satisfied by data already in the cache. On the other hand, buffer writes can modify or replace data in the cache, but the physical write might be delayed. Using a buffer cache enhances performance because data that changes often in the cache does not require a physical write for each change.

The nature of the buffer cache system's delayed physical I/O requires that each buffer request, or each block read or write, be a self-contained I/O request to the device driver's strategy interface. The buffer cache system and the block device driver strategy interface cannot assume any particular process context; therefore, the context of the process must be severed from the I/O request. The buffer passed to the buffer interface has all of the context necessary to perform the I/O.

9.3.5 Buffer Cache Interface

I/O requests come from a buffer cache interface as follows:

(*dsent[major](dev).d_strategy)(bp);

The dsent table is referenced and the appropriate driver interface is called through the block device's major number. The driver's strategy interface is passed a pointer to a buf structure.

9.3.6 Setting Up the strategy Interface

The following code shows you how to set up a strategy interface, using the /dev/xx driver as an example:

xxstrategy(bp)
struct buf *bp; [1]
{

.
.
.

Declares an argument that specifies a pointer to a buf structure. This structure describes the I/O operation to be performed.
See Writing Device Drivers: Advanced Topics for an example implementation of a strategy interface. [Return to example]