Chapter 1 describes the basic issues that concern a realtime application, and what services a realtime operating system can provide to users to help meet their realtime needs. It mainly describes issues within the scope of the user's application code itself, such as how to set priority and scheduling priorities, how to lock down process memory, and how to use asynchronous I/O. Chapter 1 also discusses the value of a preemptive kernel in reducing the process preemption latency of a realtime application.
This chapter explores more deeply the latency issues of a system and how they affect the realtime performance of an application. This involves a greater understanding of the interaction of the application with the underlying UNIX system, and with devices involved directly or indirectly with the application. Section 11.2 outlines some ways that a user can improve application performance.
Realtime applications require a predictable response time to external events, such as device interrupts. A typical realtime application involves:
An interrupt-generating device
An interrupt service routine that collects data from the device
User-level code that processes the collected data
Realtime responsiveness is a characterization of how quickly an operating system and an application, working together, can respond to external events. One way of measuring responsiveness is through a system's latency. The time it takes for hardware and the operating system to respond to external events is latency, and is expressed as a delay time. Understanding the causes of latency and minimizing their effects is a key to successful realtime program design, and is the focus of this chapter.
Two types of latency are described in the following sections:
Interrupt service routine (ISR) latency
Process dispatch latency (PDL)
A system's interrupt service routine (ISR) latency is the elapsed time from when an interrupt occurs until execution of the first instruction in the interrupt service routine. The system must first recognize that an interrupt has occurred, and then dispatch to the ISR code. If critical postprocessing is done in the ISR, then the user must be concerned with completion time of the ISR code, not just the time it takes to begin execution of its first instruction. Thus there are two concerns: ISR latency and ISR execution. There are factors that cause ISR latency and ISR execution to vary in duration, and these factors make it more difficult to assign latency a deterministic value.
The most important factor is the relative interrupt priority level (IPL) at which the ISR executes. When there are other ISRs of equal or greater interrupt priority level running at the time that the realtime device interrupts, the realtime device ISR is blocked from running until the current ISR is finished.
There could be multiple ISRs waiting to execute that have an equal or higher IPL at the time of the realtime interrupt, and all will hold off the realtime ISR until they complete. In addition, once the realtime ISR is running, it can be preempted or held off by one or more devices of higher IPL, and the realtime ISR will be delayed by the collective duration of these ISRs. Thus, it is important to know the relative IPLs of all the devices that could potentially interrupt during critical realtime processing, including system-provided devices such as a network driver or disk driver.
Process Dispatch Latency (PDL) is the time it takes from when an interrupt occurs until a process that was blocked waiting on the interrupt executes. Process dispatch latency includes:
ISR latency
ISR execution time
Time required to return from the ISR
Time required for the context switch back to the process-level code which is waiting on the interrupt
There are many more factors that can potentially increase the process dispatch latency of a realtime application. Any process that is currently executing code that holds a simple lock, is funneled to the master process, or has its IPL raised, will not be preemptable by the realtime process and thus will hold off the realtime process from running. (Note that a user process cannot hold a simple lock, be funneled to the master process, or have its IPL raised, except through a system call.) Once a process is able to run, it must compete against other processes in order to actually run, and the process with the highest priority will run.
Note that process priority can affect PDL, but cannot affect ISR latency. In other words, no matter how high the priority of an application process, even if it is in the realtime priority range, all ISRs that need servicing at the time that the realtime device's ISR needs servicing will be serviced before process code can execute, no matter in what order or at what interrupt priority level the ISRs run.
This section contains guidelines for improving realtime responsiveness.
Be sure that there is sufficient memory on your system, and always lock down memory in the user process to reduce paging. Paging will occur when there are many threads and processes running on the system that do not collectively fit into system memory, and must be paged in and out as necessary. Application code and data that are locked in memory will not be paged. Paging affects process dispatch latency because it executes code in the kernel that is protected by simple locks, and thus cannot be preempted. Note that certain system daemons are not locked in memory, so a secondary effect is paging from those systems.
Turn on kernel preemption and set your application code scheduling priority to SCHED_FIFO. This is described in Chapter 2.
Always consider the process priority level of your application in terms of relative importance in the overall system. You may need to use priorities in the realtime range. This affects process dispatch latency when there are other processes ready to run at the same time that the realtime application is ready to run. The process with the highest priority that has been waiting the longest among the waiting processes of that priority will run first.
Note, however, that always running in the realtime priority range is not necessarily what you should do. If you need to interact with system services that have threads or processes associated with them such as the network, you need to run at a priority at or below the priority of those threads or processes, as well as the priority of anything on which those threads or processes depend.
In the kernel, there are multiple threads. The purpose of these threads is to perform activities that have the potential of blocking, and thus serve as the delivery mechanism of information between ISRs and user processes. These kernel threads do not have much of the state information that processes have.
Kernel threads use the first-in first-out scheduling policy, and are scheduled along with POSIX processes. The kernel sets priorities as Mach priorities, which are the inverse of POSIX priorities: 0 is the highest priority Mach thread and 63 is the lowest. Under POSIX, 64 is the highest priority and 0 is the lowest.
You can use the
ps
command to display thread priorities.
Because the
ps
program predates the use of threads, its
ability to display information clearly about threads is limited.
The
following example shows an example of using the command
ps axm -o L5FMT,psxpri
to display L5FMT format and append
the POSIX priority field:
% ps axm -o L5FMT,psxpri F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD PPR 3 R < 0 0 0 0.0 32 -12 0 3.4M * ?? 05:02:40 kernel idle 31 R N 0.0 63 19 - 0:00.00 0 U < 0.0 38 -6 malloc_ 0:00.51 25 U < 0.0 32 -12 402cb0 0:49.47 31 U < 0.0 32 -12 402eac 0:00.00 31 S < 0.0 33 -11 netisr 05:01:23 30 U < 0.0 32 -12 3e3f18 0:00.00 31 U < 0.0 38 -6 4c3b80 0:00.00 25 U 0.0 42 0 ubc_dir 0:00.52 21 U < 0.0 37 -7 4c2678 0:00.01 26 U < 0.0 37 -7 4c2680 0:03.77 26 U < 0.0 38 -6 4c33b0 0:12.69 25 U < 0.0 32 -12 4e36d8 0:00.01 31 U < 0.0 37 -7 4e36d8 0:00.12 26 U < 0.0 37 -7 4ba2d8 0:00.00 26 U < 0.0 38 -6 4e3078 0:00.00 25 U < 0.0 42 -2 24ce30 0:00.03 21 I 0.0 42 0 nfsiod_ 0:01.49 21 I 0.0 42 0 nfsiod_ 0:01.65 21 I 0.0 42 0 nfsiod_ 0:01.82 21 I 0.0 42 0 nfsiod_ 0:00.61 21 I 0.0 42 0 nfsiod_ 0:01.71 21 I 0.0 44 0 nfsiod_ 0:01.26 19 I 0.0 42 0 nfsiod_ 0:01.78 21 80048001 I 0 1 0 0.0 44 0 0 40K pause ?? 0:03.12 init 19 8001 IW 0 3 1 0.0 44 0 0 0K sv_msg_ ?? 0:00.12 kloadsrv 19 8001 S 0 17 1 0.0 44 0 0 48K pause ?? 03:58:06 update 19 8001 I 0 81 1 0.0 44 0 0 120K event ?? 0:02.64 syslogd 19 8001 IW 0 83 1 0.0 42 0 0 0K event ?? 0:00.03 binlogd 21 8001 S 0 135 1 0.0 44 0 0 80K event ?? 8:13.21 routed 19 8001 S 0 226 1 0.0 44 0 0 104K event ?? 8:25.31 portmap 19 8001 IW 0 234 1 0.0 44 0 0 0K event ?? 0:00.21 ypbind 19
.
.
.
You can use the
dbx
command from a root account to
display more information about kernel threads, as follows:
# dbx -k /vmunix (dbx) set $pid=0 (dbx) tlist [shows kernel threads] (dbx) tset thread-name;t [shows which routine a thread is running] (dbx) p thread->sched_pri [shows Mach priority for the current thread]
The following example shows use of the
dbx
command:
# dbx -k /vmunix dbx version 3.11.8 Type 'help' for help. stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available warning: Files compiled -g3: parameter values probably wrong (dbx) set $pid=0 (dbx) tlist thread 0xfffffc0003fd1be8 stopped at [thread_run:2388 ,0xfffffc00002a2560] Source not available thread 0xfffffc0003fd6000 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd62c0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd6580 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd6dc0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd7080 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd7340 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd7600 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd78c0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd7b80 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6a000 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6a2c0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6a580 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6a840 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6ab00 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6adc0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd1950 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6b080 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6b340 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6b600 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6b8c0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6bb80 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0000926000 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available (dbx) tset 0xfffffc0003f6bb80;t thread 0xfffffc0003f6bb80 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available > 0 thread_block() ["/usr/sde/osf1/build/ptos.bl8/src/kernel/kern/sched_prim.c":2017, 0xfffffc00002a1d9c] 1 async_io_thread(0x0, 0x0, 0x0, 0x0, 0x0) ["../../../../src/kernel/nfs/nfs_vnodeops.c":2828, 0xfffffc00002f4898] (dbx) p thread->sched_pri 44
By default, the parameter
ubc_maxpercent
in the
file
/sys/conf/param.c
is set to 100.
That means that up
to 100% of physical memory can be consumed by the Unified Buffer Cache (UBC)
for buffering file data.
Some systems perform better when not all physical
memory is allowed to be taken by the UBC.
For improved realtime responsiveness, change this the value of
/sys/conf/param.c
to between 50 and 80, depending on the
amount of file system activity done on the system.
This can improve system
realtime latency, because when the UBC has consumed its maximum allocation
of memory for buffering file data, the least recently used buffers must be
flushed to disk if they are modified.
Flushing these buffers is done with a
simple lock held, and therefore can effect process dispatch latency.
The
more memory that the UBC is allowed to use before flushing, the longer it
will take to perform the flushing.
Lowering the value of the
ubc_maxpercent
parameter will cause the flushing
to occur more frequently, but take less time.
When writing device drivers, follow these guidelines:
Avoid holding locks for long periods
Holding a lock prevents context switches from occurring.
Avoid funneling
Funneled device drivers take a lock upon entry.
Interrupt service routines should be brief
Consider use of a kernel thread to do ISR postprocessing.
While an ISR is
executing, other interrupts of equal or lower IPL are delayed, and no
process can run until all ISR activity is completed.
Consider use of the
rt_post_callout
function for ISR postprocessing that
needs to execute before any process code, but after any ISRs.
See
Writing Device Drivers: Tutorial
for information
about the
rt_post_callout
function.
Use devices with care that could interfere with realtime responsiveness, such as:
The network driver
Do not configure the network driver into your system if it is not a necessary part of your realtime application. If it is necessary, then be sure that it is used only in postprocessing, and not during critical phases of your application when you are attempting to minimize latency.
The disk driver
Be sure that postprocessing data is written to permanent storage during non-critical sections of your application, and that all data is properly flushed and synchronized to disk at appropriate times. See Chapter 8 for more information about synchronized I/O.
In general, keep all peripheral devices that can cause spurious interrupts out of the configuration of the most critical systems. Other devices can possibly cause interrupt latency as well as bus contention with the critical devices. If other devices are a necessary part of the system, analyze the interrupt rate and attempt to avoid interrupt overload on the system.
Consider a symmetrical multiprocessing (SMP) system as a possible means of
improving realtime responsiveness, by dividing the application across
multiple processors using the
runon
command.