To support the development of multithreaded applications, the Tru64 UNIX operating system provides DECthreads, the Compaq Multithreading Run-Time Library. The DECthreads interface implements IEEE Standard 1003.1c-1995 threads (also referred to as POSIX 1003.1c threads), with several extensions.
In addition to an actual threading interface, the operating system also provides Thread-Independent Services (TIS). The TIS routines are an aid to creating efficient thread-safe libraries that do not create their own threads. (See Section 12.4.1 for information about TIS routines.)
This chapter addresses the following topics:
Overview of multithread support in Tru64 UNIX (Section 12.1)
Run-time library changes for POSIX conformance (Section 12.2)
Characteristics of thread-safe and thread-reentrant routines (Section 12.3)
How to write thread-safe code (Section 12.4)
How to build multithreaded applications (Section 12.5)
A thread is a single, sequential flow of control within a program. Multiple threads execute concurrently and share most resources of the owning process, including the address space. By default, a process initially has one thread.
The purposes for which multiple threads are useful include:
Improving the performance of applications running on multiprocessor systems
Implementing certain programming models (for example, the client/server model)
Encapsulating and isolating the handling of slow devices
You can also use multiple threads as an alternative approach to managing
certain events.
For example, you can use one thread per file descriptor in
a process that otherwise might use the
select( )
or
poll( ) system calls to efficiently manage concurrent
I/O operations on multiple file descriptors.
The components of the multithreaded development environment for the Tru64 UNIX system include the following:
Compiler support -- Compile using the
-pthread
option on the
cc
or
c89
command.
Threads package -- The
libpthread.so
library provides interfaces for threads control.
Thread-safe support libraries -- These libraries include
libaio,
libcfg,
liblmf,
libm,
libmsfs,
libpruplist,
libpthread,
librt, and
libsys5.
The
ladebug
debugger
The
prof
and
gprof
profilers --
Compile with the
-p
and
-pthread
options for
prof
and with
-pg
and
-pthread
for
gprof
to use
the
libprof1_r.a
profiling library.
The
atom
utility (pixie,
third, and
hiprof
tools)
For information on profiling multithreaded applications, see Section 8.14.
To analyze a multithreaded application for potential logic and performance problems, you can use Visual Threads, which is available on the Associated Products CD . Visual Threads can be used on DECthreads applications that use POSIX threads (Pthreads) and on Java applications.
For releases of the DEC OSF/1 system (that is, for releases prior to
DIGITAL UNIX Version 4.0), a large number of separate reentrant routines (*_r
routines) were provided to solve the problem of static data in
the C run-time library (the first two problems listed in
Section 12.3.1).
For releases of the Tru64 UNIX system, the problem of static data in
the nonreentrant versions of the routines is fixed by replacing the static
data with thread-specific data.
Except for a few routines specified by POSIX
1003.1c, the alternate routines are not needed on Tru64 UNIX systems
and are retained only for binary compatibility.
The following functions are the only alternate thread-safe routines that are specified by POSIX 1003.1c and need to be used when writing thread-safe code:
asctime_r* |
ctime_r* |
getgrgid_r* |
getgrnam_r* |
getpwnam_r* |
getpwuid_r* |
gmtime_r* |
localtime_r* |
rand_r* |
readdir_r* |
strtok_r |
Starting with DIGITAL UNIX Version 4.0, the interfaces flagged with
an asterisk (*) in the preceding list have new definitions that conform to
POSIX 1003.1c.
The old versions of these routines can be obtained by defining
the preprocessor symbol
_POSIX_C_SOURCE
with the value
199309L
(which denotes POSIX 1003.1b conformance -- however,
doing this will disable POSIX 1003.1c threads).
The new versions of the routines
are the default when compiling code under DIGITAL UNIX Version 4.0 or higher,
but you must be certain to include the header files specified on the manpages
for the various routines.
For more information on programming with threads, see the
Guide to DECthreads
and
cc(1),
monitor(3),
prof(1), and
gprof(1).
Routines within a library can be thread safe or not. A thread-safe routine is one that can be called concurrently from multiple threads without undesirable interactions between threads. A routine can be thread safe for either of the following reasons:
It is inherently reentrant.
It uses thread-specific data or mutex locks. (A mutex is a synchronization object that is used to allow multiple threads to serialize their access to shared data.)
Reentrant routines do not share any state across concurrent invocations from multiple threads. A reentrant routine is the ideal thread-safe routine, but not all routines can be made reentrant.
Prior to DIGITAL UNIX Version 4.0, many of the C run-time library (libc) routines were not thread safe, and alternate versions of these
routines were provided in
libc_r.
Starting with DIGITAL
UNIX Version 4.0, all of the alternate versions formerly found in
libc_r
were merged into
libc.
If a thread-safe
routine and its corresponding nonthread-safe routine had the same name, the
nonthread-safe version was replaced.
The thread-safe versions are modified
to use TIS routines (see
Section 12.4.1); this enables them
to work in both single-threaded and multithreaded environments -- without
extensive overhead in the single-threaded case.
Some common practices that can prevent code from being thread safe can
be found by examining why some of the
libc
functions were
not thread safe prior to DIGITAL UNIX Version 4.0:
Returning a pointer to a single, statically allocated buffer
The
ctime(3)
interface provides an example of this problem:
char *ctime(const time_t *timer);
This function takes no output arguments and returns a pointer to a statically allocated buffer containing a string that is the ASCII representation of the time specified in the single parameter to the function. Because a single, statically allocated buffer is used for this purpose, any other thread that calls this function will overwrite the string returned to the previously calling thread.
To make the
ctime()
function thread safe, the POSIX
1003.1c standard has defined an alternate version,
ctime_r(),
which accepts an additional output argument.
The argument is a user-supplied
buffer that is allocated by the caller.
The
ctime_r()
function
writes the ASCII time string into the buffer:
char *ctime_r(const time_t *timer, char *buf);
Users of this
function must ensure that the storage occupied by the
buf
argument is not used by another thread.
Maintaining internal state
The
rand()
function provides an example of this problem:
void srand(unsigned int seed); int rand(void);
This function is a simple pseudorandom number
generator.
For any given starting
seed
value that is set
with the
srand()
function, it generates an identical sequence
of pseudorandom numbers.
To do this, it maintains a state value that is updated
on each call.
If another thread is calling this function, the sequence of
numbers returned within any one thread for a given starting seed is nondeterministic.
This may be undesirable.
To avoid this problem, a second interface,
rand_r(),
is specified in POSIX 1003.1c.
This function accepts an additional argument
that is a pointer to a user-supplied integer used by
rand_r()
to hold the state of the random number generator:
int rand_r(unsigned int *seed);
The users of this function
must ensure that the
seed
argument is not used by another
thread.
Using thread-specific data is one way of doing this (see
Section 12.4.2).
Operating on read/write data items shared between threads
The problem of sharing read/write data can be solved by using mutexes. In this case, the routine is not considered reentrant, but it is still thread safe. Like thread-specific data, mutex locking is transparent to the user of the routine.
Mutexes are used in several
libc
routines, most notably
the
stdio
routines, for example,
printf().
Mutex locking in the
stdio
routines is done by stream to
prevent concurrent operations on a stream from colliding, as in the case of
two processes trying to fill a stream buffer at the same time.
Mutex locking
is also done on certain internal data tables in the C run-time library during
operations such as
fopen()
and
fclose().
Because the alternate versions of these routines do not require an application
program interface (API) change, they have the same name as the original versions.
See Section 12.4.3 for an example of how to use mutexes.
When writing code that can be used by both single-threaded and multithreaded applications, it is necessary to code in a thread-safe manner. The following coding practices must be observed:
Static read/write data should be either eliminated, converted
to thread-specific data, or protected by mutexes.
In the C language, to reduce
the potential for misuse of the data, it is good practice to declare static
read-only data with the
const
type modifier.
Global read/write data should be eliminated or protected by mutex locks.
Per-process system resources, such as file descriptors, should be used with care because they are accessible by all threads.
References to the global "errno" cell should be
replaced with calls to
geterrno()
and
seterrno().
This replacement is not necessary if the source file includes
errno.h
and one of the following conditions is true:
The file is compiled with the
-pthread
option (cc
or
c89
command).
The
pthread.h
file is included at the top
of the source file.
The
_REENTRANT
preprocessor symbol is explicitly
set before the include of
errno.h.
Dependencies on any other nonthread-safe libraries or object files must be avoided.
TIS is a package of routines provided by the C run-time library that can be used to write efficient code for both single-threaded and multithreaded applications. TIS routines can be used for handling mutexes, handling thread-specific data, and a variety of other purposes.
When used by a single-threaded application, these routines use simplified semantics to perform thread-safe operations for the single-threaded case. When DECthreads is present, the bodies of the routines are replaced with more complicated algorithms to optimize their behavior for the multithreaded case.
TIS is used within
libc
itself to allow a single
version of the C run-time library to service both single-threaded and multithreaded
applications.
See the
Guide to DECthreads
and
tis(3)
for information on
how to use this facility.
Example 12-1 shows how to use thread-specific data in a function that can be used by both single-threaded and multithreaded applications. For clarity, most error checking has been left out of the example.
#include <stdlib.h>
#include <string.h>
#include <tis.h>
static pthread_key_t key;
void _ _init_dirname()
{
tis_key_create(&key, free);
}
void _ _fini_dirname()
{
tis_key_delete(key);
}
char *dirname(char *path)
{
char *dir, *lastslash;
/*
* Assume key was set and get thread-specific variable.
*/
dir = tis_getspecific(key);
if(!dir) { /* First time this thread got here. */
dir = malloc(PATH_MAX);
tis_setspecific(key, dir);
}
/*
* Copy dirname component of path into buffer and return.
*/
lastslash = strrchr(path, '/');
if(lastslash) {
memcpy(dir, path, lastslash-path);
dir[lastslash-dir+1] = '\0';
} else
strcpy(dir, path);
return dir;
}
The following TIS routines are used in the preceding example:
tis_key_createGenerates a unique data key.
tis_key_deleteDeletes a data key.
tis_getspecificObtains the data associated with the specified key.
tis_setspecificSets the data value associated with the specified key.
The
_ _init_
and
_ _fini_
routines are used in the example to initialize and destroy the
thread-specific data key.
This operation is done only once, and these routines
provide a convenient way to ensure that this is the case, even if the library
is loaded with
dlopen().
See
ld(1)
for an explanation
of how to use the
_ _init_
and
_ _fini_
routines.
Thread-specific data keys are provided by DECthreads at run time and are a limited resource. If your library must use a large number of data keys, code the library to create just one data key and store all of the separate data items as a structure or an array of pointers pointed to by that key.
In some cases, using thread-specific data is not the correct way to
convert static data into thread-safe code.
For example, thread-specific data
should not be used when a data object is meant to be shareable between threads
(as in
stdio
streams within
libc).
Manipulating
per-process resources is another case in which thread-specific data is inadequate.
The following example shows how to manipulate per-process resources in a thread-safe
fashion:
#include <pthread.h>
#include <tis.h>
/*
* NOTE: The putenv() function would have to set and clear the
* same mutex lock before it accessed the environment.
*/
extern char **environ;
static pthread_mutex_t environ_mutex = PTHREAD_MUTEX_INITIALIZER;
char *getenv(const char *name)
{
char **s, *value;
int len;
tis_mutex_lock(&environ_mutex);
len = strlen(name);
for(s=environ; value=*s; s++)
if(strncmp(name, value, len) == 0 &&
value[len] == '=') {
tis_mutex_unlock(&environ_mutex);
return &(value[len+1]);
}
tis_mutex_unlock(&environ_mutex);
return (char *) 0L;
}
In the preceding example, note how the lock is set once
(tis_mutex_lock) before accessing the environment and is
unlocked exactly once (tis_mutex_unlock) before returning.
In the multithreaded case, any other thread attempting to access the environment
while the first thread holds the lock is blocked until the first thread performs
the unlock operation.
In the single-threaded case, no contention occurs unless
an error exists in the coding of the locking and unlocking sequences.
If it is necessary for the lock state to remain valid across a
fork()
system call in multithreaded applications, it may be useful
to create and register
pthread_atfork()
handler functions
to set the lock prior to any
fork()
call, and to unlock
it in both the child and parent after the
fork()
call.
This guarantees that a
fork()
operation is not done by
one thread while another thread holds the lock.
If the lock was held by another
thread, it would end up permanently locked in the child because the
fork()
operation produces a child with only one thread.
In the case
of an independent library, the call to
pthread_atfork()
can be done in an
_ _init_
routine in the library.
Unlike most
pthread
routines, the
pthread_atfork
routine is available in
libc
and may be used
by both single-threaded and multithreaded applications.
The compilation and linking of multithreaded applications differs from that of single-threaded applications in a few minor but important ways.
Depending on whether an application is single threaded or multithreaded,
many system header files provide different sets of definitions when they are
included in the compilation of an application.
Whether the compiler generates
single-threaded or thread-safe behavior is determined by whether the
_REENTRANT
preprocessor symbol is defined.
When you specify the
-pthread
option on the
cc
or
c89
command, the
_REENTRANT
symbol is automatically
defined; it is also defined if the
pthread.h
header file
is included.
This header file must be the first file included in any application
that uses the pthread library,
libpthread.so.
The
-pthread
option has no other effect on
the compilation of C programs.
The reentrancy of the code generated by the
C compiler is determined only by the proper use of reentrant coding practices
by the programmer and by the use of thread-safe support routines or functions --
not by the use of any special options.
To link a multithreaded C application, use the
cc
or
c89
command with the
-pthread
option.
When linking, the
-pthread
option has the
effect of modifying the library search path in the following ways:
The pthread library is included into the link.
The exceptions library is included into the link.
For each library mentioned in a
-l
option, an attempt is made to locate and presearch a library of corresponding
thread-safe routines whose name includes the suffix
_r.
The
-pthread
option does not modify the behavior
of the linker in any other way.
The reentrancy of the linked code is determined
by use of proper programming practices in the orginal code, and by compiling
and linking with the proper header files and libraries, respectively.
Not all compilers necessarily generate reentrant code; the definition of the language itself can make this difficult. It is also necessary for any run-time libraries linked with the application to be thread safe. For details on such matters, consult the manual for the compiler you are using and the documentation for the run-time libraries.