12 Developing Thread-Safe Libraries

To support the development of multithreaded applications, the Tru64 UNIX operating system provides the POSIX Threads Library, the Compaq Multithreading Run-Time Library. The POSIX Threads Library interface implements IEEE Standard 1003.1c-1995 threads (also referred to as POSIX 1003.1c threads), with several extensions.

In addition to an actual threading interface, the operating system also provides Thread-Independent Services (TIS). The TIS routines are an aid to creating efficient thread-safe libraries that do not create their own threads. (See Section 12.4.1 for information about TIS routines.)

This chapter addresses the following topics:

Overview of multithread support in Tru64 UNIX (Section 12.1)

Run-time library changes for POSIX conformance (Section 12.2)

Characteristics of thread-safe and thread-reentrant routines (Section 12.3)

How to write thread-safe code (Section 12.4)

How to build multithreaded applications (Section 12.5)

12.1 Overview of Thread Support

A thread is a single, sequential flow of control within a program. Multiple threads execute concurrently and share most resources of the owning process, including the address space. By default, a process initially has one thread.

The purposes for which multiple threads are useful include:

Improving the performance of applications running on multiprocessor systems

Implementing certain programming models (for example, the client/server model)

Encapsulating and isolating the handling of slow devices

You can also use multiple threads as an alternative approach to managing certain events. For example, you can use one thread per file descriptor in a process that otherwise might use the select( ) or poll( ) system calls to efficiently manage concurrent I/O operations on multiple file descriptors.

The components of the multithreaded development environment for the Tru64 UNIX system include the following:

Compiler support -- Compile using the -pthread option on the cc or c89 command.

Threads package -- The libpthread.so library provides interfaces for threads control.

Thread-safe support libraries -- These libraries include libaio, libcfg, liblmf, libm, libmsfs, libpruplist, libpthread, librt, and libsys5.

The ladebug debugger

The prof and gprof profilers -- Compile with the -p and -pthread options for prof and with -pg and -pthread for gprof to use the libprof1_r.a profiling library.

The atom utility (pixie, third, and hiprof tools)

For information on profiling multithreaded applications, see Section 8.10.

To analyze a multithreaded application for potential logic and performance problems, you can use Visual Threads, which is available on the Associated Products Volume 1 CD-ROM. Visual Threads can be used on POSIX Threads Library applications and on Java applications.

12.2 Run-Time Library Changes for POSIX Conformance

For releases of the DEC OSF/1 system (that is, for releases prior to DIGITAL UNIX Version 4.0), a large number of separate reentrant routines (*_r routines) were provided to solve the problem of static data in the C run-time library (the first two problems listed in Section 12.3.1). For releases of the Tru64 UNIX system, the problem of static data in the nonreentrant versions of the routines is fixed by replacing the static data with thread-specific data. Except for a few routines specified by POSIX 1003.1c, the alternate routines are not needed on Tru64 UNIX systems and are retained only for binary compatibility.

The following functions are the only alternate thread-safe routines that are specified by POSIX 1003.1c and need to be used when writing thread-safe code:

`asctime_r`*	`ctime_r`*	`getgrgid_r`*
`getgrnam_r`*	`getpwnam_r`*	`getpwuid_r`*
`gmtime_r`*	`localtime_r`*	`rand_r`*
`readdir_r`*	`strtok_r`

Starting with DIGITAL UNIX Version 4.0, the interfaces flagged with an asterisk (*) in the preceding list have new definitions that conform to POSIX 1003.1c. The old versions of these routines can be obtained by defining the preprocessor symbol _POSIX_C_SOURCE with the value 199309L (which denotes POSIX 1003.1b conformance -- however, doing this will disable POSIX 1003.1c threads). The new versions of the routines are the default when compiling code under DIGITAL UNIX Version 4.0 or higher, but you must be certain to include the header files specified on the reference pages for the various routines.

For more information on programming with threads, see the Guide to the POSIX Threads Library and cc(1), monitor(3), prof(1), and gprof(1).

12.3 Characteristics of Thread-Safe and Reentrant Routines

Routines within a library can be thread-safe or not. A thread-safe routine is one that can be called concurrently from multiple threads without undesirable interactions between threads. A routine can be thread-safe for either of the following reasons:

It is inherently reentrant.

It uses thread-specific data or mutex locks. (A mutex is a synchronization object that is used to allow multiple threads to serialize their access to shared data.)

Reentrant routines do not share any state across concurrent invocations from multiple threads. A reentrant routine is the ideal thread-safe routine, but not all routines can be made reentrant.

Prior to DIGITAL UNIX Version 4.0, many of the C run-time library (libc) routines were not thread-safe, and alternate versions of these routines were provided in libc_r. Starting with DIGITAL UNIX Version 4.0, all of the alternate versions formerly found in libc_r were merged into libc. If a thread-safe routine and its corresponding nonthread-safe routine had the same name, the nonthread-safe version was replaced. The thread-safe versions are modified to use TIS routines (see Section 12.4.1); this enables them to work in both single-threaded and multithreaded environments -- without extensive overhead in the single-threaded case.

12.3.1 Examples of Nonthread-Safe Coding Practices

Some common practices that can prevent code from being thread-safe can be found by examining why some of the libc functions were not thread-safe prior to DIGITAL UNIX Version 4.0:

Returning a pointer to a single, statically allocated buffer
The ctime(3) interface provides an example of this problem:
```
	char *ctime(const time_t *timer);
```
This function takes no output arguments and returns a pointer to a statically allocated buffer containing a string that is the ASCII representation of the time specified in the single parameter to the function. Because a single, statically allocated buffer is used for this purpose, any other thread that calls this function will overwrite the string returned to the previously calling thread.
To make the ctime() function thread-safe, the POSIX 1003.1c standard has defined an alternate version, ctime_r(), which accepts an additional output argument. The argument is a user-supplied buffer that is allocated by the caller. The ctime_r() function writes the ASCII time string into the buffer:
```
	char *ctime_r(const time_t *timer, char *buf);
```
Users of this function must ensure that the storage occupied by the buf argument is not used by another thread.

Maintaining internal state
The rand() function provides an example of this problem:
```
	void srand(unsigned int seed);
	int rand(void);
```
This function is a simple pseudorandom number generator. For any given starting seed value that is set with the srand() function, it generates an identical sequence of pseudorandom numbers. To do this, it maintains a state value that is updated on each call. If another thread is calling this function, the sequence of numbers returned within any one thread for a given starting seed is nondeterministic. This may be undesirable.
To avoid this problem, a second interface, rand_r(), is specified in POSIX 1003.1c. This function accepts an additional argument that is a pointer to a user-supplied integer used by rand_r() to hold the state of the random number generator:
```
	int rand_r(unsigned int *seed);
```
The users of this function must ensure that the seed argument is not used by another thread. Using thread-specific data is one way of doing this (see Section 12.4.1.2).

Operating on read/write data items shared between threads
The problem of sharing read/write data can be solved by using mutexes. In this case, the routine is not considered reentrant, but it is still thread-safe. Like thread-specific data, mutex locking is transparent to the user of the routine.
Mutexes are used in several libc routines, most notably the stdio routines, for example, printf(). Mutex locking in the stdio routines is done by stream to prevent concurrent operations on a stream from colliding, as in the case of two processes trying to fill a stream buffer at the same time. Mutex locking is also done on certain internal data tables in the C run-time library during operations such as fopen() and fclose(). Because the alternate versions of these routines do not require an application program interface (API) change, they have the same name as the original versions.
See Section 12.4.3 for an example of how to use mutexes.

12.4 Writing Thread-Safe Code

When writing code that can be used by both single-threaded and multithreaded applications, it is necessary to code in a thread-safe manner. The following coding practices must be observed:

Static read/write data should be either eliminated, converted to thread-specific data, or protected by mutexes. In the C language, to reduce the potential for misuse of the data, it is good practice to declare static read-only data with the const type modifier.

Global read/write data should be eliminated or protected by mutex locks.

Per-process system resources, such as file descriptors, should be used with care because they are accessible by all threads.

References to the global errno cell should be replaced with calls to geterrno() and seterrno(). This replacement is not necessary if the source file includes errno.h and one of the following conditions is true:
- The file is compiled with the -pthread option (cc or c89 command).
- The pthread.h file is included at the top of the source file.
- The _REENTRANT preprocessor symbol is explicitly set before the include of errno.h.

Dependencies on any other nonthread-safe libraries or object files must be avoided.

12.4.1 Using TIS for Thread-Specific Data

The following sections explain how to use Thread Independent Services (TIS) for thread-specific data.

12.4.1.1 Overview of TIS

Thread Independent Services (TIS) is a package of routines provided by the C run-time library that can be used to write efficient code for both single-threaded and multithreaded applications. TIS routines can be used for handling mutexes, handling thread-specific data, and a variety of other purposes.

When used by a single-threaded application, these routines use simplified semantics to perform thread-safe operations for the single-threaded case. When POSIX Threads Library is present, the bodies of the routines are replaced with more complicated algorithms to optimize their behavior for the multithreaded case.

TIS is used within libc itself to allow a single version of the C run-time library to service both single-threaded and multithreaded applications. See the Guide to the POSIX Threads Library and tis(3) for information on how to use this facility.

12.4.1.2 Using Thread-Specific Data

Example 12-1 shows how to use thread-specific data in a function that can be used by both single-threaded and multithreaded applications. For clarity, most error checking has been left out of the example.

Example 12-1: Threads Programming Example

#include <stdlib.h>
#include <string.h>
#include <tis.h>
 
static pthread_key_t key;
 
void _ _init_dirname()
{
	tis_key_create(&key, free);
}
 
void _ _fini_dirname()
{
	tis_key_delete(key);
}
 
char *dirname(char *path)
{
	char *dir, *lastslash;
/*
 * Assume key was set and get thread-specific variable.
 */
	dir = tis_getspecific(key);
	if(!dir) {	/* First time this thread got here. */
		dir = malloc(PATH_MAX);
		tis_setspecific(key, dir);
	}
 
/*
 * Copy dirname component of path into buffer and return.
 */
	lastslash = strrchr(path, '/');
	if(lastslash) {
		memcpy(dir, path, lastslash-path);
		dir[lastslash-dir+1] = '\0';
	} else
		strcpy(dir, path);
	return dir;
}

The following TIS routines are used in the preceding example:

tis_key_create: Generates a unique data key.
tis_key_delete: Deletes a data key.
tis_getspecific: Obtains the data associated with the specified key.
tis_setspecific: Sets the data value associated with the specified key.

The _ _init_ and _ _fini_ routines are used in the example to initialize and destroy the thread-specific data key. This operation is done only once, and these routines provide a convenient way to ensure that this is the case, even if the library is loaded with dlopen(). See ld(1) for an explanation of how to use the _ _init_ and _ _fini_ routines.

Thread-specific data keys are provided by POSIX Threads Library at run time and are a limited resource. If your library must use a large number of data keys, code the library to create just one data key and store all of the separate data items as a structure or an array of pointers pointed to by that key.

12.4.2 Using Thread Local Storage

Thread Local Storage (TLS) support is always enabled in the C compiler (the cc command's -ms option is not required). In C++, TLS is recognized only with the -ms option, and it is otherwise treated as an error.

TLS is a name for data that has static extent (that is, not on the stack) for the lifetime of a thread in a multithreaded process, and whose allocation is specific to each thread.

In standard multithreaded programs, static-extent data is shared among all threads of a given process, whereas thread local storage is allocated on a per-thread basis such that each thread has its own copy of the data that can be modified by that thread without affecting the value seen by the other threads in the process. For a complete discussion of threads, see the Guide to the POSIX Threads Library.

The essential functionality of thread local storage is and has been provided by explicit application program interfaces (APIs) such as POSIX (POSIX Threads Library) pthread_key_create(), pthread_setspecific(), pthread_getspecific(), and pthread_key_delete().

Although these APIs are portable to POSIX-conforming platforms, using them can be cumbersome and error-prone. Also, significant engineering work is typically required to take existing single-threaded code and make it thread-safe by replacing all of the appropriate static and extern variable declarations and their uses by calls to these thread-local APIs. Furthermore, for Windows-32 platforms there is a somewhat different set of APIs (TlsAlloc(), TlsGetValue(), TlsSetValue(), and TlsFree()), which have the same kinds of usability problems as the POSIX APIs.

By contrast, the TLS language feature is much simpler to use than any of the APIs, and it is especially convenient to use when converting single-threaded code to be multithreaded. This is because the change to make a static or extern variable have a thread-specific value only involves adding a storage-class qualifier to its declaration. The compiler, linker, program loader, and debugger effectively implement the complexity of the API calls automatically for variables declared with this qualifier. Unlike coding to the APIs, it is not necessary to find and modify all uses of the variables, or to add explicit allocation and deallocation code. While the language feature is not generally portable under any formal programming standard, it is portable between Tru64 UNIX and Windows-32 platforms.

12.4.2.1 The _ _thread Attribute

The C and C++ compilers for Tru64 UNIX include the extended storage-class attribute, _ _thread.

The _ _thread attribute must be used with the _ _declspec keyword to declare a thread variable. For example, the following code declares an integer thread local variable and initializes it with a value:

_ _declspec( _ _thread ) int tls_i = 1;

12.4.2.2 Guidelines and Restrictions

You must observe these guidelines and restrictions when declaring thread local objects and variables:

You can apply the _ _thread storage-class attribute only to data declarations and definitions. It cannot be used on function declarations or definitions. For example, the following code generates a compiler error:
```
#define Thread _ _declspec( _ _thread )
Thread void func();            // Error
```

You can specify the _ _thread attribute only on data items with static storage duration. This includes global data objects (both static and extern), local static objects, and static data members of C++ classes. You cannot declare automatic or register data objects with the _ _thread attribute. For example, the following code generates compiler errors:
```
#define Thread  _ _declspec( _ _thread )
void func1()
{
   Thread int tls_i;            // Error
}
 
int func2( Thread int tls_i )    // Error
{ 
   return tls_i;
}
```

You must use the _ _thread attribute for the declaration and the definition of a thread-local object, whether the declaration and definition occur in the same file or separate files. For example, the following code generates an error:
```
#define Thread          _ _declspec( _ _thread )
extern int tls_i;      // This generates an error, because the
int Thread tls_i;      // declaration and the definition differ.
```

You cannot use the _ _thread attribute as a type modifier. For example, the following code generates a compile-time error:
```
char _ _declspec( _ _thread ) *ch;            // Error
 
```

The address of a thread-local object is not considered a link-time constant, and any expression involving such an address is not considered a constant expression. In standard C, the effect of this is to forbid the use of the address of a thread-local variable as an initializer for an object that has static or thread-local extent. For example, the following code is flagged as an error by the C compiler if it appears at file scope:
```
#define Thread  _ _declspec( _ _thread )
Thread int tls_i;
int *p = &tls_i;            // ERROR
```

Standard C permits initialization of an object or variable with an expression involving a reference to itself, but only for objects of nonstatic extent. Although C++ normally permits such dynamic initialization of an object with an expression involving a reference to itself, this type of initialization is not permitted with thread local objects. For example:
```
#define Thread  _ _declspec( _ _thread )
Thread int tls_i = tls_i;            // C and C++ error 
int j = j;                           // Okay in C++; C error
Thread int tls_i = sizeof( tls_i )   // Okay in C and C++
```
Note that a sizeof expression that includes the object being initialized does not constitute a reference to itself and is allowed in C and C++.

12.4.3 Using Mutex Locks to Share Data Between Threads

In some cases, using thread-specific data is not the correct way to convert static data into thread-safe code. For example, thread-specific data should not be used when a data object is meant to be shareable between threads (as in stdio streams within libc). Manipulating per-process resources is another case in which thread-specific data is inadequate. The following example shows how to manipulate per-process resources in a thread-safe fashion:

#include <pthread.h>
#include <tis.h>
 
/*
 * NOTE: The putenv() function would have to set and clear the
 * same mutex lock before it accessed the environment.
 */
 
extern char **environ;
static pthread_mutex_t environ_mutex = PTHREAD_MUTEX_INITIALIZER;
 
char *getenv(const char *name)
{
        char **s, *value;
	      int len;
        tis_mutex_lock(&environ_mutex);
        len = strlen(name);
        for(s=environ; value=*s; s++)
                if(strncmp(name, value, len) == 0 &&
                   value[len] == '=') {
                        tis_mutex_unlock(&environ_mutex);
                        return &(value[len+1]);
                }
        tis_mutex_unlock(&environ_mutex);
        return (char *) 0L;
}

In the preceding example, note how the lock is set once (tis_mutex_lock) before accessing the environment and is unlocked exactly once (tis_mutex_unlock) before returning. In the multithreaded case, any other thread attempting to access the environment while the first thread holds the lock is blocked until the first thread performs the unlock operation. In the single-threaded case, no contention occurs unless an error exists in the coding of the locking and unlocking sequences.

If it is necessary for the lock state to remain valid across a fork() system call in multithreaded applications, it may be useful to create and register pthread_atfork() handler functions to set the lock prior to any fork() call, and to unlock it in both the child and parent after the fork() call. This guarantees that a fork() operation is not done by one thread while another thread holds the lock. If the lock was held by another thread, it would end up permanently locked in the child because the fork() operation produces a child with only one thread. In the case of an independent library, the call to pthread_atfork() can be done in an _ _init_ routine in the library. Unlike most Pthread routines, the pthread_atfork routine is available in libc and may be used by both single-threaded and multithreaded applications.

12.5 Building Multithreaded Applications

The compilation and linking of multithreaded applications differs from that of single-threaded applications in a few minor but important ways. The following sections describe these differences.

12.5.1 Compiling Multithreaded C Applications

Depending on whether an application is single-threaded or multithreaded, many system header files provide different sets of definitions when they are included in the compilation of an application. Whether the compiler generates single-threaded or thread-safe behavior is determined by whether the _REENTRANT preprocessor symbol is defined. When you specify the -pthread option on the cc or c89 command, the _REENTRANT symbol is automatically defined; it is also defined if the pthread.h header file is included. This header file must be the first file included in any application that uses the Pthread library, libpthread.so.

The -pthread option has no other effect on the compilation of C programs. The reentrancy of the code generated by the C compiler is determined only by the proper use of reentrant coding practices by the programmer and by the use of thread-safe support routines or functions -- not by the use of any special options.

12.5.2 Linking Multithreaded C Applications

To link a multithreaded C application, use the cc or c89 command with the -pthread option. When linking, the -pthread option has the effect of modifying the library search path in the following ways:

The Pthread library is included into the link.

The exceptions library is included into the link.

For each library mentioned in a -l option, an attempt is made to locate and presearch a library of corresponding thread-safe routines whose name includes the suffix _r.

The -pthread option does not modify the behavior of the linker in any other way. The reentrancy of the linked code is determined by using proper programming practices in the orginal code, and by compiling and linking with the proper header files and libraries, respectively.

12.5.3 Building Multithreaded Applications in Other Languages

Not all compilers necessarily generate reentrant code; the definition of the language itself can make this difficult. It is also necessary for any run-time libraries linked with the application to be thread-safe. For details on such matters, consult the manual for the compiler you are using and the documentation for the run-time libraries.