3.1 Design for Asynchronous Execution

When programming with threads, always keep in mind that the execution of a thread is inherently asynchronous with respect to other threads running the system (or in the process). You cannot depend upon any synchronization between two threads unless you explicitly code that synchronization into your program using one of the following:

Mutexes
A properly tested application predicate loop on a condition variable
A call to join with a thread you expect to terminate
An equivalent platform dependent programming construct (such as VAX interlocked instructions; or Alpha load locked/store conditional sequences)

Some existing implementations of threads operate by context switching threads in user mode, within a single operating system process. Context switches between such threads occur only at relatively determinate times, such as when you make a blocking call to the threads library or when a timeslice interrupt occurs. This type of threading library might be termed "slightly asynchronous" because, with such a library, you can get away with many errors.

Systems that support kernel threads are less forgiving because context switches between threads can occur more frequently, and for less deterministic reasons. Systems that allow threads within a single process to run simultaneously on multiple processors are even less forgiving.

Some examples of common programming errors that may work often under some implementations but not at all under others are as follows:

Creating a thread with an argument that points to stack local data, or to global or static data that is serially reused for a sequence of threads.
There is no guarantee of when a thread will start. It can start immediately or not for a significant period of time, depending on the priority of the thread in relation to other threads that are currently running. When a thread will start can also depend on the behavior of other processes, as well as on other threaded subsystems within the current process.
Specifically, the thread started with a pointer to stack local data may not start until the creating thread's routine has returned, and the storage may have been changed by other calls. The thread started with a pointer to global or static data may not start until the storage has been reused to create another thread.
Initializing DECthreads objects (such as mutexes) or global data that is to be used by another thread after creating the thread.
On slightly asynchronous systems this is often safe because the thread will probably not run until the creator blocks. Thus, the error can go undetected initially. On another system (or in a later release of the operating system) that supports kernel threading, the created thread may run immediately, before the data has been initialized. This can lead to failures that are difficult to detect. Note that a thread may run to completion before the call that created it returns to the creator. The system load may affect the timing as well.
Using scheduling policy and priority as a synchronization mechanism.
In a uniprocessor system, only one thread can run at a time, and when a high-priority thread becomes runnable it immediately pre-empts a lower priority running thread. Therefore, a thread running at high priority might erroneously be presumed to not need a mutex to access shared data.
On a multiprocessor system, high and low-priority threads are likely to run at the same time. Situations can even arise where high-priority threads are waiting to run while the threads that are running have a lower priority.
Even if you know that your code is only going to run on a uniprocessor implementation, never try to use scheduling instead of synchronization. Your code will be safer, more portable, and upwardly compatible to a new release of the system with SMP support, if you design the code correctly in the first place.

Before you create a thread, you should set up all requirements that the thread will need to execute. If you need to set the thread scheduling parameters, for example, do so with attributes objects when you create it, rather than trying to use pthread_setschedparam or other routines afterwards. If you need to set global data for the thread or create synchronization objects, do these before you create the thread or set them in a pthread_once initialization routine that is called from each thread.