This chapter contains notes about issues and known problems with the development environment software and, whenever possible, provides solutions or workarounds to those problems. The following topics are discussed:
The following note applies to general programming.
The C runtime library
malloc
function (and associated functions) have been modified to allow
significantly better concurrency when used by multithreaded
applications.
Additionally, three
new memory allocator tuning variables have been added to allow more
control of allocator behavior:
__first_fit
__max_cache
__delayed_free
As always when developing applications making significant use of dynamically
allocated memory and requiring maximum speed of execution, you should
carefully read the Tuning Memory Allocation section of the
malloc(3)
reference page.
Applications that directly map and access I/O space with bytes or shorts may be impacted by the new DEC C compiler.
The default tuning for the DEC C compiler has advanced its focus from EV4-EV5 architectures to EV56-EV6 architectures. With this change in tuning, the compiler now generates amask-guarded byte and word instruction sequences for some loops. The amask guards assure that the byte and word instructions will not execute on processors that do not support them. Less efficient instructions will execute instead.
The net result of this change is that users who recompile their applications with the default tuning may see a slight increase in object code size, a very slight decrease in performance on EV4-EV5 processors, and a sizable increase in performance on EV56-EV6 machines.
This change may be disruptive for applications that use special device driver interfaces that directly map I/O space for devices that do not support 8-bit and 16-bit access granularity.
If those applications are compiled without
-Wf,
-static
and are run on EV56-EV6 machines they may corrupt I/O memory. To avoid this
possibility, those applications should be compiled with
-tune
ev5
which disables byte/word instruction generation.
The following notes apply to realtime programming.
The symbol
SA_SIGINFO,
defined in
sys/signal.h,
is not visible under certain namespace conditions when
_POSIX_C_SOURCE
is explicitly defined in the application or on the compile line.
The
SA_SIGINFO
symbol is visible if you do not explicitly define
_POSIX_C_SOURCE.
For most applications,
unistd.h
provides the standards definitions needed,
including
_POSIX_C_SOURCE.
As a general rule, avoid explicitly defining
standards macros in your application or on the compile line.
If you do explicitly define
_POSIX_C_SOURCE,
then
SA_SIGINFO
is
visible if you also explicitly define
_OSF_SOURCE.
POSIX 1003.1b synchronized I/O using file status flags does not apply to file truncation. When file status flags are used to control I/O synchronization, no synchronization occurs for file truncation operations.
You can use the
fsync()
or
fdatasync()
function to explicitly synchronize truncation operations.
A problem occurs when
fcntl()
is called with the
F_GETFL
request, and the file operated on has the
O_DSYNC
file status flag set. The return mask incorrectly indicates
O_SYNC
instead of
O_DSYNC.
The following notes apply to DECthreads. See Section 8.10 and Section 8.11 for information about DECthreads interfaces that will be retired in a future release. See Section 1.11 for information about Visual Threads, a new product that lets you analyze your multithreaded applications for potential logic and performance problems.
Users who desire optimal performance from DECthreads, and who are
willing to relink on future versions of Tru64 UNIX, might want to
use the DECthreads static libraries that are located in the
CMPDEVENH440
subset. Once this subset is installed, you can find the libraries
in the
/usr/opt/alt/usr/lib/threads
directory.
Before using these static libraries, you should read the README file in the same location.
Signal handling in the POSIX 1003.1c
(pthread)
interface of
DECthreads is substantially different from signal handling
for the draft 4 POSIX and the CMA interfaces of DECthreads. When
migrating your application from the draft 4 POSIX or CMA
interfaces to the POSIX 1003.1c interface, please see
the IEEE POSIX 1003.1c standard or
the
Guide to DECthreads
for a discussion of signal handling in threaded applications.
In releases prior to Version 4.0, thread scheduling attributes were systemwide. In other words, threads had a system contention scope. Since Version 4.0, thread policies and priorities are, by default, local to the process. No artificial limit exists for thread priorities of these process contention scope threads, the full priority range is accessible by every thread.
Previously, there was no way to control the contention scope of a thread. Starting with Version 4.0D, applications coded to the POSIX 1003.1c pthreads interface can set the desired contention scope upon thread creation. For more information on setting and determining thread contention scope, see the descriptions of the following routines in the Guide to DECthreads:
pthread_attr_setscope()
pthread_attr_getscope()
The guide also describes a problem with inheritance of the contention scope scheduling attribute in Versions 4.0D and higher.
Process contention scope threads provide faster context switches between threads in the same process, and reduce the demand on system resources without reducing execution concurrency. The Tru64 UNIX "two level scheduling" implementation (the code that supports process contention scope scheduling) automatically replaces kernel execution entities when a process contention scope thread blocks in the kernel for any reason, and it provides time-slicing of compute-bound threads. Therefore, there is no need to worry that using process contention scope will reduce parallelism or allow the execution of some threads to prevent other threads from executing.
The only code that should require system contention scope is code that must run on a specific processor via binding and code that must be directly scheduled by the operating system kernel against threads in other processes, particularly threads running inside the kernel. While the scheduling policy and priority of process contention scope threads is virtual and affects scheduling only against other threads within the process, the scheduling policy and priority of system contention scope threads (when the process runs with root access) can allow the thread to preempt threads within the kernel. While this can sometimes be valuable and even essential, extreme care must be used in such programs to avoid locking up the system. It might be impossible to interrupt such a thread.
Compaq does not recommend using the
stackaddr
thread creation attribute which allows you to allocate your own stack for a
thread. The semantics of this attribute are poorly defined by POSIX and the
Single UNIX Specification, Version 2. As a result, code using the attribute is
unlikely to be portable between implementations. The attribute is difficult to
use reliably, since the developer must, by intimate knowledge of the machine
architecture and implementation, know the correct address to specify relative
to the allocated stack. The implementation cannot diagnose an incorrect value
because the interface does not provide sufficient information. Using an
incorrect value might result in program failure, possibly in obscure ways.
DECthreads now supports read-write locks. A read-write lock is a synchronization object for protecting a data object that can be accessed concurrently by more than one thread in the same program. Unlike a mutex, a read-write lock distinguishes between shared read and exclusive write operations on the shared data object. A read-write lock is most useful in protecting a shared data object that is read frequently and modified less frequently. The following routines provide access to the read-write lock capability:
pthread_rwlockattr_destroy(3)
pthread_rwlockattr_init(3)
pthread_rwlock_destroy(3)
pthread_rwlock_init(3)
pthread_rwlock-rdlock(3)
pthread_rwlock_tryrdlock(3)
pthread_rwlock_trywrlock(3)
pthread_rwlock_unlock(3)
pthread_rwlock_wrlock(3)
For more information about read-write locks, see the reference pages for these routines.
DECthreads now allows you to assign names, as C language strings, to thread objects including threads, mutexes, condition variables, and read-write locks (see Section 5.3.5 ). During debugging, you can use these names to help identify individual objects by function rather than by the numeric identifiers the thread library assigns. The Ladebug debugger and the Visual Threads analysis tool (see Section 1.11 ) include these names when displaying information about thread objects. Other debuggers and analysis tools can also use the names you have assigned.
Use the following routines to assign and retrieve object names:
pthread_attr_getname_np(3)
pthread_attr_setname_np(3)
pthread_cond_getname_np(3)
pthread_cond_setname_np(3)
pthread_getname_np(3)
pthread_key_getname_np(3)
pthread_key_setname_np(3)
pthread_mutex_getname_np(3)
pthread_mutex_setname_np(3)
pthread_rwlock_getname_np(3)
pthread_rwlock_setname_np(3)
pthread_setname_np(3)
For more information about object naming, see the reference pages for these routines.
In this release, the metering capabilities of DECthreads may not be reliable in a process that forks.
Although older Alpha processors (prior to the 21264 chip) can only access memory in units of at least a quadword (8 bytes), multiple variables, each of which is less than eight bytes, can occupy the same quadword in memory. In such cases, multithreaded programs might experience a problem if two or more threads read the same quadword, update different parts of it, then independently write their respective copies back to memory. The last thread to write the quadword overwrites any data previously written to other parts of the quadword. This can happen even though each thread protects its part of the quadword with its own mutex.
The Tru64 UNIX C compiler protects scalar variables against this problem by aligning them in memory on quadword (8-byte) boundaries. However, in composite data objects such as structures or arrays, the compiler aligns members on their natural boundaries. For example, a 2-byte member is aligned on a 2-byte boundary. Because of this, any adjacent members of the composite object that total eight bytes or less could occupy the same quadword in memory.
Inspect your multithreaded application code to determine if you have a composite data object in which adjacent members could share the same quadword in memory. If you do and if your project allows, Compaq recommends that you force alignment of each such member variable to a quadword boundary by redefining the variable to be at least eight bytes, or by defining sufficient padding storage after the variable to total eight bytes.
Alternatively, you can create one mutex for each composite data object in which adjacent members can share the same quadword in memory. Then use this single mutex to protect all write accesses by all threads to the composite data object. This technique might be less desirable because of performance considerations.
In order to allow for the possibility of a more comprehensive and robust threads
debugging environment, it has become necessary to remove the
pthread_debug()
and
pthread_debug_cmd()
routines. To prevent existing binaries from failing, the routines will continue to be
recognized. However, a call to either routine now results in an immediate return to the
calling program. The
pthread_debug_cmd()
routine returns a zero (0) indicating success. Debuggers such
as Ladebug and TotalView provide functionality formerly provided by these routines.
The
SIGEV_THREAD
notification mechanism works correctly, starting in
Version 4.0D. Using this notification mechanism, a user-defined function
is called to perform notification of an asynchronous event. The function
is run as though it were the start routine of a thread and can make full
use of the DECthreads synchronization objects.
The
SIGEV_THREAD
notification mechanism and the function to be called
are specified in the sigevent structure. This mechanism is useful for
programming with the POSIX 1.b realtime signal interfaces such as timers
and asynchronous I/O. For information and cautions concerning the use of
signals in a multithreaded environment, see the
Guide to DECthreads.
For more information about using
SIGEV_THREAD,
see the IEEE POSIX 1003.1c-1996
standard and The Open Group Single UNIX Specification, Version 2.
The following notes apply to the profiler tools.
The
-cputime
option of the
hiprof(5)
profiler now provides an instruction-count profile for threaded programs,
the same as the
-calltime
option, because the CPU cycles reported for kernel-threads
by the RPCC instruction can not be mapped to
pthread(3)
threads.
The only significant difference is that the profile is displayed as
the number of instructions executed instead of CPU seconds used. The
-cputime
option still profiles CPU seconds for nonthreaded programs.
The
cc
command's
-prof_gen
option (which causes the
pixie
profiler to be run after the executable is
linked) names files differently from the way it did in releases prior to
Version 4.0E.
The new naming scheme is necessary to support formal benchmarking, which is the
primary purpose of the
-prof_gen
option.
Before Version 4.0E, the uninstrumented executable produced by the
ld
linker and provided as input to
pixie
was named
a.out
(or as indicated with the
-o
option). The instrumented executable produced by
pixie
was given the usual
.pixie
filename extension.
Starting with Version 4.0E, the instrumented executable is named
a.out
(or as indicated with the
-o
option). The uninstrumented executable is given a
.non_pixie
file name extension.
The following note applies to debugging with
dbx.
When debugging a crash dump with
dbx,
you can examine the call stack of the user
program whose execution precipitated the kernel crash. To examine a crash dump
and also view the user program stack, you must invoke
dbx
using the following command syntax:
#
dbx -k vmunix.
n
vm[z]core.
n path
/
user-program
The version number
(n)
is determined by the value contained in the bounds file,
which is located in the same directory as the dump files. The
user-program
parameter specifies the user program executable.
The crash dump file must contain a full crash dump. For information on setting
system defaults for full or partial crash dumps, see the
System Administration
guide.
You can use the
assign
command in
dbx,
as shown in the following
example, to temporarily specify a full crash dump. This setting stays in effect
until the system is rebooted.
# dbx -k vmunix.3 dbx version 5.0
.
.
.
(dbx) assign partial_dump=0
To specify a full crash dump permanently so that this setting remains in effect
after a reboot, use the
patch
command in
dbx,
as shown in the following example:
(dbx) patch partial_dump=0
With either command, a
partial_dump
value of 1 specifies a partial dump.
The following example shows how to examine the state of a user program named test1 that purposely precipitated a kernel crash with a syscall after several recursive calls:
# dbx -k vmunix.1 vmzcore.1 /usr/proj7/test1 dbx version 5.0 Type 'help' for help.
stopped at [boot:1890 ,0xfffffc000041ebe8] Source not available
warning: Files compiled -g3: parameter values probably wrong (dbx) where [1] > 0 boot() ["../../../../src/kernel/arch/alpha/machdep.c":1890, 0xfffffc000041ebe8] 1 panic(0xfffffc000051e1e0, 0x8, 0x0, 0x0, 0xffffffff888c3a38) ["../../../../src/kernel/bsd/subr_prf.c":824, 0xfffffc0000281974] 2 syscall(0x2d, 0x1, 0xffffffff888c3ce0, 0x9aa1e00000000, 0x0) ["../../../../src/kernel/arch/alpha/syscall_trap.c":593, 0xfffffc0000423be4] 3 _Xsyscall(0x8, 0x3ff8010f9f8, 0x140008130, 0xaa, 0x3ffc0097b70) ["../../../../src/kernel/arch/alpha/locore.s":1409, 0xfffffc000041b0f4] 4 __syscall(0x0, 0x0, 0x0, 0x0, 0x0) [0x3ff8010f9f4] 5 justtryme(scall = 170, cpu = 0, levels = 25) ["test1.c":14, 0x120001310] 6 recurse(inbox = (...)) ["test1.c":28, 0x1200013c4] 7 recurse(inbox = (...)) ["test1.c":30, 0x120001400] 8 recurse(inbox = (...)) ["test1.c":30, 0x120001400] 9 recurse(inbox = (...)) ["test1.c":30, 0x120001400]
.
.
.
30 recurse(inbox = (...)) ["test1.c":30, 0x120001400] 31 main(argc = 3, argv = 0x11ffffd08) ["test1.c":52, 0x120001518] (dbx) up 8 [2] recurse: 30 if (r.a[2] > 0) recurse(r); (dbx) print r [3] struct { a = { [0] 170 [1] 0 [2] 2 [3] 0
.
.
.
(dbx) print r.a[511] [4] 25 (dbx)
where
command displays the kernel stack followed by the user
program stack at the time of the crash. In this case, the kernel stack has 4
activation levels; the user program stack starts with the fifth level and
includes several recursive calls.
[Return to example]
up 8
command moves the debugging context 8 activation levels up
the stack to one of the recursive calls within the user program code.
[Return to example]
print r
command displays the current value of the variable
r,
which is a structure of array elements. Full symbolization is available for
the user program, assuming it was compiled with the
-g
option.
[Return to example]
print r.a[511]
command displays the current value of array
element 511 of structure
r.
[Return to example]
The following note applies to Java programming.
A file system conflict exists between Java and the System V Environment (SVE) on Version 4.0 and later systems.
The problem arises because both Java and SVE use the file system path name string
/usr/bin/alpha
for different purposes. Java creates
/usr/bin/alpha
as a directory. SVE (specifically, the optional
SVEBCP4** Base Compatibility Package subset) creates
/usr/bin/alpha
as a symbolic link to the
/usr/opt/svr4/usr/bin/alpha
directory. The order in which these applications are installed
determines if the customer will experience a problem.
Here are three ways to avoid the problem:
/usr/bin/alpha
link exists, it is safe to remove the link. The link is
not used after the SVE installation and for all other situations SVE will look
for the directory location,
/usr/opt/svr/usr/bin/alpha.
That directory will be found and does not cause a conflict.
/usr/bin/alpha
link will not be needed during SVE installation.
/usr/bin/alpha
directory that is used by Java
/usr/bin/alpha
link created by the SVE installation
/usr/bin/alpha
directory
There will be no patch or other resolution mechanism for this problem other than the workaround provided here.