The size of the default pointer type on Tru64 UNIX systems is 64 bits, and all system interfaces use 64-bit pointers. The Compaq C compiler, in addition to supporting 64-bit pointers, also supports the use of 32-bit pointers.
In most cases, 64-bit pointers do not cause any problems in a program and the issue of 32-bit pointers can be ignored altogether. However, in the following cases, the issue of 32-bit pointers does become a concern:
If you are porting a program with pointer-to-int
assignments
If the 64-bit pointer size is unnecessary for your program and it causes your program to use too much memory -- for example, your program uses very large structures composed of pointer fields and has no need to exceed the 32-bit address range in those fields
If you are developing a program for a mixed system environment (32- and 64-bit systems) and the program's in-memory structures are accessed from both types of systems
The use of 32-bit pointers in applications requires that the applications be compiled and linked with special options and, depending on the specific nature of the code, may also require source-code changes.
The following types of pointers are referred to in this appendix:
Short pointer: A 32-bit pointer. When a short pointer is declared, 32 bits are allocated.
Long pointer: A 64-bit pointer. When a long pointer is declared, 64 bits are allocated. This is the default pointer type on Tru64 UNIX systems.
Simple pointer: A pointer to a nonpointer data type,
for example,
int *num_val
.
More specifically, the pointed-to type contains no
pointers, so the size of the pointed-to type does not depend on the size of
a pointer.
Compound pointer: A pointer to a data type whose size depends
upon the size of a pointer, for example,
char **FontList
.
A.1 Compiler-System and Language Support for 32-Bit Pointers
The following mechanisms control the use of 32-bit pointers:
The
cc
options
-xtaso
and
-xtaso_short
are needed for compiling programs that
use pointers with a 32-bit data type:
-xtaso
Enables recognition of the
#pragma pointer_size
directive and causes
-taso
(truncated address
support option) to be passed to the linker (if linking).
-xtaso_short
Same as
-xtaso
, except
-xtaso_short
also sets the initial compiler default state to use short pointers.
Because all system routines continue to use 64-bit pointers,
most applications require source changes when compiled in this way.
However,
the use of
protect_headers_setup
(see
Section A.3.3)
can greatly reduce or eliminate the need to change the source code.
Because the use of short pointers, in general, requires a thorough knowledge
of the application they are applied to,
-xtaso_short
is
not recommended for use as a porting aid.
In particular, do not attempt to
use
-xtaso_short
to port a poorly written program (that
is, a program that heavily mixes pointer and
int
values).
The
ld
option
-taso
ensures that executable files and associated shared libraries are located
in the lower 31-bit addressable virtual address space.
The
-taso
option can be helpful when porting programs that assume address
values can be stored in 32-bit variables (that is, programs that assume that
pointers are the same length as
int
variables).
The
-taso
option does not affect the size of the pointer data type;
it just ensures that the value of any pointer in a program would be the same
in either a 32-bit or a 64-bit representation.
The
-taso
linker option does impose restrictions
on the run-time environment and how libraries can be used.
See
Section A.2
for details on the
-taso
option.
The
#pragma pointer_size
directive controls the size of pointer types in a C program.
These pragmas
are recognized by the compiler only if the
-xtaso
or
-xtaso_short
options are specified on the
cc
command line; they are ignored if neither of the options is specified.
See
Section 3.9
for details on the
#pragma pointer_size
directive.
The following example demonstrates the use of both short and long pointers:
main () { int *a_ptr; printf ("A pointer is %ld bytes\n", sizeof (a_ptr)); }
When compiled with either the default settings or the
-xtaso
option, the sample program prints the following message:
A pointer is 8 bytes
When compiled with the
-xtaso_short
option,
this sample program prints the following message:
A pointer is 4 bytes
The
-taso
option establishes 32-bit addressing within
all 64-bit pointers within a program.
It thereby solves almost all 32-bit
addressing problems, except those that require constraining the physical size
of some pointers to 32-bits (which is handled by the
-xtaso
or
-xtaso_short
option and
pointer_size
pragmas).
The
-taso
option is most frequently used to handle
addressing problems introduced by pointer-to-int
assignments
in a program.
Many C programs, especially older C programs that do not conform
to currently accepted programming practices, assign pointers to
int
variables.
Such assignments are not recommended, but they do
produce correct results on systems in which pointers and
int
variables are the same size.
However, on a Tru64 UNIX system, this practice
can produce incorrect results because the high-order 32 bits of an address
are lost when a 64-bit pointer is assigned to a 32-bit
int
variable.
The following code fragment illustrates this problem:
{ char *x; /* 64-bit pointer */ int z; /* 32-bit int variable */ . . . x = malloc(1024); /* get memory and store address in 64 bits */ z = x; /* assign low-order 32 bits of 64-bit pointer to 32-bit int variable */ }
The most portable way to fix the problem presented by pointer-to-int
assignments in existing source code is to modify the code to
eliminate this type of assignment.
However, in the case of large applications,
this can be time-consuming.
(To find pointer-to-int
assignments
in existing source code, use the
lint -Q
command.)
Another way to overcome this problem is to use the
-taso
option.
The
-taso
option makes it unnecessary
for the pointer-to-int
assignments to be modified.
It does
this by causing a program's address space to be arranged so that all locations
within the program -- when it starts execution -- can be expressed
within the 31 low-order bits of a 64-bit address, including the addresses
of routines and data coming from shared libraries.
The
-taso
option does not in any way affect the sizes
used for any of the data types supported by the system.
Its only effect on
any of the data types is to limit addresses in pointers to 31 bits (that is,
the size of pointers remains at 64 bits, but addresses use only the low-order
31 bits).
A.2.1 Use and Effects of the -taso Option
The
-taso
option can be specified
on the
cc
or
ld
command lines used to
create object modules.
(If specified on the
cc
command
line, the option is passed to the
ld
linker.) The
-taso
option directs the linker to set a flag in object modules,
and this flag directs the loader to load the modules into 31-bit address space.
The 31-bit address limit is used to avoid the possibility of setting
the sign bit (bit 31) in 32-bit
int
variables when pointer-to-int
assignments are made.
Allowing the sign bit to be set in an
int
variable by pointer-to-int
assignment would
create a potential problem with sign extension.
For example:
{ char *x; /* 64-bit pointer */ int z; /* 32-bit int variable */ . . . /* address of named_obj = 0x0000 0000 8000 0000 */ x = &named_obj; /* 0x0000 0000 8000 0000 = original pointer value */ z = x; /* 0x8000 0000 = value created by pointer-to-int assignment */ x = z; /* 0xffff ffff 8000 0000 = value created by pointer- to-int-to-pointer or pointer-to-int-to-long assignment (32 high-order bits set to ones by sign extension) */ }
The
-taso
option ensures that the text and data segments
of an application are loaded into memory that can be reached by a 31-bit address.
Therefore, whenever a pointer is assigned to an
int
variable,
the values of the 64-bit pointer and the 32-bit variable will always be identical
(except in the special situations described in
Section A.2.2).
Figure A-1
is an example of a memory diagram of programs
that use the
-taso
and
-call_shared
options.
(If you invoke the linker through the
cc
command, the default is
-call_shared
.
If you
invoke
ld
directly, the default is
-non_shared
.)
Figure A-1: Layout of Memory Under -taso Option
Note that stack and heap addresses will also fit into 31 bits. The stack grows downward from the bottom of the text segment, and the heap grows upward from the top of the data segment.
The
-T
and
-D
options (linker
options that are used to set text and data segment addresses, respectively)
can also be used to ensure that the text and data segments of an application
are loaded into low memory.
The
-taso
option, however,
in addition to setting default addresses for text and data segments, also
causes shared libraries linked outside the 31-bit address space to be appropriately
relocated by the loader.
The default addresses used for the text and data segments are determined
by the options that you specify on the
cc
command line:
Specifying the
-non_shared
or
-call_shared
option with the
-taso
option results
in the following defaults:
Specifying the
-shared
option with the
-taso
option results in the following defaults:
Using these default values produces sufficient amounts of space for
text and data segments for most applications (see the
Object File/Symbol Table Format Specification
for details on the contents of text and
data segments).
The default values also allow an application to allocate a
large amount of
mmap
space.
If
you specify the
-taso
option and also specify text and
data segment address values with
-T
and
-D
, the values specified override the
-taso
default addresses.
You can use the
odump
utility to check whether a program was built successfully
within 31-bit address space.
To display the start addresses of the text, data,
and bss segments, enter the following command:
% odump -ov obj_file_x.o
None of the addresses should have any bits set in bits 31 to 63; only bits 0 to 30 should ever be set.
Shared objects built with
-taso
cannot be linked with shared objects that were not built with
-taso
.
If you attempt to link taso shared objects with nontaso shared
objects, the following error message is displayed:
Cannot mix 32 and 64 bit shared objects without -taso option
A.2.2 Limits on the Effects of the -taso Option
The
-taso
option does not prevent a program from
mapping addresses outside the 31-bit limit, and it does not issue warning
messages if this is done.
Such addresses could be established using any one
of the following mechanisms:
-T and -D options
As noted previously, if the
-T
and
-D
options are used with the
-taso
option, the values that
you specify for them will override the
-taso
option's default
values.
Therefore, to avoid defeating the purpose of the
-taso
option, you must select addresses for the
-T
and
-D
options that are within the 31-bit address range.
malloc() function
To avoid problems with addressing
when you use
malloc
in a taso application, you must ensure that the combination of the
default data-size resource limit and the starting address of the data segment
do not exceed the maximum 31-bit address (0x7fff ffff).
The data-size resource limit is the maximum amount of data space that
can be used by a process.
This limit can be adjusted using the
limit
(C shell) or
ulimit
(Korn shell) commands.
As
noted previously, the starting address of the data segment can be adjusted
using the
-D
option on the
cc
command.
mmap system call
Applications that use the
mmap
system call must use
a jacket routine to
mmap
to ensure that mapping addresses
do not exceed a 31-bit range.
This entails taking the following steps:
To prevent
mmap
from allocating space outside
the 31-bit address space, specify the following compilation option on the
cc
command line for all modules (or at least all modules that refer
to
mmap
):
-Dmmap=_mmap_32_
This
option equates the name of the
mmap
function with the name
of a jacket routine (_mmap_32_
).
As a result, the jacket
routine is invoked whenever references are made to the
mmap
function in the source program.
If the
mmap
function is invoked in only
one of the source modules, either include the jacket routine in that module
or create an
mmap_32c.o
object module and specify it on
the
cc
command line.
(The file specification for the jacket
routine is
/usr/opt/alt/usr/lib/support/mmap_32.c
.)
If the
mmap
function is invoked from more than one
source file, you must use the method of creating an
mmap_32c.o
object module and specifying it on a
cc
command line because
including the jacket routine in more than one module would generate linker
errors.
A.3 Using the -xtaso or -xtaso_short Option
The
-xtaso
and
-xtaso_short
options
enable you to use both short (32-bit) and long (64-bit) pointer data types
in a program.
Note that the 64-bit data type is constrained to 31-bit addressing
because
-xtaso
and
-xtaso_short
both
engage the
-taso
option.
You should only use the
-xtaso
or
-xtaso_short
option when you have a need to use the short pointer data type
in your program.
If you want to use 32-bit addressing but do not need short
pointers, use the
-taso
option.
Most programs that use short pointers will also need to use long pointers
because Tru64 UNIX is a 64-bit system and all applications must use 64-bit
pointers wherever pointer data is exchanged with the operating system or any
system-supplied library.
Because normal applications use the standard system
data types, no conversion of pointers is needed.
In an application that uses
short pointers, it may be necessary to explicitly convert some short pointers
to long pointers by using
pointer_size
pragmas (see
Section 3.9).
Note
New applications for which the use of short pointers is being considered should be developed with long pointers first and then analyzed to determine whether short pointers would be beneficial.
A.3.1 Coding Considerations Associated with Changing Pointer Sizes
The following coding considerations may be pertinent when you are working with pointers in your source code:
The size of pointers used in a
typedef
that includes pointers as part of its definition is determined when the
typedef
is declared, not when it is used.
Therefore, if a short
pointer is declared as part of a
typedef
definition, all
variables that are declared using that
typedef
will use
a short pointer, even if those variables are compiled in a context where long
pointers are being declared.
The size of pointers within a macro is governed by the context
in which the macro is expanded.
The only way to specify pointer size as part
of a macro is by using a
typedef
declared with the desired
pointer size.
In general, conversions between short and long simple pointers are safe and are performed implicitly without the need for casts on assignments or function calls. On the other hand, compound pointers generally require source code changes to accommodate conversions between short and long pointer representations.
For example, the argument vector,
argv
, is a long
compound pointer and must be declared as such.
Many X11 library functions
return long compound pointers; the return values for these functions must
be declared correctly or erroneous behavior will result.
If a function was
compiled to exclusively use short pointers and needed to access such a vector,
it would be necessary to add code to copy the values from the long pointer
vector into a short pointer vector before passing it to the function.
Only the C and C++ compilers support the use of short pointers. Short pointers should not be passed from C and C++ routines to routines written in other languages.
The
pointer_size
pragma and the
-xtaso_short
option have no effect on the size of the second argument
to
main()
, traditionally called
argv
.
This argument always has a size of 8 bytes even if the
pointer_size
pragma has been used to set other pointer sizes to 4 bytes.
A.3.2 Restrictions on the Use of 32-Bit Pointers
Most applications on Tru64 UNIX systems use addresses that are
not representable in 32 bits.
Therefore, no library that might be called by
normal applications can contain short pointers.
Vendors who create software
libraries generally should not use short pointers.
A.3.3 Avoiding Problems with System Header Files
When the system libraries are built, the compiler assumes that pointers
are 64 bits and that structure members are naturally aligned.
These are the
C and C++ compiler defaults.
The interfaces to the system libraries (the header
files in the
/usr/include
tree) do not explicitly encode
these assumptions.
You can alter the compiler's assumptions about pointer size (with
-xtaso_short
) and structure member alignment (with
-Zp
n
[where
n!=8]
or
-nomember_alignment
).
If you use any of these options
and your application includes a header file from the
/usr/include
tree and then calls a library function or uses types declared in
that header file, problems may occur.
In particular, the data layouts computed
by the compiler when it processes the system header file declarations may
differ from the layouts compiled into the system libraries.
This situation
can cause unpredictable results.
Run
the script
protect_headers_setup.sh
immediately after the
compiler is installed on your system to eliminate the possibility of problems
with pointer size and data alignment under the conditions described in this
section.
See
protect_headers_setup
(8)
for details on how and why this is
done.
You can enable or disable the protection established
by the
protect_headers_setup
script by using variations
of the
-protect_headers
option on your compilation command
line.
See
cc
(1)
for information about the
-protect_headers
option.