A    Using 32-Bit Pointers on Tru64 UNIX Systems

The size of the default pointer type on Tru64 UNIX systems is 64 bits, and all system interfaces use 64-bit pointers. The Compaq C compiler, in addition to supporting 64-bit pointers, also supports the use of 32-bit pointers.

In most cases, 64-bit pointers do not cause any problems in a program and the issue of 32-bit pointers can be ignored altogether. However, in the following cases, the issue of 32-bit pointers does become a concern:

The use of 32-bit pointers in applications requires that the applications be compiled and linked with special options and, depending on the specific nature of the code, may also require source-code changes.

The following types of pointers are referred to in this appendix:

A.1    Compiler-System and Language Support for 32-Bit Pointers

The following mechanisms control the use of 32-bit pointers:

The following example demonstrates the use of both short and long pointers:

   main ()
 
  {
     int *a_ptr;
 
     printf ("A pointer is %ld bytes\n", sizeof (a_ptr));
  }

When compiled with either the default settings or the -xtaso option, the sample program prints the following message:

A  pointer  is  8  bytes
 

When compiled with the -xtaso_short option, this sample program prints the following message:

A  pointer  is  4  bytes
 

A.2    Using the -taso Option

The -taso option establishes 32-bit addressing within all 64-bit pointers within a program. It thereby solves almost all 32-bit addressing problems, except those that require constraining the physical size of some pointers to 32-bits (which is handled by the -xtaso or -xtaso_short option and pointer_size pragmas).

The -taso option is most frequently used to handle addressing problems introduced by pointer-to-int assignments in a program. Many C programs, especially older C programs that do not conform to currently accepted programming practices, assign pointers to int variables. Such assignments are not recommended, but they do produce correct results on systems in which pointers and int variables are the same size. However, on a Tru64 UNIX system, this practice can produce incorrect results because the high-order 32 bits of an address are lost when a 64-bit pointer is assigned to a 32-bit int variable. The following code fragment illustrates this problem:

{
char *x;   /* 64-bit pointer */
int z;     /* 32-bit int variable */
.
.
.
    x = malloc(1024); /* get memory and store address in 64 bits */
    z = x;    /* assign low-order 32 bits of 64-bit pointer to
                 32-bit int variable */
}

The most portable way to fix the problem presented by pointer-to-int assignments in existing source code is to modify the code to eliminate this type of assignment. However, in the case of large applications, this can be time-consuming. (To find pointer-to-int assignments in existing source code, use the lint  -Q command.)

Another way to overcome this problem is to use the -taso option. The -taso option makes it unnecessary for the pointer-to-int assignments to be modified. It does this by causing a program's address space to be arranged so that all locations within the program -- when it starts execution -- can be expressed within the 31 low-order bits of a 64-bit address, including the addresses of routines and data coming from shared libraries.

The -taso option does not in any way affect the sizes used for any of the data types supported by the system. Its only effect on any of the data types is to limit addresses in pointers to 31 bits (that is, the size of pointers remains at 64 bits, but addresses use only the low-order 31 bits).

A.2.1    Use and Effects of the -taso Option

The -taso option can be specified on the cc or ld command lines used to create object modules. (If specified on the cc command line, the option is passed to the ld linker.) The -taso option directs the linker to set a flag in object modules, and this flag directs the loader to load the modules into 31-bit address space.

The 31-bit address limit is used to avoid the possibility of setting the sign bit (bit 31) in 32-bit int variables when pointer-to-int assignments are made. Allowing the sign bit to be set in an int variable by pointer-to-int assignment would create a potential problem with sign extension. For example:

{
char *x;   /* 64-bit pointer */
int z;     /* 32-bit int variable */
   .
   .
   .
             /* address of named_obj = 0x0000 0000 8000 0000 */
      x = &named_obj;  /* 0x0000 0000 8000 0000 = original pointer
                          value */
      z = x;  /* 0x8000 0000 = value created by pointer-to-int
                 assignment */
      x = z;  /* 0xffff ffff 8000 0000 = value created by pointer-
                 to-int-to-pointer or pointer-to-int-to-long
                 assignment (32 high-order bits set to ones by
                 sign extension) */
}

The -taso option ensures that the text and data segments of an application are loaded into memory that can be reached by a 31-bit address. Therefore, whenever a pointer is assigned to an int variable, the values of the 64-bit pointer and the 32-bit variable will always be identical (except in the special situations described in Section A.2.2).

Figure A-1 is an example of a memory diagram of programs that use the -taso and -call_shared options. (If you invoke the linker through the cc command, the default is -call_shared. If you invoke ld directly, the default is -non_shared.)

Figure A-1:  Layout of Memory Under -taso Option

Note that stack and heap addresses will also fit into 31 bits. The stack grows downward from the bottom of the text segment, and the heap grows upward from the top of the data segment.

The -T and -D options (linker options that are used to set text and data segment addresses, respectively) can also be used to ensure that the text and data segments of an application are loaded into low memory. The -taso option, however, in addition to setting default addresses for text and data segments, also causes shared libraries linked outside the 31-bit address space to be appropriately relocated by the loader.

The default addresses used for the text and data segments are determined by the options that you specify on the cc command line:

Using these default values produces sufficient amounts of space for text and data segments for most applications (see the Object File/Symbol Table Format Specification for details on the contents of text and data segments). The default values also allow an application to allocate a large amount of mmap space.

If you specify the -taso option and also specify text and data segment address values with -T and -D , the values specified override the -taso default addresses.

You can use the odump utility to check whether a program was built successfully within 31-bit address space. To display the start addresses of the text, data, and bss segments, enter the following command:

% odump -ov
obj_file_x.o

None of the addresses should have any bits set in bits 31 to 63; only bits 0 to 30 should ever be set.

Shared objects built with -taso cannot be linked with shared objects that were not built with -taso. If you attempt to link taso shared objects with nontaso shared objects, the following error message is displayed:

Cannot mix 32 and 64 bit shared objects without -taso
option

A.2.2    Limits on the Effects of the -taso Option

The -taso option does not prevent a program from mapping addresses outside the 31-bit limit, and it does not issue warning messages if this is done. Such addresses could be established using any one of the following mechanisms:

A.3    Using the -xtaso or -xtaso_short Option

The -xtaso and -xtaso_short options enable you to use both short (32-bit) and long (64-bit) pointer data types in a program. Note that the 64-bit data type is constrained to 31-bit addressing because -xtaso and -xtaso_short both engage the -taso option.

You should only use the -xtaso or -xtaso_short option when you have a need to use the short pointer data type in your program. If you want to use 32-bit addressing but do not need short pointers, use the -taso option.

Most programs that use short pointers will also need to use long pointers because Tru64 UNIX is a 64-bit system and all applications must use 64-bit pointers wherever pointer data is exchanged with the operating system or any system-supplied library. Because normal applications use the standard system data types, no conversion of pointers is needed. In an application that uses short pointers, it may be necessary to explicitly convert some short pointers to long pointers by using pointer_size pragmas (see Section 3.9).

Note

New applications for which the use of short pointers is being considered should be developed with long pointers first and then analyzed to determine whether short pointers would be beneficial.

A.3.1    Coding Considerations Associated with Changing Pointer Sizes

The following coding considerations may be pertinent when you are working with pointers in your source code:

A.3.2    Restrictions on the Use of 32-Bit Pointers

Most applications on Tru64 UNIX systems use addresses that are not representable in 32 bits. Therefore, no library that might be called by normal applications can contain short pointers. Vendors who create software libraries generally should not use short pointers.

A.3.3    Avoiding Problems with System Header Files

When the system libraries are built, the compiler assumes that pointers are 64 bits and that structure members are naturally aligned. These are the C and C++ compiler defaults. The interfaces to the system libraries (the header files in the /usr/include tree) do not explicitly encode these assumptions.

You can alter the compiler's assumptions about pointer size (with -xtaso_short) and structure member alignment (with -Zpn [where n!=8] or -nomember_alignment). If you use any of these options and your application includes a header file from the /usr/include tree and then calls a library function or uses types declared in that header file, problems may occur. In particular, the data layouts computed by the compiler when it processes the system header file declarations may differ from the layouts compiled into the system libraries. This situation can cause unpredictable results.

Run the script protect_headers_setup.sh immediately after the compiler is installed on your system to eliminate the possibility of problems with pointer size and data alignment under the conditions described in this section. See protect_headers_setup(8) for details on how and why this is done.

You can enable or disable the protection established by the protect_headers_setup script by using variations of the -protect_headers option on your compilation command line. See cc(1) for information about the -protect_headers option.