5    Image Relocation

Post-link modification tools often require detailed relocation information for a linked image. Some tools may require normal relocations that are preserved in images linked with the -r switch. Newer tools rely on compact relocations and linkerdef records, which retain the same level of detail as normal relocations while requiring much less space in the file.

5.1    New or Changed Image Relocations Features

Tru64 UNIX V5.1B introduces the following new or changed features:

Tru64 UNIX V5.1 introduces the following new or changed features:

5.2    Structures, Fields, and Values for Image Relocation

5.2.1    Compact Relocation Records

Compact relocation records are written into the free-form data area of the comment section. They are identified by a tag type of CM_COMPACT_RLC in the comment header. The public versions of compact relocation interfaces for producers and consumers are located in the header file cmplrs/cmrlc.h. See Section 5.3.1 and Chapter 15 for more information.

5.2.2    Linkerdef Relocation Records (scncomment.h)

Linkerdef relocation records are written into the free-form data area of the comment section. They are identified by a tag type of CM_LINKDERDEF in the comment header. The Linkerdef comment subsection is an array of linker_data structures that contain information similar to the reloc structure. See Section 5.3.2 and Chapter 15 for more information.


Version Note

The linker_data structure is supported on Tru64 UNIX V5.1 and greater.


struct linker_data {
        unsigned int    ld_scnptr;        
        unsigned int    ld_base   : 6;    
        unsigned int    ld_symbol : 6;   
        unsigned int    ld_type   : 8;    
        unsigned int    ld_size   : 6;     
        unsigned int    ld_offset : 6;  
};

SIZE - 8 bytes, ALIGNMENT - 4 bytes

Linkerdef Relocation Entry Fields

ld_scnptr

A byte offset relative to the starting file offset of the section identified by ld_base. Together, these fields identify the target address for the relocation.

ld_base

The number of the section containing the target address. See Table 4-1 for a list of valid section numbers.

ld_symbol

An enumeration value identifying a linker-defined symbol. See Section 5.2.2.1 for a list of valid values.

ld_type

A relocation type. See Table 4-2 for a list of relocation types.

ld_size

The size of a bitfield for the R_OP_STORE relocation.

ld_offset

The bit offset of a bitfield for the R_OP_STORE relocation.

5.2.2.1    Linkerdef Symbol Enumeration

Linker-defined symbols are identified by the following enumeration. Each enumeration value corresponds to the linker-defined symbol of the same name (excluding the "LDEF_" prefix).


Version Note

The LD_SYMBOL enumeration is supported on Tru64 UNIX V5.1 and greater.


enum LD_SYMBOL {
      LDEF__BASE_ADDRESS             = 0,
      LDEF__cobol_main               = 1,
      LDEF__DYNAMIC                  = 2,
      LDEF__DYNAMIC_LINK             = 3,
      LDEF__ebss                     = 4,
      LDEF__edata                    = 5,
      LDEF_edata                     = 6,
      LDEF__end                      = 7,
      LDEF_end                       = 8,
      LDEF__etext                    = 9,
      LDEF_etext                     = 10,
      LDEF__fbss                     = 11,
      LDEF__fdata                    = 12,
      LDEF__fpdata                   = 13,
      LDEF__fpdata_size              = 14,
      LDEF___fstart                  = 15,
      LDEF__ftext                    = 16,
      LDEF__ftlsinit                 = 17,
      LDEF_GOT_OFFSET                = 18,
      LDEF__gp                       = 19,
      LDEF__gpinfo                   = 20,
      LDEF___istart                  = 21,
      LDEF__procedure_string_table   = 22,
      LDEF__procedure_table          = 23,
      LDEF__procedure_table_size     = 24,
      LDEF___tlsbsize                = 25,
      LDEF___tlsdsize                = 26,
      LDEF___tlskey                  = 27,
      LDEF___tlsoffset               = 28,
      LDEF___tlsregions              = 29,
      LDEF___EXEC_FLAGS              = 30,       (V5.1B - )
      LDEF_MAX
};

5.3    Image Relocation Usage

5.3.1    Compact Relocations

Compact relocations are a highly compressed form of relocation records designed for the use of profiling tools and object restructuring tools. By default, they are generated by the linker for all fully linked executable objects and recorded in the object's .comment section. The linker produces this information using libmld.a APIs, which implement the reading and writing of compact relocations. Compact relocations are not produced for images linked with the following linker options: -r, -s. The strip utility will remove the comment subsection that contains compact relocations. See Chapter 15 for the format of the .comment section.

Compact relocations must provide crucial relocation information in much less space than the space required for actual relocation entries. This goal is accomplished by employing a heuristic function to predict relocations. For some sections, this heuristic is highly accurate. Detailing many records in the object file becomes unnecessary because the algorithm can be used instead to recreate many of the actual relocation entries.


Version Note

In releases of Tru64 UNIX prior to V5.1, compact relocations contained only enough relocation information to drive tools that restructure an executable's .text, .init, and .fini sections. From Tru64 UNIX V5.1 onward, executables contain full compact relocation information including relocation records for text and data segment addresses in all mapped object sections.


The interfaces for compact relocations continue to evolve. These interfaces are defined and described in the header file cmplrs/cmrlc.h. This section describes the on-disk file format of compact relocations and the producer and consumer algorithms.

5.3.1.1    Overview

The procedure for creation of compact relocations is as follows:

  1. Generate a list of predicted relocations using heuristics.

  2. Compare the predicted relocations to the actual relocation entries (which are input data to the compact relocations producer).

  3. Wherever a "miss" occurs (that is, the predicted and actual entries do not match) output a compact relocation record.

The procedure for the use of compact relocation records follows:

  1. Generate the list of predicted relocations using the same heuristics as the compact relocations producer.

  2. Compare the expanded compact relocations data with predicted relocations to reconstruct the actual relocation entries.

See Section 5.3.1.3 for more details.

5.3.1.2    File Format

Compact relocations are stored in a subsection of the .comment section. The linker and other tools do not need to be aware of the details of the internal structure of the compact relocation subsection. This knowledge is encapsulated in the cmrlc_* routines found in libmld.a.

The on-disk format of the compact relocations data consists of the following components, in order:

Code may only assume that the version and the file header are contiguous. To access other structures, it is necessary to rely on the location information in the file header.

5.3.1.2.1    Compact Relocation Version

The compact relocation section begins with a version identifier, which has the following structure:

struct {
        unsigned int   version_major;
        unsigned int   version_minor;
};

SIZE - 8 bytes, ALIGNMENT - 4 bytes

The version identifier allows the format of the compact relocations to change from one release to another while providing a mechanism for tools to work on binaries with either the old or new formats. The version identifiers are separate from the header because the format of the header itself may change from release to release.

The major version identifier is incremented for changes in the format of the compact relocation data that affect the most basic access to the data. For example, changes in structure sizes or structure layout are likely to cause failures in existing code that simply reads the raw compact relocation data.

The minor version identifier is incremented whenever the compact relocation data is modified without impacting the format of the data. For example, changing the heuristic to further compact the stored relocation information would require the minor version identifier to be incremented. If the consumer routines see that an object has an old minor version number, they can call a matching version of the heuristic to correctly reconstruct the relocation information.

The major and minor version identifiers that have been used for compact relocation data are described in Table 5-1. Enumeration values for supported versions can be found in the header file /usr/include/cmplrs/cmrlc.h.

Table 5-1:  Compact Relocation Version Identifiers

Major Minor OS Version Description
0 0 V3.0 Initial version
1 0 V3.2 Fix for dynsym relocations
2 0 V4.0 Miscellaneous bug fixes
2 3 V5.1 Full compact relocations
2 4 V5.1B Full compacts with TLS relocations

5.3.1.2.2    Compact Relocations File Header

The version identifier is followed by a high-level header structure that stores the sizes and locations of the other tables with compact relocations information:

struct cmrlc_file_header {
        /*
         * Total number of elements in each sub-table.
         */
        unsigned long   scn_num;    /* section header table */
        unsigned long   rlc_num;    /* compact relocation table */
        unsigned long   expr_num;   /* expression relocation table */
        unsigned long   gpval_num;  /* GP value table */
 
        /*
         * Relative file offset from start of compact relocation data
         * to each sub-table.
         */
        unsigned long   scn_off;
        unsigned long   rlc_off;
        unsigned long   expr_off;
        unsigned long   gpval_off;
};

SIZE - 64 bytes, ALIGNMENT - 8 bytes

Each of the *_num fields indicates the number of entries in the corresponding tables. Each of the *_off fields contains a relative file offset from the start of the compact relocations .comment subsection to the start of the corresponding table. If any of the tables are not present for a particular program, the *_num and *_off fields should be set to zero.

5.3.1.2.3    Compact Relocations Section Header

One or more compact relocations section headers follow the compact relocations file header. Each section header has the following structure:

struct cmrlc_file_scnhdr {
        char           name[8];    /* section name */
 
        /*
         * Number of elements for this section in each sub-table.
         */
        unsigned long  rlc_snum;
        unsigned long  expr_snum;
        unsigned long  gpval_snum;
 
        /*
         * Index from start of table to this section's elements.
         * (This is an element index, not a byte offset.)
         */
        unsigned long  rlc_indx;
        unsigned long  expr_indx;
        unsigned long  gpval_indx;
 
        /*
         * Flag: True if compact relocation table is sorted by
         * increasing virtual address.
         */
         unsigned long  rlc_sorted:1;
         unsigned long  :63;
};

SIZE - 64 bytes, ALIGNMENT - 8 bytes

One compact relocation section header is created for each eCOFF object file section for which compact relocation data is stored. This section header is unrelated to the eCOFF section header structure except for the name field, which connects the two.

Each of the *_num fields indicates the number of entries in the corresponding table for this object file section. If the *_num field is non-zero, the corresponding *_indx field contains the index of the start of that section's entries within the table.

The rlc_sorted field indicates whether the compact relocation table entries for this section are sorted by virtual address.

If an object file section does not have entries in one of the tables for a particular program, the corresponding fields should be set to zero.

5.3.1.2.4    Compact Relocations Table

Compact relocation tables follow the compact relocation section headers. Each compact relocation table consists of an array of structures:

struct cmrlc_file_rlc {
    unsigned    v_offset;
    union {
        unsigned        word;
        struct {
            unsigned    type:5;
            unsigned    :27;
        } common;
        struct {                /* GPDISP */
            unsigned    type:5;
            unsigned    lda_offset:27;
        } gpdisp;
        struct {                /* EXPRESSION */
            unsigned    type:5;
            unsigned    index:27;
        } expr;
        struct {                /* REF*, SREL*, GPREL32 */
            unsigned    type:5;
            unsigned    rel_scn:5;
            unsigned    count:12;
            unsigned    dist:4;                           (V5.0 - )
            unsigned    :6;
        } addrtype;
        struct {                /* External REF */        (V5.1 - )
            unsigned    type:5;                           (V5.1 - )
            unsigned    r_symndx:27;                      (V5.1 - )
        } eref;                                           (V5.1 - )
        struct {                /* LITERAL */             (V5.1 - )
            unsigned    type:5;                           (V5.1 - )
            unsigned    rel_scn:5;                        (V5.1 - )
            unsigned    count:12;                         (V5.1 - )
            unsigned    dist:4;                           (V5.1 - )
            unsigned    :6;                               (V5.1 - )
        } literal;                                        (V5.1 - )
        struct {                /* LITUSE */              (V5.1 - )
            unsigned    type:5;                           (V5.1 - )
            unsigned    rel_scn:5;                        (V5.1 - )
            unsigned    lit_type:5;                       (V5.1 - )
            unsigned    litOFFSET:17;                     (V5.1 - )
        } lituse;                                         (V5.1 - )
        struct {                /* NO_RELOC, NO_LITUSE */ (V5.0 - )
            unsigned    type:5;                           (V5.0 - )
            unsigned    count:12;                         (V5.0 - )
            unsigned    dist:4;                           (V5.0 - )
            unsigned    :11;                              (V5.0 - )
        } noreloc;                                        (V5.0 - )
        struct {                /* IMMED: GP_HI32, SCN_HI32, BR_HI32 */
            unsigned    type:5;
            unsigned    subop:6;
            unsigned    br_offset:21;
        } immedhi;
        struct {                /* IMMED: all other sub-opcodes */
            unsigned    type:5;
            unsigned    subop:6;
            unsigned    rel_scn:5;
            unsigned    hi_offset:16;                     (V5.1 - )
        } immedlo;
        struct {                /* VADJUST */
            unsigned    type:5;
            signed      adjust:27;
        } vadjust;
        struct {                /* BRADDR, HINT */
            unsigned    type:5;
            unsigned    rel_scn:5;
            unsigned    :22;
        } other;
        struct {                /* TLS_HIGH, TLS_LOW */   (V5.1B - )
            unsigned    type:5;                           (V5.1B - )
            unsigned    rel_scn:5;                        (V5.1B - )
            unsigned    :22;                              (V5.1B - )
        } tlshighlo;                                      (V5.1B - )
    } info;
};

SIZE - 8 bytes, ALIGNMENT - 4 bytes

/*
 * Values for 'type' field.
 */
enum cmrlc_rlctypes {
    CMRLC_REFLONG=1,
    CMRLC_REFQUAD=2,
    CMRLC_GPREL32=3,
    CMRLC_GPDISP=4,
    CMRLC_BRADDR=5,
    CMRLC_HINT=6,
    CMRLC_SREL16=7,
    CMRLC_SREL32=8,
    CMRLC_SREL64=9,
    CMRLC_EXPRESSION=10,   /* R_OP_* expression */
    CMRLC_IMMEDHI=11,      /* R_IMMED for high part */
    CMRLC_IMMEDLO=12,      /* R_IMMED for low part */
    CMRLC_NO_RELOC=13,     /* correct mispredicted relocation */
    CMRLC_VADJUST=14,      /* adjust base for succeeding 'v_offset's */
    CMRLC_LITERAL=15,                                     (V5.1 - )
    CMRLC_LITUSE=16,                                      (V5.1 - )
    CMRLC_NO_LITUSE=17,                                   (V5.1 - )
    CMRLC_REFQUAD_EXTERN=18, /* not used */               (V5.1 - )
    CMRLC_TLS_LITERAL=19,                                 (V5.1B - )
    CMRLC_TLS_HIGH=20,                                    (V5.1B - )
    CMRLC_TLS_LOW=21                                      (V5.1B - )
};
 
/*
 * Maximum value for 'count' field in 'addrtype' relocations.
 */
#define CMRLC_COUNT_MAX         ((1<<12) - 1)
 
/* 
 * Maximum value for 'dist' field in 'addrtype' and 'noreloc' relocations.
 */
#define CMRLC_DIST_MAX          ((1<<4) - 1)

The number of elements in the array is determined by the corresponding *_num field in the section header.

The v_offset field specifies the virtual address of each relocation entry as a byte offset from a base address. Initially, the base is the starting virtual address of the current section. If relocations are required at addresses that cannot be expressed as a 32-bit offset from the section's start address, CMRLC_VADJUST relocation entries are used to extend the addressing range. However, this feature is not fully supported.

The value of the type field determines how to interpret the remainder of a compact relocation structure.

The lda_offset field specifies an instruction offset (byte offset divided by 4) from the relocation entry's virtual address to the lda instruction in an R_GPDISP entry's ldah/lda pair. This design does not support ldah/lda pairs that are separated by more than 2^29 bytes.

The rel_scn field indicates the ID of the section to which this relocation is relative. It uses the R_SN_* values from the header file reloc.h.

The count and dist fields are used to specify consecutive relocation entries that are identical. The count field can be used in this manner for R_REFLONG, R_REFQUAD, R_SREL16, R_SREL32, R_SREL64, R_GPREL32, and R_LITERAL entries. Two relocation entries are identical if they have the same type and relative section. Two relocation entries are consecutive if the difference in their virtual addresses is equal to the same multiple of the natural size for the relocation type (16 bits for R_SREL16; 32 bits for R_REFLONG, R_SREL32, R_GPREL32; and R_LITERAL, and 64 bits for R_REFQUAD and R_SREL64). The dist field multiplied by the natural size of the relocation type gives the byte distance between repetitions of the relocation. A count value of zero is not allowed. These fields reduce the impact of mispredicting the relocations for jump tables.

5.3.1.2.5    Stack Relocation Table

Expression stack relocation information is stored separately. Each stack relocation table entry has the following structure:

struct cmrlc_file_expr {
        unsigned long  vaddr;
        unsigned       type:5;
        unsigned       rel_scn:5;
        unsigned       offset:6;  /* CMRLC_EXPR_STORE only */
        unsigned       size:6;    /* CMRLC_EXPR_STORE only */
        unsigned       last:1;    /* true for last reloc in expr */
        unsigned       :9;
        unsigned       reserved;
};

SIZE - 16 bytes, ALIGNMENT - 8 bytes

/*
 * Values for 'type' field.
 */
enum cmrlc_exprtypes {
        CMRLC_EXPR_PUSH=1,     /* R_OP_PUSH */
        CMRLC_EXPR_PSUB=2,     /* R_OP_PSUB */
        CMRLC_EXPR_PRSHIFT=3,  /* R_OP_PRSHIFT */
        CMRLC_EXPR_STORE=4     /* R_OP_STORE */
};

Expression stack compact relocation records are stored in a separate table because each record requires more space than other types of compact relocation records. Entries in this table are grouped into sequences of relocation entries that form a single expression. The first entry in each table starts a sequence. The last entry in each sequence has its last field set to one. A new sequence starts immediately after the end of the previous sequence.

The start of each sequence is referenced by a CMRLC_EXPRESSION entry in the section's compact relocation table. The index field of that entry points to the first entry in a stack relocation sequence. All sequences in the stack relocation table should have a corresponding CMRLC_EXPRESSION entry in the compact relocation table.

5.3.1.2.6    GP Value Tables

Additional tables called GP value tables are used to store GP range information. GP values are kept in tables separate from other compact relocations to reduce the processing required to map a virtual address to the corresponding active GP value.

Each GP value table consists of an array of these structures:

struct {
        unsigned long     vaddr
        unsigned          gp_offset
        unsigned          reserved
};

SIZE - 16 bytes, ALIGNMENT - 8 bytes

Each additional GP range after the first range has an entry in the table. (The first range is described by the GP value in the file's a.out header.) Therefore, a single-GOT program will have no entries in its GP value tables.

If an executable's sections have different numbers of GP ranges, gpval_num should be set to describe the section with the largest number of ranges. eCOFF sections with fewer GP ranges must still have GP value tables with gpval_num entries. Sections with short GP value tables can duplicate their last GP value table entry until the table is the proper length.

The vaddr field contains the virtual address where the new range starts. vaddr must point within the section to which this GP value table corresponds. The new GP value is computed by adding gp_offset to the GP value in the file's a.out header.

5.3.1.3    Basic Algorithm for Compact Relocations Production

In order to produce compact relocations, a tool must have a set of actual relocation entries and the raw data to which those relocation entries apply. It should then apply the following algorithm to create a set of matching compact relocations:

  1. Convert the external relocation entries to local relocation entries.

  2. Run the prediction heuristic function to construct a set of predicted relocation entries from the raw data.

  3. Compare the predicted relocation entries to the remaining actual relocation entries and create a compact relocation record for any mismatches.

  4. Compress any sequences of consecutive, identical R_REF*, R_SREL*, R_GPREL32, or R_LITERAL entries.

  5. Set the rlc_sorted field if the compact relocation entries are stored in a sorted order.

Any R_GPVALUE entries must be handled specially. These relocation entries must be added to their section's GP value table. They should then be removed from the list of actual relocation entries used to create compact relocations.

The first step in the algorithm is to convert actual relocation entries from external to local. The compact relocations only exist in fully linked executables with no undefined symbols. Thus, external relocation entries are not usually needed. (The compact relocation types include a type for retaining external R_REFQUAD relocations wherever symbol correspondence might be needed for post-link processing.) An external relocation entry is converted to a local relocation entry by setting its r_extern field to zero and changing its r_symndx field to the appropriate relocation section constant (see Table 4-1).

The second step is to run the prediction heuristic function over the raw data for which these actual relocation entries apply. This produces a set of predicted relocation entries.

Step three compares the predicted relocation entries to the actual relocation entries as follows:

  1. If a match exists between a predicted relocation entry and an actual relocation entry at the same virtual address, do nothing.

  2. If a predicted relocation entry and an actual relocation entry at the same virtual address do not match, write a compact form of the actual relocation entry to the compact relocation data file.

  3. If only a predicted relocation entry exists for a particular virtual address, write a compact CMRLC_NO_RELOC record to the data file at this virtual address.

  4. If only an actual relocation entry exists for a particular virtual address, write a compact form of the actual relocation entry to the compact relocation data file.

Creating a compact relocation entry from an actual relocation entry is fairly straightforward except in the case of an expression stack relocation sequence. First, create entries in the stack relocation table for each relocation entry in the sequence. Normally, this sequence starts with an R_OP_PUSH entry and ends with an R_OP_STORE entry. The last entry should have the last field set to one. Then create a CMRLC_EXPRESSION compact relocation entry whose index field points to the first entry in the stack relocation table for this expression. (This can only be done for a sequence that describes a complete expression.)

The fourth step is to compress any sequences of R_REF*, R_SREL*, R_GPREL32, or R_LITERAL entries that are consecutive and identical . Such a sequence exists if all relocation entries in the sequence have the same relocation type, are relative to the same rel_scn value ( R_SN_* constant), and have v_offset fields that increase by a multiple of the natural size of the relocation type (for example, 8 bytes for R_REFQUAD, 2 bytes for R_SREL16). Such sequences can be replaced with a single compact relocation entry that has the sequence's type and rel_scn value. The v_offset field should be that of the first relocation entry in the sequence. The dist field should be set to the distance between repeated relocations in natural size increments, and the count field should be set to the number of relocation entries in the sequence.

The final step is to set the rlc_sorted field in the compact relocation section header. If the compact relocations are stored in order of increasing v_offset values, this field should be set to one. Otherwise, it should be set to zero.

5.3.1.4    Basic Algorithm for Compact Relocations Consumption

A consumer tool can read back the compact relocation entries if it has the compact relocation information and the raw data that they describe. The consumer tool can use this information to regenerate the actual relocation entries by following this algorithm:

  1. Expand any R_REF*, R_SREL*, R_GPREL32, or R_LITERAL compact relocation entries whose count field is greater than one.

  2. Run the prediction heuristic function to construct a set of predicted relocation entries from the raw data.

  3. Compare the predicted relocation entries to the compact relocation entries and reconstruct the actual relocation entries.

The first step in this algorithm just undoes the compression step (step four) in the production algorithm.

The second step runs the same prediction heuristic that was used in the production algorithm. To guarantee that the generated predicted relocation entries are the same as when the compact relocation entries were produced, it is critical that the heuristic function is the same. It is also critical that the raw data is the same as when the compact relocation entries were produced.

The final step compares the predicted relocation entries with the stored compact relocation entries as follows:

  1. If only a predicted relocation entry exists for a particular virtual address, report the predicted relocation entry.

  2. If a CMRLC_NO_RELOC entry exists at the same virtual address as a predicted relocation entry, do not report a relocation entry at this virtual address.

  3. If a compact relocation entry other than CMRLC_NO_RELOC exists at the same virtual address as a predicted relocation entry, report the compact relocation entry.

  4. If only a compact relocation entry exists for a particular virtual address, report the compact relocation entry.

5.3.2    Linkerdef Relocations


Version Note

Linkerdef relocations are supported in Tru64 UNIX V5.1 and greater for symbol table format V3.13 and greater.


Linkerdef relocations are generated by the linker for all fully linked executable objects and shared libraries. They are not produced for images linked with the following linker options: -r, -s. The strip utility will remove the comment subsection that contains linkerdef relocations. See Chapter 15 for the format of the .comment section.

The linkerdef relocations supplement compact relocation information. They provide relocation information for all uses of linker-defined symbol values within the section data of an object. This information is not currently accessible in compact relocation information. Compact relocations are generally stored as local relocations with no symbolic information. Linkerdef relocations are also unique because they contain relocations for absolute symbols with literal values such as _DYNAMIC_LINK and _procedure_table_size.

Tools that modify linked objects, such as om and spike, can use linkerdef relocations to update references to linker-defined symbol values that are necessarily changed as a result of other changes made to the linked object.

5.4    Language-Specific Image Relocations Features

Relocation entries may be generated for language-specific compiler-generated external symbols. For example, they are often generated in Fortran programs for the procedure for_set_reentrancy() and in C++ programs for exception-handling labels.