7    Comment Section

The Tru64 UNIX object file format supports a mechanism for storing information that is not part of a program's code or data and is not loaded into memory during execution. The comment section (.comment) is used for this purpose. Typically, this section contains information that describes an object but is not required for the correct operation of the object. Any kind of object file can have a comment section.


Version Note

Prior to Tru64 UNIX V5.0 the system linker ignores comment sections in input objects.


7.1    New and Changed Comment Section Features

Tru64 UNIX V5.1 introduces the following new features for comment sections:

Version 3.13 of the object file format introduces the following new features for comment sections:

7.2    Structures, Fields, and Values of the Comment Section

All declarations described in this section are found in the header file scncomment.h.

7.2.1    Subsection Headers

The comment section begins with a set of header structures, each describing a separate subsection.

typedef struct {
        coff_uint        cm_tag;
        coff_uint        cm_len;
        coff_ulong       cm_val;
} CMHDR;

SIZE - 16 bytes, ALIGNMENT - 8 bytes

Subsection Header ( CMHDR) Fields

cm_tag

Identifies the type of data in this subsection of the .comment section. This value may be recognized by system tools. If it is not recognized, generic processing occurs, as described in Section 7.3.3. Refer to Table 7-1 for a list of system-defined comment tags.

cm_len

Specifies the unpadded length (in bytes) of this subsection's data. If cm_len is zero, the data is stored in the cm_val field. The padded length is this value rounded up to the nearest 16-byte boundary.

cm_val

Provides either a pointer to this subsection's data or the data itself. If cm_len is nonzero, cm_val is a relative file offset to the start of the data from the beginning of the .comment section. If cm_len is zero, this field contains all data for that subsection. In the latter case, the size of the data is considered to be the size of the field (8 bytes).

Table 7-1:  Comment Section Tag Values

Tag Value Description
CM_END 0 Last subsection header. Must be present.
CM_CMSTAMP 3 First subsection header. The cm_val field contains a version stamp that identifies the version of the comment section format. The current definition of CM_VERSION is 0. Must be present.
CM_COMPACT_RLC 4 Compact relocation data. See Section 4.4 for details.
CM_STRSPACE 5 (V5.0 - ) Generic string space.
CM_TAGDESC 6 (V5.0 - ) Subsection containing flags that tell tools how to process unfamiliar subsections. See Section 7.2.2 and Section 7.3.4.1.
CM_IDENT 7 (V5.0 - ) Identification string. Reserved for system use.
CM_TOOLVER 8 (V5.0 - ) Tool-specific version information. See Section 7.3.4.2.
CM_II_CHECKSUMS 9 (V5.1 - ) Checksum data for Atom incremental instrumentation. Reserved for future use.
CM_II_ATOMARGS 10 (V5.1 - ) Atom argument data for incremental instrumentation. Reserved for future use.
CM_II_TOOLARGS 11 (V5.1 - ) Atom tool argument string for incremental instrumentation. Reserved for future use.
CM_II_ANALADDRS 12 (V5.1 - ) Analysis address information for Atom incremental instrumentation. Reserved for future use.
CM_FLOAT_TYPE 13 (not supported) Floating point type used in compilation. The value field will be set to one of: F_TANDEM_FLOATTYPE_UNUSED, F_TANDEM_FLOATTYPE_TANDEM, F_TANDEM_FLOATTYPE_NEUTRAL, F_TANDEM_FLOATTYPE_IEEE
CM_II_OBJID 14 (V5.1 - ) Object identification number for Atom incremental instrumentation. Reserved for future use.
CM_LINKERDEF 15 (V5.1 - ) Relocation information for linker-defined symbols. See Section 4.5
CM_LOUSER 0x80000000 Beginning of user tag value range (inclusive).
CM_HIUSER 0xffffffff End of user tag value range (inclusive).


Version Note

The CM_FLOAT_TYPE tag is reserved for use on Tandem big-endian systems. It is not supported on Tru64 UNIX.


7.2.2    Tag Descriptor Entry

Tag descriptors are used to specify behavior for tools that modify object files and potentially affect the accuracy of comment subsection data. They are especially useful as processing guidelines for tools that do not understand certain subsections. Tools which have specific knowledge of certain comment subsection types can ignore the tag descriptor settings for subsection type. The tag descriptors are stored in the raw data of the CM_TAGDESC subsection. See Section 7.3.4.1 for more information.

typedef struct {
        coff_uint       tag;
        cm_flags_t      flags;
} cm_td_t;

SIZE - 8 bytes, ALIGNMENT - 4 bytes

Tag Descriptor Fields

tag

Tag value of subsection being described.

flags

Flag settings. See Section 7.2.2.1.

7.2.2.1    Comment Section Flags

typedef struct {
        coff_uint       cmf_strip   :3;
        coff_uint       cmf_combine :5;
        coff_uint       cmf_modify  :4;
        coff_uint       reserved    :20;
} cm_flags_t;

SIZE - 4 bytes, ALIGNMENT - 4 bytes

Comment Section Flags Fields

cmf_strip

Tells tools that perform stripping operations whether to strip comment section data.

cmf_combine

Tells tools how to combine multiple input subsections of the same.

cmf_modify

Tells tools that modify single object files how to rewrite the input comment section in the output object.

Table 7-2:  Strip Flags

Name Value Description
CMFS_KEEP 0x0 Do not remove this subsection when performing stripping operations.
CMFS_STRIP 0x1 Remove this subsection if stripping the entire symbol table.
CMFS_LSTRIP 0x2 Remove this subsection if stripping local symbolic information or if fully stripping the symbol table.

Table 7-3:  Combine Flags

Name Value Description
CMFC_APPEND 0x0 Concatenate multiple instances of input subsection data.
CMFC_CHOOSE 0x1 Choose one instance of input subsection data (randomly).
CMFC_DELETE 0x2 Do not output this subsection.
CMFC_ERRMULT 0x3 Raise an error if multiple instances of this subsection are encountered as input.
CMFC_ERROR 0x4 Raise an error if a subsection of this type is encountered as input.

Table 7-4:  Modify Flags

Name Value Description
CMFM_COPY 0x0 Copy this subsection's data unchanged from the input object to the output object.
CMFM_DELETE 0x1 Do not output a subsection of this type.
CMFM_ERROR 0x2 Raise an error if a subsection of this type is encountered as input.

7.3    Comment Section Usage

7.3.1    Comment Section Formatting Requirements

The comment section is divided between subsection header structures and an unstructured raw data area. The subsection headers contain tags that identify the data stored in the subsequent raw data area. Each header describes a different subsection. The raw data for all subsections follows the last header, as shown in Figure 7-1.

Figure 7-1:  Comment Section Data Organization

Begin and end marker tags are used to denote the boundaries of the structured portion of the comment section. The begin marker is CM_CMSTAMP, which contains a comments section version stamp, and the end marker is CM_END. If either of these headers is missing or the version indicated by the value of CM_CMSTAMP is invalid, the comment section is considered invalid.

The ordering of the subsection headers and their corresponding raw data do not need to match. Nor is the density of the raw data area guaranteed. However, all subsection headers must be contiguous: no other data can be placed between them. Furthermore, a one-to-one relationship must exist between the subsection headers that point into the raw data and the data itself. Subsection raw data must not overlap.

The interpretation of the cm_val field depends on the cm_len field. When cm_len is zero, cm_val contains arbitrary data whose interpretation depends on the value in the cm_tag field. When cm_len is non-zero, cm_val contains a relative file offset from the start of the comment section into the raw data area.

The start of data allocated in the raw data area must be octaword (16-byte) aligned for each subsection. Zero-byte padding is inserted at the end of each data item as necessary to maintain this alignment. The value stored in cm_len represents the actual length of the data, not the padded length. Tools manipulating this data must calculate the padded length.

7.3.2    Comment Section Contents

The comment section can contain various types of information. Each type of information is stored in its own subsection of the comment section. Each subsection must have a unique tag value within the section.

The comment section can include supplemental descriptive information about the object file. For instance, the tag CM_IDENT points to one or more ASCII strings in the raw data area that serve to identify the module. Use of this tag is reserved for compilation system object producers such as compilers and assemblers.

User-defined comment subsections are also possible. The CM_LOUSER and CM_HIUSER tags delimit the user-defined range of tag values. Potential uses include product version information and miscellaneous information targeted for specific consumers.

Although no restrictions are put on the type or amount of information that can be placed in the comment section, it is important to be aware that users have the capability to remove the section entirely (by using the command ostrip -c) and that object file consumers may ignore its presence.

The minimal valid comment section consists of a CM_CMSTAMP header and a CM_END header. Because no structure field in the object file format holds the number of subsections in the comment section, the presence of the CM_END header is crucial. Without it, a consumer cannot determine the number of subsections present.

7.3.3    Comment Section Processing

Many tools that handle objects read or write the comment section. Some tools, such as the linker and mcs, perform special processing of comment section data. Others may be interested in extracting certain subsections. Most object-handling tools provided on the system access the comment section to check for tool-specific version information (see Section 7.3.4.2).

The linker is both a consumer and producer of the comment section. As with other object file sections, the linker must combine multiple input comment sections to form a single output section. When comment sections are encountered in input object files, the linker reads subsection headers and merges the raw data according to its own defaults and the flag settings of any tag descriptors that are present.

The mcs utility provides comment section manipulation facilities. This tool allows users to add, modify, delete, or print the comment section from the command line. The mcs tool can only process objects that already have a .comment section header, but actual .comment section data is not required. Compilers and assemblers frequently write object files which have zero-sized .comment sections.

The operations performed by mcs do not affect the object's suitability for linking or execution. See the mcs(1) man page for more details.

Stripping tools, such as strip and ostrip, also process the comment section. They read the tag descriptors to determine what subsections to remove. The cmf_strip field of the tag descriptor specifies the stripping behavior. If the cmf_strip field is set to CMFS_STRIP that subsection will be removed if an object is fully stripped. If the cmf_strip field is set to CMFS_LSTRIP for a particular subsection type, that subsection will be removed if an object is fully stripped or locally stripped.

7.3.4    Special Comment Subsections

Comment subsections can have particular structures or semantics that a consumer must know to be able to read and process them correctly. Two system-defined subsections with special formatting and processing rules are the tag descriptors ( CM_TAGDESC) and the tool-specific version information ( CM_TOOLVER).

Another special subsection contains compact relocation data ( CM_COMPACT_RLC). This topic is covered in Section 4.4.

7.3.4.1    Tag Descriptors ( CM_TAGDESC)


Version Note

Tag descriptors are supported in object format V3.13 and greater.


The tag descriptor subsection contains a table of tags and their corresponding flag settings. This information tells tools how to handle unfamiliar subsections. The CM_TAGDESC subsection may not be present, and if present, it may not contain entries for subsections that are present. Also, a tag descriptor may be present for a subsection that is not found in the object.

A list of possible tag descriptor flag settings can be found in Section 7.2.2.1. Flag settings are divided into three categories based on the categories of object tools that need to modify the comment section:

  1. Tools that strip object files

  2. Tools that combine multiple instances of comment section data

  3. Tools that modify and rewrite single object files

The default flag settings for user subsections that do not have tag descriptors are CMFS_KEEP, CMFC_APPEND, and CMFM_COPY. Tools that strip or rewrite objects should not modify subsection data for comment subsections marked with these default flag settings. A tool that combines multiple instances of subsection data, should concatenate the subsection raw data for same-type input subsections marked with the default flag settings.

A tool can ignore the tag descriptor flags and default flag settings for a subsection if it recognizes the subsection type and understands how to process its data.

Some of the system tags have different defaults. These are shown in Table 7-5. However, tag descriptors in the CM_TAGDESC subsection can be used to override the default settings for system tag values as well as user tag values.

Table 7-5:  Default System Tag Flags

Tag Default Flag Settings
CM_END CMFS_KEEP, CMFC_CHOOSE, CMFM_COPY
CM_CMSTAMP CMFS_KEEP, CMFC_CHOOSE, CMFM_COPY
CM_COMPACT_RLC CMFS_STRIP, CMFC_DELETE, CMFM_DELETE
CM_STRSPACE CMFS_KEEP, CMFC_APPEND, CMFM_COPY
CM_TAGDESC CMFS_KEEP, CMFC_CHOOSE, CMFM_COPY
CM_IDENT CMFS_KEEP, CMFC_APPEND, CMFM_COPY
CM_TOOLVER CMFS_KEEP, CMFC_CHOOSE, CMFM_COPY
CM_II_CHECKSUMS CMFS_STRIP, CMFC_ERROR, CMFM_COPY
CM_II_ATOMARGS CMFS_STRIP, CMFC_ERROR, CMFM_COPY
CM_II_TOOLARGS CMFS_STRIP, CMFC_ERROR, CMFM_COPY
CM_II_ANALADDRS CMFS_STRIP, CMFC_ERROR, CMFM_COPY
CM_II_OBJID CMFS_STRIP, CMFC_ERROR, CMFM_COPY
CM_LINKERDEF CMFS_STRIP, CMFC_ERROR, CMFM_DELETE

Because the size of a tag descriptor entry is fixed, a consumer can determine the number of entries by dividing the size of the subsection by the size of a single tag descriptor (see Section 7.2.2). If cm_len is set to zero, a single tag descriptor is stored as immediate data.

7.3.4.2    Tool Version Information ( CM_TOOLVER)


Version Note

Tool versions are supported in object format V3.13 and greater.


The CM_TOOLVER subsection contains tool-specific version entries for system tools that process object files. If present, this subsection may have any number of entries. This subsection can also be used to record version information for non-system tools.

Each tool version entry consists of three parts:

  1. Tool name (null-terminated character string)

  2. Tool version number (unsigned 8-byte unaligned numeric value)

  3. Printable version string (null-terminated character string)

The number of tool version entries cannot be determined from the subsection header because the entries vary in length. The data must be read until the entry sought is found or until the end of the subsection's data is reached.

The encoding of the tool version number is generally tool dependent. The only requirement is that the value, viewed as an unsigned long, must be monotonically increasing with time.

Typically, an object file consumer uses the tool version information to verify its ability to handle an input object file. The consumer uses an API (see libst reference pages) to look for a tool version entry with a tool name matching its own (part one of the entry). If found, the version number (part two of the entry) must not exceed the version number of the tool. Otherwise, the tool will print a message instructing the user to obtain the newer version of the tool, using the printable version string (part three of the entry). This mechanism can be used as a warning to customers of a necessary upgrade to a newer release of a product, for instance.

As an example, a compiler might produce object files with new symbol table information that causes an old version of the ladebug debugger to produce a fatal error. To provide more user-friendly behavior for old versions of the debugger, the compiler outputs a tool version entry:

  1. "ladebug"

  2. 2

  3. "5.0A-BL5"

This entry occupies 25 bytes. The debugger recognizes its name in the entry and compares the version number "2" with the version number it was built with. (Note that the version number is most likely meaningless to an end user of the debugger.) In this case, assume that the installed debugger's version number is "1". The message "Please obtain version 5.0A-BL5" is output to the user.

Note that the numeric tool version number can be unaligned. This is an exception to the general rule requiring alignment of numeric data.