One of the chief tasks of the compilation process is the production of a symbol table, which is a collection of data structures whose purpose is to store type, scope, and address information about program data. Compilers and assemblers create the symbol table. It is read and may be modified by linkers, profiling tools, and assorted object manipulation tools. It also contains information required for debugging.
For large applications, a single compilation can involve many program components, including source files, header files, and libraries. Data from all of these files must be described in the symbol table.
The Tru64 UNIX eCOFF symbol table, when present, comprises a large portion of the physical object file and is often considered a stand-alone entity. It is divided into numerous sections, including a header section that is used for navigation. The contents of the symbol table are shown in Figure 5-1.
Figure 5-1: Symbol Table Sections
The symbol table has a hierarchical design. The sections storing local symbols, local strings, relative file descriptors, procedure descriptors, line numbers, auxiliary symbols, and optimization symbols are divided into subtables and organized by file. Local symbols, local strings, and optimization symbols are further broken down by procedure. Figure 5-2 depicts this hierarchy.
Figure 5-2: Symbol Table Hierarchy
A particular symbol table may not contain all sections, for one of the following reasons:
Relative file descriptors are present in linked objects only.
The line number, auxiliary symbol and optimization symbol tables are produced only when debugging information is requested.
Symbol table information may be partially or entirely removed by post-link object tools.
Optimization symbols are not present in symbol table formats less than V3.13.
The function of each symbol table section is summarized below:
The symbolic header stores the sizes and locations of all other symbol table sections.
The line number table enables debuggers to map machine instructions to source code lines.
The procedure descriptor table contains call-frame information as well as pointers to a procedure's local symbols, line numbers and optimization entries.
The local symbol table describes procedures, static and local data, and user-defined types.
The external symbol table stores information about global symbols.
The relative file descriptor table contains a post-link file descriptor table index mapping for each file in the compilation.
The local and external string tables store local and external symbol names, respectively.
The file descriptor table stores the sizes and locations of each subtable produced for contributing source and include files. It also contains miscellaneous information about each file, such as the source language and the level of symbolic information.
The auxiliary symbol table contains data type information for local and external symbols.
The optimization symbols section stores procedure relative information, including extended source location information and optimized debugging information.
Several tools are available to view the contents of the symbol table.
See the
stdump
(1),
odump
(1), and
nm
(1)
man pages.
This chapter covers symbol table organization and usage, concentrating
on debugging issues in particular.
The current version of the symbol table
is
V3.13
.
The dynamic symbol table built
by the linker is discussed separately in
Section 6.3.3.
5.1 New or Changed Symbol Table Features
Tru64 UNIX V5.1 includes the following new or changed features:
Alignment for common storage class symbols (see Section 5.2.6 and Section 2.3.5)
Tail call flag used in procedure call optimization (see Section 5.2.3)
A new ESLI command to describe gaps in address ranges (see Section 5.3.2.2)
A new basic type for 32-byte complex (see Table 5-5).
A new representation for empty classes or structures (see Section 5.3.8.6.1) to distinguish them from opaque classes and structures (see Section 5.3.8.6.2).
Version 3.13 of the symbol table includes the following new or changed features:
64-bit auxiliary support (see Section 5.3.7.3)
Parameters with static storage and unallocated parameters (see Section 5.2.11)
New optimization symbols section (see Section 5.3.3)
Extended Source Location Information (see Section 5.3.2.2)
New representation for procedures with no text (see Section 5.3.6.1)
Modified variant record representation (see Section 5.3.8.11)
New function pointer representation (see Section 5.3.8.5)
Block symbol added for alternate entry prologue size (see Section 5.3.6.7)
Address
of locally stripped
FDR
s set to
addressNil
(see
Section 5.3.1.2)
Uplevel links for referencing local symbols in an outer scope (see Section 5.3.4.4)
New profile feedback information (see Section 5.3.5)
New representation for C++ namespaces (see Section 5.3.6.4)
Unnamed union or structure representation (see Section 5.3.8.3)
5.2 Structures, Fields and Values for Symbol Tables
Unless otherwise specified, all structures described in this section
are declared in the header file
sym.h
, and all constants are defined in the header
file
symconst.h
.
5.2.1 Symbolic Header (HDRR)
typedef struct { coff_ushort magic; coff_ushort vstamp; coff_int ilineMax; coff_int idnMax; coff_int ipdMax; coff_int isymMax; coff_int ioptMax; coff_int iauxMax; coff_int issMax; coff_int issExtMax; coff_int ifdMax; coff_int crfd; coff_int iextMax; coff_long cbLine; coff_off cbLineOffset; coff_off cbDnOffset; coff_off cbPdOffset; coff_off cbSymOffset; coff_off cbOptOffset; coff_off cbAuxOffset; coff_off cbSsOffset; coff_off cbSsExtOffset; coff_off cbFdOffset; coff_off cbRfdOffset; coff_off cbExtOffset; } HDRR, *pHDRR;
SIZE - 144 bytes, ALIGNMENT - 8 bytes
Symbolic Header Fields
magic
To verify validity
of the symbol table, this field must contain the constant
magicSym
, defined as
0x1992
.
vstamp
Symbol table
version stamp.
This value consists of a major version number and a minor version
number, as defined in the
stamp.h
header file:
Symbol | Value | Description |
MAJ_OBJ_STAMP |
3 | Current major object format version |
MIN_OBJ_STAMP |
13 | Current minor object format version |
See Section 1.4.5 for a description of object and symbol table versioning.
ilineMax
idnMax
Obsolete.
ipdMax
isymMax
ioptMax
iauxMax
Number of auxiliary symbols.
issMax
issExtMax
ifdMax
crfd
iextMax
cbLine
Byte size of (packed) line number entries.
cbLineOffset
Byte offset to start of (packed) line numbers.
cbDnOffset
Obsolete.
cbPdOffset
Byte offset to start of procedure descriptors.
cbSymOffset
Byte offset to start of local symbols.
cbOptOffset
Byte offset to start of optimization entries.
cbAuxOffset
Byte offset to start of auxiliary symbols.
cbSsOffset
Byte offset to start of local strings.
cbSsExtOffset
Byte offset to start of external strings.
cbFdOffset
Byte offset to start of file descriptors.
cbRfdOffset
Byte offset to start of relative file descriptors.
cbExtOffset
Byte offset to start of external symbols.
General Notes:
The size and offset fields describing symbol table sections must be set to zero if the section described is not present.
The
cb*Offset
fields are byte offsets from
the beginning of the object file.
The
i*Max
fields contain the number of entries
for a symbol table section.
Legal index values for a symbol table section
will range from 0 to the value of the
associated i*Max field minus one.
For an explanation of packed and expanded line number entries, see the
discussion in
Section 5.3.2.2.
5.2.2 File Descriptor Entry (
FDR
)
typedef struct fdr { coff_addr adr; coff_long cbLineOffset; coff_long cbLine; coff_long cbSs; coff_int rss; coff_int issBase; coff_int isymBase; coff_int csym; coff_int ilineBase; coff_int cline; coff_int ioptBase; coff_int copt; coff_int ipdFirst; coff_int cpd; coff_int iauxBase; coff_int caux; coff_int rfdBase; coff_int crfd; coff_uint lang : 5; coff_uint fMerge : 1; coff_uint fReadin : 1; coff_uint fBigendian : 1; coff_uint glevel : 2; coff_uint fTrim : 1; #ifndef TANDEMSYM coff_uint reserved : 5; #else coff_uint platform : 3; (not supported) coff_uint reserved : 2; #endif coff_ushort vstamp; (SV3.13 - ) coff_uint reserved2; } FDR, *pFDR;
SIZE - 96 bytes, ALIGNMENT - 8 bytes
See
Section 5.3.2.1
for related information.
File Descriptor Table Entry Fields
adr
Address of first
instruction generated from this
source file, which should be the same value as found in the
PDR
.adr
field of the first procedure descriptor for this file.
If no instructions
are associated with this source file, this field should be set to
0
.
File
descriptors that have been merged by source language in locally-stripped objects
will have this field set to
addressNil
(-1)
.
Version Note This use of
addressNil
is supported in symbol table format V3.13 and greater.
cbLineOffset
Byte offset from start of packed line numbers to start of entries for this file.
cbLine
Byte size of packed line numbers for this file.
cbSs
rss
Byte offset from
start of file's local string table entries to source file name; set to
issNil
(-1)
to indicate the
source file name is unknown.
issBase
Start of local strings for this file.
isymBase
csym
Count of local symbol entries for this file.
ilineBase
Debuggers and other tools expand the packed line numbers, producing an array of line numbers with an entry for each machine instruction in the program. This field is an index for this file's first line number entry in the expanded line number array.
cline
See the preceding
description of
ilineBase
.
This field is a count
of this file's entries in the expanded line number array.
ioptBase
Byte offset from start of optimization symbol table to optimization symbol entries for this file.
copt
Byte size of optimization symbol entries for this file.
ipdFirst
Starting index of procedure descriptors for this file.
cpd
Count of procedure descriptors for this file.
iauxBase
Starting index of auxiliary symbol entries for this file.
caux
Count of auxiliary symbol entries for this file.
rfdBase
crfd
Count of relative file descriptors for this file.
lang
Source language for this file (see Table 5-1).
fMerge
fReadin
True if file was read in (as opposed to just created).
fBigendian
Unused.
glevel
Symbolic information level with which this file was compiled. This value is not the same as the user's idea of debugging levels. The value mapping from the user level -g option to the symbol table value is:
Debug switch | glevel contents |
-g0 | 2 |
-g1 | 1 |
-g2 | 0 |
-g3 | 3 |
fTrim
Unused.
platform
Identifies
the platform associated with the file descriptor.
Set to
platUndef
,
platGuard
,
platOss
,
or
platPc
.
Version Note The
platform
field is reserved for use on Tandem big-endian systems. It is not supported on Tru64 UNIX
vstamp
Symbol table
version stamp (
HDRR.vstamp
) value from the original object module (.o file) that is recorded
by the linker.
The linker may combine objects that were compiled at different
times and potentially contain different versions of the symbol table.
In
post-link objects, this value may or may not match the version stamp in the
symbolic header.
For pre-link objects, the value in this field will either
be zero or the same as the symbolic header stamp.
Version Note The
vstamp
field is supported on Tru64 UNIX V5.0 and greater for symbol table version V3.13 and greater.
reserved
Must be zero.
reserved2
Must be zero.
General Notes:
The
i*Base
fields provide the starting indices
of this file's subtables within the symbol table sections.
If the associated
count fields are set to 0, the base fields will also be set to zero.
For an explanation of packed and expanded line number entries, see the
discussion in
Section 5.3.2.2.
Table 5-1: Source Language (
lang
) Constants
Name | Value | Commant |
langC |
0 | |
langPascal |
1 | |
langFortran |
2 | |
langAssembler |
3 | |
langMachine |
4 | |
langNil |
5 | |
langAda |
6 | |
langPl1 |
7 | |
langCobol |
8 | |
langStdc |
9 | |
langMIPSCxx |
10 | Unused. |
langDECCxx |
11 | |
langCxx |
12 | |
langFortran90 |
13 | Not used by all compilers -
langFortran
might be used instead for both f77 and f90 |
langBliss |
14 | |
langPTAL |
15 | (not supported) |
langCplusplusV1 |
16 | (not supported) |
langCplusplusV2 |
17 | (not supported) |
langMax |
31 | Number of language codes available |
Version Note The language constants
langPTAL
,langCplusplusV1
, andlangCplusplusV2
are reserved for use on Tandem big-endian systems. They are not supported on Tru64 UNIX.
5.2.3 Procedure Descriptor Entry (
PDR
)
#ifndef TANDEMSYM struct pdr { #else struct pdrv4 { #endif coff_addr adr; coff_long cbLineOffset; coff_int isym; coff_int iline; coff_uint regmask; coff_int regoffset; coff_int iopt; coff_uint fregmask; coff_int fregoffset; coff_int frameoffset; coff_int lnLow; coff_int lnHigh; coff_uint gp_prologue : 8; coff_uint gp_used : 1; coff_uint reg_frame : 1; coff_uint prof : 1; coff_uint gp_tailcall : 1; (V5.1 - ) #ifndef TANDEMSYM coff_uint reserved : 12; #else coff_uint optlevel : 4; (not supported) coff_uint reserved : 8; #endif coff_uint localoff : 8; coff_ushort framereg; coff_ushort pcreg; #ifdef TANDEMSYM coff_uint proctype : 16; (not supported) coff_uint reserved2 : 48; } PDRV4, *pPDRV4; #else } PDR, *pPDR; #endif
SIZE - 64 bytes (72 bytes for Tandem), ALIGNMENT - 8 bytes
See
Section 5.3.4
for related information.
Procedure Descriptor Table Entry Fields
adr
The start address
of this procedure.
Set to
addressNil
(-1)
for procedures with no text.
Version Note Prior to symbol table format V3.13 this field may not be updated by the linker. To determine the procedure start address for symbol table formats V3.10 - V3.12, use the algorithm described in Section 5.3.4.1.
cbLineOffset
Byte offset
to the start of this procedure's packed line numbers from the start of the
file descriptor entry (
FDR
.cbLineOffset
).
isym
Start of local
symbols for this procedure.
This symbol is the symbol for the
procedure (symbol
type
stProc
).
The name of the procedure can be obtained
from the
iss
field of the symbol table entry.
If the object is stripped of local symbol information, this field contains an external symbol table index for the procedure symbol's entry.
If this procedure has no symbols associated with it, this field should
be set to
isymNil
(-1)
.
This situation occurs for a static procedure in an object stripped
of local symbol information.
iline
Start of line
number entries (if expanded) for
this procedure.
Set to
ilineNil
(-1)
to indicate that this procedure does not have line numbers.
regmask
Saved general register mask.
regoffset
Offset from the virtual frame pointer to the general register save area in the stack frame.
iopt
Start of procedure's
optimization symbol entries.
Set to
ioptNil
(-1)
to indicate that this procedure does not have optimization symbol entries.
fregmask
Saved floating-point register mask.
fregoffset
Offset from the virtual frame pointer to the floating-point register save area in the stack frame.
frameoffset
Size of the fixed part of the stack frame. The actual frame size can exceed this value. A routine can extend its own frame size for frame sizes larger than 2 GB or for dynamic stack allocation requests.
lnLow
Lowest source line number within this file for the procedure. This is typically the line number of the first instruction in the procedure, but not always. Code optimizations can rearrange or remove instructions making the first instruction map to a different line number.
lnHigh
Highest source
line number within this file for the procedure.
This field contains a value
of
-1
for alternate
entry points, which is how an alternate
entry
point is identified.
gp_prologue
gp_used
Flag set if the procedure uses GP.
reg_frame
True if the procedure is a light-weight or null-weight procedure. See the General Notes section following these definitions for more details on procedure weights.
prof
True if the procedure
has been compiled with
-pg
for
gprof
profiling.
gp_tailcall
Indicates that a call to this procedure may result in a tail call return from a different GP domain. This bit is used exclusively for tail call optimizations.
Version Note The
gp_tailcall
field is supported in Tru64 UNIX V5.1 and greater.
optlevel
Optimization
level.
Set to
0
for unknown or
1
through
6
for optimization levels
0
through
5
respectively.
Version Note The
optlevel
field is used on Tandem big-endian systems. It is not supported on Tru64 UNIX.
reserved
Must be zero.
localoff
Bias value for accessing local symbols on the stack at run time.
framereg
Frame pointer register number.
pcreg
PC (Program Counter) register number.
proctype
Procedure attribute flags. See Table 5-2 for flag descriptions.
Version Note The
proctype
field and the associated flag values in Table 5-2 are reserved for use on Tandem big-endian systems. They are not supported on Tru64 UNIX.
Table 5-2: Procedure Attribute Flags
Flag | Value | Description |
TNDM_MAIN |
0x0001 | Main entry point |
TNDM_RESIDENT |
0x0002 | Resident routine |
TNDM_PRIVILEGED |
0x0004 | Privileged routine |
TNDM_CALLABLE |
0x0008 | Callable routine |
TNDM_ENTRY |
0x0010 | Alternate entry, procedure, or subprocedure |
TNDM_SUBPROC |
0x0020 | Subprocedure |
TNDM_INTERRUPT |
0x0040 | Interrupt routine |
TNDM_SHELL |
0x0080 | Shell routine |
TNDM_COMPILER_GENERATED |
0x0200 | Procedure can have multiple copies |
TNDM_EXTENSIBLE |
0x0800 | Extensible procedure |
TNDM_EDITLINE |
0x8000 | Edit line numbers |
General Notes:
For more information on call frames, see Section 5.3.4.2.
If the value of
gp_prologue
is zero and
gp_used
is 1, a gp prologue is present but was scheduled into
the procedure prologue.
Otherwise, the
gp_prologue
field gives the number of bytes occupied by the GP prologue instructions at
the procedure's start address.
If there is a chain of tail call procedures, some of which are in the
same GP domain, and some that are in a different GP domain, then
gp_tailcall
must be set for all procedures in the chain.
For
example, suppose there is a tail call from A to B, and a tail call from B
to C.
A and B are in the same GP domain, but C is in a different GP domain.
In this case
gp_tailcall
must be set in both A's
and B's
PDR
,
because callers can't rely on the standard definition of GP after calling
A.
See the
Alpha Architecture Reference Manual
for additional details.
For an explanation of packed and expanded line number entries, see the discussion in Section 5.3.2.2.
A procedure may be heavy-, light-, or null-weight. The weight of a procedure can be determined from its descriptor by using the following guidelines:
Weight | Indications |
Heavy | reg_frame
is 0 and bit 26
of the register mask (regmask ) is on |
Light | reg_frame
is 1 and
regoffset
is
ra_save |
Null | reg_frame
is 1 and
regoffset
is 26 |
See the
Calling Standard for Alpha Systems
for details on the calling conventions
for different weight procedures.
Note that a calling routine does not need
to know the weight of the routine being called.
5.2.4 Line Number Entry (
LINER
)
Line numbers are represented using two formats: packed and expanded.
The packed format is a byte stream that can be interpreted as described in
Section 5.3.2.2
to build an expanded table that maps instructions to source
line numbers.
The
LINER
type is used to refer to
a single entry in the expanded table.
It is declared as:
typedef int LINER, *pLINER;
A second, newer
form of line number information is located in the optimization
symbols section.
See
Section 5.2.10
and
Section 5.3.2.2.
5.2.5 Local Symbol Entry (
SYMR
)
typedef struct { coff_long value; coff_int iss; coff_uint st : 6; coff_uint sc : 5; coff_uint reserved : 1; coff_uint index : 20; } SYMR, *pSYMR;
SIZE - 16 bytes, ALIGNMENT - 8 bytes
See
Section 5.2.11,
Section 5.3.4, and
Section 5.3.8
for related information.
Local Symbol Table Entry Fields
value
A field that can contain an address, size, offset, or index. Its interpretation is determined by the symbol type and storage class combination, as explained in Section 5.2.11.
iss
Byte offset from
the
issBase
field of a file descriptor table entry to the name of the symbol.
If the
symbol does not have a name, this field is set to
issNil
(-1)
.
Generally, all user-defined
symbols have names.
A symbol without a name is one that has been created
by the compilation system for its own use.
st
Symbol type (see Table 5-3).
sc
Storage class (see Table 5-4).
reserved
Must be zero.
index
An index into
either the local symbol table or auxiliary symbol table, depending
on the symbol type and class.
The index is used as an offset from the
isymBase
field in the file descriptor entry for an entry in
the local symbol table or an offset from the
iauxBase
field for an entry in the auxiliary symbol table.
The index field may have a value of
indexNil
, which is defined as (long)0xfffff
.
This
value is used to indicate that the index is not a valid reference.
The next two tables contain all defined values for the
st
and
sc
constants, along with short descriptions.
However,
these fields must be considered as pairs that have a limited number of possible
pairings as explained in
Section 5.2.11.
Table 5-3: Symbol Type (
st
) Constants
Constant | Value | Description |
stNil |
0 | Dummy entry |
stGlobal |
1 | Global variable |
stStatic |
2 | Static variable |
stParam |
3 | Procedure argument |
stLocal |
4 | Local variable |
stLabel |
5 | Label |
stProc |
6 | Global procedure |
stBlock |
7 | Start of block |
stEnd |
8 | End of block, file, or procedure |
stMember |
9 | Member of class, structure, union, or enumeration |
stTypedef |
10 | User-defined type definition |
stFile |
11 | Source file name |
stStaticProc |
14 | Static procedure |
stConstant |
15 | Constant data |
stBase |
17 | Base class (for example, C++) |
stVirtBase |
18 | Virtual base class (for example, C++) |
stTag |
19 | Data structure tag value (for example, C++ class or struct) |
stInter |
20 | Interlude (for example, C++) |
stModule |
22 | (not yet implemented) Fortran90 module definition. |
stNamespace |
22 | (V5.0 - ) Namespace definition (for example, C++) |
stModview |
23 | (not yet implemented) Modifiers for current view of given module. |
stUsing |
23 | (V5.0 - ) Namespace use (for example, C++ "using"). |
stAlias |
24 | (V5.0 - ) Defines an alias for another symbols. Currently, only used for namespace aliases. |
stDefine |
25 | (not supported) Macro definition |
stObjinfo |
26 | (not supported) Name/data object info |
stToolinfo |
27 | (not supported) Compiler info |
stSrcinfo |
28 | (not supported) Source data info |
stEquivRel |
29 | (not supported) Equivalence variable |
stMax |
64 | Maximum number of symbol types |
General Notes:
Symbol type codes with more than one interpretation are identified by
the
lang
field in the associated file descriptor.
This applies to the
stModule
/
stNamespace
and
stModview
/
stUsing
symbol
types.
Version Note The symbol types:
stDefine
,stObjinfo
,stToolinfo
,stSrcinfo
, andstEquivRel
are reserved for use on Tandem big-endian systems. They are not supported on Tru64 UNIX.
Table 5-4: Storage Class (
sc
) Constants
Constant | Value | Description |
scNil |
0 | Dummy entry |
scText |
1 | Symbol allocated in the
.text
section |
scData |
2 | Symbol allocated in the
.data
section |
scBss |
3 | Symbol allocated in the
.bss
section |
scRegister |
4 | Symbol allocated in a register |
scAbs |
5 | Symbol value is absolute |
scUndefined |
6 | Symbol referenced but not defined in the current module |
scUnallocated |
7 | Storage not allocated for this symbol |
scResText |
8 | (not supported) Resident text |
scTlsUndefined |
9 | TLS symbol referenced but not defined in the current module |
scInfo |
11 | Symbol contains debugger information |
scSData |
13 | Symbol allocated in the
.sdata
section |
scSBss |
14 | Symbol allocated in the
.sbss
section |
scRData |
15 | Symbol allocated in the
.rdata
section |
scVar |
16 | Parameter passed by reference (for example, Fortran or Pascal) |
scCommon |
17 | Common symbol |
scSCommon |
18 | Small common symbol |
scVarRegister |
19 | Parameter passed by reference in a register |
scVariant |
20 | Variant record (for example, Pascal or Ada) |
scFileDesc |
20 | File descriptor (for example, COBOL) |
scSUndefined |
21 | Small undefined symbol |
scInit |
22 | Symbol allocated in the
.init
section |
scReportDesc |
23 | Report descriptor (for example, COBOL) |
scXData |
24 | Symbol allocated in the
.xdata
section |
scPData |
25 | Symbol allocated in the
.pdata
section |
scFini |
26 | Symbol allocated in the
.fini
section |
scRConst |
27 | Symbol allocated in the
.rconst
section |
scTlsCommon |
29 | TLS common symbol |
scTlsData |
30 | Symbol allocated in the
.tlsdata
section |
scTlsBss |
31 | Symbol allocated in the
.tlsbss
section |
scMax |
32 | Maximum number of storage classes |
Version Note The
scResText
storage class is reserved for use on Tandem big-endian systems. It is not supported on Tru64 UNIX.
5.2.6 External Symbol Entry (
EXTR
)
typedef struct { SYMR asym; coff_uint jmptbl : 1; coff_uint cobol_main : 1; coff_uint weakext : 1; coff_uint alignment : 4; (V5.1 - ) #ifdef TANDEMSYM coff_uint xport : 1; (not supported) coff_uint multiext : 1; (not supported) coff_uint reserved : 23; #else coff_uint reserved:25; #endif coff_int ifd; } EXTR, *pEXTR;
SIZE - 24 bytes, ALIGNMENT - 8 bytes
External Symbol Table Entry Fields
asym
External symbol table entry. This structure has the same format as a local symbol entry. The field interpretations differ as described in the following entries.
asym.value
Contains the symbol address for most defined symbols. See Section 5.2.11 for details.
asym.iss
Byte offset
in external string table to
symbol name.
Set to
issNil
(-1)
if there is no name for this symbol.
asym.st
Symbol type. See Table 5-3 for possible values.
asym.sc
Storage class. See Table 5-4 for possible values.
asym.reserved
Must be zero.
asym.index
Contains either an index into the auxiliary symbol table for a type description or an index into the local symbol table pointing to a related symbol.
The index field may have a value of
indexNil
, which is defined as
(long)0xfffff
.
This
value is used to indicate that the index is not a valid reference.
jmptbl
Unused.
cobol_main
Flag set to indicate that the symbol is a COBOL main procedure.
weakext
Flag set to identify the symbol as a weak external. See Section 6.3.4.2 for more details on weak symbols.
alignment
Power of
two byte alignment for common storage class symbols
biased by 2^3 (8).
Supported
values
range from 0 through 13 yielding a minimum alignment of 8 bytes and a maximum
alignment of 64K bytes.
For symbols with storage classes other than
scCommon
and
scSCommon
this field should be ignored.
Version Note The
alignment
field is supported on Tru64 UNIX V5.1 and greater.
xport
Flag set to indicate the symbol is to be exported from a shared library.
Version Note The
xport
field is reserved for use on Tandem big-endian systems. It is not supported on Tru64 UNIX.
multiext
Flag set to indicate that multiple definitions of the symbol are allowed.
Version Note The
multiext
field is reserved for use on Tandem big-endian systems. It is not supported on Tru64 UNIX.
reserved
Must be zero.
ifd
Index of the file
descriptor where the symbol is defined.
Set to
ifdNil
(-1)
for undefined symbols and for some compiler system symbols.
5.2.7 Relative File Descriptor Entry (
RFDT
)
The relative file descriptor table provides a post-link
mapping of file descriptor indices.
The purpose of this table is to minimize
work for the
linker,
which does not update symbol table references to local symbols.
This
information is used to obtain the file
offset used to bias local symbol indices.
Because this table is also known
as the File Indirect Table, two declarations are included in the
sym.h
header file, as shown here.
typedef int RFDT, *pRFDT; typedef int FIT, *pFIT;
SIZE - 4 bytes, ALIGNMENT - 4 bytes
See
Section 5.3.2.1
for related information.
5.2.8 Auxiliary Symbol Table Entry (
AUXU
)
The auxiliary
symbol table entry is a 32-bit union.
It is either interpreted as a
TIR
or
RNDXR
structure or as an
integer value.
See
Section 5.3.7.3
for detailed instructions on
reading the auxiliary symbols.
typedef union { TIR ti; RNDXR rndx; coff_int dnLow; coff_int dnHigh; coff_int isym; coff_int iss; coff_int width; coff_int count; coff_int slice; (V5.0a) } AUXU, *pAUXU;
SIZE - 4 bytes, ALIGNMENT - 4 bytes
See
Section 5.3.7.3
for related information.
Auxiliary Symbol Table Entry Fields
ti
Type information
record (TIR
), as defined in
Section 5.2.8.1.
rndx
Relative index
into local or auxiliary symbols (rndx
), as defined
in
Section 5.2.8.2.
dnLow
Lower bound of range or array dimension. For large structures, two of these fields can be used together to form one 64-bit number.
dnHigh
Upper bound of range or array dimension. For large structures, two of these fields can be used together to form one 64-bit number.
isym
For procedures
(
stProc
or
stStaticProc
symbols), this field is an index into the local
symbols.
It is also used as an index into
the relative file descriptors.
iss
Unused.
width
Width of a bit field or array stride in bits. Fortran compilers set the array stride to the array element size in bits. Two of these fields can be used together to form one 64-bit number.
count
Count of ranges
for variant arm.
This field name is only used within the type description
of a variant block (
stBlock
,
scVariant
).
slice
Reserved.
General Notes:
The fields
dnLow
,
dnHigh
,
or
width
must all use either the 32-bit or 64-bit
representation when used together.
For example, an array dimension cannot
be specified with a 32-bit
dnLow
and a 64-bit
dnHigh
.
5.2.8.1 Type Information Record (
TIR
)
typedef struct { coff_uint fBitfield : 1; coff_uint continued : 1; coff_uint bt : 6; coff_uint tq4 : 4; coff_uint tq5 : 4; coff_uint tq0 : 4; coff_uint tq1 : 4; coff_uint tq2 : 4; coff_uint tq3 : 4; } TIR, *pTIR;
SIZE - 4 bytes, ALIGNMENT - 4 bytes
Type Information Record Entry Fields
fBitfield
Flag set if bit width is specified.
continued
Flag set
to indicate that the type description is continued in another
TIR
record.
This will happen
if the type is represented with more than six type qualifiers.
bt
Basic type (see Table 5-5 and Section 5.3.7.1).
tq0, tq1, tq2, tq3, tq4, tq5
Type qualifiers (see
Table 5-6
and
Section 5.3.7.2).
The lower-numbered
tq
fields must be used first,
and all unneeded fields must be set to
tqNil
(0).
Table 5-5: Basic Type (
bt
) Constants
Constant | Value | Description |
btNil |
0 | Undefined or void |
btAdr32 |
1 | Address (32 bits) |
btChar |
2 | Character |
btUChar |
3 | Unsigned character |
btShort |
4 | Short (16 bits) |
btUShort |
5 | Unsigned short (16 bits) |
btInt |
6 | Integer (32 bits) |
btUInt |
7 | Unsigned integer (32 bits) |
btLong32 |
8 | Long (32 bits) |
btULong32 |
9 | Unsigned long (32 bits) |
btFloat |
10 | Floating point |
btDouble |
11 | Double-precision floating point |
btStruct |
12 | Structure or record |
btUnion |
13 | Union |
btEnum |
14 | Enumeration |
btTypedef |
15 | Defined by means of a user-defined type definition |
btRange |
16 | Range of values (for example, Pascal subrange) |
btSet |
17 | Sets (for example, Pascal) |
btComplex |
18 | Single complex (for example, Fortran
COMPLEX*8 ) |
btDComplex |
19 | Double complex (for example, Fortran
COMPLEX*16 ) |
btIndirect |
20 | Indirect definition; following
rndx
points to an entry in the auxiliary symbol table that contains a
TIR
(type information record) |
btFixedBin |
21 | Fixed binary (for example, COBOL) |
btDecimal |
22 | Packed or unpacked decimal (for example, COBOL) |
btPicture |
25 | Picture (for example, COBOL) |
btVoid |
26 | Void |
btPtrMem |
27 | Currently unused |
btScaledBin |
27 | Scaled binary (for example, COBOL) |
btVptr |
28 | Virtual function table (for example, C++) |
btArrayDesc |
28 | Array descriptor (for example, Fortran, Pascal) |
btClass |
29 | Class (for example, C++) |
btLong64 |
30 | Address (64 bits) |
btLong |
30 | Long (64 bits) |
btULong64 |
31 | Unsigned long (64 bits) |
btULong |
31 | Unsigned long (64 bits) |
btLongLong |
32 | Long long (64 bits) |
btULongLong |
33 | Unsigned long long (64 bits) |
btAdr64 |
34 | Address (64 bits) |
btAdr |
34 | Address (64 bits) |
btInt64 |
35 | Integer (64 bits) |
btUInt64 |
36 | Unsigned integer (64 bits) |
btLDouble |
37 | Long double floating point (128 bits) |
btInt8 |
38 | Integer (64 bits) |
btUInt8 |
39 | Unsigned integer (64 bits) |
btRange_64 |
41 | (V5.0 - ) 64-bit range |
btProc |
42 | (V5.0 - ) Procedure or function |
btCobolIndex |
43 | (not supported) COBOL index variables |
btReal32 |
44 | (not supported) Tandem float |
btReal64 |
45 | (not supported) Tandem double |
btQComplex |
46 | (V5.1 - )
Quad complex
(for example Fortran
COMPLEX*32 ) |
btChecksum |
63 | Symbol table checksum value stored in auxiliary record |
btMax |
64 | Number of basic type codes |
Table Notes:
btInt
and
btLong32
are synonymous.
btUInt
and
btULong32
are synonymous.
btLong
,
btLong64
,
btLongLong
,
btInt64
, and
btInt8
are synonymous.
btULong64
,
btULongLong
,
btUInt64
, and
btUInt8
are synonymous.
Version Note The basic type constants:
btCobolIndex
,btReal32
, andbtReal64
are reserved for use on Tandem big-endian systems. They are not supported on Tru64 UNIX.
Table 5-6: Type Qualifier (
tq
) Constants
Constant | Value | Description |
tqNil |
0 | No qualifier (placeholder) |
tqPtr |
1 | Pointer |
tqProc |
2 | (obsolete) Procedure or function |
tqArray |
3 | Array |
tqFar |
4 | 32-bit pointer; used with the -xtaso emulation |
tqVol |
5 | Volatile |
tqConst |
6 | Constant |
tqRef |
7 | Reference |
tqArray_64 |
8 | (V5.0 - ) Large array |
tqHasLen |
9 | (not supported) Length for buffer parameters |
tqShar |
10 | (V5.0a - ) Reserved |
tqSharArr_64 |
11 | (V5.0a - ) Reserved |
tqMax |
16 | Number of type qualifier codes |
Version Note The
tqHasLen
type qualifier is reserved for use on Tandem big-endian systems. It is not supported on Tru64 UNIX.
5.2.8.2 Relative Symbol Record (
RNDXR
)
typedef struct { coff_uint rfd : 12; coff_uint index : 20; } RNDXR, *pRNDXR;
SIZE - 4, ALIGNMENT - 4
Relative Symbol Record Fields
rfd
Index into relative file descriptor table if it exists; otherwise, index into file descriptor table.
This field may have a value of
ST_RFDESCAPE
, defined as
0xfff
in the header file
cmplrs/stsupport.h
.
This value is used to indicate that the next
auxiliary entry, interpreted as an
isym
, contains
the actual rfd index.
index
Symbol index.
Used as an offset from either
FDR
.isymbase
or
FDR
.iauxbase
, depending on context.
Objects can contain two string tables: the local string table (corresponding to local symbols) and the external string table (corresponding to external symbols). The local string table is present only for objects created with full debugging information; it is removed if an object is locally stripped.
The storage format for the string tables is a list of null-terminated
character strings.
It is correctly considered
as one long character array,
not an array of strings.
Fields in the
symbolic header and file headers represent string table sizes and offsets
in bytes.
5.2.10 Optimization Symbol Entry (
PPODHDR
)
The optimization symbol table contains
information for optimized debugging,
basic block
profiling, and other miscellaneous procedure-specific data.
Each
procedure's associated optimization symbol
table data begins with an array of
PPODHDR
structures.
See
Section 5.3.3
for a description of the optimization symbol table.
Version Note The following structure definition is for Tru64 UNIX V5.0 and greater. It is used for symbol table format V3.13 and greater.
typedef struct { coff_uint ppode_tag; coff_uint ppode_len; coff_ulong ppode_val; } PPODHDR, *pPPODHDR;
SIZE - 16 bytes, ALIGNMENT - 8 bytes
Optimization Symbol Entry Fields
ppode_tag
Identifies the kind of data described by this entry.
ppode_len
Indicates
the size in bytes of the data that is found in the raw data area for this
entry.
When this field is zero, the only data is stored in the
ppode_val
field.
ppode_val
This field
is either a pointer to the entry's data or is itself the data.
If
ppode_len
is nonzero,
this field is a relative file offset from
the beginning of the current PPOD (Per-Procedure Optimization
Descriptor ) to the applicable data area.
If
ppode_len
is zero, this field contains the data for the entry.
A PPOD contains multiple
PPODHDR
s.
A
PPODHDR
and its associated data are collectively referred to
as a PPODE (Per-Procedure Optimization Descriptor Entry.)
Figure 5-10
in
Section 5.3.3
shows several PPODs with multiple
PPODHDR
s and their data.
Table 5-7: Optimization Tag Values
Name | Value | Description |
PPODE_STAMP |
1 | Version number of the PPOD stored in
ppode_val .
The current
PPOD_VERSION
value is 1. |
PPODE_END |
2 | End of entries for this PPOD. |
PPODE_EXT_SRC |
3 | Extended source line information. |
PPODE_SEM_EVENT |
4 | Semantic event information. (Reserved for future use.) |
PPODE_SPLIT |
5 | Split lifetime information. (Reserved for future use.) |
PPODE_DISCONTIG_SCOPE |
6 | Discontiguous scope information. (Reserved for future use.) |
PPODE_INLINED_CALL |
7 | Inlined procedure call information. (Reserved for future use.) |
PPODE_PROFILE_INFO |
8 | Profile feedback information. |
5.2.11 Symbol Type and Class (st/sc) Combinations
Entries in the
symbol table are primarily identified by the combination
of their symbol type (st
) and storage class (sc
) values.
Not all combinations are valid.
Figure 5-3
indicates which combinations are currently in use.
Figure 5-3: st/sc Combination Matrix
sc | | | | | | T | | | | | | | lU | V | | | | R |S | sn U| a | | | | e |U T| UaUs| r | B | F | R|Rp |n l| nlne| R | a | i | e|eo S |d s|Tdldr| eV | s C| l | R g|gr C |eS TC|leoeS| ga | e o| e | PCRI|it oS|fy lo|sfcft| irX | dB m|DDFII| DoDm|sDSmD|imTsm|Diair| siD |AViBm|aeinn|Nanaa|teBma|nreBm|antnu|Vtaa |batso|tsnfi|itstg|essot|eexso|teeec|aent st |srssn|aciot|latae|rcsna|dftsn|adddt|rrta -----------+-----+-----+-----+-----+-----+-----+---- Alias | | X | | | | | Base | | X | | | | | Block | X| X X | | X | X | | X Constant |X X |X | X | X X| | | End | X| X X | | X | X | | X Expr | | | | | | | File | | | | | X | | Forward | | | | | | | Global |X XX|X | XX | XXX|X XX|XX X | Inter | | X | | | | | Label |X X |X X X| XXX | X X| XX |X X | X Local |X X |X X X| XXX |X X X| XX |X X |XX X Member | | X X | | X | | | Module | | | | | | | Modview | | | | | | | Namespace | | X | | | | | Nil | | | | | | | Number | | | | | | | Param |X X |X X | XX |X X X| | X |XX Proc | | X |X | | X | X | RegReloc | | | | | | | Split | | | | | | | StaParam | | | | | | | Static |X XX|X X | XX | X X| X |X | StaticProc | | X X| | | X | | Str | | | | | | | Tag | | X | | | | | Type | | | | | | | Typedef | | X | | | | | Using | | X | | | | | VirtBase | | X | | | | |
A symbol's type and class taken together determines interpretation of other fields in the symbol table entry. The same combination can be used for different purposes in different contexts. As a result, to understand the symbol entry, it also may be necessary to access type information in the auxiliary table or the source language information in the file descriptor.
The contents of the
value
and
index
fields for each combination, with a brief explanation
of the symbol's use, are described in the following list of combinations.
For many combinations, greater detail can be found in
Section 5.3.7
and
Section 5.3.8.
stGlobal
/
scAbs
The
value
field contains an absolute
value.
The
index
field is an auxiliary
table index or
indexNil
if there
is no type information.
This symbol is a global absolute value.
stGlobal
/
scSData
,
stGlobal
/
scData
,
stGlobal
/
scSBss
,
stGlobal
/
scBss
,
stGlobal
/
scRData
,
stGlobal
/
scRConst
The
value
field is the symbol's
address.
The
index
field is an auxiliary
table index or
indexNil
if there
is no type information.
This symbol is a defined global variable.
stGlobal
/
scTlsData
,
stGlobal
/
scTlsBss
The
value
field is the offset from
the base of the object's TLS region.
The
index
field is an auxiliary
table index or
indexNil
if there
is no type information.
This symbol is a defined global TLS variable.
stGlobal
/
scSCommon
,
stGlobal
/
scCommon
,
stGlobal
/
scTlsCommon
The
value
field is the symbol's
size in bytes.
The
index
field is an auxiliary
table index or
indexNil
if there
is no type information.
This symbol is a common.
stGlobal
/
scSUndefined
,
stGlobal
/
scUndefined
,
stGlobal
/
scTlsUndefined
The
value
field is zero in linked
objects.
In relocatable
objects, the
value
field is ignored.
(Some compilers
store the size in bytes of the global variable in the
value
field.)
The
index
field is an auxiliary
table index or
indexNil
if there
is no type information.
This symbol is an undefined global variable.
stStatic
/
scAbs
The
value
field is an absolute
value.
The
index
field is an auxiliary
table index or
indexNil
if there
is no type information.
This symbol is an absolute value with static scope.
stStatic
/
scSData
,
stStatic
/
scData
,
stStatic
/
scSBss
,
stStatic
/
scBss
,
stStatic
/
scRData
,
stStatic
/
scRConst
The
value
field is the symbol's
address.
The
index
field is an auxiliary
table index or
indexNil
if there
is no type information.
This symbol is a defined static variable.
stStatic
/
scTlsData
,
stStatic
/
scTlsBss
The
value
field is an offset from
the base of the object's TLS region.
The
index
field is an auxiliary
table index or
indexNil
if there
is no type information.
This symbol is a defined static TLS variable.
stStatic
/
scCommon
The
value
field is zero.
The
index
field is an auxiliary
table index or
indexNil
if there
is no type information.
stStatic
/
scInfo
The
value
field is zero.
The
index
field is an auxiliary
table index.
This symbol is a C++ static data member.
stParam
/
scAbs
The
value
field is an offset from
the virtual frame pointer.
The
index
field is an auxiliary
table index.
This symbol is a parameter stored on the stack.
stParam
/
scRegister
The
value
field is the number of
the register containing the parameter.
The
index
field is an auxiliary
table index.
This symbol is a parameter stored in a register.
stParam
/
scVar
The
value
field is an offset from
the virtual frame pointer to the parameter's address.
The
index
field is an auxiliary
table index.
This symbol is a parameter stored on the stack. One level of indirection is required to access the parameter's value.
stParam
/
scVarRegister
The
value
field is the register
number containing the address of the parameter.
The
index
field is an auxiliary
table index.
This symbol is a parameter stored on the stack. One level of indirection is required to access the parameter's value.
stParam
/
scInfo
The
value
field is zero.
The
index
field is an auxiliary
table index.
This symbol is a parameter of a C++ member function, function pointer definition, or procedure with no code.
stParam
/
scSData
,
stParam
/
scData
,
stParam
/
scSBss
,
stParam
/
scBss
,
stParam
/
scRData
,
stParam
/
scRConst
The
value
field is the address
of the parameter.
The
index
field is an auxiliary
table index.
Version Note Static parameters are supported in symbol table format V3.13 and greater.
stParam
/
scUnallocated
The
value
field is zero.
The
index
field is an auxiliary
table index.
This is an unallocated parameter.
stLocal
/
scAbs
The
value
field is an offset from
the virtual frame pointer.
The
index
field is an auxiliary
table index.
stLocal
/
scRegister
The
value
field is the number of
the register containing the variable.
The
index
field is an auxiliary
table index.
This symbol is a local variable stored in a register.
stLocal
/
scVar
The
value
field is an offset from
the virtual frame pointer to the symbol's address.
The
index
field is an auxiliary
table index.
This symbol is a local variable stored on the stack. One level of indirection is required to access its value.
stLocal
/
scVarRegister
The
value
field is the register
number containing the address of this variable.
The
index
field is an auxiliary
table index.
This symbol is a local variable stored on the stack. One level of indirection is required to access its value.
stLocal
/
scUnallocated
The
value
field is zero.
The
index
field is an auxiliary
table index.
This is an unallocated local variable.
Version Note The use of
scUnallocated
is supported in symbol table format V3.13 and greater.
stLocal
/
scText
,
stLocal
/
scInit
,
stLocal
/
scFini
,
stLocal
/
scSData
,
stLocal
/
scData
,
stLocal
/
scSBss
,
stLocal
/
scBss
,
stLocal
/
scRData
,
stLocal
/
scRConst
,
stLocal
/
scTlsData
,
stLocal
/
scTlsBss
The
value
field is the address
of the section indicated by the storage class.
The
index
field is
indexNil
.
These are special symbols inserted
by the linker for shared
objects.
They are found in the external symbol table and their names are the section
names (for example,
.text
or
.init
).
stLabel
/
scAbs
The
value
field is the symbol's
value.
This may be either a numeric constant or absolute address.
The
index
field is
indexNil
.
stLabel
/
scText
,
stLabel
/
scInit
,
stLabel
/
scFini
,
stLabel
/
scSData
,
stLabel
/
scData
,
stLabel
/
scXData
,
stLabel
/
scPData
,
stLabel
/
scSBss
,
stLabel
/
scBss
,
stLabel
/
scRData
,
stLabel
/
scRConst
,
stLabel
/
scTlsData
,
stLabel
/
scTlsBss
The
value
field is the label's
value (an address).
The
index
field is
indexNil
.
This symbol is an allocated label. It can be associated with any raw data section of the object file.
stLabel
/
scUnallocated
The
value
field is zero.
The
index
field is
indexNil
.
This symbol is an unallocated label.
stProc
/
scNil
The
value
field is zero.
The
index
field is
indexNil
.
This symbol can be ignored. Compilers may produce this type/class combination for procedures that have been optimized away and that don't require debug information. The linker removes these symbols from the external symbol table in linked objects.
stProc
/
scText
The
value
field is the procedure's
address.
This symbol can occur in the external or local symbol table:
In the local symbol table, the
index
field is an auxiliary table index.
In the external symbol table, it is the local symbol index
of the corresponding procedure symbol
in the local symbol table, unless the
file is stripped of local symbol information.
If the file is
locally stripped, the
index
field is
indexNil
.
This symbol is a defined procedure.
stProc
/
scUndefined
The
value
field is zero.
The
index
field is
indexNil
.
This symbol is an undefined procedure.
stProc
/
scInfo
The
value
field contains a value
of:
-1
(a procedure with no code)
A non-negative index into the virtual function table for this function, for a C++ virtual member function.
Version Note The use of
-1
and-2
in thevalue
field is supported in symbol table format V3.13 and greater.
The
index
field is an auxiliary
table index.
This symbol represents a procedure without code, a function
prototype, or a function pointer.
The
value
field
is used to distinguish among these possibilities.
stBlock
/
scText
The
value
field depends on context:
If this is the first
stBlock
/
scText
symbol following
an
stProc
/
scText
symbol, the
value
is the byte
offset from the procedure's address to the address of the first instruction
beyond
the end of the procedure's prologue.
Otherwise, it is the byte offset from the procedure's address to the starting instruction address of the block.
The
index
field is the local symbol
index of the symbol following the matching
stEnd
.
If this is the first
stBlock
/
scText
following an
stProc
/
scText
for an alternate entry point, the index field will
be set to
indexNil
because the symbol
will not have a matching
stEnd
symbol.
Version Note The use of
stBlock
/scText
for alternate entry points is supported in symbol table format V3.13 and greater.
stBlock
/
scInfo
The
value
field depends on context:
Size in bytes for a class, structure, or union.
Zero for the block scope of a procedure with no code.
The
index
field is the local symbol
index of the symbol following the matching
stEnd
.
This symbol indicates the start of a structure, union, or
enumeration definition (in C; the C++ representation differs).
It describes
a variant arm if it is inside an
stBlock
/
scVariant
scope.
This symbol is also used to
define the block scope of a procedure with no code.
stBlock
/
scCommon
The
value
field is the size of
the common block in bytes.
The
index
field is the local symbol
index of the symbol following the matching
stEnd
.
This symbol is a scoping symbol for a Fortran common block. It occurs in the context of the synthesized file used to define a common block.
stBlock
/
scVariant
The
value
field is the local symbol
index of the structure member whose value
determines which variant range is used.
The
index
field is a the local
symbol index of the symbol following the matching
stEnd
.
This symbol occurs in the context of Pascal and Ada variant records. It indicates the start of the symbols for one variant.
stBlock
/
scFileDesc
,
stBlock
/
scReportDesc
The
value
field is zero.
The
index
field is a the local
symbol index of the symbol following the matching
stEnd
.
This symbol occurs in COBOL only. It indicates the start of the file or report descriptor scope.
stEnd
/
scText
The
value
field depends on the
type of scope it is ending.
It is:
The size in bytes of the procedure's text (for a procedure).
Byte offset from a procedure's address to the start of the epilogue (for the outermost text block in a procedure).
Byte offset from a procedure's address to the first instruction address beyond the end of the block (for a text block).
Zero (for a file).
The
index
field is the local symbol
index of the matching
stBlock
,
stProc
, or
stFile
.
This symbol ends a file, procedure, or text block scope.
stEnd
/
scInfo
The
value
field is zero.
The
index
field is a the local
symbol index of the matching
stBlock
or
stNamespace
.
If the matching symbol is an
stBlock
, this symbol ends a structure, union, enumeration, C++
member function definition, procedure with no code, or the block scope contained
by a procedure with no code.
If the matching symbol is an
stNamespace
, this symbol ends a namespace definition.
stEnd
/
scCommon
The
value
field is zero.
The
index
field is the local symbol
index of the matching
stBlock
.
This symbol ends a Fortran common definition.
stEnd
/
scVariant
The
value
field is the same as
that of the matching
stBlock
.
The
index
field is the local symbol
index of the matching
stBlock
.
This symbol ends a variant record block.
stEnd
/
scFileDesc
,
stEnd
/
scReportDesc
The
value
field is zero.
The
index
field is the local symbol
index of the matching
stBlock
.
This symbol ends a file or report descriptor block.
stMember
/
scInfo
The
value
field depends on the
symbol's data type:
The ordinal value (for an element of an enumerated type).
Zero (for a namespace or union member).
Bit offset from the beginning of the structure (for a C structure or C++ class member).
The
index
field is an auxiliary
table index.
This symbol describes a data structure field or the member of a namespace. It is found inside a block defining a data structure (for example, class or struct) or a namespace definition block.
stMember
/
scFileDesc
,
stMember
/
scReportDesc
The
value
field is zero or one,
depending on whether the symbol is local or external, respectively.
The
index
field is an auxiliary
table index.
This symbol occurs in COBOL only. It is found inside a file descriptor or report descriptor block.
stTypedef
/
scInfo
The
value
field depends on the
purpose of this symbol:
Zero (for a user-defined type definition).
The auxiliary table index of the next auxiliary entry after
the start of the class definition (for a compiler-inserted symbol).
In effect,
the value is the contents of the
index
field plus
one.
The
index
field is an auxiliary
table index.
This symbol is a user-chosen name for a data type.
It also
appears as a compiler-inserted symbol following the
stTag
/
scInfo
symbol
for a
C++ opaque class or structure.
stFile
/
scText
The
value
field is zero.
The
index
field is the local symbol
index of the symbol following the matching
stEnd
.
stStaticProc
/
scText
The
value
field is the procedure's
address.
The
index
field is an auxiliary
table index.
This symbol is a defined static procedure.
stStaticProc
/
scInit
,
stStaticProc
/
scFini
The
value
field is the procedure
address.
The
index
field is an auxiliary
table index.
These combinations are used for the special symbols
__istart
and
__fstart
, which are inserted by the
linker.
stConstant
/
scAbs
The
value
field is the value of
the constant.
The
index
field is an auxiliary
table index.
This symbol represents a named value (for example, Fortran
PARAMETER
).
stConstant
/
scSData
,
stConstant
/
scData
,
stConstant
/
scSBss
,
stConstant
/
scBss
,
stConstant
/
scRData
,
stConstant
/
scRConst
The
value
field is the symbol's
address.
The
index
field is an auxiliary
table index.
This symbol represents allocated constant data.
stBase
/
scInfo
The
value
field is the offset of
the
base class relative to a derived class.
The
index
field is an auxiliary
table index.
This symbol is a C++ base class. It is found inside a block defining a data structure (for example, class or struct).
stVirtBase
/
scInfo
The
value
field is an index (starting
at 1) of the base class run-time description in the virtual base class table.
See
Section 5.3.8.6.3.
The
index
field is an auxiliary
table index.
This symbol is a C++ virtual base class. It is found inside a block defining a data structure (for example, class or struct).
stTag
/
scInfo
The
value
field is zero.
The
index
field is an auxiliary
table index.
This symbol is a C++ class, structure, or union. See Section 5.3.8.6. Note that the representation for C structures and unions (Section 5.3.8.3) is different.
stInter
/
scInfo
The
value
field is zero.
The
index
field is an auxiliary
table index.
This symbol is used in C++ to connect the definition of a member function with its prototype in the class definition context.
stNamespace
/
scInfo
The
value
field is zero.
The
index
field is the local symbol
index of the symbol following the matching
stEnd
.
This symbol indicates the start of the symbols in a namespace definition.
Version Note Namespace symbols are supported in symbol table format V3.13 and greater.
stUsing
/
scInfo
The
value
field is zero.
The
index
field is an auxiliary
table index.
This symbol specifies a C++ namespace (or portion thereof) that is being imported into another scope.
Version Note Namespace USING directives are supported in symbol table format V3.13 and greater.
stAlias
/
scInfo
The
value
field is zero.
The
index
field is an auxiliary
table index.
Version Note Namespace aliases are supported in symbol table format V3.13 and greater.
Combinations may be valid in the local symbol table, the external symbol
table, or both.
Table 5-8
shows which combinations are valid
in which table, based on the symbol type value and also the storage class
value where necessary.
Only combinations previously specified as valid apply
where the storage class value is shown as a wildcard value with the character
'*'.
Table 5-8: Valid Placement for
st
/sc
Combinations
st /sc
Combination |
External Symbol Table | Local Symbol Table |
stNil ,
sc* |
X |
X |
stGlobal ,
sc* |
X |
|
stStatic ,
sc* |
X |
|
stParam ,
sc* |
X |
|
stLocal ,
scSCN
1 |
X |
|
stLocal , not
scSCN
1 |
X |
|
stLabel ,
sc* |
X |
X |
stProc ,
scInfo |
X |
|
stProc ,
scText |
X |
X |
stProc ,
scUndefined |
X |
|
stBlock ,
sc* |
X |
|
stEnd ,
sc* |
X |
|
stMember ,
sc* |
X |
|
stTypedef ,
sc* |
X |
|
stFile ,
sc* |
X |
|
stStaticProc ,
scText |
X |
|
stStaticProc ,
scInit /
scFini |
X |
|
stConstant ,
sc* |
X |
X |
stBase ,
sc* |
X |
|
stVirtBase ,
sc* |
X |
|
stTag ,
* |
X |
|
stInter ,
sc* |
X |
|
stNamespace ,
sc* |
X |
|
stUsing ,
sc* |
X |
|
stAlias ,
* |
X |
Table Notes:
scSCN
is a section storage class:
scData
,
scSData
,
scBss
,
scSBss
,
scRConst
,
scRData
,
scInit
,
scFini
,
scText
,
scXData
,
scPData
,
scTlsData
,
scTlsBss
5.3 Symbol Table Usage
5.3.1 Levels of Symbolic Information
Different levels of symbolic information can be stored with an object file. Compilers often provide options that allow the user to choose the desired level of symbolic information for their program. This choice may be influenced by size considerations and debugging needs. A trade-off exists between the benefit of saving space in the object file and the amount of information available to tools that consume symbolic information.
It is also possible to change the amount of symbolic information present
in a program that has already been compiled and linked.
Information can be
added or deleted.
Two of the most common and useful operations are locally
stripping and fully stripping the symbol tables in executable files.
Tools
that modify linked executables, such as instrumentation tools and code optimizers,
may rewrite parts of the symbol
table to reflect changes that they made.
5.3.1.1 Compilation Levels
The representation of symbolic information supported by compilers can be broken down into four levels:
Minimal Only information required for linking
Limited Source file and line number information for profiling and limited debugging (stack-tracing)
Full Complete debugging information for non-optimized code
Optimized Debugging information for optimized code
These levels correspond to the system compiler switches
-g0
(minimal),
-g1
(limited),
-g2
(full), and
-g3
(optimized).
Table 5-9
shows the symbol table sections that are produced
by system compilers at each compilation level.
Table 5-9: Symbol Table Sections Produced at Various Compilation Levels
Compilation Level | ||||
Symbol Table Section | Minimal | Limited | Full | Optimized |
Symbolic header | Yes | Yes | Yes | Yes |
File Descriptors | Yes | Yes | Yes | Yes |
External Symbols | Yes | Yes | Yes | Yes |
External Strings | Yes | Yes | Yes | Yes |
Procedure Descriptors | Yes | Yes | Yes | Yes |
Line Numbers | No | Yes | Yes | Yes |
Relative File Descriptors | No | No | Yes | Yes |
Optimization Symbols | No | Partial | Yes | Yes |
Local Symbols | No | Partial | Yes | Yes |
Local Strings | No | Partial | Yes | Yes |
Auxiliary Symbols | No | Partial | Yes | Yes |
The minimal level of symbolic information that may be produced during compilation includes only the symbol information required for the linker to function properly. This includes external symbol information that is needed to perform symbol resolution and relocation.
If the limited level of symbolic information is requested, line number entries are generated, as well as external symbol information and procedure descriptors. In addition, local symbols for procedures (and the corresponding auxiliary symbols, optimization symbols, and local strings) are present. Limited symbolic information is sufficient to meet the needs of profiling tools. The information present at this level is a subset of that required for full debugger support.
If full symbolic information is included, all symbol table sections are produced in full. This level enables full debugging support with complete type descriptions for local and external symbols. Optimization is disabled.
Optimized symbolic information is designed to balance the aims of performance and debugging capabilities. This level supplies the same information as the full debugging option, but it also allows all compiler optimizations. As a result, some of the correlation is lost between the source code and the executable program.
On Tru64 UNIX systems, users can choose to compile their programs
with any one of the four levels of symbolic information.
The options
-g0,
-g1, and
-g2
specify increasing
levels of symbolic information.
The system compiler's default is to produce
the minimal level (-g0).
Currently, debugging of optimized
code (-g3) is not fully supported.
See
cc
(1)
for more details.
5.3.1.2 Locally Stripped Images
Objects can be produced with only global symbolic information stored in the symbol table. Selection of the -x option causes the linker to create a locally-stripped object. Reasons for stripping local symbolic information include reducing file size and limiting the amount of symbolic information available to end users of an application.
A locally-stripped object is very similar to an object produced with minimal symbolic information (see Section 5.3.1.1). The difference is the consolidation of file descriptors, which the linker does only for locally-stripped objects.
In a locally-stripped
image, the file descriptors are included solely
for the purpose
of identifying source file languages.
One file descriptor
is present for each source language involved
in the compilation.
These file descriptors will have their
adr
field set to
addressNil
indicating the file descriptors cannot be used to identify text addresses.
Version Note The preceding use of
addressNil
is supported in symbol table format V3.13 and greater. In symbol table formats less than V3.13, the file descriptoradr
value should be ignored.
The procedure descriptor table is present in full but is rearranged to group procedures by source language. All procedure descriptors for procedures written in a particular source language are thus contiguous, and they reflect the file descriptor's information.
External symbols are also present in a locally-stripped image.
The file
indices (ifd
field) of the external symbols are updated to identify the generic
file descriptor for the appropriate source language.
The index fields are
set to zero to indicate that no type information
is available.
External symbols with the storage
class
scNil
are removed.
These are
debugging symbols that are not normally produced for minimal symbol tables.
Limited debugging is possible with locally-stripped objects. Because the procedure descriptors are retained, stack traces are possible. External symbol information can also be viewed, and language-dependent handling of symbols (for example, C++ name demangling) is preserved.
A linked executable
file can be locally stripped at any time after its creation using the command
ostrip
-x.
The output is the same as described above.
This operation may also alter
the raw data of the
.comment
section.
See
Chapter 7
for details.
5.3.1.3 (Fully) Stripped Images
Executable files may be fully stripped at any time after creation using
either the
strip
command or the command
ostrip
-s.
Stripping an executable will result in complete
removal of the symbol table, including
the symbolic header.
The file header fields
f_symptr
and
f_nsyms
are set to zero to indicate that the
file has been stripped.
This operation may also alter the raw data of the
.comment
section.
See
Chapter 7
for details.
5.3.2 Source Information
The final executable
image for a program bears little resemblance to the source code files from
which it was created.
One of the principal functions
of the symbol table is to track the relationship
between the two so that the
debugger
is able to describe the resulting program in a way that the programmer can
recognize.
5.3.2.1 Source Files
Much of the complication of source information stems from the "include" system. When a compilation involves several source files, there may be duplication of the header files included in each source file, or of the source files themselves. To avoid repetition of header file information in the linked object, the linker merges the input objects' included files wherever possible. Compilers mark file descriptors as mergeable or unmergeable. The linker then examines the input file descriptors and performs the merge whenever possible.
The linker considers two file descriptors to be mergeable if all of the following criteria are met:
The file descriptor
fMerge
bit is set in both (marked as mergeable by compiler).
Files have the same name.
Files are written in the same language.
Files contain the same number of local and auxiliary symbols.
Checksums match.
The checksums match if either:
Neither file's first auxiliary record is a
btChecksum
.
Both files' first auxiliary record is a
btChecksum
and they are identical.
The role of the relative file descriptor (RFD) tables is to track file-relative information after merging. A relative file descriptor table entry maps the index of each file at compile time to its index after linking. After linking, local or auxiliary symbols must be accessed through the RFD table to obtain the updated file descriptor index. This mechanism is necessary because the indices in the local symbol table are not updated when files are merged.
Figure 5-4 is an example of the use of the relative file descriptor table.
Figure 5-4: Relative File Descriptor Table Example
For a symbol reference composed of a file index and symbol index (offset within file), the relative file descriptor table is used as follows:
To look up given file index in the RFD table to get the updated file index.
To look up new file index in the (merged) file descriptor table to get the base of symbols for that file.
To add symbol index to file's base to access the symbol entry.
See
Section 5.3.7.3
for the representation of relative indices
in the auxiliary symbol table.
5.3.2.2 Line Number Information
For a debugger to be effective, a connection must be made between high-level-language statements in source files and the executable machine instructions in object files. Line number entries map executable instructions to source lines. This mapping allows a debugger to present to a programmer the line of source code that corresponds to the code being executed. The line number information is produced by the compiler and should be rewritten if an application such as an instrumentation tool or an optimizer modifies code.
Line number information is emitted in two forms, one found in the line number table and one in the optimization symbol table (see Section 5.3.3).
The line number information found in the optimization symbol table is referred to as ESLI (extended source location information). This is a new form of line number that augments the information in the line number table. ESLI will only be present for procedures that cannot be described accurately by entries in the line number table.
Version Note In symbol table formats less than V3.13 line number information is found exclusively in the line number table.
5.3.2.2.1 The Line Number Table
Line number information is generated for each source file that contributes executable code to a program. Within each source file, line numbers are organized by procedure, in the order of appearance in the file. The line number symbol table section is produced only when a program is compiled with limited or greater symbolic information (see Section 5.3.2.2).
Figure 5-5 illustrates the organization of the line number table.
The order outlined in
Figure 5-5
is not guaranteed to
match
the ordering of file descriptors or procedure descriptors in those tables.
The starting offset for a procedure's line table entries can be computed
by adding the procedure descriptor's
cbLineOffset
to the containing file descriptor's
cbLineOffset
.
The count of line number entries for a specific procedure can only be determined
by finding the starting offset of the next procedure's entries in the line
number table.
This calculation is illustrated by the
proc_pline_count()
function in the packed line number programming example in
Section 10.1.
Alternate entry points have a starting line number, but they have no specific ending line number. Procedure descriptors for a procedure and each of its associated alternate entry points share a common end offset in the line number table. See Section 5.3.6.7 for more information on alternate entry points.
The line number table has two forms. The "packed" form is used in the object file. The "expanded" form is a more useful representation to programmers and can be derived algorithmically (or by API) from the packed form.
The packed line numbers are stored as bytes. Each packed entry within the single byte value consists of two parts: count and delta. The count is the number of instructions generated from a source line. The delta is the number of source lines between the current source line and the previous one that generated executable instructions.
Figure 5-6 shows how these two values are represented.
Figure 5-6: Line Number Byte Format
The four-bit count is interpreted as an unsigned value between 1 and 16 (0 means 1, 1 means 2, and so forth). A zero value would be wasted when no instructions are generated for a source line and, as a result, no line number entry will exist for that line.
The four-bit delta is interpreted as a signed value in the range -7 to +7. Code generators may produce instructions that are not in the same order as the corresponding source lines. Therefore, the offset to the "next" source line may be a forwards or backward jump.
Either of these quantities may fall outside the representable range. For a delta outside the range, an extended format exists (as shown in Figure 5-7). This extended format can represent delta values in the range -32768 to 32767. Delta values outside of this range are not representable. This is a permanent restriction of the packed line number format.
Figure 5-7: Line Number 3-Byte Extended Format
For a count outside the range, one or more additional entries follow, with the delta set to zero.
If both fields are out of range, the delta is handled first. An extended-format delta representation is followed by an entry with the delta bits set to zero and the remainder of the count contained in the count value.
The packed line number format can be expanded to produce the instruction-to-source-line mapping that is needed for debugging. A sample program is provided in Section 10.1 to illustrate interpretation of packed line numbers.
The following source listing of a file named
lines.c
provides an example that shows how the compiler assigns line numbers:
1 #include <stdio.h> 2 main() 3 { 4 char c; 5 6 printf("this program just prints input\n"); 7 for (;;) { 8 if ((c =fgetc(stdin)) != EOF) break; 9 /* this is a greater than 7-line comment 10 * 1 11 * 2 12 * 3 13 * 4 14 * 5 15 * 6 16 * 7 17 */ 18 printf("%c", c); 19 } /* end for */ 20 } /* end main */
The compiler generates line numbers only for the lines 2, 6, 8, 18, and 20; the other lines are either blank or contain only comments.
Table 5-10
shows the packed entries' interpretation for
each source line.
Table 5-10: Line Number Example
Source Line | LINER
contents |
Interpretation |
2 | 03 |
Delta 0, count 4 |
6 | 44 |
Delta 4, count 5 |
8 | 29 |
Delta 2, count 10 |
18 1 | 88 00 0a |
Delta 10, count 9 |
19 | 10 |
Delta 1, count 1 |
20 | 14 |
Delta 1, count 5 |
Table Note:
Extended format (delta is greater than 7 lines).
The compiler generates the following instructions for the example program:
[lines.c: 2] 0x0: ldah gp, 1(t12)
[lines.c: 2] 0x4: lda gp, -32592(gp)
[lines.c: 2] 0x8: lda sp, -16(sp)
[lines.c: 2] 0xc: stq ra, 0(sp)
[lines.c: 6] 0x10: ldq a0, -32720(gp)
[lines.c: 6] 0x14: ldq t12, -32728(gp)
[lines.c: 6] 0x18: jsr ra, (t12), printf
[lines.c: 6] 0x1c: ldah gp, 1(ra)
[lines.c: 6] 0x20: lda gp, -32620(gp)
[lines.c: 8] 0x24: ldq a0, -32736(gp)
[lines.c: 8] 0x28: ldq t12, -32744(gp)
[lines.c: 8] 0x2c: jsr ra, (t12), fgetc
[lines.c: 8] 0x30: ldah gp, 1(ra)
[lines.c: 8] 0x34: lda gp, -32640(gp)
[lines.c: 8] 0x38: and v0, 0xff, t0
[lines.c: 8] 0x3c: stq v0, 8(sp)
[lines.c: 8] 0x40: xor t0, 0xff, t0
[lines.c: 8] 0x44: bne t0, 0x6c
[lines.c: 18] 0x48: ldq t2, 8(sp)
[lines.c: 18] 0x4c: sll t2, 0x38, t2
[lines.c: 18] 0x50: sra t2, 0x38, a1
[lines.c: 18] 0x54: ldq a0, -32752(gp)
[lines.c: 18] 0x58: ldq t12, -32728(gp)
[lines.c: 18] 0x5c: jsr ra, (t12), printf
[lines.c: 18] 0x60: ldah gp, 1(ra)
[lines.c: 18] 0x64: lda gp, -32688(gp)
[lines.c: 19] 0x68: br zero, 0x24
[lines.c: 20] 0x6c: bis zero, zero, v0
[lines.c: 20] 0x70: ldq ra, 0(sp)
[lines.c: 20] 0x74: lda sp, 16(sp)
[lines.c: 20] 0x78: ret zero, (ra), 1
[lines.c: 20] 0x7c: call_pal halt
After expanding packed line numbers, the following instruction-to-source
mapping (formatted
instruction number.source line number
)
is produced by
odump
for the
-l
option:
0. 2 1. 2 2. 2
3. 2 4. 6 5. 6
6. 6 7. 6 8. 6
9. 8 10. 8 11. 8
12. 8 13. 8 14. 8
15. 8 16. 8 17. 8
18. 18 19. 18 20. 18
21. 18 22. 18 23. 18
24. 18 25. 18 26. 19
27. 20 28. 20 29. 20
30. 20 31. 20
Header files included in an object have no associated line numbers recorded
in the symbol table.
Line number information for included files containing
source code is not supported by the packed line number format.
The following
section describes a more comprehensive line number representation that includes
line number information for header files.
5.3.2.2.2 Extended Source Location Information (ESLI)
Version Note ESLI is supported for symbol table format V3.13 and greater.
The line number table does not correctly describe optimized code or programs with untraditional source files, resulting in images that are difficult to debug. Extended Source Location Information (ESLI) is intended to provide more information to enable debugging of optimized programs, including PC and line number changes, file transitions, and line and column ranges. ESLI is essentially a superset of the older line number table.
ESLI is stored in the optimization symbols section. This information is accessible on a per-procedure basis from the procedure descriptors. See Section 5.3.3 for more detail on accessing information in the optimization symbols section.
ESLI is a byte stream that can be interpreted in two modes: data mode or command mode. Currently, two formats are defined for data mode. These are designated as "Data Mode 1" and "Data Mode 2". Additional data modes may be defined as needed.
Figure 5-8: ESLI Data Mode Bytes
Data Mode 1 is the initial mode for a procedure's ESLI.
Data Mode 1
is identical to the packed line number format with the exception of the interpretation
of the delta PC escape value
0x80
(which indicates a switch
to command mode).
In Data Mode 2, each entry consists of two bytes.
The first byte is
identical to the encoding and interpretation of Data Mode 1.
The second byte
is an absolute column number (from
0 to 255), where column number 0 indicates that column information is missing
or not meaningful for this entry.
The escape from Data Mode 2 to command
mode consists of a delta PC escape value set to
0x80
and
column number set to 0.
In command mode, each byte is either a command or a command parameter.
For a command byte, the low-order six bits are a command code, and the two
high bits are used as flags, as shown in
Figure 5-9.
The "mark"
flag, if set, announces that a new state has been established.
Several commands
may be required to fully describe a new state.
The "resume" flag, if set,
indicates the end of command mode.
The next byte following a command with
"resume" set will be a data mode byte.
The effective data mode can be changed
by
SET_DATA_MODE
commands in command
mode, otherwise the data mode that was in effect prior to the escape to command
mode will be resumed.
See
Table 5-11
for a complete list of
commands.
Command parameters are stored in LEB (Little Endian Byte) 128 format. See Section 1.4.6 for a description of this data representation. PC deltas are always expressed as machine instruction offsets and must be scaled by the size of a machine instruction before adding to the current PC. No other deltas need to be scaled.
Table 5-11
shows how to interpret the bytes in command
mode.
These definitions can be found in the system header file
linenum.h
.
Table 5-11: ESLI Commands
Name | Value | Parameters by Type |
ADD_PC |
1 | SLEB |
ADD_LINE |
2 | SLEB |
SET_COL |
3 | LEB |
SET_FILE |
4 | LEB |
SET_DATA_MODE |
5 | LEB |
ADD_LINE_PC |
6 | SLEB, SLEB |
ADD_LINE_PC_COL |
7 | SLEB, SLEB, LEB |
SET_LINE |
8 | LEB |
SET_LINE_COL |
9 | LEB, LEB |
SEQUENCE_BREAK |
10 | SLEB |
SET_EXP |
11 | LEB |
ADD_PC
Parameter is a signed value to add to the current PC value.
ADD_LINE
Parameter is a signed value to add to the current line number.
SET_COL
Parameter is an unsigned value that represents a new column number. The column number is used to associate the PC with a particular location within a source line. Column number parameters use a zero-based representation that must be adjusted by adding 1.
SET_FILE
Parameter is an unsigned value used to switch file context.
This command
is typically followed by a
set_line
command.
SET_DATA_MODE
Parameter is an unsigned value used to set the data mode that will be
in effect when data mode is resumed.
The only parameter values that are currently
accepted are
1
and
2
.
Additional data
modes may be defined in future releases.
ADD_LINE_PC
Both parameters are signed values. The first is added to the line number and the second is added to the PC.
ADD_LINE_PC_COL
The first two parameters are signed values and the third is an unsigned value. The first two are added to the line number and PC respectively. The third is used to set the column number.
SET_LINE
Parameter is an unsigned value that sets the current line number.
SET_LINE_COL
Both parameters are unsigned values. The first represents the line number and the second represents the column number.
SEQUENCE_BREAK
Indicates the end of a contiguous sequence of address descriptions. The value of the parameter is added to the current address, and the resulting address becomes the starting address of the next sequence of address descriptions. The current file and line number continue to apply as the current values for the new sequence as well. (These can, however, be changed using the appropriate commands.)
Version Note The
SEQUENCE_BREAK
command is supported in Tru64 UNIX V5.1 and greater for symbol table format V3.13 and greater.
SET_EXP
Set exponent for Tandem edit line numbers.
The value of the parameter is
an unsigned integer from
0
through
7
representing a power of 10 from
-3
through
4
.
Version Note The
SET_EXP
command is reserved for use on Tandem big-endian systems. It is not supported on Tru64 UNIX.
A tool reading the ESLI must maintain the current PC value, file number, line number, and column. Taken together, these four values represent the current "state". Consumers must also keep track of the mode in effect to interpret the data properly. A sample program is provided in Section 10.2 to illustrate consumption of ESLI.
Data encoded in ESLI can be represented in tabular format. The PC value and file, line, and column numbers can be stored as a state table. The following example shows how to build this state table.
In this example ESLI will record line numbers for a routine that includes text from a header file.
Source listing for
line1.c
:
1 /* ESLI example using included source lines */ 2 3 main() { 4 char *msg; 5 6 msg = (char *)0; 7 8 #include "line2.h" 9 10 printf("%s", msg); 11 }
Source listing for
line2.h
1 msg = (char *)malloc(20); 2 /* 3 * 4 * 5 * 6 * 7 * 8 * 9 * 10 */ 11 strcpy(msg, "Hello\n");
The compiler generates the following instructions for the example program:
main:
[line1.c: 3] 0x1200011d0: ldah gp, 8192(t12)
[line1.c: 3] 0x1200011d4: lda gp, 28336(gp)
[line1.c: 3] 0x1200011d8: lda sp, -16(sp)
[line1.c: 3] 0x1200011dc: stq ra, 0(sp)
[line1.c: 3] 0x1200011e0: stq s0, 8(sp)
[line1.c: 6] 0x1200011e4: bis zero, zero, s0
[line2.h: 1] 0x1200011e8: bis zero, 0x14, a0
[line2.h: 1] 0x1200011ec: ldq t12, -32560(gp)
[line2.h: 1] 0x1200011f0: jsr ra, (t12)
[line2.h: 1] 0x1200011f4: ldah gp, 8192(ra)
[line2.h: 1] 0x1200011f8: lda gp, 28300(gp)
[line2.h: 1] 0x1200011fc: bis zero, v0, s0
[line2.h: 11] 0x120001200: bis zero, s0, a0
[line2.h: 11] 0x120001204: lda a1, -32768(gp)
[line2.h: 11] 0x120001208: ldq t12, -32600(gp)
[line2.h: 11] 0x12000120c: jsr ra, (t12)
[line2.h: 11] 0x120001210: ldah gp, 8192(ra)
[line2.h: 11] 0x120001214: lda gp, 28272(gp)
[line1.c: 10] 0x120001218: ldq_u zero, 0(sp)
[line1.c: 10] 0x12000121c: lda a0, -32760(gp)
[line1.c: 10] 0x120001220: bis zero, s0, a1
[line1.c: 10] 0x120001224: ldq t12, -32552(gp)
[line1.c: 10] 0x120001228: jsr ra, (t12)
[line1.c: 10] 0x12000122c: ldah gp, 8192(gp)
[line1.c: 10] 0x120001230: lda gp, 28244(gp)
[line1.c: 11] 0x120001234: bis zero, zero, v0
[line1.c: 11] 0x120001238: ldq ra, 0(sp)
[line1.c: 11] 0x12000123c: ldq s0, 8(sp)
[line1.c: 11] 0x120001240: lda sp, 16(sp)
[line1.c: 11] 0x120001244: ret zero, (ra)
The ESLI and its interpretation for the generated code is shown in the
following table.
Table 5-12: ESLI Example
Command | State | |||||||
(M)ark (R)esume | (F)ile (L)ine (C)olumn | |||||||
ESLI bytes (hex) | Mode | Code | M | R | PC (hex) | F | L | C |
Initial State (from
PDR ) |
Data1 | 1200011d0 |
0 |
3 |
0 |
|||
04 |
Data1 | 1200011e4 |
0 |
3 |
0 |
|||
30 |
Data1 | 1200011e8 |
0 |
6 |
0 |
|||
80 |
Data1 | Escape | ||||||
04 01 |
Cmd | set_file(1) |
1 |
|||||
48 01 |
Cmd | set_line(1) |
R | 1 |
||||
05 |
Data1 | 120001200 |
1 |
1 |
0 |
|||
80 |
Data1 | Escape | ||||||
86 0a 06 |
Cmd | add_line_pc(10,6) |
M | 120001218 |
1 |
11 |
0 |
|
04 00 |
Cmd | set_file(0) |
0 |
|||||
48 0a |
Cmd | set_line(10) |
R | 10 |
||||
06 |
Data1 | 120001234 |
0 |
10 |
0 |
|||
16 |
Data1 | 120001250 |
0 |
11 |
0 |
The handling of alternate entry
points differs from the handling of main entry points.
Procedure descriptors
for alternate entry points are identified by a
PDR
.lnHigh
value of
-1
.
If the PC for an instruction maps to an alternate
entry point, the following steps should be taken:
Find procedure descriptor for the corresponding main entry.
This is accomplished by searching back in the procedure descriptors until
a
PDR
is found that is not an alternate entry (PDR
.lnHigh
is not
-1
).
Access the ESLI for the procedure.
Read the ESLI until the PC value matches the
PDR
.adr
field of the alternate entry's
procedure descriptor.
Version Note Optimization symbols are supported for symbol table format V3.13. and greater.
The optimization symbols section gives individual producers and consumers the ability to communicate information about any aspect of the object file, in any form they choose. New information can be generated at any time with minimal coordination between all producers and consumers.
The optimization section is organized on a per-procedure basis.
Each
procedure descriptor has a pointer to the
optimization symbols in the field
PDR
.iopt
.
If no optimization
symbols are associated with the procedure, the field contains
ioptNil
.
Otherwise, it contains the index of
the first optimization
symbol
entry for this procedure.
Consumers should access the optimization symbols
through the procedure descriptors.
The optimization section is not present
in a locally-stripped object.
This section consists of a sequence of zero or more Per-Procedure Optimization Descriptions (PPODs), as shown in Figure 5-10. Each PPOD's internal structure consists of two parts:
A leading sequence of structured entries using a Tag-Length-Value model to describe subsequent raw data. The structure of the PPOD entry can be found in Section 5.2.10.
The raw data area.
Figure 5-10: Optimization Symbols Section
This section has the following alignment requirements:
Octaword (16-byte) alignment of the beginning of the section.
Octaword (16-byte) alignment of the beginning of the raw data area.
Octaword (16-byte) alignment of each PPOD.
Object file producers must produce either an empty optimization symbols
section or a valid one.
An empty one has
the symbolic header fields
cbOptOffset
and
ioptMax
set to zero.
If an optimization section
is present, but a particular file
does
not contribute to it, the file descriptor field
copt
is set to zero.
In this case, all procedure descriptors belonging to the file
must have their
iopt
fields set to
ioptNil
.
Tools that both read and write object files must consume a valid optimization symbols section (if present in the input file) and produce an equivalent and valid section in its output file. If a tool does not know how to process the section contents, the section must be omitted from the output file. If a tool does know how to process portions of the optimization symbols, those portions may be modified and the rest should be removed. The linker concatenates input optimization symbols sections into one output section without reading or modifying any of the entries.
The format and flexible nature of this section are similar by design
to the
.comment
section.
The structures are the same size and contain the same
fields (with different names), and the rules of navigation are the same.
The
primary difference is that the optimization section contains procedure-specific
information; whereas, the comment section contains object-specific information.
5.3.4 Run-Time Information
The symbol table
contains information that debuggers must interpret to find symbols at run
time.
This section describes the information that the static symbol table
structures provides.
Algorithms for determining run-time symbol addresses
are included.
5.3.4.1 Procedure Addresses
The following pseudocode describes an algorithm for determining the procedure start address:
if (HDRR.vstamp >= 0x30D || PDR.isym == isymNil) return(PDR.adr) else foreach FDR in HDRR foreach PDR in FDR if PDR matches if (FDR.csym == 0) /* Use external symbol */ return (EXTR[PDR.isym].asym.value) else /* Use local symbol */ return (SYMR[FDR.isymbase + PDR.isym].value)
If local symbol information
is present for the given
PDR
, the
isym
field identifies the local
symbol table entry that contains
the start address of the procedure.
If no local symbol information is present,
the
isym
field
identifies the external symbol table entry containing the start
address of the procedure.
If no symbol information is present for the
PDR
, the
isym
field is set to
isymNil
and the
adr
field will contain a reliable start address.
Version Note The
PDR
.adr
field is reliably updated by the linker for symbol table format V3.13. The preceding algorithm is recommended for determining procedure addresses in symbol table formats less than V3.13.
A stack frame is a run-time memory structure that is created whenever a procedure is called. The Calling Standard for Alpha Systems specifies the stack frame format and related code requirements. This section explains how to interpret procedure descriptor fields related to the stack frame.
Two types of stack frames are supported: fixed-size frames and variable-size frames. The variable frame format is used for procedures that dynamically allocate memory and for those with very large frames. Figure 5-11 shows a fixed-size frame and Figure 5-12 shows a variable-sized frame.
From the procedure descriptor, you can determine which type of stack
frame the procedure has.
The field
PDR
.framereg
stores the
frame pointer register number.
If this field has a value of 30 ($sp), the
stack frame is a fixed-size frame.
If it has a value of 15 ($fp), the stack
frame is a variable-size frame.
Figure 5-11: Fixed-Size Stack Frame
Figure 5-12: Variable-Size Stack Frame
For both types of stack frames, the value of
PDR
.frameoffset
is the size of the fixed part of the stack frame.
In the case of a fixed-size frame, it is the entire frame size.
For a variable-sized
frame, the entire frame size cannot
be determined from the symbol table.
The code may dynamically increase and
decrease the size of the frame multiple times during procedure execution.
The virtual frame pointer represents the contents of the frame pointer register at procedure entry, prior to prologue execution. The (real) frame pointer is the contents of the frame pointer register after prologue execution. The difference between the virtual and real frame pointer values is the fixed frame size, which is subtracted from the $sp contents during the procedure prologue. Note that stack offsets recorded in the symbol table are relative to the virtual frame pointer, not the real value used at run time.
The contents of the frame pointer register at are used at run time as
the base address for accessing
data, such as parameters and local variables, on the stack.
See
Section 5.3.4.3
for details.
5.3.4.3 Local Symbol Addresses
Local variables and parameters may be stored in registers or on the
stack.
Those stored in registers
(identified by a storage class of
scRegister
) do not have addresses.
For local
variables and parameters with addresses, this
section explains how to calculate
their run-time locations from the symbol table information.
To calculate
the run-time address for a local variable (
stLocal
) based on its symbol table value:
Frame pointer - PDR.localoff + SYMR.value
To calculate the run-time address for a parameter (
stParam
) based on its symbol table value:
Frame pointer - argument_home_area_size + SYMR.value
The argument home area is a portion of the stack frame designated for parameter storage. See Figure 5-11 for an illustration. For historical reasons, the size of this area is always 48 bytes.
The calculations above must be performed at run time when the actual frame pointer value is known. Note that the value becomes valid only after the procedure prologue has executed.
To calculate the locations based on static information, convert the symbol's value to an offset from the real frame pointer:
Local:
PDR.frameoffset - PDR.localoff + SYMR.value
Parameter:
PDR.frameoffset - 48 + SYMR.value
The resulting offsets are always positive values because the frame pointer
contains the address of the lowest memory in the fixed part of the stack frame
at run time.
5.3.4.4 Uplevel Links
Version Note Uplevel links are supported in symbol table format V3.13 and greater.
An uplevel link is the real frame pointer of an ancestor of a nested routine. The routine nesting may be a feature of the language (such as Pascal), or the nesting may occur in optimized code which has been decomposed for parallel execution into smaller routines. Uplevel links provide debuggers a method of finding all local symbols associated with the ancestor routine.
When a procedure is passed a static link, that static link will be represented
within the scope of the procedure definition as a local automatic symbol with
a special name beginning with
"__StaticLink."
.
The lifetime
of
this symbol begins after the procedure prologue has been executed.
The static link symbol will occur between the procedure's parameter
definitions and the first
stBlock
symbol.
The full name of the symbol will be
"__StaticLink."
followed by a positive decimal integer with no leading zeros.
This integer
value identifies the number of levels up the ancestor tree the static link
points to.
For example, if the name is
"__StaticLink.3"
it will
contain the static link of the procedure in which it is defined, and that
procedure's static link points to
a stack frame that is three levels up in the procedure's ancestor tree, the
great-grandfather of the procedure.
Figure 5-13: Representation of Uplevel Reference
Debuggers of Tru64 UNIX object files need to use the uplevel link information to determine which symbols are visible at a location in the program and to compute the addresses of local symbols in ancestor routines. When the debugger needs the current value or address of a name that might be defined as an uplevel reference, two separate actions may be required: finding the procedure that defines the currently visible instance of that name, and finding the address of the currently visible instance of that name. If only type information is required, finding the procedure that defines the name may be sufficient.
Finding the defining procedure is accomplished by repeatedly looking up the name in the local symbol table of a chain of procedures that extends from the current procedure through its chain of ancestors until either the name is found in a procedure or the end of the chain of ancestors is reached without finding the name. If this search terminates without finding the name, the debugger should conclude that the name is not visible by uplevel reference at the current location in the program.
When searching for the desired procedure, the debugger should count
how many levels in the ancestor chain were traversed before finding the name.
If zero levels were traversed, the name is defined within the current procedure
and is not an uplevel reference.
The number of levels traversed is assumed
to be in the variable
LevelsToGo
in the algorithm below.
Finding the address for the name involves locating static link values and dereferencing them with appropriate offsets. Basically, while the number of levels to be traversed is greater than zero, find the static link symbol for the current level and obtain its value. Finally, add the desired symbol's offset from the real frame pointer to the final static link value.
The recommended algorithm for finding the address is as follows:
LevelsToGo = <from name lookup above> NewProc = CurrentProcedure NewFrame = FramePointerValue(CurrentProcedure) Failed = false while (LevelsToGo > 0 && !Failed) StaticLink = FindStaticLinkSym(NewProc) if (StaticLink == NULL) Failed = true else NewFrame = *(NewFrame + StaticLink->symbol.offset) Levels = StaticLinkLevels(StaticLink) LevelsToGo = LevelsToGo - Levels for (; Levels > 0; Levels--) NewProc = NewProc->proc.parent
if
Failed
is true after executing this algorithm, required
information about static links is missing in the symbol table, and an error
has occurred.
If
LevelsToGo
ends up less than zero, the
optimizer's static link optimization has eliminated a static link level that
would be needed to compute the address of the name.
It is recommended that
debuggers inform the user that optimization prevents the debugger from computing
the address of the name.
If
Failed
is false and
LevelsToGo
is equal to zero, the address for the currently visible instance of the name
is
NewFrame
plus the offset of the name with respect to the
real frame pointer for
NewProc
.
The function
StaticLinkLevels
returns the integer at
the end of the name for the indicated static link symbol.
5.3.4.5 Finding Thread Local Storage (TLS) Symbols
This section explains how
to interpret symbolic information for TLS
symbols (identified by a storage class of
scTlsData
or
scTlsBss
).
See
Section 3.3.9
or the
Programmer's Guide
for general
information on TLS.
A TLS symbol's value contains its offset from the start of the TLS region for that object. This offset can be used at process execution time to determine the address of the TLS symbol for a particular thread.
A debugger can calculate TLS symbol addresses by looking up the address of the TLS region using run-time structures and adding the offset of the TLS symbol to that address. The following formula can be used to calculate TLS symbol addresses.
TLS sym address = *(TEB.TSD + __tlskey) + SYMR.value
A detailed description of this formula follows:
Get the address of the Thread Environment Block (TEB).
Get the address of the Thread Specific Data (TSD) array from the TEB structure.
Get the offset of the TLS pointer in the TSD array.
This offset is normally stored in a
.lita
or
.got
entry.
This value should be accessed using the symbol
__tlskey
.
In spite of the fact that
__tlskey
is
a label symbol, no ampersand is used in this context because the value that
the label points to is being retrieved.
The address of
__tlskey
will need to be adjusted by the address mapping displacement in the same manner
that the debugger adjusts addresses of text and data symbols.
For static
executables, the
.lita
entry contains the constant offset
(2048).
This offset identifies the first and only TSD slot (256) that will
be allocated for the TLS pointer.
For shared objects, the
.got
entry labeled by
__tlskey
is initially 0, indicating that the TSD slot has not been
allocated yet.
After the object's initialization routines have run, a TSD
key will be allocated and the
.got
entry will contain its
offset.
Get the TLS pointer value. The TLS pointer is a 64-bit address set to the start of the TLS Region.
Calculate the address of the TLS symbol by adding the offset of the TLS symbol to the TLS pointer value.
TLS common symbols (
scTlsCommon
)
should not occur in linked objects, so debuggers should not need to support
them.
Executables and shared
libraries can only reference TLS symbols that they define, so successfully
linked objects should have not TLS undefined or TLS common symbols.
5.3.5 Profile Feedback Data
Version Note Profile feedback data is supported in symbol table format V3.13 and greater.
Profile
feedback data is stored in entries in the optimization symbols table with
tag type
PPODE_PROFILE_INFO
.
The
data contained in this section is intended for Compaq internal use only.
It contains execution profiling feedback
used by compilers and the
om
utility.
Profile
feedback data contains relative file descriptor and local symbol
table indexes.
If an object tool
removes, adds, or rearranges relative file descriptors or local symbol table
entries it must also remove all optimization
symbol table entries including the profile feedback data.
5.3.6 Scopes
From a user-program's point of view, an identifer's scope determines its visibility in different parts of the program. Programming languages provide facilities for declaring and defining names of procedures, variables and other program components inside various scoping levels. This section briefly discusses the concept of scope and then explains how it is represented in the symbol table. References are made to structures in the auxiliary symbol table; see Section 5.3.7.3 for details.
Generally speaking, the four main scoping levels in a program are block scope, procedure scope, file scope, and program scope. Most programming languages have constructs to implement at least these scoping levels. Figure 5-14 shows the hierarchy of these scopes.
Names with block scope can only be referenced inside the declaring block. Blocks are delimited by begin and end markers, the syntax of which varies among languages.
Names with procedure scope are only recognized inside their enclosing subroutines. For instance, the names of formal parameters and local variables declared inside a procedure are accessible only to that procedure's executable statements.
Names with file scope can be referenced by any instruction within the file where they are declared. A file can be composed of procedures and data external to any procedure. Both external data names and procedure names can have file scope or program scope. Note that in a compilation involving only a single file or in a compilation for a programming language with no separate-compilation facilities, file scope and program scope are equivalent.
Names with program scope are visible everywhere in the program, even when the executable program is built from many source and header files. The linker must resolve these names or pass them to the dynamic loader to resolve. See Section 5.3.10 for more information about symbol resolution.
In the symbol table, procedure scope, file scope, and program scope
correspond to local, static, and global symbols, respectively.
Block scope
names are also local symbols.
Local and
static symbols appear in the local
symbol table, and global symbols are in
the external symbol table.
5.3.6.1 Procedure Scope
Although procedure
symbols can only be global or static (with symbol
types
stProc
and
stStaticProc
, respectively),
procedure entries appear in the
local symbol table
to identify the containing scope of their local data.
The set of symbols appearing
in the local symbol table to describe a procedure
scope and their associated
auxiliary entries is shown in
Figure 5-15.
Global procedures also have entries in the external
symbol table.
As illustrated, the indices of these external entries point
to the scoping entries in the local symbol table.
Note
In this chapter, all diagrams of symbol table representations use arrows to show that one entry contains an index to another entry. For external and local symbol table entries, the index used is contained in the
index
field. For auxiliary symbols, theisym
orRNDXR
field is the index used. Any exceptions to this general rule are noted in the diagrams.
Figure 5-15: Procedure Representation
A special instance of a procedure definition occurs for a procedure with no text. This type of procedure occurs only in the local symbol table and is very similar to the representation of other procedures. It is generally used for procedures that have been optimized away that still need to be represented for debugging or profiling information.
Figure 5-16: Procedure with No Text
A procedure with no code can contain only nested procedures that
also have no code associated with them.
If a procedure with no code does
not contain any nested procedures, the
stBlock
/
stEnd
symbol pair
can be omitted from the representation.
The
stProc
symbol included
in this representation is distinguished from similar
stProc
symbols by its value field that is set to
addressNil
(-1)
.
Version Note Procedures with no code are supported in symbol table format V3.13 and greater.
As in the case of procedures, file name entries appear in the local symbol table to define the file's scope. This representation is shown in Figure 5-17. Note that file symbols appear in the local symbol table only.
Figure 5-17: File Representation
In general, the local symbol table denotes scoping
levels with
stBlock
and
stEnd
pairs, as shown in
Figure 5-18.
All symbols contained between these two entries belong to the scope
they describe.
Nested blocks are possible, and
stEnd
symbols match the most recent occurrences of
stBlock
(or other opening symbol entries such
as
stProc
or
stTag
).
Figure 5-18: Block Representation
Block scopes occur in many languages.
In C, they take the form of lexical
blocks.
In C++, declarations can occur
anywhere in the code.
In Pascal and
Ada, nested procedures are possible,
with local variables at any or all levels.
5.3.6.4 Namespaces (C++)
Version Note Namespaces are supported in symbol table format V3.13 and greater.
A C++ namespace is a mechanism that allows the partitioning of the program global name space. This partitioning is intended to reduce name clashing and provide greater program manageability to C++ developers.
Figure 5-19: C++ Namespace Representation
A namespace definition may exist only at the global scope or within another namespace. The namespace representation in Figure 5-19 shows a single contribution to a namespace. This representation may be replicated many times in the symbol table for a single namespace. A namespace definition may be continued within the same file or over multiple source files.
A single namespace contribution that spans multiple source files is represented as if it were contained entirely within the source file in which it began.
Namespaces may be aliased, allowing a single namespace to be referred
to by multiple names.
Namespace components may also be referenced without
their namespace qualification if they are included within a scope by a using
directive or using declaration.
The representations
of namespace aliases, using directives, and using declarations are shown in
Figure 5-19.
Namespace definitions, namespace component declarations,
namespace aliases,
using directives,
and using declarations occur only in the local symbol table.
Namespace component definitions may occur in the
local or external symbol table.
5.3.6.4.1 Namespace Components
The components of a namespace are represented in two parts: declarations and definitions. Namespace components that do not require definition must be declared in the namespace definition. Namespace components that are referenced by a using declaration must be declared in the namespace definition. All other namespace component declarations may be omitted from the namespace definition.
Namespace component names are mangled only as needed. Function and data definitions have mangled name definitions in the local or external symbol table. These entries are mangled for type-safe linkage and as a method of matching components with the namespaces to which they belong. Names of component declarations within a namespace definition may or may not be mangled. They are not required to include the namespace name in their mangled form.
Empty namespace contributions can be omitted, but at least one instance
of a namespace definition must occur somewhere
in the local symbol table.
This definition is required because name mangling
rules do not distinguish namespace component definitions from class member
definitions.
5.3.6.4.2 Namespace Aliases
Namespace aliases can occur in namespace, file, procedure, or block
scope in the local symbol table.
The index value for the
stAlias
entry is an auxiliary table index.
The auxiliary entry is a
RNDXR
record containing the local symbol table index of the
stNamespace
symbol in the first instance of
a namespace definition within a compilation unit.
For an alias of an alias, the
RNDXR
record can also contain the index of another
stAlias
symbol in the local symbol table.
Section 9.2.5
provides an example
of a namespace alias.
The
stAlias
symbol type may be used in future versions
of the symbol table format as a general purpose symbol alias representation.
The semantic interpretation of the
stAlias
symbol depends on the type of the symbol it aliases.
5.3.6.4.3 Unnamed Namespace
An unnamed namespace can be declared at the global scope or within another namespace. An unnamed namespace is unique within a compilation unit. Multiple contributions to a unique unnamed namespace are not allowed. Unnamed namespace contributions are included in the non-mergeable portion of a C++ header file.
Unnamed namespace components are subject to the same rules as named namespaces for declarations and definitions.
The
stNamespace
symbol for
an unnamed namespace has a compiler generated name starting with
__N1
.
This same name is used to identify the unnamed namespace
in the
mangled names of components of that namespace.
(See the unnamed namespace
example in
Section 9.2.4.)
5.3.6.4.4 Usage of Namespaces
A C++
using directive or a using declaration is represented by a symbol of type
stUsing
.
It may occur in
any
scope in the local symbol table.
The index value for the
stUsing
entry is an auxiliary table index.
If the
stUsing
entry represents a using declaration
for a single namespace component, the auxiliary entry is a
RNDXR
record containing the
local symbol table index of a namespace component declaration.
If the
stUsing
entry represents a using directive,
its
RNDXR
auxiliary contains the local symbol table
index of the
stNamespace
symbol
in the first definition of that namespace in the compilation unit.
A
using directive for a namespace alias is represented with a
RNDXR
auxiliary that directly references the aliased namespace.
This
representation contains no record of the alias referenced by the using directive.
Names are not required for
stUsing
entries, but they can be set to match the namespace or namespace component
to which they refer.
Namespace components that are referenced by an
stUsing
symbol must be declared in the namespace definition.
Section 9.2.3
provides an example of namespace definitions
and uses.
5.3.6.5 Exception Handling Blocks (C++)
In C++, a special scoping mechanism is introduced to expand user-defined exception-handling capabilities. Exception handlers are defined to "catch" exceptions that are "thrown" by other functions. The symbol table must contain sufficient information to recognize the scope of a handler. The compiler generates special symbols to identify where exception handlers are valid.
Figure 5-20: C++ Exception Handler Representation
Fortran common blocks constitute another scoping level.
Fortran uses common blocks as a way of specifying data that is global or shared
between program units.
A common block is global storage that can be named,
allocated, accessed, and used by various subroutines.
The block can be named
or unnamed;
unnamed
blocks are known as "blank commons".
Internal to the symbol table,
blank
commons are named
_BLNK__
.
Figure 5-21 shows the symbolic representation of Fortran common blocks.
Figure 5-21: Fortran Common Block Representation
Because a Fortran common is represented as a synthesized file, it also has an entry in the file descriptor table. Furthermore, a global symbol with the same name is also present in the external symbol table.
An example of a Fortran common block can be found in
Section 9.3.1.
5.3.6.7 Alternate Entry Points
Fortran also has a facility for
creating alternate entry points in procedures.
An alternate entry point is represented using an
stProc
/
scText
symbol.
In the procedure descriptor table, an alternate
entry point is identified by a
lnHigh
field with
a value of -1.
Procedure descriptors for alternate entry points follow the
procedure descriptor for
the primary entry point.
In the local symbol
table, an alternate entry point has an entry inside the scope of the procedure's
primary entry.
The representation of a procedure with an alternate entry point is shown in Figure 5-22
Version Note The
stBlock
symbol that follows the alternate entry'sstProc
symbol in Figure 5-22 is supported in symbol table format V3.13 and greater. In symbol table formats less than V3.13 alternate entries do not have a start block symbol, and their prologue size is unknown.
Figure 5-22: Alternate Entry Point Representation
An example of Fortran alternate entries can be found in
Section 9.3.2.
5.3.7 Data Types in the Symbol Table
A data element's type dictates its size and interpretation in a programming environment. One of the symbol table's most important tasks is to represent data types in a compact and complete manner.
Type information is stored in the local and auxiliary symbol tables.
This section provides guidelines for understanding the type information plus
specific examples for depicting a range
of types.
5.3.7.1 Basic Types
All programming languages have a set of simple types that are built into the language and from which other data types can be derived. Examples of simple types are integer, character, and floating point. Languages also provide constructs for creating user-defined types based on the simple types. For example, a C++ class can be built using any simple type or previously defined user-defined type and the language facility for declaring classes.
Similarly, a basic type in the
symbol table is a building block from which each language constructs its type
information.
Basic type (bt
) values directly represent
many of the simple types for supported languages; for instance, the value
btChar
indicates a character.
Other
bt
values represent language constructs for building aggregate
types; a value of
btStruct
may be
used, for example, to represent a C structure
or Pascal record.
The symbol table uses approximately forty basic type values.
The interpretation
of some of these values is language dependent.
See
Table 5-5
for a list of all values.
5.3.7.2 Type Qualifiers
Type qualifiers can be applied to basic types to create other data types. Examples are "pointer to", "array of", and "function returning". Generally the number and order of type qualifiers is unrestricted.
See
Table 5-6
for a list of type qualifiers and their
meanings.
5.3.7.3 Interpreting Type Descriptions in the Auxiliary Table
This section explains in detail the encoding of type descriptions in the symbol table. To fully describe the type of a symbol, the auxiliary symbol table must be created and referenced. Compilation with full symbolic information (-g option on system compilers) results in the creation of this table.
To correctly decode the type information, proceed sequentially, beginning with the symbol table entry. Several fields may be required from other symbol table structures:
value (SYMR
.value
)
The first step is to determine whether the symbol contains an index
of an auxiliary table description.
Table 5-13: Symbols with Auxiliary Type Descriptions
Symbol Type | Storage Class | Conditions | AUXU
Index Field |
stGlobal |
Any | None | index |
stStatic |
Any | None | index |
stParam |
Any | None | index |
stLocal |
Any | Local symbol table | index |
stProc |
Any | Local symbol table | index |
stBlock |
scInfo |
Inside an
scVariant
block |
value |
stMember |
scInfo |
None | index |
stTypedef |
scInfo |
None | index |
stStaticProc |
Any | Local symbol table | index |
stConstant |
Any | None | index |
stBase |
scInfo |
None | index |
stVirtBase |
scInfo |
None | index |
stTag |
scInfo |
None | index |
stInter |
scInfo |
None | index |
stNamespace |
scInfo |
None | index |
stUsing |
scInfo |
None | index |
stAlias |
scInfo |
None | index |
If the index does represent a record in the auxiliary symbol table,
the interpretation of the first auxiliary entry (
AUXU
) depends on the type of the symbol:
If the symbol's type is
stProc
or
stStaticProc
and
the symbol is a local symbol, the indexed
AUXU
is
an
isym
(set to
indexNil
for alternate entry points) and
the second
AUXU
is a
TIR
.
External procedure symbols do not have descriptions in the auxiliary
table.
If the symbol's type is
stInter
,
stAlias
, or
stUsing
, the indexed
AUXU
is an
RNDXR
and the type description does not contain a
TIR
.
If the symbol is an
stBlock
symbol inside an
scVariant
block,
the symbol entry's
value
field is an index into
the auxiliary table.
This special case is the only one where the
value
is used as an auxiliary symbol pointer.
In all other cases,
it is the
index
field that potentially indexes
the auxiliary table type description.
Otherwise, the indexed
AUXU
is a
TIR
.
The next task is to examine the contents of the
TIR
.
The
TIR
contains constants representing the basic
type of the symbol and up to
six type qualifiers, labeled
tq0-tq5
.
If a type
has more than one qualifier, they are ordered from lowest to highest.
Lower
qualifiers are applied to the basic type before higher qualifiers.
All unused
tq
fields are set to
tqNil
,
and no
tqNil
fields are present
before or between other type qualifiers.
In addition to the basic type and type qualifiers, the
TIR
contains two flags: an
fBitfield
flag to mark whether the size of the type is explicitly recorded, and a
continued
flag to indicate that the type description is continued
in another
TIR
.
If
fBitfield
is set, the
TIR
is immediately followed by a
width
entry.
If more than six type qualifiers are required for
the current definition, the description is continued, and the
continued
flag is set.
If exactly six type qualifiers are needed,
all six fields are used and the
continued
flag
is cleared.
To illustrate, consider the type "array of pointers to integers".
The
basic type is "integer" and has two qualifiers, "array of" and "pointer to".
Each element of the array is a "pointer
to integer".
Therefore, the qualifier "pointer to" must be applied first to
the basic type "integer".
In this example, the qualifier "pointer to" is lower
than the qualifier "array of".
The contents of the
TIR
are as follows:
bt: btInt tq0: tqPtr tq1: tqArray tq2: tqNil tq3: tqNil tq4: tqNil tq5: tqNil continued: 0 fBitfield: 0
The contents of the
TIR
dictate how to interpret
any subsequent records.
The records appear in a prescribed order:
If the
fBitfield
flag is set, a
width
record follows the
TIR
.
If the basic type is
btPicture
, the next four records contain integer values: the string table
index of the picture string,
the length, precision and scale.
If the basic type is
btScaledBin
, the next three records contain integer values: a basic type,
the precision and scale.
If the basic type field is
btStruct
,
btUnion
,
btEnum
,
btClass
,
btIndirect
,
btSet
,
btTypedef
,
btRange
,
btRange_64
,
btDecimal
,
btFixedBin
, or
btProc,
the next record is an
RNDXR
.
If the
rfd
field of the
RNDXR
contains the value
ST_RFDESCAPE
, the next record is an
isym
.
If the basic type is
btRange
, the next two records are
dnLow
and
dnHigh
.
If the basic type is
btRange_64
, the next two records are
dnLow
records
and the two after that are
dnHigh
records.
If the basic type is
btDecimal
or
btFixedBin
, the
next two records contain integer values: the precision and scale.
For each array type qualifier in the
TIR
, the following symbols occur:
An
RNDXR
, again possibly followed
by an
isym
Either one or two
dnLow
records
(depending on whether the array is
tqArray
or
tqArray_64
)
Either one or two
dnHigh
records
(depending on whether the array is
tqArray
or
tqArray_64
)
Either one or two
width
records
(depending on whether the array is
tqArray
or
tqArray_64
)
If the
continued
flag is set, the
next record is another
TIR
For a type description containing more than one
TIR
,
the fields of all
TIR
records are interpreted in
the same way.
When a
TIR
is reached with the flag
cleared and any records associated with that
TIR
have been decoded, the type description is complete.
As an example, consider an array of structures with the
fBitfield
flag set.
A total of seven auxiliary records can
be used to describe the type:
The
TIR
with a basic type of
btStruct
and with
tq0
set to
tqArray
.
A
width
record.
The size of the
basic type.
A
RNDXR
record.
A pointer to the
structure definition in the local symbol table.
A
RNDXR
record.
A pointer to the
array index type description elsewhere in the auxiliary table.
A
dnhigh
record.
The upper bound
of the array's range.
A
width
record.
The distance in
bits between each element in the array.
If the
continued
flag of the
TIR
is cleared, the
width
record corresponding
to the array qualifier is the final
AUXU
for this
type description.
For another view of this process, see Figure 5-23. Each box represents one auxiliary entry belonging to the symbol's type description. Using the flowchart, an ordered list of entries can be assembled.
Figure 5-23: Auxiliary Table Interpretation
Figure 5-24: Auxiliary Table "ti" Interpretation
Figure 5-25: Auxiliary Table "bt vals" Interpretation
Figure 5-26: Auxiliary Table "arrays" Interpretation
Figure 5-27: Auxiliary Table Range Interpretation
Figure 5-28: Auxiliary Table
RNDXR
Interpretation
The final step is to decode the
RNDXR
records.
The basic types that are followed by
RNDXR
records
require reference to another local or auxiliary symbol to complete the type
description.
Interpret the
RNDXR
records as follows:
If the basic type is
btStruct
,
btUnion
,
btEnum
,
btClass
,
btProc
, or
btTypedef
, the
index
field of the
RNDXR
points into the local symbol table.
The specified local symbol is the start of the definition of the structure,
union, enumeration, class, or user-defined type.
For
btProc
, the referenced local symbol is the start of the set of
symbols defining the procedure's signature.
If the basic type is
btSet
,
the
RNDXR
points into the auxiliary symbol table.
The specified record is the start of the description of the type of each element
in the set.
If the basic type is
btIndirect
, the
RNDXR
points into the auxiliary
symbol table.
The specified auxiliary record is the start of the description
of the referenced type.
If the basic type is
btRange
, the
RNDXR
points into the auxiliary
symbol table.
The specified auxiliary record is the start of the description
of the type being subranged.
If the basic type is
btFixedBin
, the
rfd
field of the
RNDXR
contains a Boolean value.
If
rfd
is
true
, the base is decimal; if
rfd
is
false
, the base is binary.
The
index
field represents a type code.
If the basic type is
btDecimal
, the
rfd
field of the
RNDXR
contains the value
1
for 4-bit digits
(packed decimal) or
2
for 8-bit digits (zoned decimal).
The
index
field represents a type code.
Additionally, the index of every
RNDXR
used
as a pointer must be mapped through the
relative file descriptor table (see
Section 5.3.2.1), if the table
exists.
The
rfd
field of the record controls this
mapping.
The following algorithm can be used to locate the symbol referenced
by the relative index record:
if (RNDXR.rfd == ST_RFDESCAPE) RFD = (++AUXU).isym else RFD = RNDXR.rfd if (HDRR.crfd) /* RFD table exists */ IFD = (current FDR's RFD table)[RFD] else IFD = RFD if (SYMR needed) SYMBASE = FDR[IFD].isymBase SYMR = SYMBASE[RNDXR.index] else if (AUXU needed) AUXBASE = FDR[IFD].iauxBase AUXU = AUXBASE[RNDXR.index]
5.3.8 Individual Type Representations
This section provides sketches of type representations in the local and auxiliary symbol tables. The connections between the two tables is depicted for each type. This form of representation is only possible when full symbolic information is present.
Note that external symbols as well as local symbols
reference the auxiliary table, although the examples in this chapter use local
symbols only.
5.3.8.1 Pointer Type
A pointer is a variable containing the address of another variable.
A pointer is represented by a
tqPtr
type
qualifier modifying another type.
A pointer is represented by a single symbol
with an entry in the auxiliary table, as shown in
Figure 5-29.
Note that if the pointer referenced a user-defined type, such as a class
or structure, the
TIR
would be followed by an
RNDXR
(and possibly an
isym
).
Figure 5-29: Pointer Representation
The combination of type qualifiers
tqFar
and
tqPtr
are used
to represent a short (32-bit) pointer.
This pointer type is used with the XTASO emulation.
5.3.8.2 Array Type
An array is a list of elements that all have the same type. Arrays may be fixed size and allocated at compile time or dynamically sized and allocated at run time. This section describes the fixed-size array symbol table representation. For information on Fortran dynamic arrays, see Section 5.3.8.9. For conformant arrays in Pascal and Ada, see Section 5.3.8.10.
An array is represented by a
tqArray
or
tqArray_64
type qualifier
applied to another type.
This second type describes the type of all elements
in the
array.
In the local or
external symbol table, a single entry represents an array.
Figure 5-30
shows the symbol table description for an array.
Figure 5-30: Array Representation
Note that for an array of elements of a user-defined type, such as a
class or structure, another
RNDXR
(and possibly an
isym
)
would be inserted between the
TIR
and the
RNDXR
describing
the subscript type.
If an array has multiple dimensions, the symbols describing the dimension
appear in the order of innermost to outermost.
For example, the following
declaration produces a
TIR
with the
tqArray
qualifier followed by the
RNDXR
and range description for 0-1 followed
by the entries for the dimension 0-99:
float floattable[100][2]
Some arrays may have dimensions too large to represent in the 32-bit format shown in Figure 5-30. Such arrays are represented using a 64-bit format in which two auxiliary entries are used for the dimension bounds and size. Figure 5-31 illustrates the 64-bit representation.
Version Note The 64-bit representation of arrays is supported in symbol table format V3.13 and greater.
Figure 5-31: 64-Bit Array Representation
5.3.8.3 Structure, Union, and Enumerated Types
This section applies to data structures in languages other than C++. For the C++ structure, union, or enumerated type representation, see Section 5.3.8.6.
Structures, unions, and enumerated types have a common representation.
All three are identified using "tags" and contain zero or more fields.
In
the symbol table, the tag is the
name associated with the starting
stBlock
symbol for the structure's set of local
symbols.
Note that it may be empty because the tag is optional.
Symbols for fields follow.
The definition is completed by a block-end symbol
matching the block-start symbol.
Figure 5-32 contains a graphical depiction of this set of symbols.
Figure 5-32: Structure Representation
The structure members have auxiliary table indices pointing to their type descriptions.
Untagged structures and unions are represented with a NULL tag name. Unnamed structures can be embedded in other structures and are represented as a NULL-named member of the outer structure. See Section 9.1.1 for an example of an unnamed structure.
Version Note Unnamed member structures are supported in symbol table format V3.13 and greater. As of Tru64 UNIX V5.1 dbx will display structures with unnamed member structures, but neither dbx nor ladebug provide specific access to members of unnamed member structures.
A structure can contain a field that is a pointer to itself.
This field
is represented by an
stMember
symbol
with an auxiliary table entry that references the beginning of the structure's
block of local symbols, as shown in
Figure 5-33.
Figure 5-33: Recursive Structure Representation
When a field within a structure is itself a structure, the compiler may choose to generate the structure definitions either sequentially or embedded, as shown in Figure 5-34.
Figure 5-34: Nested Structure Representation
The following declaration might result in the nested structure representation:
struct line { struct point { float x, y; } p1, p2; };
Most languages allow programmers to choose alternate names, or aliases, for data types. The alias created by such a facility (such as C's typedef) is represented as a single local symbol entry that has a pointer to its type description in the auxiliary table. The auxiliary entry contains a pointer to the definition of the type name, as shown in Figure 5-35.
Figure 5-35: Typedef Representation
Version Note The following function pointer representation is the preferred representation for symbol table format V3.13 and greater.
Languages such as C and C++, which allow pointers to functions, represent
the type of the function pointer using a special
stProc
/
scInfo
block
describing the parameters and return value for the function as shown in
Figure 5-36.
Figure 5-36: Function Pointer Representation
The
stProc
/
scInfo
entry has its value set to
-2
, which
distinguishes it from similar entries used to represent procedures with no
text and C++ member functions.
The
stProc
/
scInfo
and
stEnd
/
scInfo
entries have
null names in the function pointer representation.
The parameters are optional
and may or may not be named.
Version Note For symbol table formats less than V3.13 the preceding representation for function pointers is not supported, and the following alternate representation is used exclusively.
An alternate representation of function pointers is shown in
Figure 5-37.
This representation describes the return type of the function pointer but
not its parameters, and it is valid for all symbol table format versions.
The combination of type qualifiers
tqPtr
and
tqProc
is interpreted as "pointer
to function returning".
The function return type may be the base type (bt
) in the
TIR
or it may be constructed from the base type augmented by additional
type qualifiers.
Figure 5-37: Function Pointer Alternate Representation
A C++ class resembles an extended C structure. One major distinction is that class fields (referred to as "members") can be functions as well as variables. The set of symbols created for a class is organized as follows:
The name of the class
A block symbol for scoping
Data members
Symbols associated with member functions. Each member function is represented by the normal set of symbols present for a function.
Corresponding end symbols that denote the completion of the block and class.
Another characteristic of classes is that symbols are defined implicitly.
For example, all classes have an
operator=
operator-overloading
function included in the class definition and a
this
pointer
to its own type as a parameter to all member functions.
These symbols are
always included explicitly in the
symbol table description.
Figure 5-38 is a graphical representation of the set of symbols for a class.
Figure 5-38: Class Representation
Class members, including member functions, have auxiliary references that point to their type descriptions. Note that member functions are represented as prototypes. The set of symbols defining the member function is elsewhere in the symbol table. To locate the definition of a member function, a name lookup can be performed using the mangled name of the member function with its class name qualifier. See Section 5.3.10.3 for information on name mangling.
C++ structures, unions, and enumerated types are represented the same way as classes. The different data structures are distinguished by basic type value.
The symbol table does not represent class member access attributes.
Examples of base and derived classes can be found in
Section 9.2.1.
5.3.8.6.1 Empty Class or Structure (C++)
The representation of an empty class in C++ is shown in
Figure 5-39.
Empty structures in C++ are represented in a similar
manner with the
TIR
.bt
set to
btStruct
.
Figure 5-39: Empty Class or Structure (C++)
Version Note This empty class or structure representation is supported in Tru64 UNIX V5.1. Prior to Tru64 UNIX V5.1, the default compilers did not distinguish empty classes and structures from opaque classes and structures. See Section 5.3.8.6.2 for more details.
5.3.8.6.2 Opaque Class or Structure (C++)
Opaque classes and structures are incomplete types.
They have no member
information, and they are distinguished from empty classes and structures
that
have no members.
The representation of an opaque class in C++ is shown in
Figure 5-40.
Opaque structures in C++ are represented in a similar
manner with
TIR
.bt
set to
btStruct
.
Figure 5-40: Opaque Class or Structure (C++)
Version Note Prior to Tru64 UNIX V5.1 the default compilers used the preceding representation for empty classes and structures as well as opaque classes and structures.
5.3.8.6.3 Base and Derived Classes (C++)
Hierarchical groups of classes can be designed in C++.
A base class
serves
as a wider classification for its derived classes, and a derived class has
all of the members and methods of the base class, plus additional members
of its own.
In the symbol table,
the set of symbols denoting a derived class is nearly identical to that for
a non-derived class.
The derived class includes an additional
stBase
or
stVirtBase
symbol that identifies its corresponding base class, and it
does not need to duplicate the definitions for the base class members.
This
representation is shown in
Figure 5-41.
Figure 5-41: Base Class Representation
The representation of virtual base classes for C++ relies on the definition
of a special symbol that identifies
the virtual base table.
The name for this symbol is derived from the name
of the class to which it belongs.
For example, the virtual base table symbol
for class
C5
would be named
"_btbl_2C5"
.
This table contains entries for base class run-time descriptions.
A class can include the special member
_bptr
.
This
class member is a pointer to the virtual base table for that class.
The
value
field for a virtual base class
symbol (
stVirtBase
/
scInfo
) serves as an index (starting at 1) into the virtual base
class table.
5.3.8.7 Template Type (C++)
Templates are a C++-specific language construct allowing the parameterization
of types.
C++ class templates are
represented in the symbol table for each instantiation, but not for the template
itself.
The set of class symbols is unchanged from the set shown in
Figure 5-38.
5.3.8.8 Interlude Type (C++)
Interludes are compiler generated functions in C++.
They are represented
in the local symbol table with special names starting with the
"__INTER__" prefix.
Their representation in the symbol table makes use of
two
RNDXR
aux entries to identify the related member function and
the actual interlude
function, both of which are local symbol table entries.
Figure 5-42: Interlude Representation
5.3.8.9 Array Descriptor Type (Fortran90)
A Fortran90 array descriptor is a structure that describes an array: its location, dimensions, bounds, sizes, and other attributes. Array descriptors are described in detail in the Fortran 90 User Manual for Tru64 UNIX. Fortran90 includes several types of arrays for which the dimensions or dimension bounds are determined at run time: allocatable arrays, assumed shape arrays, and array pointers.
Two symbol table representations have been used for array descriptors. The current representation describes the array descriptor itself. The retired representation described attributes of the array known at compile time.
For both representations, symbols of this type point to a data location at which the array descriptor is allocated. One of the array descriptor fields contains a pointer to the actual array. Other fields are used to describe the attributes of the array. Fields that describe the number of dimensions and upper and lower bounds are filled in at run time.
By default, array descriptors are described by a structure tag representation.
Most of the array descriptor fields are represented as structure members.
(Excluded fields are not needed by debuggers.) Special tag names are used
to identify array descriptor structure definitions:
$f90$f90_array_desc
(assumed-shape array),
$f90$f90_ptr_desc
(pointer
to array) and
$f90$f90_alloc_desc
(allocatable array).
Figure 5-43
shows the format of this representation.
Some compilers may emit other fields in addition to those shown in Figure 5-43. A consumer's ability to interpret additional fields depends on its knowledge of the producing compiler.
Figure 5-43: Array Descriptor Representation
An example of the default Fortran array descriptor representation can be found in Section 9.3.3.
Version Note The following representation of Fortan array descriptors is supported in symbol table formats less than V3.13. It is not supported in symbol table format V3.13 and greater.
This retired representation of Fortran array descriptors is substantially more compact in the local symbol table, but it provides no way to distinguish between the different array descriptor types.
The
overloaded basic type value 28 indicates an array descriptor in the
TIR
, and dimension bounds are
set to [1:1] indicating their true size is unknown.
The alternate representation
does not provide any information describing the contents of the array descriptor
itself, so debuggers must assume a static representation for the descriptor
and lookup the fields at their expected offsets.
Figure 5-44 shows this representation of array descriptors.
Figure 5-44: Array Descriptor Representation (retired)
5.3.8.10 Conformant Array Type (Pascal)
Full details
are not currently available for Pascal's conformant array
representation.
A Pascal conformant array is very
similar to Fortran's assumed shape arrays.
It is an array parameter with upper
and lower dimension bounds that are determined by the input argument.
A conformant
array is represented by an array descriptor.
The special names used and the
format of the array descriptor differ from those used for Fortran.
The DEC
Pascal release notes contain additional information on conformant arrays.
5.3.8.11 Variant Record Type (Pascal and Ada)
A variant record is an extension to the record data type, which is a Pascal or Ada data structure akin to a C structure and is represented in the same manner in the symbol table. The variant part of the record consists of sets of one or more fields associated with a range of values. Only one such set is part of the record, and it is selected based on the value of another record field. Any number of variant parts can be embedded in a single record.
Version Note The following variant record representation is for symbol table format V3.13 and greater.
The local symbol table
entries for the variant part of a record are
contained within a block with the storage class (sc
value)
scVariant
.
The
value
field of the
stBlock
entry contains the index of the local symbol entry for
the member of the record whose value determines which variant arm is used.
The variant block contains multiple inner blocks, each representing a variant
arm.
The
value
field of each of these block entries
is an auxiliary table index.
Each auxiliary table entry starts with a
count
, which indicates how many range entries follow.
The range
entries describe the values associated with the block.
Figure 5-45 is a graphical representation of a variant record.
Figure 5-45: Variant Record Representation
Version Note The following variant record representation is for symbol table formats less than V3.13. It is not supported in symbol table format V3.13 and greater.
The representation of variant records depicted in
Figure 5-46
does not include
TIR
auxiliaries.
Figure 5-46: Variant Record Representation (retired)
An example of a Pascal variant record can be found in
Section 9.4.3.
5.3.8.12 Subrange Type (Pascal and Ada)
A subrange data type
defines a subset of the values associated with a particular ordinal type (the
"base type" of the subrange).
Ordinal types
in Pascal include integers, characters,
and enumerated types.
The symbol table representation of a subrange uses the
btRange
or
btRange_64
type followed by an auxiliary index identifying the base type
and entries providing the bounds of the subrange.
The 32-bit representation
is shown in
Figure 5-47
and the 64-bit representation is shown
in
Figure 5-48.
Figure 5-47: Subrange Representation
Figure 5-48: 64-bit Range Representation
Version Note The 64-bit range representation is supported in symbol table format V3.13 and greater.
An example of a Pascal subrange can be found in
Section 9.4.2.
5.3.8.13 Set Type (Pascal)
A set is a data type that groups ordinal elements in an unordered list.
The arithmetic and logical operators
are overloaded in Pascal; this enables them to be used with set variables
to perform classic set operations such as union and intersection.
A special
auxiliary type definition
btSet
exists to identify this type.
The symbol
table representation is depicted in
Figure 5-49.
Figure 5-49: Set Representation
The element type for
a set is typically a range or an enumeration.
An example of a Pascal set can be found in
Section 9.4.1.
5.3.9 Special Debug Symbols
A variety of special symbols are
used throughout the symbol table to convey call frame information, special
type semantics, or other language specific information.
These names are reserved
for use by compilers and other tools
that produce Tru64 UNIX object files.
Table 5-14: Special Debug Symbols
Name | Purpose |
Name | Purpose |
__StaticLink.* |
(SV3.13 - ) Uplevel link. See Section 5.3.4.4. |
_BLNK__ |
Fortran unnamed common block. See Section 5.3.6.6. |
MAIN__ |
Fortran alias for main program unit. See Section 5.3.10.4. |
ARGNAME.len |
Generated parameter for Fortran routines. It contains the length of ARGNAME, a parameter of character type. |
.lb_<ARRAY>.<dim> .ub_<ARRAY>.<dim>
|
Lower and upper bounds of particular dimensions of arrays - when the array has an explicit shape, yet some bounds come from non-constant specification expressions (array arguments in Pascal and Fortran routines). |
$f90$f90_array_desc $f90$f90_alloc_desc $f90$f90_ptr_desc
|
Variants of Fortran-90 described arrays (assumed shape, ALLOCATABLE, and POINTER, respectively). See Section 5.3.8.9. |
cray pointee |
Fortran-generated typedef describing the type of a variable pointed to by a CRAY pointer. |
pointer |
Fortran generated typedef describing the type of a scalar with the POINTER attribute. |
_DECCXX_generated_name_* |
DECC++ compiler-inserted name for unnamed classes and enumerations. |
this |
Hidden parameter in C++ member functions that is a pointer to the current instance of the class. See Section 5.3.8.6. |
__vptr |
Hidden C++ class member containing the virtual function table. See example in Section 9.2.2. |
__bptr |
Hidden C++ class member containing the virtual base class table. See example in Section 9.2.2. |
__vtbl_* |
Global symbols for C++ virtual function tables. See example in Section 9.2.2. |
__btbl_* |
Global symbols for C++ virtual base class tables. See example in Section 9.2.2. |
__control |
Hidden argument to C++ constructors controlling descent (in the face of virtual base classes). |
__t*__evdf |
Structure used to maintain a list of C++ global deconstructors. |
t*__iviw |
C++ static procedure used for global constructors. |
t*__evdw |
C++ static procedure used for global destructors. |
__t*_thunk |
C++ static procedure used to provide a defaulted argument value. |
__INTER__* |
C++ interlude. See example in Section 9.2.2. |
__N1* |
C++ unnamed namespaces. See example in Section 9.2.4. |
Among the linker's chief tasks is symbol resolution. Because most compilations involve multiple source files and virtually all programs rely on system libraries, a process is necessary to resolve conflicting uses of global symbol names. The linker must decide which symbol is referenced by a given name. This section highlights the major issues involved in that decision. Related information is contained in Section 6.3.4 and the Programmer's Guide.
Symbol table entries provide information relevant to performing symbol
resolution.
External symbols
with a storage class of
sc(S)Undefined
,
sc(S)Common
, or
scTlsCommon
must be resolved before they are referenced.
By
default, the linker will not mark an object
file
with unresolved symbols as executable.
However, linker options give programmers
a fair measure of control over its symbol resolution behavior.
See
ld
(1)
for
more information.
5.3.10.1 Library Search
Symbols referenced, but not defined in the main executable of an application must be matched with definitions in linked-in libraries. The linker combines objects, archives, and shared libraries while attempting to resolve all references to undefined symbols. The Programmer's Guide covers related topics in detail, such as how to specify libraries during compilation and the search order of libraries.
In general, main executable objects and shared libraries are searched
before archive libraries.
If no undefined
external symbols remain, archive
libraries in the library list do not have to be searched, because
archive members are only loaded to resolve external references.
Archives are
not used to find "better" common definitions (see
Section 5.3.10.2),
and no archive definitions preempt symbol definitions from the main object
or shared libraries.
5.3.10.2 Resolution of Symbols with Common Storage Class
Symbols with
common storage class are a special category of global symbols that have a
size but no allocated storage.
Symbols with common storage class
should
not be confused with Fortran common symbols, which are not represented
by a single symbol table entry.
(See
Section 5.3.6.6
for a description of Fortran common symbols.)
Common storage classes are
scCommon
,
scSCommon
, and
scTlsCommon
.
The symbol definition model used by Tru64 UNIX allows an unlimited number of common storage class symbols with the same name. Ultimately, the "best" of these must be selected (by the linker or the loader) during symbol resolution. The criteria used to select the best symbol definition include the symbol's allocation status and size.
The symbol table does not provide an "allocated common" storage class.
Common storage class symbols adopt a new storage class when they are allocated.
Typically, their new storage class is
scBss
or
scSBss
or
scTlsBss
.
On the other hand, the dynamic symbol table does explicitly distinguish common
storage class symbols that have been allocated.
See
Section 6.3.4
for more information
on dynamic symbol resolution.
A symbol reference is resolved according to the following precedence rules:
Find a symbol definition that does not have a common storage class and is not identified as an allocated common in the dynamic symbol table.
Find the largest allocated common identified in the dynamic symbol table.
Find the largest common storage class symbol and allocate it. This step will be skipped when the linker produces a relocatable object file.
Precedence is given to symbol definitions with storage allocation to minimize load time common allocation and redundant storage allocations in shared objects. The loader is capable of allocating space for common storage class symbols, but this should only be necessary when a program references an allocated common symbol in a shared library that is later removed from that shared library.
Note that Fortran common block representations use common storage class
symbols.
Another very frequent occurrence of a common storage class symbol
is a C-language global variable that does not have an initializer in its declaration.
5.3.10.3 Mangling and Demangling
Another issue related to symbol resolution is the need to "mangle" user-level identifiers. For example, C++ allows function overloading, prototyping, and the use of templatesall of which can result in the occurrence of the same names for different entities. The solution employed by the symbol table is to use mangled names that derive from the symbol's type signature.
Object file consumers, such as debuggers and object dumpers, need to "demangle" the identifiers so they can be output in a form that is recognizable to the user. For linking and loading, the mangled names are used for symbol resolution.
The encoding of C++ names is described in the manual Using DEC C++ for Tru64 UNIX Systems.
Other compilers may write symbol names that are modified by prepending
or appending special characters such as dollar sign ($) or underscore (_)
or by prepending qualifier strings such as file names or namespace names.
Uppercasing of names is also common for certain languages such as Fortran.
All of these transformations fall into the general category of mangled names.
Refer to the release notes for specific compilers for additional information.
5.3.10.4 Mixed Language Resolution
Compilation of a program involving multiple source languages introduces additional symbol resolution issues. One important task is resolving the main program entry point because conflicting "main" symbols may be present in the different files. For C and C++, the symbol "main" is the main program entry point, but for other languages, "main" will either be an alias for the main program or an interlude. DEC Fortran and DEC COBOL provide interludes that perform some language specific initializations and then call the real main program entry point. For DEC Fortran the main program is "MAIN__" and for DEC COBOL the main program is "__cobol_main". DEC Pascal provides a "main" symbol that aliases the actual main program symbol.
The symbols "MAIN__" and "__cobol_main" can both be present in a mixed
language program, and either, neither, or both can be used by the program.
Debuggers can set a breakpoint in the user's main program by applying some
precedence for selecting the most appropriate symbol.
For a mixed language
program, there is a slight chance that "MAIN__" or "__cobol_main" will be
present but never called.
5.3.10.5 TLS Symbols
TLS (Thread Local Storage)
symbols, like non-TLS symbols, can be undefined
or common.
Unresolved TLS symbols are identified
by the storage class
scTlsUndefined
,
and TLS commons have the storage class
scTlsCommon
.
The symbol resolution process for TLS names is similar, but separate; TLS
symbols cannot be resolved to non-TLS symbols or vice versa.
TLS common symbols are resolved in the same manner as other common storage class symbols (see Section 5.3.10.2), except that, again, only TLS symbols are candidates for resolution.
Another rule special to TLS is that symbol definitions for TLS common
and undefined symbols cannot
be imported from shared libraries.
5.4 Language-Specific Symbol Table Features
Language-specific characteristics are pervasive in the symbol table, particularly in the local, external, and auxiliary symbol tables. See Section 5.2 and Section 5.3.7 for information on language-specific values.
The
lang
field of the file descriptor entry
encodes the source language of the file.
This
field should be accessed prior to decoding symbolic information, especially
type descriptions.
This section highlights, by language, language-specific
features represented in the symbol table.
Additional information on certain
features is available elsewhere in this chapter.
5.4.1 Fortran77 and Fortran90
In Fortran, it
is possible to create multiple entry points in subroutines.
A subroutine has one main entry point and zero or more
alternate entry points, indicated by
ENTRY
statements.
See
Section 5.3.6.7
for
their representation in the symbol table.
Fortran90 array descriptors include allocatable arrays, assumed-shape arrays, and pointers to arrays. Their representation in the symbol table is discussed in Section 5.3.8.9.
Modules provide another scoping level in Fortran90 programs.
The symbol
table representation for modules has not yet been implemented.
5.4.2 C++
C++ classes encapsulate functions and data inside a single structure.
Classes are represented in the
symbol table using a
btClass
basic
type and the
stBlock
/
stEnd
scoping mechanism.
See
Section 5.3.8.6.
Templates provide for parameterized types. At present, no special symbol table values are related to templates. The template itself is not represented; rather, entries that correspond to each instantiation are generated. Template instantiations are distinguished by mangled names based on their type signatures.
C++ namespaces, like Fortran modules, offer an additional scope for program identifiers.
The C++ concepts of private, protected, and public data attributes are
not currently represented in the symbol table.
The C++ concept of "friend"
classes and functions are also not represented.
5.4.3 Pascal and Ada
Pascal conformant arrays are function parameters with array dimensions that are determined by the arguments passed to the function at run time. See Section 5.3.8.10.
Variant records are an extension of the record data structure. Variant records allow different sets of fields depending on the value of a particular record member. See Section 5.3.8.11.
Nested procedures are supported in these languages. They are represented using standard scoping mechanisms discussed in Section 5.3.6 and uplevel references described in Section 5.3.4.4.
Sets and subranges are user-defined subsets of ordinal types. Sets are unordered groups of elements, which can be manipulated with the classic set operations. Subranges are ordered and are used with the usual operators. See Section 5.3.8.12 and Section 5.3.8.13.
Ada subtypes of ordinal types are represented in the same manner as Pascal subranges.