The Third Degree tool checks for leaking heap memory, referencing invalid
addresses and reading uninitialized memory in C and C++ programs.
Programs
must first be compiled with either the
-g
or
-g
n
option, where
n
is greater than 0.
Third Degree also helps you determine the allocation habits
of your program by listing heap objects and finding wasted memory.
It accomplishes
this by instrumenting executable objects with extra code that automatically
monitors memory management services and load/store instructions at run time.
The requested reports are written to one or more log files that can optionally
be displayed, or associated with source code by using the
xemacs
(1)
editor.
By default, Third Degree checks only for memory leaks, resulting in
fast instrumentation and run-time analysis.
The other more expensive and intrusive
checks are selected with options on the command line.
See
third
(1)
for more information.
You can use Third Degree for the following types of applications:
Applications that allocate memory by using the
malloc
,
calloc
,
realloc
,
valloc
,
alloca
, and
sbrk
functions
and the C++
new
function.
You can also use Third Degree
to instrument programs using other memory allocators, such as the
mmap
function, but it will not check accesses to the memory obtained
in this manner.
If
your application uses
mmap
, see the description of the
-mapbase
option in
third
(1).
Third Degree detects and forbids calls to the
brk
function.
Furthermore, if your program allocates memory by partitioning large
blocks that it obtained by using the
sbrk
function, Third
Degree may not be able to precisely identify memory blocks in which errors
occur.
Applications that call
fork
(2).
You must specify the
-fork
option with the
third
(1)
command.
Applications that use the Tru64 UNIX implementation of
POSIX threads
(
pthread
(3)).
You must specify the
-pthread
option with the
third
(1)
command.
In
pthread
programs,
Third Degree does not check system-library routines (for example,
libc
and
libpthread
) for access to invalid addresses
or uninitialized variables; therefore,
strcpy
and other
such routines will not be checked.
Applications that use 31-bit heap addresses.
7.1 Running Third Degree on an Application
To invoke Third Degree, use the
third
(1)
command as follows:
third
[option...]
app
[argument...]
In this command synopsis,
option
selects
one or more options beyond the default nonthreaded leak checking,
app
is the name of the application, and
argument
represents one or more optional arguments that are passed to
the application if you want to run the instrumented program immediately.
(Use
the
-run
option if
app
needs
no arguments.)
The instrumented program, named
app.third
(see
third
(1)), differs from the original as follows:
The code is larger and runs more slowly because of the additional instrumentation code that is inserted. The amount of overhead depends on the number and nature of the specified options.
To detect errant use of uninitialized data, Third Degree initializes
all otherwise uninitialized data to a special pattern (0xfff8a5a5, or as specified
in the
-uninit
option).
This can cause the instrumented
program to behave differently, behave incorrectly, or crash (particularly
if this special pattern is used as a pointer).
All of these behaviors indicate
a bug in the program.
You can take advantage of this by running regression
tests on the instrumented program, and you can investigate problems using
the
third
(1)
command's
-g
option and running a debugger.
Third Degree poisons memory in this way only if the
-uninit
option is specified (for example,
-uninit heap+stack
).
Otherwise, most instrumented programs run just like the original.
Each allocated heap memory object is larger because Third
Degree pads it to allow boundary checking.
You can adjust the amount of padding
by specifying the
-pad
option.
When memory is deallocated with
free
or
delete
, it is held back from the free pool to help detect invalid
access.
Adjust the holding queue size with the
-free
option.
Third Degree writes error messages in a format similar to that used
by the C compiler.
It writes them to a log file named
app.3log
, by default.
You can use
emacs
to automatically
point to each error in sequence.
In
emacs
, use
[Esc/X]
compile
, replace the default
make
command with a command such as
cat
app.3log
, and step through the errors as
if they were compilation errors, with
[Ctrl/X]
`
.
You can change the name used for the log file by specifying one of the following options:
-pids
Includes the process identification number (PID) in the log file name.
-dirname
directory-name
Specifies the directory path in which Third Degree creates its log file.
-fork
Includes the PID in the name of the log file for each forked process.
Depending on the option supplied, the log file's name will be as follows:
Option | Filename | Use |
None or
-fork
parent |
app.3log |
Default |
-pids
or
-fork
child |
app.12345.3log |
Include PID |
-dirname /tmp |
/tmp/app.3log |
Set directory |
-dirname /tmp -pids |
/tmp/app.12345.3log |
Set directory and PID |
Errors in signal handlers may be reported in an additional
.sig.3log
option.
7.1.1 Using Third Degree with Shared Libraries
Errors in an application, such as passing too small a buffer to the
strcpy
function, are often caught in library routines.
Third Degree
supports the instrumentation of shared libraries; it instruments programs
linked with either the
-non_shared
or the
-call_shared
option.
The following options let you determine which shared libraries are instrumented by Third Degree:
-all
Instruments all shared libraries that were linked with the call-shared executable.
-excobj
objname
Excludes the named shared library from instrumentation.
You
can use the
-excobj
option more than once to specify
several shared libraries.
-incobj
objname
Instruments the named shared library.
You can use the
-incobj
option more than once to specify several shared libraries,
including those loaded using
dlopen()
.
-Ldirectory
Tells Third Degree where to find the program's shared libraries if they are not in the standard places known to the linker and loader.
When Third Degree finishes instrumenting the application, the current
directory contains an instrumented version of each specified shared library,
and at least minimally instrumented versions of
libc.so
,
libcxx.so
, and
libpthread.so
, as appropriate.
The instrumented application needs to use these versions of the libraries.
Define the
LD_LIBRARY_PATH
environment variable to tell
the instrumented application where the instrumented shared libraries reside.
The
third
(1)
command will do this automatically if you specify the
-run
option or you specify arguments to the application (which also
cause the instrumented program to be executed).
By default, Third Degree does not fully instrument any of the shared
libraries used by the application, though it does have to minimally instrument
libc.so
,
libcxx.so
, and
libpthread.so
when used.
This makes the instrumentation operation much faster
and causes the instrumented application to run faster as well.
Third Degree
detects and reports errors in the instrumented portion normally, but it does
not detect errors in the uninstrumented libraries.
If your partially instrumented
application crashes or malfunctions and you have fixed all of the errors reported
by Third Degree, reinstrument the application and all of its shared libraries
and run the new instrumented version, or use Third Degree's
-g
option to investigate the problem in a debugger.
Third Degree needs to instrument a shared library (but only minimally,
by default) to generate error reports that include stack traces through its
procedures.
Also, a debuggable procedure (compiled with the
-g
option, for example) must appear within the first few stack frames nearest
the error.
This avoids printing spurious errors that the highly optimized
and assembly code in system libraries can generate.
Use the
-hide
option to override this feature.
For pthread programs, Third Degree does not check some system shared
libraries (including
libc
) for errors, because doing so
would not be thread safe.
7.2 Debugging Example
Assume that you must debug the small application represented by the
following source code (ex.c
):
1 #include <assert.h> 2 3 int GetValue() { 4 int q; 5 int *r=&q; 6 return q; /* q is uninitialized */ 7 } 8 9 long* GetArray(int n) { 10 long* t = (long*) malloc(n * sizeof(long)); 11 t[0] = GetValue(); 12 t[0] = t[1]+1; /* t[1] is uninitialized */ 13 t[1] = -1; 14 t[n] = n; /* array bounds error*/ 15 if (n<10) free(t); /* may be a leak */ 16 return t; 17 } 18 19 main() { 20 long* t = GetArray(20); 21 t = GetArray(4); 22 free(t); /* already freed */ 23 exit(0); 24 }
The following sections explain how to use Third Degree to debug this
sample application.
7.2.1 Customizing Third Degree
Command-line options are used to turn on and off various capabilities of Third Degree.
If you do not specify any options, Third Degree instruments the program
as follows but does not run the instrumented program or display the resulting
.3log
file(s):
Detect leaks at program exit.
Do not check for memory errors (invalid addresses or uninitialized values).
Do not analyze the heap-usage history.
You can run the instrumented application with a command such as
./app.third
arg1 arg2
after setting the
LD_LIBRARY_PATH
environment variable.
Alternatively, you can append the application arguments to the
third
(1)
command line and/or specify the
-run
or
-display
options.
You can view the resulting
.3log
file
manually or by specifying the
-display
option.
To add checks for memory errors, specify the
-invalid
option and/or the
-uninit
option.
You can abbreviate the
-invalid
option, like all
third
(1)
options, to three letters (-inv
).
It tells Third Degree
to check that all significant load and store instructions are accessing valid
memory addresses that application code should.
This option carries a noticeable
performance overhead, but it has little effect on the run-time environment.
The
-uninit
option takes a "+"-separated
list of keyword arguments.
This is usually
heap+stack
(or
h+s
), which asks that both heap memory and stack memory be checked
for all significant load instructions.
Checking involves prefilling all stack
frames and heap objects allocated with
malloc
, and so on
(but not
calloc
), with the unusual pattern 0xfff8a5a5,
and reporting any load instruction that reads such a value out of memory.
That is, the selected memory is poisoned, much as by the
cc -trapuv
option, to highlight code that reads uninitialized data areas.
If the offending code was selected for instrumentation, Third Degree will
report each case (once only) in the
.3log
file.
However,
whether or not the code was instrumented, the code will load and process the
poison pattern instead of the value that the original program would have loaded.
This may cause the program to malfunction or crash, because the pattern is
not a valid pointer, character, or floating-point number, and it is a negative
integer.
Such behavior is a sign of a bug in the program.
You can identify malfunctions by running regression tests on the instrumented
program, specifying
-quiet
and omitting
-display
if running within
third
(1).
You can debug malfunctions or crashes
by looking at the error messages in the
.3log
file and
by running the instrumented program in a debugger such as
dbx
(1), or
ladebug
(1)
for C++ and pthread applications.
To use a debugger, compile with a
-g
option and specify
-g
on the
third
(1)
command line as well.
The
-uninit
option can report false errors, particularly
for variables, array elements, and structure members of less
than 32 bits (for example,
short
,
char
,
bit-field).
See
Section 7.6.
However, using
the
-uninit heap+stack
option can improve the accuracy
of leak reports.
To add a heap-usage analysis, specify the
-history
option.
This enables the
-uninit heap
option.
7.2.2 Modifying the Makefile
Add the following entry to the application's makefile:
ex.third: ex third ex
Build
ex.third
as follows:
>
make ex.third
third ex
Now run the instrumented application
ex.third
and
check the log
ex.3log
.
Alternatively, run it and display
the
.3log
file immediately by adding the
-display
option before the program name.
7.2.3 Examining the Third Degree Log File
The
ex.3log
file contains several parts that are
described in the following sections, assuming this command line as an example:
> third -invalid -uninit h+s -history -display ex
7.2.3.1 List of Run-Time Memory Access Errors
The types of errors that Third Degree can detect at run-time include such conditions as reading uninitialized memory, reading or writing unallocated memory, freeing invalid memory, and certain serious errors likely to cause an exception. For each error, an error entry is generated with the following items:
A banner line with the type of error and number -- The error banner line contains a three-letter abbreviation of each error (see Section 7.3 for a list of the abbreviations). If the process that caused the error is not the root process (for instance, because the application forks one or more child processes), the PID of the process that caused the error also appears in the banner line.
An error message line formatted to look like a compiler error message -- Third Degree lists the file name and line number nearest to the location where the error occurred. Usually this is the precise location where the error occurred, but if the error occurs in a library routine, it can also point to the place where the library call occurred.
One or more stack traces -- The last part of an error entry is a stack trace. The first procedure listed in the stack trace is the procedure in which the error occurred.
The following examples show entries from the log file:
The following log entry indicates that a local variable of
procedure
GetValue
was read before being initialized.
The line number confirms that
q
was never given a value.
---------------------------------------------------- rus -- 0 -- ex.c: 6: reading uninitialized local variable q of GetValue GetValue ex, ex.c, line 6 GetArray ex, ex.c, line 11 main ex, ex.c, line 20 __start ex
In the following log entry, an error is reported at line 12:
t[0] = t[1]+1
Because the array was not initialized, the
program is using the uninitialized value of
t[1]
in the
addition.
The memory block containing array
t
is identified
by the call stack that allocated it.
Stack variables are identified by name
if the code was compiled with the
-g
option.
---------------------------------------------------- ruh -- 1 -- ex.c: 12: reading uninitialized heap at byte 8 of 160-byte block GetArray ex, ex.c, line 12 main ex, ex.c, line 20 __start ex This block at address 0x14000ca20 was allocated at: malloc ex GetArray ex, ex.c, line 10 main ex, ex.c, line 20 __start ex
The following log entry indicates that the program has written to the memory location one position past the end of the array, potentially overwriting important data or even Third Degree internal data structures. Keep in mind that certain errors reported later could be a consequence of this error:
---------------------------------------------------- wih -- 2 -- ex.c: 14: writing invalid heap 1 byte beyond 160-byte block GetArray ex, ex.c, line 14 main ex, ex.c, line 20 __start ex This block at address 0x14000ca20 was allocated at: malloc ex GetArray ex, ex.c, line 10 main ex, ex.c, line 20 __start ex
The following log entry indicates that an error occurred while
freeing memory that was previously freed.
For errors involving calls to the
free
function, Third Degree usually gives three call stacks:
The call stack where the error occurred
The call stack where the object was allocated
The call stack where the object was freed
Upon examining the program, it is clear that the second call
to
GetArray
(line 20) frees the object (line 14), and that
another attempt to free the same object occurs at line 21:
---------------------------------------------------- fof -- 3 -- ex.c: 22: freeing already freed heap at byte 0 of 32-byte block free ex main ex, ex.c, line 22 __start ex This block at address 0x14000d1a0 was allocated at: malloc ex GetArray ex, ex.c, line 10 main ex, ex.c, line 21 __start ex This block was freed at: free ex GetArray ex, ex.c, line 15 main ex, ex.c, line 21 __start ex
See
Section 7.3
for more information.
7.2.3.2 Memory Leaks
The following excerpt shows the report generated when leak detection on program exit, the default, is selected. The report shows a list of memory leaks sorted by importance and by call stack.
------------------------------------------------------------------------ ------------------------------------------------------------------------ New blocks in heap after program exit Leaks - blocks not yet deallocated but apparently not in use: * A leak is not referenced by static memory, active stack frames, or unleaked blocks, though it may be referenced by other leaks. * A leak "not referenced by other leaks" may be the root of a leaked tree. * A block referenced only by registers, unseen thread stacks, mapped memory, or uninstrumented library data is falsely reported as a leak. Instrumenting shared libraries, if any, may reduce the number of such cases. * Any new leak lost its last reference since the previous heap report, if any. A total of 160 bytes in 1 leak were found: 160 bytes in 1 leak (including 1 not referenced by other leaks) created at: malloc ex GetArray ex, ex.c, line 10 main ex, ex.c, line 20 __start ex Objects - blocks not yet deallocated and apparently still in use: * An object is referenced by static memory, active stack, or other objects. * A leaked block may be falsely reported as an object if a pointer to it remains when a new stack frame or heap block reuses the pointer's memory. Using the option to report uninitialized stack and heap may avoid such cases. * Any new object was allocated since the previous heap report, if any. A total of 0 bytes in 0 objects were found:
Upon examining the source, it is clear that the first call to
GetArray
did not free the memory block, nor was it freed anywhere
else in the program.
Moreover, no pointer to this block exists in any other
heap block, so it qualifies as "not referenced by other leaks
".
The distinction is often useful to find the real culprit
for large memory leaks.
Consider a large tree structure and assume that the pointer to the root has been erased. Every block in the structure is a leak, but losing the pointer to the root is the real cause of the leak. Because all blocks but the root still have pointers to them, albeit only from other leaks, only the root will be specially qualified, and therefore the likely cause of the memory loss.
See
Section 7.4
for more information.
7.2.3.3 Heap History
When heap history is enabled, Third Degree collects information about dynamically allocated memory. It collects this information for every block that is freed by the application and for every block that still exists (including memory leaks) at the end of the program's execution. The following excerpt shows a heap allocation history report:
---------------------------------------------------------------- ---------------------------------------------------------------- Heap Allocation History for parent process Legend for object contents: There is one character for each 32-bit word of contents. There are 64 characters, representing 256 bytes of memory per line. '.' : word never written in any object. 'z' : zero in every object. 'i' : a non-zero non-pointer value in at least one object. 'pp': a valid pointer or zero in every object. 'ss': a valid pointer or zero in some but not all objects. 192 bytes in 2 objects were allocated during program execution: ------------------------------------------------------------------ 160 bytes allocated (8% written) in 1 objects created at: malloc ex GetArray ex, ex.c, line 10 main ex, ex.c, line 20 __start ex Contents: 0: i.ii.................................... ------------------------------------------------------------------ 32 bytes allocated (38% written) in 1 objects created at: malloc ex GetArray ex, ex.c, line 10 main ex, ex.c, line 21 __start ex Contents: 0: i.ii....
The sample program allocated two objects for a total of 192 bytes (8*(20+4)). Because each object was allocated from a different call stack, there are two entries in the history. Only the first few bytes of each array were set to a valid value, resulting in the written ratios shown.
If the sample program was a real application, the fact that so little of the dynamic memory was ever initialized is a warning that it was probably using memory ineffectively.
See
Section 7.4.4
for more information.
7.2.3.4 Memory Layout
The memory layout section of the report summarizes the memory used by the program by size and address range. The following excerpt shows a memory layout section:
----------------------------------------------------------------- ----------------------------------------------------------------- memory layout at program exit heap 40960 bytes [0x14000c000-0x140016000] stack 2720 bytes [0x11ffff560-0x120000000] ex data 48528 bytes [0x140000000-0x14000bd90] ex text 1179648 bytes [0x120000000-0x120110000]
The heap size and address range indicated reflect the value returned
by
sbrk(0)
, (the heap break) at program exit.
Therefore,
the size is the total amount of heap space that has been allotted to the
process.
Third Degree does not support the use of the
malloc
variables that would alter this interpretation of
sbrk(0)
.
The stack size and address range reflect the lowest address reached
by the main thread's stack pointer during execution of the program.
That is,
Third Degree keeps track of it through each instrumented procedure call.
For
this value to reflect the maximum stack size, all shared libraries need to
have been instrumented (for example, using the
third
(1)
command's
-all
option for a nonthreaded program and
-incobj
options for libraries loaded with
dlopen(3)
).
The stacks
of threads (created using
pthread_create
) are not included.
The data and text sizes and address ranges show where the static portions
of the executable and each shared library were loaded.
7.3 Interpreting Third Degree Error Messages
Third Degree reports both fatal errors and memory access errors. Fatal errors include the following:
Bad parameter
For example,
malloc(-10)
.
Failed allocator
For example,
malloc
returned a zero, indicating that
no memory is available.
Call to the
brk
function with a nonzero
argument
Third Degree does not allow you to call
brk
with
a nonzero argument.
Memory allocation not permitted in signal handler.
A fatal error causes the instrumented application to crash after flushing
the log file.
If the application crashes, first check the log file and then
rerun it under a debugger, having specified
-g
on the
third
(1)
command line.
Memory errors include the following (as represented by a three-letter abbreviation):
Name | Error |
ror |
Reading out of range: not heap, stack, or static (for example, NULL) |
ris |
Reading invalid data in stack: probably an array bound error |
rus |
Reading an uninitialized (but valid) location in stack |
rih |
Reading invalid data in heap: probably an array bound error |
ruh |
Reading an uninitialized (but valid) location in heap |
wor |
Writing out of range: neither in heap, stack, or static area |
wis |
Writing invalid data in stack: probably an array bound error |
wih |
Writing invalid data in heap: probably an array bound error |
for |
Freeing out of range: neither in heap or stack |
fis |
Freeing an address in the stack |
fih |
Freeing an invalid address in the heap: no valid object there |
fof |
Freeing an already freed object |
fon |
Freeing a null pointer (really just a warning) |
mrn |
malloc
returned null |
You can suppress the reporting of specific memory errors by specifying
one or more
-ignore
options.
This is often useful when
the errors occur within library functions for which you do not have the source.
Third Degree allows you to suppress specific memory errors in individual
procedures and files, and at particular line numbers.
See
third
(1)
for more details.
Alternatively, do not select the library for checking, by specifying
-excobj
or omitting the
-all
or
-incobj
option.
7.3.1 Fixing Errors and Retrying an Application
If Third Degree reports many write errors from your instrumented program,
fix the first few errors and then reinstrument the program.
Not only can
write errors compound, but they can also corrupt Third Degree's internal data
structures.
7.3.2 Detecting Uninitialized Values
Third Degree's technique for detecting the use of uninitialized values
can cause programs that have worked to fail when instrumented.
For example,
if a program depends on the fact that the first call to the
malloc
function returns a block initialized to zero, the instrumented
version of the program will fail because Third Degree poisons all blocks with
a nonzero value (0xfff8a5a5, by default).
When it detects a signal, perhaps caused by dereferencing or otherwise using this uninitialized value, Third Degree displays a message of the following form:
*** Fatal signal SIGSEGV detected. *** This can be caused by the use of uninitialized data. *** Please check all errors reported in app.3log.
Using uninitialized data is the most likely reason for an instrumented program to crash. To determine the cause of the problem, first examine the log file for reading-uninitialized-stack and reading-uninitialized-heap errors. Very often, one of the last errors in the log file reports the cause of the problem.
If you have trouble pinpointing the source of the error, you can confirm
that it is indeed due to reading uninitialized data by removing one of the
heap
and
stack
options from the
-uninit
option (or the whole option).
Removing
stack
disables the poisoning of newly allocated stack memory that Third Degree normally
performs on each procedure entry.
Similarly, removing
heap
disables the poisoning of heap memory performed on each dynamic memory allocation.
By using one or both options, you can alter the behavior of the instrumented
program and may likely get it to complete successfully.
This will help you
determine which type of error is causing the instrumented program to crash
and, as a result, help you focus on specific messages in the log file.
Alternatively, run the instrumented program in a debugger (using the
-g
option of the
third
(1)
command) and remove the cause of the
failure.
You need not use the
-uninit
option if you just
want to check for memory leaks; however, using the
-uninit
option can make the leak reports more accurate.
If your program establishes signal handlers, there is a small chance
that Third Degree's changing of the default signal handler may interfere with
it.
Third Degree defines signal handlers only for those signals that normally
cause program crashes (including
SIGILL
,
SIGTRAP
,
SIGABRT
,
SIGEMT
,
SIGFPE
,
SIGBUS
,
SIGSEGV
,
SIGSYS
,
SIGXCPU
, and
SIGXFSZ
).
You can disable Third Degree's signal handling by specifying the
-signals
option.
7.3.3 Locating Source Files
Third Degree prefixes each error message with a file and line number in the style used by compilers. For example:
----------------------------------------------------- fof -- 3 -- ex.c: 21: freeing already freed heap at byte 0 of 32-byte block free malloc.c main ex.c, line 21 __start crt0.s
Third Degree tries to point as closely as possible to the source of the error, and it usually gives the file and line number of a procedure near the top of the call stack when the error occurred, as in this example. However, Third Degree may not be able to find this source file, either because it is in a library or because it is not in the current directory. In this case, Third Degree moves down the call stack until it finds a source file to which it can point. Usually, this is the point of call of the library routine.
To tag these error messages, Third Degree must determine the location
of the program's source files.
If you are running Third Degree in the directory
containing the source files, Third Degree will locate the source files there.
If not, to add directories to Third Degree's search path, specify one or more
-use
options.
This allows Third Degree to find the source files
contained in other directories.
The location of each source file is the first
directory on the search path in which it is found.
7.4 Examining an Application's Heap Usage
In addition to run-time checks that ensure that only properly allocated memory is accessed and freed, Third Degree provides two ways to understand an application's heap usage:
It can find and report memory leaks.
It can list the contents of the heap.
By default, Third Degree checks for leaks when the program exits.
This section discusses how to use the information provided by Third
Degree to analyze an application's heap usage.
7.4.1 Detecting Memory Leaks
A memory leak is an object in the heap to which no in-use pointer exists. The object can no longer be accessed and can no longer be used or freed. It is useless, will never go away, and wastes memory.
Third Degree finds memory leaks by using a simple trace-and-sweep algorithm. Starting from a set of roots (the currently active stack and the static areas), Third Degree finds pointers to objects in the heap and marks these objects as visited. It then recursively finds all potential pointers inside these objects and, finally, sweeps the heap and reports all unmarked objects. These unmarked objects are leaks.
The trace-and-sweep algorithm finds all leaks, including circular structures. This algorithm is conservative: in the absence of type information, any 64-bit pattern that is properly aligned and pointing inside a valid object in the heap is treated as a pointer. This assumption can infrequently lead to the following problems:
Third Degree considers pointers either to the beginning or interior of an object as true pointers. Only objects with no pointers to any address they contain are considered leaks.
If an instrumented application hides true pointers by storing
them in the address space of some other process or by encoding them, Third
Degree will report spurious leaks.
When instrumenting such an application
with Third Degree, specify the
-mask
option.
This option
lets you specify a mask that is applied as an AND operator against every potential
pointer.
For example, if you use the top three bits of pointers as flags,
specify a mask of 0x1fffffffffffffff.
See
third
(1).
Third Degree can confuse any bit pattern (such as string, integer, floating-point number, and packed struct) that looks like a heap pointer with a true pointer, thereby missing a true leak.
Third Degree does not notice pointers that optimized code stores only in registers, not in memory. As a result, it may produce false leak reports.
To maximize the accuracy of the leak reports, use the
-uninit
h+s
and
-all
options.
However, the
-uninit
option can cause the program to fail, and the
-all
option increases the instrumentation and run time.
So, just
check both the Leaks and Objects listings, and evaluate for possible program
errors.
7.4.2 Reading Heap and Leak Reports
You can supply command options that tell Third Degree to generate heap
and leak reports incrementally, listing only new heap objects
or leaks since the last report or listing all heap objects or leaks.
You can
request these reports when the program terminates, or before or after every
nth call to a user-specified function.
See
third
(1)
for details of
the
-blocks
,
-every
,
-before
, and
-after
options.
The
-blocks
option (the default) reports both the leaks and the objects in the heap, so
you will never miss one in the event that it is classified as the wrong type.
The
.3log
file describes the situations where incorrect
classification can occur, along with ways to improve its accuracy.
You should pay closest attention to the leaks report, because Third Degree has found evidence suggesting that the reported blocks really are leaked, whereas the evidence suggests that the blocks reported as objects were not. However, if your debugging and examination of the program suggests otherwise, you can reasonably deduce that the evidence was misleading to the tool.
Third Degree lists memory objects and leaks in the report by decreasing importance, based on the number of bytes involved. It groups together objects allocated with identical call stacks. For example, if the same call sequence allocates a million one-byte objects, Third Degree reports them as a 1-MB group containing a million allocations.
To tell Third Degree when objects or leaks are the same and should be
grouped in the report (or when objects or leaks are different and should not
be thus grouped), specify the
-depth
option.
It sets the
depth of the call stack that Third Degree uses to differentiate leaks or objects.
For example, if you specify a depth of 1 for objects, Third Degree groups
valid objects in the heap by the function and line number that allocated them,
no matter what function was the caller.
Conversely, if you specify a very
large depth for leaks, Third Degree groups only leaks allocated at points
with identical call stacks from
main
upwards.
In most heap reports, the first few entries account for most of the
storage, but there is a very long list of small entries.
To limit the length
of the report, you can use the
-min
option.
It defines
a percentage of the total memory leaked or in use by an object as a threshold.
When all smaller remaining leaks or objects amount to less than this threshold,
Third Degree groups them together under a single final entry.
Notes
Because the
realloc
function always allocates a new object (by involving calls tomalloc
,copy
, andfree
), its use can make interpretation of a Third Degree report counterintuitive. An object can be listed twice under different identities.Leaks and objects are mutually exclusive: an object must be reachable from the roots.
It may not always be obvious when to search for memory leaks. By default, Third Degree checks for leaks after program exit, but this may not always be what you want.
Leak detection is best done as near as possible to the end of the program
while all used data structures are still in scope.
Remember, though, that
the roots for leak detection are the contents of the stack and static areas.
If your program terminates by returning from
main
and the
only pointer to one of its data structures was kept on the stack, this pointer
will not be seen as a root during the leak search, leading to false reporting
of leaked memory.
For example:
1 main (int argc, char* argv[]) { 2 char* bytes = (char*) malloc(100); 3 exit(0); 4 }
When you instrument this program, specifying
-blocks all -before
exit
will cause Third Degree to not find any leaks.
When the program
calls the
exit
function, all of
main
's
variables are still in scope.
However, consider the following example:
1 main (int argc, char* argv[]) { 2 char* bytes = (char*) malloc(100); 3 }
When you instrument this program, providing the same (or no) options,
Third Degree's leak check may report a storage leak because
main
has returned by the time the check happens.
Either of these two
behaviors may be correct, depending on whether
bytes
was
a true leak or simply a data structure still in use when
main
returned.
Rather than reading the program carefully to understand when leak detection
should be performed, you can check for new leaks after a specified number
of calls to the specified procedure.
Use the following options to disable
the default leak-checking and to request a leak before every 10,000th call
to the procedure
proc_name
:
-blocks cancel -blocks new -every 10000 -before proc_name
7.4.4 Interpreting the Heap History
When you instrument a program using the
-history
option, Third Degree generates a heap history for the program.
A heap history
allows you to see how the program used dynamic memory during its execution.
For example, you can use this feature to eliminate unused fields in data
structures or to pack active fields to use memory more efficiently.
The heap
history also shows memory blocks that are allocated but never used by the
application.
When heap history is enabled, Third Degree collects information about each dynamically allocated object at the time it is freed by the application. When program execution completes, Third Degree assembles this information for every object that is still alive (including memory leaks). For each object, Third Degree looks at the contents of the object and categorizes each word as never written by the application, zero, a valid pointer, or some other value.
Third Degree next merges the information for each object with what it has gathered for all other objects allocated at the same call stack in the program. The result provides you with a cumulative picture of the use of all objects of a given type.
Third Degree provides a summary of all objects allocated during the
life of the program and the purposes for which their contents were used.
The
report shows one entry per allocation point (for example, a call stack where
an allocator function such as
malloc
or
new
was called).
Entries are sorted by decreasing volume of allocation.
Each entry provides the following:
Information about all objects that have been allocated
Total number of bytes allocated
Total number of objects that have been allocated
Percentage of bytes of the allocated objects that have been written
The call stack and a cumulative map of the contents of all objects allocated by that call stack
The contents part of each entry describes how the objects were used.
If all allocated objects are not the same size, Third Degree considers only
the minimum size common to all objects.
For very large allocations, it summarizes
the contents of only the beginning of the objects, by default, the first kilobyte.
You can adjust the maximum size value by specifying the
-size
option.
In the contents portion of an entry, Third Degree uses one of the following characters to represent each 32-bit longword that it examines:
Character | Description |
Dot (.) | Indicates a longword that was never written in any of the objects, a definite sign of wasted memory. Further analysis is generally required to see if it is simply a deficiency of a test that never used this field, if it is a padding problem solved by swapping fields or choosing better types, or if this field is obsolete. |
z | Indicates a field whose value was always 0 (zero) in every object. |
pp | Indicates a pointer: that is, a 64-bit quantity that was a valid pointer into the stack, the static data area, the heap (or was zero). |
ss | Indicates a sometime pointer. This longword looked like a pointer in at least one of the objects, but not in all objects. It could be a pointer that is not initialized in some instances, or a union. However, it could also be the sign of a serious programming error. |
i | Indicates a longword that was written with some nonzero value in at least one object and that never contained a pointer value in any object. |
Even if an entry is listed as allocating 100 MB, it does not mean that at any point in time 100 MB of heap storage were used by the allocated objects. It is a cumulative figure; it indicates that this point has allocated 100 MB over the lifetime of the program. This 100 MB may have been freed, may have leaked, or may still be in the heap. The figure simply indicates that this allocator has been quite active.
Ideally, the fraction of the bytes actually written should always be close to 100 percent. If it is much lower, some of what is allocated is never used. The common reasons why a low percentage is given include the following:
A large buffer was allocated but only a small fraction was ever used.
Parts of every object of a given type are never used. They may be forgotten fields or padding between real fields resulting from alignment rules in C structures.
Some objects have been allocated but never used at all. Sometimes leak detection will find these objects if their pointers are discarded. However, if they are kept on a free list, they will be found only in the heap history.
7.5 Using Third Degree on Programs with Insufficient Symbolic Information
If the executable you instrumented contains too little symbolic information for Third Degree to pinpoint some program locations, Third Degree prints messages in which procedure names or file names or line numbers are unknown. For example:
------------------------------------------------------ rus -- 0 -- reading uninitialized stack at byte 40 of 176 in frame of main proc_at_0x1200286f0 libc.so pc = 0x12004a268 libc.so main app, app.c, line 16 __start app
Third Degree tries to print the procedure name in the stack trace, but if the procedure name is missing (because this is a static procedure), Third Degree prints the program counter in the instrumented program. This information enables you to find the location with a debugger. If the program counter is unavailable, Third Degree prints the number of the unknown procedure.
Most frequently, the file name or line number is unavailable because
the file was compiled with the default
-g0
option.
In this
case, Third Degree prints the name of the object in which the procedure was
found.
This object may be either the main application or a shared library.
By default, error reports are printed only if a stack frame with a source
file name and a line number appears within two frames of the top of the stack.
This hides spurious reports that can be caused by the highly optimized and
assembly language code in the system libraries.
Use the
-hide
option to hide fewer (or more) reports involving nondebuggable procedures.
If the lack of symbolic information is hampering your debugging, consider
recompiling the program with more symbolic information.
Recompile with the
-g
or
-g1option and link without the
-x
option.
Using
-g
will make variable names
appear in reports instead of the byte offset shown in the previous example.
7.6 Validating Third Degree Error Reports
The following spurious errors can occur:
Modifications to variables, array elements, or structure members
that are less than 32 bits in size (such as
short
,
char
, bit field), as in this example:
void Packed() { char c[4]; struct { int a:6; int b:9; int c:4} x; c[0] = c[1] = 1; /* rus errors here ... */ x.a = x.c = x.e = 3; /* ... maybe here */ }
Ignore any implausible error messages, such as those reported for
strcpy
,
memcpy
,
printf
, and
so on.
Third Degree poisons newly allocated memory with a special value to detect references to uninitialized variables (see Section 7.3.2). Programs that explicitly store this special value into memory and subsequently read it may cause spurious "reading uninitialized memory" errors.
If you think that you have found a false positive, you can verify it
by using a debugger on the procedure in which the error was reported.
All
errors reported by Third Degree are detected at loads and stores in the application,
and the line numbers shown in the error report match those shown in the disassembly
output.
Compile and instrument the program with the
-g
option before debugging.
7.7 Undetected Errors
Third Degree can fail to detect real errors, such as the following:
Errors in operations on quantities smaller than 32 bits can
go undetected (for example,
char
,
short
,
and bit-field).
The
-uninit repeat
option can expose such
errors by checking more load and store operations, which Third Degree usually
considers too low a risk to check.
Third Degree cannot detect a chance access of the wrong object
in the heap.
It can only detect memory accesses from objects.
For example,
Third Degree cannot determine that
a[last+100]
is the same
address as
b[0]
.
You can reduce the chances of this happening
by altering the amount of padding added to objects.
To do this, specify the
-pad
option.
Third Degree may not be able to detect if the application
walks past the end of an array unless it also walks past the end of the array's
stack frame or its heap object.
Because Third Degree brackets objects in
the heap by guard words, it will miss small array bounds errors.
(Guard words
are spare memory added at the ends of valid memory blocks to detect overshoots.)
In the stack, adjacent memory is likely to contain local variables, and Third
Degree may fail to detect larger bounds errors.
For example, issuing a
sprintf
operation to a local buffer that is much too small may be
detected, but if the array bounds are only exceeded by a few words and enough
local variables surround the array, the error can go undetected.
Use the
cc
command's
-check_bounds
option to detect array
bounds violations more precisely.
Hiding pointers by encoding them or by keeping pointers only to the inside of a heap object will degrade the effectiveness of Third Degree's leak detection.
Third Degree may detect more uninitialized variables if compiler optimization is disabled (that is, with the -O0 and -inline none options).
At times, some leaks may not be reported, because old pointers were found in memory. Selecting checks for uninitialized heap memory (-uninit heap) may reduce this problem.
Any degree of optimization will skew leak-reporting results, because instructions that the compiler considers nonessential may be optimized away.