 |
Index for Section 1 |
|
 |
Alphabetical listing for H |
|
 |
Bottom of page |
|
hiprof(1)
NAME
hiprof - CPU-time and page-fault call-graph profiler for performance
analysis
SYNOPSIS
hiprof [-flat] [-pthread | -threads] [hiprof-option...] [gprof-option...]
program [argument...]
hiprof { -cycles | -faults } [hiprof-option...] [gprof-option...] program
[argument...]
See the start of the OPTIONS section for details of hiprof options that may
be essential for the correct execution of the program.
The atom -tool hiprof interface is still available, for compatibility with
earlier releases. However, it is now undocumented, and it will be retired
in a future release.
DESCRIPTION
See prof_intro(1) for an introduction to the application performance tuning
tools provided with Tru64 UNIX.
The hiprof command creates an instrumented version of a program
(program.hiprof) that produces call-graph and flat profiles of one of a
range of performance statistics:
· The CPU time spent in each procedure (or optionally, each source line
or instruction), measured by sampling the program counter about every
millisecond (the default)
· The CPU time spent in each procedure and procedure call, measured as
machine cycles, including the effects of any memory-access delays
(with the -cycles option)
· The number of page faults that occur during each procedure and
procedure call (with the -faults option)
See the limitations of each performance statistic in the RESTRICTIONS
section below.
If you specify program arguments (argument...) or -run, the instrumented
program is also executed.
If you specify -display or any of the gprof-options, the hiprof command
runs the instrumented program and then displays the profile by running the
gprof tool (with any specified gprof-options).
If you omit the program name, a usage message is printed.
The following example shows how to instrument, run, and display the profile
for a multithreaded program:
cc *.c -pthread -L. -g1 -O2 -o program -lapp1 -lapp2
hiprof -pthread -L. -all program data/*
The -all option request that all shared libraries be profiled, but
threads-related system libraries cannot be safely instrumented to count
procedure calls that are needed to print a call graph. By default, these
libraries are still sampled to provide flat CPU-time profiles. The -cycles
and -faults options cannot be used with threaded programs, but the
displayed time or page-fault count for a procedure includes the time or
count for any procedures that it calls but that were not selected for
instrumentation--for example, any procedures in libraries not selected by
the -all or -incobj options. This means that time is not lost from these
profiles by excluding shared libraries.
OPERANDS
program
File name of a fully linked call-shared or nonshared executable to be
profiled. This program should be compiled with the -g or -gn option
(n>=1) to obtain more complete profiling information. If the default
symbol table level (-g0) is used, line number information, static
procedure names, and file names are unavailable. Inlined procedure
calls are also unavailable.Programs that are stripped or are optimized
by spike or cc -om are not supported.
argument
All arguments following the program name are considered to be arguments
needed by the instrumented program to execute the procedures, lines,
and instructions of interest. Multiple arguments can be specified. They
imply -run if any are specified, and they can be replaced by -run if
none are needed.
OPTIONS
Options can be abbreviated to three characters. The gprof-options, which
are provided as alternatives to the -display option, can be abbreviated to
one character.
For options that specify a procedure name (proc), C++ procedures can omit
the argument type list, though this will match all overloaded procedures
with that name. To select a specific procedure, specify the full symbol
name (as printed by the nm command). Symbol names containing spaces,
asterisks, and so on must be quoted.
Essential Options
Some or all of these options may be needed to prevent the instrumented
program from malfunctioning:
-pthread
Specify -pthread if the program or any of its libraries calls
pthread_create(3) (for example, if it was compiled with either the
-pthread option or the -threads compatibility option). This will make
the collection of profile data thread-safe.
-fork
The -fork option is maintained for compatibility with earlier releases.
By default, hiprof now profiles subprocesses that do not call exec(2),
and produces separate profiling data files for the forked subprocesses,
including the process id in their file names as if -pids was specified.
-heapbase addr
By default, the hiprof code running in the program's process allocates
memory for its own use at address 38000000000. If the program needs to
use memory between 38000000000 and 3ff00000000, specify the address
that the hiprof code should use.
-sigdump signal
Specify -sigdump to force the instrumented program to write the current
profile data to its file(s) on receipt of the named signal. By
default, the program writes the profiling data file(s) only when the
process terminates, but some processes never terminate normally, so
this option lets you generate the file(s) on demand. After a file is
written, the instruction counts of the profile are all set to zero; so
by sending two signals, any interval of a test run can be profiled,
with the second signal's file(s) overwriting the first. For example, to
use the default kill pid command to signal the program, specify
-sigdump TERM. Choose a signal that the program does not use for
another purpose.
Profiling Statistics Options
-flat
Generates a flat profile; that is, it avoids the intrusiveness of
collecting the default call-graph information. If the -display option
is specified, it defaults to gprof -procedures. Do not use the -flat
option with the -cycles or -faults options.
-cycles
Profiles CPU time by counting the machine cycles used in each procedure
call. Use this option only for non-threaded programs.
-faults
Profiles page faults that occur during each procedure instead of the
default time spent in each procedure. Use this option only for
nonthreaded programs.
File Generating Options
-quiet
Does not print informational and progress messages on the standard
error stream.
-v Prints the command lines used to instrument the program and to execute
the instrumented program. Prints the names of any procedures that were
not instrumented.
-output file
Names the instrumented program file instead of the default
program.hiprof.
-dirname path
Specifies the directory to which the instrumented program writes the
profiling data file(s) for each test run. The default is the current
directory.
-pids
Adds the process-id of the instrumented program's test run to the name
of the profiling data file produced (that is, program.pid.hiout). By
default, the file is named program.hiout.
-threads
When profiling a threaded program, specify -threads to produce a
separate profile for each pthread in the program. The files are named
program[.pid].sequence.hiout, where sequence is the thread sequence
number assigned by pthread_create(3). The -threads option implies the
-pthread option. If -sigdump is needed, -pthread is recommended instead
of -threads, to avoid possible synchronization problems.
Shared-Library Profiling Options
-all
Profiles all of the shared libraries in addition to the program's
executable.
-excobj lib
If -all was specified, does not profile the shared library lib. Can be
repeated to exclude multiple libraries.
-incobj lib
Profiles the shared library lib. Can be repeated to include multiple
libraries.
-Lpath
Searches for shared-libraries in the specified directory before
searching the default directories. Can be repeated to make a search
path. Use the same options that were used when linking the program with
ld.
-E proc
Does not instrument the procedure proc. This option can be used to
exclude procedures that are uninteresting or that interfere with the
instrumentation (such as nonstandard assembly code).
Execution Control Options
-version
Prints the tool's version number.
-run
Executes the instrumented program, even if no arguments are specified.
By default, the program is only instrumented (for later execution).
-display
Executes the instrumented program, and runs gprof with default options
on the resulting .hiout file(s).
gprof-option
Executes the instrumented program, and runs gprof on the resulting
.hiout file(s). The following gprof options are supported:
-asm
Profiles each instruction within selected procedures.
-bounded
Does not report on called procedures.
-e proc
Excludes procedure proc and its descendants from the profile, but
totals all procedures.
-f proc
Includes only procedure proc and its descendants in the profile,
but totals all procedures.
-graph
Profiles procedures as an indexed call graph (default).
-heavy
Profiles source lines, listing the most heavily used first.
-lines
Profiles source lines, in order within selected procedures.
-merge file
Merges all .hiout input files into file.
-numbers
Prints each procedure's starting line number.
-procedures
Profiles procedures, listing the most heavily used first (default).
-totals
Profiles the whole executable and any shared libraries.
-zero
Reports procedures that were never called.
NOTES
If hiprof finds any previously instrumented shared libraries in the working
directory, it will reuse them if they meet current requirements, to reduce
re-instrumentation costs.
Temporary instrumentation files are created in /tmp. Set the TMPDIR
environment variable to a different directory to create the files
elsewhere, for example, in a disk partition with more space.
RESTRICTIONS
The default sampled profile only estimates the CPU time spent in each
procedure call; profiles made with the -cycles and -faults options measure
it.
When timing a program's procedures by measuring machine cycles (with the
-cycles option), the 32-bit cycle-counting hardware will wrap if no
procedure call or return is executed by the program every few seconds --
for example, because of a long-running loop. If the counter wraps, the
profile will be incorrect. Using the -all or -incobj options to profile all
nonsystem libraries and procedures can help avoid this restriction.
The -cycles option generates an inaccurate profile if the instrumented
program is run on a system whose processors have different cycle speeds.
This inaccuracy can be avoided by using hiprof's default sampling profiler
or the cc -p/-pg profilers instead, or by running the application on a
subset of the processors:
· Select a single processor using the runon command.
· Check the processor speeds using the psrinfo -v command and run the
application in a processor set comprising only processors that run at
the same speed (see processor_sets(4)).
Approximate performance estimates are as follows but will vary according to
the application and the machine's CPU count, type, and clock rate. The
hiprof instrumentation takes ~2s per Mb of program file on a 500-MHz EV6
(21264) Alpha system, using ~10 Mb of memory plus another ~10 Mb per Mb of
the largest file. The instrumented files are ~20% larger than the
originals, plus ~1 Mb of hiprof code. They run ~4 times slower. By default,
each profile data file is at least the size of the instrumented code (and
uses this much memory), but these files are very small for the -cycles and
-faults options.
If a procedure contains interprocedural branches or interprocedural jumps,
that procedure will not be instrumented with the -cycles or -faults option,
and no information will be reported about that procedure. Use the -v option
to see which procedures were not instrumented. Compilers can optimize
return statements or non-returning function calls to interprocedural
branches. To avoid this, recompile with the -O0 or -no_inline option.
FILES
program.hiprof
Instrumented version of program produced by hiprof
program[.pid][.sequence].hiout
Profile data file produced by program.hiprof
lib*.so.instrumented-executable
Instrumented shared libraries produced by hiprof
.hiprof.pid
Temporary file created and deleted in the current and -dirname path
directories.
SEE ALSO
Introduction: prof_intro(1)
atom(1), cc(1), dxprof(1), fork(2), gprof(1), kill(1), ld(1), pixie(1),
processor_sets(4), psrinfo(1), pthread(3), runon(1), uprofile(1). (dxprof
is available as an option.)
Programmer's Guide
 |
Index for Section 1 |
|
 |
Alphabetical listing for H |
|
 |
Top of page |
|