Index Index for
Section 1
Index Alphabetical
listing for H
Bottom of page Bottom of
page

hiprof(1)

NAME

hiprof - CPU-time and page-fault call-graph profiler for performance analysis

SYNOPSIS

hiprof [-flat] [-pthread | -threads] [hiprof-option...] [gprof-option...] program [argument...] hiprof { -cycles | -faults } [hiprof-option...] [gprof-option...] program [argument...] See the start of the OPTIONS section for details of hiprof options that may be essential for the correct execution of the program. The atom -tool hiprof interface is still available, for compatibility with earlier releases. However, it is now undocumented, and it will be retired in a future release.

DESCRIPTION

See prof_intro(1) for an introduction to the application performance tuning tools provided with Tru64 UNIX. The hiprof command creates an instrumented version of a program (program.hiprof) that produces call-graph and flat profiles of one of a range of performance statistics: · The CPU time spent in each procedure (or optionally, each source line or instruction), measured by sampling the program counter about every millisecond (the default) · The CPU time spent in each procedure and procedure call, measured as machine cycles, including the effects of any memory-access delays (with the -cycles option) · The number of page faults that occur during each procedure and procedure call (with the -faults option) See the limitations of each performance statistic in the RESTRICTIONS section below. If you specify program arguments (argument...) or -run, the instrumented program is also executed. If you specify -display or any of the gprof-options, the hiprof command runs the instrumented program and then displays the profile by running the gprof tool (with any specified gprof-options). If you omit the program name, a usage message is printed. The following example shows how to instrument, run, and display the profile for a multithreaded program: cc *.c -pthread -L. -g1 -O2 -o program -lapp1 -lapp2 hiprof -pthread -L. -all program data/* The -all option request that all shared libraries be profiled, but threads-related system libraries cannot be safely instrumented to count procedure calls that are needed to print a call graph. By default, these libraries are still sampled to provide flat CPU-time profiles. The -cycles and -faults options cannot be used with threaded programs, but the displayed time or page-fault count for a procedure includes the time or count for any procedures that it calls but that were not selected for instrumentation--for example, any procedures in libraries not selected by the -all or -incobj options. This means that time is not lost from these profiles by excluding shared libraries.

OPERANDS

program File name of a fully linked call-shared or nonshared executable to be profiled. This program should be compiled with the -g or -gn option (n>=1) to obtain more complete profiling information. If the default symbol table level (-g0) is used, line number information, static procedure names, and file names are unavailable. Inlined procedure calls are also unavailable.Programs that are stripped or are optimized by spike or cc -om are not supported. argument All arguments following the program name are considered to be arguments needed by the instrumented program to execute the procedures, lines, and instructions of interest. Multiple arguments can be specified. They imply -run if any are specified, and they can be replaced by -run if none are needed.

OPTIONS

Options can be abbreviated to three characters. The gprof-options, which are provided as alternatives to the -display option, can be abbreviated to one character. For options that specify a procedure name (proc), C++ procedures can omit the argument type list, though this will match all overloaded procedures with that name. To select a specific procedure, specify the full symbol name (as printed by the nm command). Symbol names containing spaces, asterisks, and so on must be quoted. Essential Options Some or all of these options may be needed to prevent the instrumented program from malfunctioning: -pthread Specify -pthread if the program or any of its libraries calls pthread_create(3) (for example, if it was compiled with either the -pthread option or the -threads compatibility option). This will make the collection of profile data thread-safe. -fork The -fork option is maintained for compatibility with earlier releases. By default, hiprof now profiles subprocesses that do not call exec(2), and produces separate profiling data files for the forked subprocesses, including the process id in their file names as if -pids was specified. -heapbase addr By default, the hiprof code running in the program's process allocates memory for its own use at address 38000000000. If the program needs to use memory between 38000000000 and 3ff00000000, specify the address that the hiprof code should use. -sigdump signal Specify -sigdump to force the instrumented program to write the current profile data to its file(s) on receipt of the named signal. By default, the program writes the profiling data file(s) only when the process terminates, but some processes never terminate normally, so this option lets you generate the file(s) on demand. After a file is written, the instruction counts of the profile are all set to zero; so by sending two signals, any interval of a test run can be profiled, with the second signal's file(s) overwriting the first. For example, to use the default kill pid command to signal the program, specify -sigdump TERM. Choose a signal that the program does not use for another purpose. Profiling Statistics Options -flat Generates a flat profile; that is, it avoids the intrusiveness of collecting the default call-graph information. If the -display option is specified, it defaults to gprof -procedures. Do not use the -flat option with the -cycles or -faults options. -cycles Profiles CPU time by counting the machine cycles used in each procedure call. Use this option only for non-threaded programs. -faults Profiles page faults that occur during each procedure instead of the default time spent in each procedure. Use this option only for nonthreaded programs. File Generating Options -quiet Does not print informational and progress messages on the standard error stream. -v Prints the command lines used to instrument the program and to execute the instrumented program. Prints the names of any procedures that were not instrumented. -output file Names the instrumented program file instead of the default program.hiprof. -dirname path Specifies the directory to which the instrumented program writes the profiling data file(s) for each test run. The default is the current directory. -pids Adds the process-id of the instrumented program's test run to the name of the profiling data file produced (that is, program.pid.hiout). By default, the file is named program.hiout. -threads When profiling a threaded program, specify -threads to produce a separate profile for each pthread in the program. The files are named program[.pid].sequence.hiout, where sequence is the thread sequence number assigned by pthread_create(3). The -threads option implies the -pthread option. If -sigdump is needed, -pthread is recommended instead of -threads, to avoid possible synchronization problems. Shared-Library Profiling Options -all Profiles all of the shared libraries in addition to the program's executable. -excobj lib If -all was specified, does not profile the shared library lib. Can be repeated to exclude multiple libraries. -incobj lib Profiles the shared library lib. Can be repeated to include multiple libraries. -Lpath Searches for shared-libraries in the specified directory before searching the default directories. Can be repeated to make a search path. Use the same options that were used when linking the program with ld. -E proc Does not instrument the procedure proc. This option can be used to exclude procedures that are uninteresting or that interfere with the instrumentation (such as nonstandard assembly code). Execution Control Options -version Prints the tool's version number. -run Executes the instrumented program, even if no arguments are specified. By default, the program is only instrumented (for later execution). -display Executes the instrumented program, and runs gprof with default options on the resulting .hiout file(s). gprof-option Executes the instrumented program, and runs gprof on the resulting .hiout file(s). The following gprof options are supported: -asm Profiles each instruction within selected procedures. -bounded Does not report on called procedures. -e proc Excludes procedure proc and its descendants from the profile, but totals all procedures. -f proc Includes only procedure proc and its descendants in the profile, but totals all procedures. -graph Profiles procedures as an indexed call graph (default). -heavy Profiles source lines, listing the most heavily used first. -lines Profiles source lines, in order within selected procedures. -merge file Merges all .hiout input files into file. -numbers Prints each procedure's starting line number. -procedures Profiles procedures, listing the most heavily used first (default). -totals Profiles the whole executable and any shared libraries. -zero Reports procedures that were never called.

NOTES

If hiprof finds any previously instrumented shared libraries in the working directory, it will reuse them if they meet current requirements, to reduce re-instrumentation costs. Temporary instrumentation files are created in /tmp. Set the TMPDIR environment variable to a different directory to create the files elsewhere, for example, in a disk partition with more space.

RESTRICTIONS

The default sampled profile only estimates the CPU time spent in each procedure call; profiles made with the -cycles and -faults options measure it. When timing a program's procedures by measuring machine cycles (with the -cycles option), the 32-bit cycle-counting hardware will wrap if no procedure call or return is executed by the program every few seconds -- for example, because of a long-running loop. If the counter wraps, the profile will be incorrect. Using the -all or -incobj options to profile all nonsystem libraries and procedures can help avoid this restriction. The -cycles option generates an inaccurate profile if the instrumented program is run on a system whose processors have different cycle speeds. This inaccuracy can be avoided by using hiprof's default sampling profiler or the cc -p/-pg profilers instead, or by running the application on a subset of the processors: · Select a single processor using the runon command. · Check the processor speeds using the psrinfo -v command and run the application in a processor set comprising only processors that run at the same speed (see processor_sets(4)). Approximate performance estimates are as follows but will vary according to the application and the machine's CPU count, type, and clock rate. The hiprof instrumentation takes ~2s per Mb of program file on a 500-MHz EV6 (21264) Alpha system, using ~10 Mb of memory plus another ~10 Mb per Mb of the largest file. The instrumented files are ~20% larger than the originals, plus ~1 Mb of hiprof code. They run ~4 times slower. By default, each profile data file is at least the size of the instrumented code (and uses this much memory), but these files are very small for the -cycles and -faults options. If a procedure contains interprocedural branches or interprocedural jumps, that procedure will not be instrumented with the -cycles or -faults option, and no information will be reported about that procedure. Use the -v option to see which procedures were not instrumented. Compilers can optimize return statements or non-returning function calls to interprocedural branches. To avoid this, recompile with the -O0 or -no_inline option.

FILES

program.hiprof Instrumented version of program produced by hiprof program[.pid][.sequence].hiout Profile data file produced by program.hiprof lib*.so.instrumented-executable Instrumented shared libraries produced by hiprof .hiprof.pid Temporary file created and deleted in the current and -dirname path directories.

SEE ALSO

Introduction: prof_intro(1) atom(1), cc(1), dxprof(1), fork(2), gprof(1), kill(1), ld(1), pixie(1), processor_sets(4), psrinfo(1), pthread(3), runon(1), uprofile(1). (dxprof is available as an option.) Programmer's Guide

Index Index for
Section 1
Index Alphabetical
listing for H
Top of page Top of
page