If you have not already done so, you must collect performance data for synprog and open the experiment test.1.er. See Collecting Data for the synprog Example for instructions.
In this example you use the Callers-Callees tab to explore how the time used in a function can be attributed to its callers.
The Callers-Callees tab is divided into three panes. In the center pane is the selected function. In the pane above are the callers of the selected function, and in the pane below are the functions that are called by the selected function, which are termed callees.
The Callers pane shows two functions that call the selected function, gpf_b() and gpf_a(). The Callees pane is empty because gpf_work() does not call any other functions. Such functions are called "leaf functions."
Examine the attributed user CPU time column in the Callers pane. Attributed time for a caller is the time spent in the selected function that is due to calls from that caller. Most of the time in gpf_work() is due to calls from gpf_b(). Much less time is due to calls from gpf_a().
To see why gpf_b() calls account for over ten times as much time in gpf_work() as calls from gpf_a(), you must examine the source code for the two callers.
gpf_a() becomes the selected function, and moves to the center pane; its callers appear in the Callers pane, and gpf_work(), its callee, appears in the Callees pane.
gpf_a() calls gpf_work() ten times with an argument of 1, whereas gpf_b() calls gpf_work() only once, but with an argument of 10. The arguments from gpf_a() and gpf_b() are passed to the formal argument amt in gpf_work().
Now examine the code for gpf_work(), to see why the way the callers call gpf_work() makes a difference.
Examine the line in which the variable imax is computed: imax is the upper limit for the following for loop.The time spent in gpf_work() depends on the square of the argument amt. So ten times as much time is spent on one call from a function with an argument of 10 (400 iterations) as is spent on ten calls from a function with an argument of 1 (10 instances of 4 iterations).
In gprof, however, the amount of time spent in a function is estimated from the number of times the function is called, regardless of how the time depends on the function's arguments or any other data that it has access to. So for an analysis of synprog, gprof incorrectly attributes ten times as much time to calls from gpf_a() as it does to calls from gpf_b(). This is the gprof fallacy.
See also | |
---|---|
The Callers-Callees Tab The Source Tab Exclusive, Inclusive, and Attributed Metrics |