Simple Metric Analysis

If you have not already done so, you must collect performance data for synprog and open the first experiment, test.1.er. See Collecting Data for the synprog Example for instructions.

In this part of the example, you examine User CPU times for two functions, cputime() and icputime(). Both functions contain a for loop that increments a variable x by one. In cputime(), x is a floating-point variable, but in icputime(), x is an integer variable.

  1. Locate cputime() and icputime() in the Functions tab.

    You can use the Find tool to find the functions instead of scrolling the display.

    Compare the exclusive user CPU time for the two functions. Much more time is spent in cputime() than in icputime().

  2. Choose File and choose Create New Window (Alt-F, N).

    A new Analyzer window is displayed with the same data. Position the windows so that you can see both of them.

  3. In the Functions tab of the first window, click cputime() to select it, and then click the Source tab.
  4. In the Functions tab of the second window, click icputime() to select it, and then click the Source tab.

    The annotated source listing tells you which lines of code are responsible for the CPU time. Most of the time in both functions is used by the loop line and the line in which x is incremented.

    The time spent on the loop line in icputime() is approximately the same as the time spent on the loop line in cputime(), but the line in which x is incremented takes much less time to execute in icputime() than the corresponding line in cputime().

  5. In both windows, click the Disassembly tab and locate the instructions for the line of source code in which x is incremented.

    You can also find these instructions by choosing High Metric Value in the Find tool combo box and searching.

    In cputime(), a significant amount of time is spent executing the fstod and fdtos instructions. These instructions convert the value of x from a single floating-point value to a double floating-point value and back again. This must be done so that x can be incremented by 1.0, which is a double floating-point constant.

    In icputime(), all that is involved is a load, add, and store operation that takes approximately a third of the time of the corresponding set of instructions in cputime(), because no conversions are necessary. The value 1 does not need to be loaded into a register--it can be added directly to x by a single instruction.

  6. When you have finished the exercise, close the new Analyzer window.

Extension Exercise

Edit the source code for synprog, and change the type of x to double in cputime(). Recompile the program and record a new experiment by typing the following in the terminal window that you used earlier to collect data for synprog:

% make
% collect synprog

Open the new experiment, test.3.er, in the Performance Analyzer.

What effect does the change to x have on the time? What differences do you see in the annotated disassembly listing?

See also
The Functions Tab
The Source Tab
The Disassembly Tab

Can't find what you are looking for? Submit your comments at http://www.sun.com/hwdocs/feedback.
Legal Notices