Pocket Computing @ WRL

The Itsy Instruction Profiler


     
 

Overview

Profiling on the Itsy is done in two steps:

  1. Data is gathered about applications as they run on an Itsy. This step is called profiling.
  2. The acquired data is processed. This step is done on a host computer.

Installation

First, install the prof package on your Itsy (see here for how). This step will install both the itsy_prof kernel module and the profd daemon. Once you have installed the package, execute on the Itsy the sync command and then reboot it. You can now profile.

Second, install on your host system the unstripped versions of the Itsy libraries and kernel; these are contained in the profile.tar.gz file that is shipped with Itsy Release 4.1 or later. The directory in which you install these unstripped versions is the "profile" directory; we suggest using /usr/local/arm-unknown-linuxelf/src/profile. Note that the kernel and modules in the profile directory do not necessarily match the ones shipped with your Itsy. However, they do match the binaries for the kernel and modules (on the ramdisk) included in this distribution. If you want to get kernel debugging info from the profiler you will probably need to install the shipped kernel and ramdisks on your Itsy.

Notes:

  1. Make sure the bin directory is in your host-machine path before attempting to process profile data.

  2. The location of the "profile" directory itself must be placed into the bin/itsyscan program. To do so, change the following line in the program (it is a perl script):
    $profhome = "/wrl/proj/itsy/profile";

    to have the full pathname of the location you place that directory. In our example, that would be:

 $profhome = "/usr/local/arm-unknown-linuxelf/src/profile"; 

You can now process the profile data.

Profiling (i.e., Gathering Data)

The Itsy profiler is an interrupt-driven statistical profiler. The profiling functionality is provided by the profd daemon.

To start profiling, on the Itsy create an empty "database" directory (i.e., db) for storing profiles, and then invoke profd. For example:

% mkdir db
% profd db & 

profd accepts a number of comman-line options. These are:

  • a "-f" option to specify the sampling frequency in samples per second (defaults to 256);
  • a "-r" option to specify whether or not to randomize the sampling period (on by default),
  • a "-p" option to separate out data by process id (pid); and
  • a "-v" option that will result in verbose logging.

To stop profiling, kill the profd process. Profd will catch the signal, save its in-memory sample data to files in "db", and terminate.

Here is an example directory listing after running profd on an Itsy for a couple of minutes:

/bin/ls -l db
total 30
-rw-r--r--   1 root     root          134 Jun  4 12:05 bash-0e3383fb.prof
-rw-r--r--   1 root     root          130 Jun  4 12:05 gmanager-d98893c8.prof
-rw-r--r--   1 root     root          191 Jun  4 12:05 kaffe-dce80653.prof
-rw-r--r--   1 root     root         2079 Jun  4 12:05 kernel-00000000.prof
-rw-r--r--   1 root     root        13022 Jun  4 12:05 kernel.syms
-rw-r--r--   1 root     root          305 Jun  4 12:05 ldso1-4631481a.prof
-rw-r--r--   1 root     root          408 Jun  4 12:05 libawtso-73723a7c.prof
-rw-r--r--   1 root     root          155 Jun  4 12:05 libcso4627-5f673247.prof
-rw-r--r--   1 root     root          411 Jun  4 12:05 libcso6-0fa854e0.prof
-rw-r--r--   1 root     root          211 Jun  4 12:05 libggiso-bb74823f.prof
-rw-r--r--   1 root     root          717 Jun  4 12:05 libkaffevmso-e86f76dc.prof
-rw-r--r--   1 root     root          234 Jun  4 12:05 libnativeso-50d434a5.prof
-rw-r--r--   1 root     root          220 Jun  4 12:05 nomap-00000000.prof
-rw-r--r--   1 root     root          122 Jun  4 12:05 open-b03fac0d.prof
-rw-r--r--   1 root     root          138 Jun  4 12:05 tcsh-80dc1857.prof

The filename format is name-imageid[-pid].prof, where name is the executable image whose unique identifier (i.e., signature) is imageid, and pid is the process id of the executable. Note that the imageid is generated by itsyscan, and pid is included only if the -p option is given to profd. Also, note that profd scans the entire system starting at "/" upon startup. Each .prof profile is simply a plain ASCII file (future versions could use compression via zlib). The profile starts with a short hearder containing key/value paris that indicate the image pathnames, profiler verison, etc. This hearder is followed by sample data, arranged in lines. Each line has the format "offset count", where "offset" is a hex file offset -- i.e. a byte offset into the executable image, suitable for lseek-ing, and "count" is the number of samples that landed on the instruction at that offset. Fnally, note that the SA1100 drains its pipeline before handling an interrupt (we believe), so instructions may not be uniformly sampled.

Processing the Data

Data processing is done on a host machine (e.g. an x86). Copy the entire profile directory created by profd to this machine for processing. The most basic level of information available is a summary of where the cycles (i.e., samples) were spent on a per-image basis. Here is typical output:

itsyprof *.prof
--------------------------------------------------------------
  cycles         %     cum%  image
   67448    58.37%   58.37%  crafty
   28278    24.47%   82.84%  kernel
    8477     7.34%   90.18%  libkaffevm.so
    4951     4.28%   94.47%  libc.so.6
    4008     3.47%   97.93%  libawt.so
    1321     1.14%   99.08%  kaffe
     876     0.76%   99.84%  nomap
      49     0.04%   99.88%  profd
      49     0.04%   99.92%  libggi.so
      32     0.03%   99.95%  libnative.so
      25     0.02%   99.97%  ld.so.1
      17     0.01%   99.98%  pppd
      10     0.01%   99.99%  libui.so
       2     0.00%   99.99%  libm.so.6
       1     0.00%  100.00%  libc.so.4.6.27
       1     0.00%  100.00%  open
       1     0.00%  100.00%  liboss.so
       1     0.00%  100.00%  bash
       1     0.00%  100.00%  gmanager
       1     0.00%  100.00%  libio.so
  115549                     Total
--------------------------------------------------------------


All you need to generate this is a set of profile data. profd (see above) will generate these files for you. Note that:

  1. "kernel" includes time spent in loaded modules

  2. "nomap" includes all instructions not associated with image files (e.g. jit'ed java code)

Also available is a per-function breakdown (for all images for which unstripped versions are available). Because we keep mostly stripped images on the Itsy, the profiler generates a signature of the image at run time, and compares this signature to the versions available when itsyprof digests the data. A per-user map from signatures to file locations needs to be generated before running the profiler in the per-function mode:

 itsyscan > MAPFILE 

Note that itsyscan searches a set of default directories for unstripped binaries. You can specify other directories on the command line, separating each with a space. Also note that you will have to regenerate this map file every time the images that you are profiling change.

A simple function breakdown example is:

~/utils/itsyprof -f -mapfile map crafty-c642cf09.prof
--------------------------------------------------------------
/wrl/proj/itsy/profile/apps/crafty

  cycles         %     cum%        addr function
   13830    20.50%   20.50%  0x02004934 Evaluate
    5497     8.15%   28.65%  0x02016224 Attacked
    5024     7.45%   36.10%  0x0201173c MakeMove
    4576     6.78%   42.89%  0x0201a7d4 ClearHashTables
    4297     6.37%   49.26%  0x0200bdd4 GenerateCaptures
    4202     6.23%   55.49%  0x02001f64 Search
    3765     5.58%   61.07%  0x02038248 Iterate
...
       1     0.00%  100.00%  0x0202053c EPDTokenize
   67448                                Total
--------------------------------------------------------------

If you want a function breakdown of kernel code, you will need to specify the kernel map file to itsyprof with the "-kernelmapfile KERNELMAPFILE" flag, where KERNELMAPFILE is automatically generated by profd, is called kernel.syms, and will be stored in the same directory as the *.prof files.

If you want a function breakdown of jit'ted java code, you will need to specify to itsyprof a jitfile with the "-jitfile JITFILE" flag. To do this, under the Launcher.I program go to the "System" screen. Select "Profile Me", and wait a long time. You will know it is done when the /tmp/kaffe-### file has the following in it:

Done dumping VM profile

where ### is the pid (process id) of the currently running java session. You may also see the following in the /tmp/kaffe-### file

java.lang.ThreadDeath
        at java/lang/Thread.stop(Thread.java:355)
        at java/lang/Thread.stop(Thread.java:348)
        at java/lang/Runtime.exit(Runtime.java:94)
        at java/lang/System.exit(System.java:69)
        at DumpProfile.main(DumpProfile.java:18)
        at kaffe/lang/Application.run(Application.java:114)
        at java/lang/Thread.run(Thread.java:300)

Do not worry, it's perfectly normal (it indicates that the thread started to dump the profiling data died a natural death after finishing dumping the profile data). The jitfile will be located in /tmp, and will be called VMProfile.dump.

Please note that the version of the files that you are running on your Itsy may not match anything the profiler can find. Also, we don't have unstripped versions for some images (eg bash, pppd, the aout versions of the libc/libm libraries, etc).

Finally, the command line

itsyprof -f -mapfile map -kernelmapfile kernel.syms -jitfile VMProfile.dump *.prof

will generate the most complete listing possible, showing the number of cycles attributable to each function profiled sorted in order of the most frequently profiled to the least frequently profiled, where kernel.syms is the kernel map file and VMProfile.dump is the Java system map file.

 

 

The Itsy Project is a joint effort of the Western Research Lab and the Systems Research Center