This appendix describes the optimization phases of the -oldc version of the C compiler and their benefits.
The global optimizer (uopt) is a single program that improves the performance of object programs by transforming existing code into more efficient coding sequences. Although the same optimizer processes optimizations for all languages, it does distinguish between the various languages to take advantage of the different language semantics involved.
The primary benefits of optimization are faster running programs and smaller object code size. However, the optimizer can also speed up development time. For example, coding time can be reduced by leaving it up to the optimizer to relate programming details to execution-time efficiency. This allows you to focus on the more crucial global structure of your program. Programs often yield optimizable code sequences regardless of how well a program is written.
Optimize your programs only after they are fully developed and debugged. Although the optimizer does not alter the flow of control within a program, it may move operations so that the object code does not correspond to the source code. These changed sequences of code may create confusion when using the debugger.
Optimizations are most useful in code that contains loops. The optimizer moves loop-invariant code sequences outside loops so that they are performed only once instead of multiple times. Apart from loop-invariant code, loops often contain loop-induction expressions that can be replaced with simple increments. In programs composed of many loops, global optimization can often reduce the run time by half.
Register usage has a significant impact on program performance. For example, fetching a value from a register is significantly faster than fetching a value from storage. Thus, to perform its intended function, the optimizer must make the best possible use of registers.
In allocating registers, the optimizer selects those data items that are most suited for placement in registers, taking into account their frequency of use and their location in the program structure. In addition, the optimizer assigns values to registers so that their contents move minimally within loops and during procedure invocations.
The optimizer processes one procedure at a time. Large procedures offer more opportunities for optimization because more interrelationships are exposed in terms of constructs and regions.
The uld and umerge phases of the compiler permit global optimization among separate units in the same compilation. Often, programs are divided into separate files that are compiled separately and referred to as modules or compilation units. Compiling them separately saves time during program development because a change requires recompilation of only one module, not the entire program.
Traditionally, program modularity restricted the optimization of code to a single compilation unit at a time. For example, calls to procedures that reside in other modules could not be fully optimized with the code that called them. The uld and umerge phases of the compiler system overcome this deficiency. The uld phase links multiple compilation units into a single compilation unit. Then, umerge orders the procedures for optimal processing by the global optimizer (uopt).
Table D-1
summarizes the functions of each of the
-O
options to the
cc
-oldc
command.
Option | Result |
-O3 |
The
uld
and
umerge
phases process the output from the compilation phase of the compiler,
which produces symbol table information and the program text in an
internal format called
ucode.
The uld phase combines all the ucode files and symbol tables, and passes control to umerge. The umerge phase reorders the ucode for optimal processing by uopt. Upon completion, umerge passes control to uopt, which performs global optimizations on the program. |
-O2 | The uld and umerge phases are bypassed and only the global optimizer (uopt) phase executes. It performs optimization only within the bounds of individual compilation units. |
-O1 | The uld, umerge, and uopt phases are bypassed. However, the code generator and the assembler perform basic optimizations in a more limited scope. |
-O0 | The uld, umerge, and uopt phases are bypassed, and the assembler bypasses certain optimizations that it normally performs. |
The following examples assume that the program prog1 consists of three files: a.c, b.c, and c.c.
To perform procedure merging optimizations -O3 on all three files, enter the following command:
%
cc -oldc -O3 -o prog1 a.c b.c c.c
If you normally use the -c option to compile the object file (.o), follow these steps:
%
cc -oldc -j a.c
%
cc -oldc -j b.c
%
cc -oldc -j c.c
The -j option causes the compiler driver to produce a .u file. None of the remaining compiler phases are executed.
The .u file contains the standard output of the first pass of the compiler (which is referred to as the front end of the compiler). The file is written in ucode, an internal language used by the compiler.
%
cc -oldc -O3 -o prog1 a.u b.u c.u
To ensure that all procedures are optimized regardless of size, specify the -Olimit option at compilation time.
Because compilation time increases by the square of the procedure size, the compiler system enforces a top limit on the size of a procedure that can be optimized. This limit was set for the convenience of users who place a higher priority on the compilation turnaround time than on optimizing an entire procedure. The -Olimit option removes the top limit and allows those users who do not mind a long compilation to fully optimize their procedures.
You may want to optimize modules that are frequently called from other programs to reduce the compilation and optimization time required for programs calling these modules.
In the examples that follow, b.c and c.c represent two frequently used modules to be optimized, retaining all information necessary to link them with future programs; future.c represents one such program.
The following steps show how to optimize frequently called modules:
%
cc -oldc -j b.c
%
cc -oldc -j c.c
The -j option causes the front end, or first pass, of the compiler to produce two ucode files, b.u and c.u.
b.c proc1() c.c x() { { . . . . } } proc2() help() { { . . . . } } proc3() struct { { . . . . } } ddata; struct y() { { . . . . } work; }
In this example, future.c calls or references only proc1, proc2, x, ddata, and y in the two procedures (b.c and c.c). Thus, a file (named extern for this example) must be created containing the following symbolic names:
proc1 proc2 x ddata y
The structure work and the procedures help and proc3 are used internally only by b.c and c.c, and thus are not included in extern.
If you omit an external symbolic name, an error message is generated (see step 4).
%
cc -oldc -O3 -kp extern b.u c.u -o keep.o
The -kp option designates that the -p linker option is to be passed to the ucode loader.
%
cc -oldc -j future.c
%
cc -oldc -O3 future.u keep.o -o test_opt
The following message may appear. It means that the code in future.c is using a symbol from the code in b.c or c.c that was not specified in the file extern.
proc3: multiply defined hidden external (should have been preserved)
If the preceding message appears, include proc3 in the file extern and recompile as follows:
%
cc -oldc -O3 -kp extern b.u c.u -o keep.o
%
cc -oldc -O3 future.u keep.o -o test_opt
Building a ucode object library is similar to building a COFF object library. First, compile the source files into ucode object files using the -j option:
%
cc -oldc -j a.c
%
cc -oldc -j b.c
%
cc -oldc -j c.c
Then, enter the following commands to build a ucode library (libtest_opt.b) containing object files for a.c, b.c, and c.c:
%
ar -crs libtest_opt.b a.u b.u c.u
The names of ucode libraries should have the suffix .b.
Using ucode object libraries is similar to using COFF object files. To load from a ucode library, specify the -klx option to the compiler driver or the ucode loader. To load from the ucode library file created in the previous example, enter the following command:
%
cc -oldc -O3 file1.u file2.u -kltest_opt -o output
Libraries are searched as they are encountered on the command line, so the order in which they are specified on the command line is important. If a library is made from both assembly and high-level language routines, the ucode object library contains code only for the high-level language routines, not all of the routines as the COFF object library does. In this case, to ensure that all modules are loaded from the proper library, you must specify both the ucode object library and the COFF object library to the ucode loader.
If the compiler driver is to perform both a ucode load step and a final load step, the object file created after the ucode load step is placed in the position of the first ucode file specified or created on the command line in the final load step.