Contents|Index|Previous|Next
Options
That Control Optimization
The following options control
various sorts of optimizations:
-O
-O1
Optimize. Optimizing
compilation takes somewhat more time, and a lot more memory for a large
function.
Without ‘-O’,
the compiler’s goal is to reduce the cost of compilation and to make debugging
produce the expected results. Statements are independent: if you stop the
program with a breakpoint between statements, you can then assign a new
value to any variable or change the program counter to any other statement
in the function and get exactly the results you would expect from the source
code.
Without ‘-O’,
the compiler only allocates variables declared register
in registers. The resulting compiled code is a little worse than produced
by PCC without ‘-O’.
With ‘-O’,
the compiler tries to reduce code size and execution time.
When you specify ‘-O’,
the compiler turns on -fthread-jumps
and -fdefer-pop
on all machines.
The compiler turns on -fdelayed-branch
on machines that have delay slots, and -fomit-frame-pointer
on machines that can support debugging even without a frame pointer. On
some machines the compiler also turns on other flags.
-O2
Optimize even
more. GNU CC performs nearly all supported optimizations that do not involve
a space-speed tradeoff. The compiler does not perform loop unrolling or
function inlining when you specify ‘-O2’.
As compared to ‘-O’,
this option in-creases both compilation time and the performance of the
generated code.
‘-O2’
turns on all optional optimizations except for loop unrolling function
inlining, life shortening, and static variable optimizations. It also turns
on frame pointer elimination on machines where doing so does not interfere
with debugging.
-O3
Optimize yet more.
‘-O3’
turns on all optimizations specified by ‘-O2’
and also turns on the option,
inline-functions.
-O0
Do not optimize.
If you use multiple ‘-O’
options, with or without level numbers, the last such option is the one
that is effective.
Options of the form, -fflag,
specify machine-independent flags. Most flags have both positive and negative
forms; the negative form of -ffoo
would be -fno-foo.
In the following options, only one of the forms is listed—the one which
is not the default.
You can figure out the other
form by either removing ‘no-’
or adding it.
-ffloat-store
Do not store floating
point variables in registers, and inhibit other options that might change
whether a floating point value is taken from a register or memory.
This option prevents undesirable
excess precision on machines such as the 68000 where the floating registers
(of the 68881) keep more precision than a double
is supposed to have. For most programs, the excess precision does only
good, but a few programs rely on the precise definition of IEEE floating
point. Use -ffloat-store
for such programs.
-fno-default-inline
Do not make member
functions inline by default merely because they are defined inside the
class scope (C++ only). Otherwise, when you specify ‘-O’,
member functions defined inside class scope are compiled inline by default;
i.e., you don’t need to add inline
in front of the member function name.
-fno-defer-pop
Always pop the
arguments to each function call as soon as that function returns. For machines
which must pop arguments after a function call, the compiler normally lets
arguments accumulate on the stack for several function calls and pops them
all at once.
-fforce-mem
Force memory operands
to be copied into registers before doing arithmetic on them. This produces
better code by making all memory references potential common subexpressions.
When they are not common subexpressions, instruction combination should
eliminate the separate register-load. The ‘-O2’
option turns on this option.
-fforce-addr
Force memory address
constants to be copied into registers before doing arithmetic on them.
This may produce better
code just as -fforce-mem
may.
-fomit-frame-pointer
Don’t keep the
frame pointer in a register for functions that don’t need one. This avoids
the instructions to save, set up and restore frame pointers; it also makes
an extra register available in many functions.
- Warning:
It also makes
debugging impossible on some machines.
On some machines, such as
the VAX, this flag has no effect because the standard calling sequence
automatically handles the frame pointer and nothing is saved by pretending
it doesn’t exist. The machine-description macro, FRAME_POINTER_REQUIRED,
controls whether a target machine supports this flag. See Constraints
for Particular Machines to determine register usage with your
target machine.
-fno-inline
Don’t pay attention
to the inline
keyword. Normally this option is used to keep the compiler from expanding
any functions inline.
- Note:
If you are not optimizing, no functions can be expanded inline.
-finline-functions
Integrate all
simple functions into their callers. The compiler heuristically decides
which functions are simple enough to be worth integrating in this way.
If all calls to a given
function are integrated, and the function is declared static,
then the function is normally not output as assembler code in its own right.
-fkeep-inline-functions
Even if all calls
to a given function are integrated, and the function is declared static,
nevertheless output a separate run-time callable version of the function.
This switch does not affect extern
inline functions.
-fkeep-static-consts
Emit variables
declared static const
when optimization isn’t turned on, even if the variables weren’t referenced.
This option is enabled by default. -fno-keep-static-consts
will force the compiler to check if the variable was referenced, regardless
of whether or not optimization is turned on.
-fno-function-cse
Do not put function
addresses in registers; make each instruction that calls a constant function
contain the function’s address explicitly.
The fno-function-cse
option results in less efficient code, but some strange hacks that alter
the assembler output may be confused by the optimizations performed when
this option is not used.
-ffast-math
This option allows
GCC to violate some ANSI or IEEE rules and/or specifications in the interest
of optimizing code for speed. For example, it allows the compiler to assume
arguments to the sqrt
function are non-negative numbers and that no floating-point values are
NaNs.
This option should never
be turned on by any ‘-O’
option since it can result in incorrect output for programs which depend
on an exact implementation of IEEE or ANSI rules/specifications for math
functions.
The following options control
specific optimizations.
The ‘-O2’
option turns on all of these optimizations except -funroll-loops
and -funroll-all-loops.
On most machines, the ‘-O’
option turns on the -fthread-jumps
and -fdelayed-branch
options, but specific machines may handle it differently.
Use the following flags
in the rare cases when you want to fine-tune optimizations.
-fstrength-reduce
Perform the optimizations
of loop strength reduction and elimination of iteration variables.
-fthread-jumps
Perform optimizations
where we check to see if a jump branches to a location where another comparison
subsumed by the first is found. If so, the first branch is redirected to
either the destination of the second branch or a point immediately following
it, depending on whether the condition is known to be true or false.
-fcse-follow-jumps
In common subexpression
elimination, scan through jump instructions when the target of the jump
is not reached by any other path. For example, when CSE encounters an if
statement with an else
clause, CSE will follow the jump when the condition tested is false.
-fcse-skip-blocks
This is similar
to ‘-fcse-follow-jumps’,
but causes CSE to follow jumps which conditionally skip over blocks. When
CSE encounters a simple if
statement with no else
clause, ‘-fcse-skip-blocks’
causes CSE to follow the jump around the body of the if.
-frerun-cse-after-loop
Re-run common
subexpression elimination after loop optimizations has been performed.
-fexpensive-optimizations
Perform a number
of minor optimizations that are relatively expensive.
-fdelayed-branch
If supported for
the target machine, attempt to reorder instructions to exploit instruction
slots available after delayed branch instructions.
-fschedule-insns
If supported for
the target machine, attempt to reorder instructions to eliminate execution
stalls due to required data being unavailable. This helps machines that
have slow floating point or memory load instructions by allowing other
instructions to be issued until the result of the load or floating point
instruction is required.
-fschedule-insns2
Similar to -fschedule-insns,
but requests an additional pass of instruction scheduling after register
allocation has been done. This is especially useful on machines with a
relatively small number of registers and where memory load instructions
take more than one cycle.
-fshorten-lifetimes
Shorten lifetimes
of pseudo registers which must be allocated into specific hard registers.
On some machines this avoids spilling those specific hard registers and
improves code.
-fcombine-statics
Combine static
variables into a single block to allow the compiler to eliminate redundant
address loads.
-ffunction-sections
Place each function
into its own section in the output file if the target supports arbitrary
sections. The function’s name determines the section’s name in the output
file.
Use this option on systems
where the linker can perform optimizations to improve locality of reference
in the instruction space. HPPA processors running HP-UX and SPARC processors
running Solaris 2 have linkers with such optimizations. Other systems using
the ELF object format as well as AIX may have these optimizations in the
future.
Only use this option when
there are significant benefits from doing so. When you specify this option,
the assembler and linker will create larger object and executable files
and will also be slower. You will not be able to use gprof
on all systems if you specify this option and you may have problems with
debugging if you specify both this option and ‘-g’.
-fcaller-saves
Enable values
to be allocated in registers that will be clobbered by function calls,
by emitting extra instructions to save and restore the registers around
such calls. Such allocation is done only when it seems to result in better
code than would otherwise be produced. This option is enabled by default
on certain machines, usually those which have no call-preserved registers
to use instead.
-funroll-loops
Perform the optimization
of loop unrolling. This is only done for loops whose number of iterations
can be determined at compile time or run time. -funroll-loop
implies both -fstrength-reduce
and
-frerun-cse-after-loop.
-funroll-all-loops
Perform the optimization
of loop unrolling. This is done for all loops and usually makes programs
run more slowly. -funroll-all-loops
implies -fstrength-reduce
as well as -frerun-cse-after-loop.
-fno-peephole
Disable any machine-specific
peephole optimizations.
-fbranch-probabilities
After running
a program compiled with -fprofile-arcs
(see Options for Debugging
Your Program on GNU CC), you can compile it a second time using -fbranch-probabilities,
to improve optimizations based on guessing the path a branch might take.
-fregmove
Some machines only support 2 operands
per instruction. On such machines, GNU CC might have to do extra copies.
The ‘-fregmove’
option overrides the default for the machine to do the copy before register
allocation.