BBS水木清华站∶精华区
OPTIMIZATION OPTIONS
These options control various sorts of optimizations:
-O
-O1 Optimize. Optimizing compilation takes somewhat
more time, and a lot more memory for a large func-
tion.
Without `-O', the compiler's goal is to reduce the
cost of compilation and to make debugging produce
the expected results. Statements are independent:
if you stop the program with a breakpoint between
statements, you can then assign a new value to any
variable or change the program counter to any other
statement in the function and get exactly the re-
sults you would expect from the source code.
Without `-O', only variables declared register are
allocated in registers. The resulting compiled
code is a little worse than produced by PCC without
`-O'.
With `-O', the compiler tries to reduce code size
and execution time.
When you specify `-O', the two options
`-fthread-jumps' and `-fdefer-pop' are turned on.
On machines that have delay slots, the
`-fdelayed-branch' option is turned on. For those
machines that can support debugging even without a
frame pointer, the `-fomit-frame-pointer' option is
turned on. On some machines other flags may also
be turned on.
-O2 Optimize even more. Nearly all supported optimiza-
tions that do not involve a space-speed tradeoff
are performed. Loop unrolling and function inlin-
ing are not done, for example. As compared to -O,
this option increases both compilation time and the
performance of the generated code.
-O3 Optimize yet more. This turns on everything -O2
does, along with also turning on -fin-
line-functions.
-O0 Do not optimize.
If you use multiple -O options, with or without
level numbers, the last such option is the one that
is effective.
Options of the form `-fflag' specify machine-independent
flags. Most flags have both positive and negative forms;
the negative form of `-ffoo' would be `-fno-foo'. The
following list shows only one form--the one which is not
the default. You can figure out the other form by either
removing `no-' or adding it.
-ffloat-store
Do not store floating point variables in registers.
This prevents undesirable excess precision on ma-
chines such as the 68000 where the floating regis-
ters (of the 68881) keep more precision than a dou-
ble is supposed to have.
For most programs, the excess precision does only
good, but a few programs rely on the precise defi-
nition of IEEE floating point. Use `-ffloat-store'
for such programs.
-fmemoize-lookups
-fsave-memoized
Use heuristics to compile faster (C++ only). These
heuristics are not enabled by default, since they
are only effective for certain input files. Other
input files compile more slowly.
The first time the compiler must build a call to a
member function (or reference to a data member), it
must (1) determine whether the class implements
member functions of that name; (2) resolve which
member function to call (which involves figuring
out what sorts of type conversions need to be
made); and (3) check the visibility of the member
function to the caller. All of this adds up to
slower compilation. Normally, the second time a
call is made to that member function (or reference
to that data member), it must go through the same
lengthy process again. This means that code like
this
cout << "This " << p << " has " << n << "
legs.\n";
makes six passes through all three steps. By using
a software cache, a "hit" significantly reduces
this cost. Unfortunately, using the cache intro-
duces another layer of mechanisms which must be im-
plemented, and so incurs its own overhead.
`-fmemoize-lookups' enables the software cache.
Because access privileges (visibility) to members
and member functions may differ from one function
context to the next, g++ may need to flush the
cache. With the `-fmemoize-lookups' flag, the
cache is flushed after every function that is com-
piled. The `-fsave-memoized' flag enables the same
software cache, but when the compiler determines
that the context of the last function compiled
would yield the same access privileges of the next
function to compile, it preserves the cache. This
is most helpful when defining many member functions
for the same class: with the exception of member
functions which are friends of other classes, each
member function has exactly the same access privi-
leges as every other, and the cache need not be
flushed.
-fno-default-inline
Don't make member functions inline by default mere-
ly because they are defined inside the class scope
(C++ only).
-fno-defer-pop
Always pop the arguments to each function call as
soon as that function returns. For machines which
must pop arguments after a function call, the com-
piler normally lets arguments accumulate on the
stack for several function calls and pops them all
at once.
-fforce-mem
Force memory operands to be copied into registers
before doing arithmetic on them. This may produce
better code by making all memory references poten-
tial common subexpressions. When they are not com-
mon subexpressions, instruction combination should
eliminate the separate register-load. I am inter-
ested in hearing about the difference this makes.
-fforce-addr
Force memory address constants to be copied into
registers before doing arithmetic on them. This
may produce better code just as `-fforce-mem' may.
I am interested in hearing about the difference
this makes.
-fomit-frame-pointer
Don't keep the frame pointer in a register for
functions that don't need one. This avoids the in-
structions to save, set up and restore frame point-
ers; it also makes an extra register available in
many functions. It also makes debugging impossible
on most machines.
On some machines, such as the Vax, this flag has no
effect, because the standard calling sequence auto-
matically handles the frame pointer and nothing is
saved by pretending it doesn't exist. The machine-
description macro FRAME_POINTER_REQUIRED controls
whether a target machine supports this flag.
-finline-functions
Integrate all simple functions into their callers.
The compiler heuristically decides which functions
are simple enough to be worth integrating in this
way.
If all calls to a given function are integrated,
and the function is declared static, then GCC nor-
mally does not output the function as assembler
code in its own right.
-fcaller-saves
Enable values to be allocated in registers that
will be clobbered by function calls, by emitting
extra instructions to save and restore the regis-
ters around such calls. Such allocation is done
only when it seems to result in better code than
would otherwise be produced.
This option is enabled by default on certain ma-
chines, usually those which have no call-preserved
registers to use instead.
-fkeep-inline-functions
Even if all calls to a given function are integrat-
ed, and the function is declared static, neverthe-
less output a separate run-time callable version of
the function.
-fno-function-cse
Do not put function addresses in registers; make
each instruction that calls a constant function
contain the function's address explicitly.
This option results in less efficient code, but
some strange hacks that alter the assembler output
may be confused by the optimizations performed when
this option is not used.
-fno-peephole
Disable any machine-specific peephole optimiza-
tions.
-ffast-math
This option allows GCC to violate some ANSI or IEEE
rules/specifications in the interest of optimizing
code for speed. For example, it allows the compil-
er to assume arguments to the sqrt function are
non-negative numbers.
This option should never be turned on by any `-O'
option since it can result in incorrect output for
programs which depend on an exact implementation of
IEEE or ANSI rules/specifications for math func-
tions.
The following options control specific optimizations. The
`-O2' option turns on all of these optimizations except
`-funroll-loops' and `-funroll-all-loops'.
The `-O' option usually turns on the `-fthread-jumps' and
`-fdelayed-branch' options, but specific machines may
change the default optimizations.
You can use the following flags in the rare cases when
"fine-tuning" of optimizations to be performed is desired.
-fstrength-reduce
Perform the optimizations of loop strength reduc-
tion and elimination of iteration variables.
-fthread-jumps
Perform optimizations where we check to see if a
jump branches to a location where another compari-
son subsumed by the first is found. If so, the
first branch is redirected to either the destina-
tion of the second branch or a point immediately
following it, depending on whether the condition is
known to be true or false.
-funroll-loops
Perform the optimization of loop unrolling. This
is only done for loops whose number of iterations
can be determined at compile time or run time.
-funroll-all-loops
Perform the optimization of loop unrolling. This
is done for all loops. This usually makes programs
run more slowly.
-fcse-follow-jumps
In common subexpression elimination, scan through
jump instructions when the target of the jump is
not reached by any other path. For example, when
CSE encounters an if statement with an else clause,
CSE will follow the jump when the condition tested
is false.
-fcse-skip-blocks
This is similar to `-fcse-follow-jumps', but causes
CSE to follow jumps which conditionally skip over
blocks. When CSE encounters a simple if statement
with no else clause, `-fcse-skip-blocks' causes CSE
to follow the jump around the body of the if.
-frerun-cse-after-loop
Re-run common subexpression elimination after loop
optimizations has been performed.
-felide-constructors
Elide constructors when this seems plausible (C++
only). With this flag, GNU C++ initializes y di-
rectly from the call to foo without going through a
temporary in the following code:
A foo (); A y = foo ();
Without this option, GNU C++ first initializes y by
calling the appropriate constructor for type A;
then assigns the result of foo to a temporary; and,
finally, replaces the initial value of `y' with the
temporary.
The default behavior (`-fno-elide-constructors') is
specified by the draft ANSI C++ standard. If your
program's constructors have side effects, using
`-felide-constructors' can make your program act
differently, since some constructor calls may be
omitted.
-fexpensive-optimizations
Perform a number of minor optimizations that are
relatively expensive.
-fdelayed-branch
If supported for the target machine, attempt to re-
order instructions to exploit instruction slots
available after delayed branch instructions.
-fschedule-insns
If supported for the target machine, attempt to re-
order instructions to eliminate execution stalls
due to required data being unavailable. This helps
machines that have slow floating point or memory
load instructions by allowing other instructions to
be issued until the result of the load or floating
point instruction is required.
-fschedule-insns2
Similar to `-fschedule-insns', but requests an ad-
ditional pass of instruction scheduling after reg-
ister allocation has been done. This is especially
useful on machines with a relatively small number
of registers and where memory load instructions
take more than one cycle.
BBS水木清华站∶精华区