📄 optimize-options.html

📁 gcc手册
💻 HTML
📖 第 1 页 / 共 4 页
字号:
between <code>-fno-peephole</code> and <code>-fno-peephole2</code> is in how they

are implemented in the compiler; some targets use one, some use the

other, a few use both.



     <p><code>-fpeephole</code> is enabled by default. 

<code>-fpeephole2</code> enabled at levels <code>-O2</code>, <code>-O3</code>, <code>-Os</code>.



     <br><dt><code>-fbranch-probabilities</code>

     <dd><br><dt><code>-fno-guess-branch-probability</code>

     <dd>Do not guess branch probabilities using a randomized model.



     <p>Sometimes gcc will opt to use a randomized model to guess branch

probabilities, when none are available from either profiling feedback

(<code>-fprofile-arcs</code>) or <code>__builtin_expect</code>.  This means that

different runs of the compiler on the same program may produce different

object code.



     <p>In a hard real-time system, people don't want different runs of the

compiler to produce code that has different behavior; minimizing

non-determinism is of paramount import.  This switch allows users to

reduce non-determinism, possibly at the expense of inferior

optimization.



     <p>The default is <code>-fguess-branch-probability</code> at levels

<code>-O</code>, <code>-O2</code>, <code>-O3</code>, <code>-Os</code>.



     <br><dt><code>-freorder-blocks</code>

     <dd>Reorder basic blocks in the compiled function in order to reduce number of

taken branches and improve code locality.



     <p>Enabled at levels <code>-O2</code>, <code>-O3</code>, <code>-Os</code>.



     <br><dt><code>-freorder-functions</code>

     <dd>Reorder basic blocks in the compiled function in order to reduce number of

taken branches and improve code locality. This is implemented by using special

subsections <code>text.hot</code> for most frequently executed functions and

<code>text.unlikely</code> for unlikely executed functions.  Reordering is done by

the linker so object file format must support named sections and linker must

place them in a reasonable way.



     <p>Also profile feedback must be available in to make this option effective.  See

<code>-fprofile-arcs</code> for details.



     <p>Enabled at levels <code>-O2</code>, <code>-O3</code>, <code>-Os</code>.



     <br><dt><code>-fstrict-aliasing</code>

     <dd>Allows the compiler to assume the strictest aliasing rules applicable to

the language being compiled.  For C (and C++), this activates

optimizations based on the type of expressions.  In particular, an

object of one type is assumed never to reside at the same address as an

object of a different type, unless the types are almost the same.  For

example, an <code>unsigned int</code> can alias an <code>int</code>, but not a

<code>void*</code> or a <code>double</code>.  A character type may alias any other

type.



     <p>Pay special attention to code like this:

     <pre class="example">          union a_union {

            int i;

            double d;

          };

          

          int f() {

            a_union t;

            t.d = 3.0;

            return t.i;

          }

          </pre>

     The practice of reading from a different union member than the one most

recently written to (called "type-punning") is common.  Even with

<code>-fstrict-aliasing</code>, type-punning is allowed, provided the memory

is accessed through the union type.  So, the code above will work as

expected.  However, this code might not:

     <pre class="example">          int f() {

            a_union t;

            int* ip;

            t.d = 3.0;

            ip = &amp;t.i;

            return *ip;

          }

          </pre>



     <p>Every language that wishes to perform language-specific alias analysis

should define a function that computes, given an <code>tree</code>

node, an alias set for the node.  Nodes in different alias sets are not

allowed to alias.  For an example, see the C front-end function

<code>c_get_alias_set</code>.



     <p>Enabled at levels <code>-O2</code>, <code>-O3</code>, <code>-Os</code>.



     <br><dt><code>-falign-functions</code>

     <dd><dt><code>-falign-functions=</code><var>n</var><code></code>

     <dd>Align the start of functions to the next power-of-two greater than

<var>n</var>, skipping up to <var>n</var> bytes.  For instance,

<code>-falign-functions=32</code> aligns functions to the next 32-byte

boundary, but <code>-falign-functions=24</code> would align to the next

32-byte boundary only if this can be done by skipping 23 bytes or less.



     <p><code>-fno-align-functions</code> and <code>-falign-functions=1</code> are

equivalent and mean that functions will not be aligned.



     <p>Some assemblers only support this flag when <var>n</var> is a power of two;

in that case, it is rounded up.



     <p>If <var>n</var> is not specified, use a machine-dependent default.



     <p>Enabled at levels <code>-O2</code>, <code>-O3</code>.



     <br><dt><code>-falign-labels</code>

     <dd><dt><code>-falign-labels=</code><var>n</var><code></code>

     <dd>Align all branch targets to a power-of-two boundary, skipping up to

<var>n</var> bytes like <code>-falign-functions</code>.  This option can easily

make code slower, because it must insert dummy operations for when the

branch target is reached in the usual flow of the code.



     <p>If <code>-falign-loops</code> or <code>-falign-jumps</code> are applicable and

are greater than this value, then their values are used instead.



     <p>If <var>n</var> is not specified, use a machine-dependent default which is

very likely to be <code>1</code>, meaning no alignment.



     <p>Enabled at levels <code>-O2</code>, <code>-O3</code>.



     <br><dt><code>-falign-loops</code>

     <dd><dt><code>-falign-loops=</code><var>n</var><code></code>

     <dd>Align loops to a power-of-two boundary, skipping up to <var>n</var> bytes

like <code>-falign-functions</code>.  The hope is that the loop will be

executed many times, which will make up for any execution of the dummy

operations.



     <p>If <var>n</var> is not specified, use a machine-dependent default.



     <p>Enabled at levels <code>-O2</code>, <code>-O3</code>.



     <br><dt><code>-falign-jumps</code>

     <dd><dt><code>-falign-jumps=</code><var>n</var><code></code>

     <dd>Align branch targets to a power-of-two boundary, for branch targets

where the targets can only be reached by jumping, skipping up to <var>n</var>

bytes like <code>-falign-functions</code>.  In this case, no dummy operations

need be executed.



     <p>If <var>n</var> is not specified, use a machine-dependent default.



     <p>Enabled at levels <code>-O2</code>, <code>-O3</code>.



     <br><dt><code>-frename-registers</code>

     <dd>Attempt to avoid false dependencies in scheduled code by making use

of registers left over after register allocation.  This optimization

will most benefit processors with lots of registers.  It can, however,

make debugging impossible, since variables will no longer stay in

a "home register".



     <p>Enabled at levels <code>-O3</code>.



     <br><dt><code>-fno-cprop-registers</code>

     <dd>After register allocation and post-register allocation instruction splitting,

we perform a copy-propagation pass to try to reduce scheduling dependencies

and occasionally eliminate the copy.



     <p>Disabled at levels <code>-O</code>, <code>-O2</code>, <code>-O3</code>, <code>-Os</code>.



   </dl>



   <p>The following options control compiler behavior regarding floating

point arithmetic.  These options trade off between speed and

correctness.  All must be specifically enabled.



     <dl>

<dt><code>-ffloat-store</code>

     <dd>Do not store floating point variables in registers, and inhibit other

options that might change whether a floating point value is taken from a

register or memory.



     <p>This option prevents undesirable excess precision on machines such as

the 68000 where the floating registers (of the 68881) keep more

precision than a <code>double</code> is supposed to have.  Similarly for the

x86 architecture.  For most programs, the excess precision does only

good, but a few programs rely on the precise definition of IEEE floating

point.  Use <code>-ffloat-store</code> for such programs, after modifying

them to store all pertinent intermediate computations into variables.



     <br><dt><code>-ffast-math</code>

     <dd>Sets <code>-fno-math-errno</code>, <code>-funsafe-math-optimizations</code>, <br>

<code>-fno-trapping-math</code>, <code>-ffinite-math-only</code> and <br>

<code>-fno-signaling-nans</code>.



     <p>This option causes the preprocessor macro <code>__FAST_MATH__</code> to be defined.



     <p>This option should never be turned on by any <code>-O</code> option since

it can result in incorrect output for programs which depend on

an exact implementation of IEEE or ISO rules/specifications for

math functions.



     <br><dt><code>-fno-math-errno</code>

     <dd>Do not set ERRNO after calling math functions that are executed

with a single instruction, e.g., sqrt.  A program that relies on

IEEE exceptions for math error handling may want to use this flag

for speed while maintaining IEEE arithmetic compatibility.



     <p>This option should never be turned on by any <code>-O</code> option since

it can result in incorrect output for programs which depend on

an exact implementation of IEEE or ISO rules/specifications for

math functions.



     <p>The default is <code>-fmath-errno</code>.



     <br><dt><code>-funsafe-math-optimizations</code>

     <dd>Allow optimizations for floating-point arithmetic that (a) assume

that arguments and results are valid and (b) may violate IEEE or

ANSI standards.  When used at link-time, it may include libraries

or startup files that change the default FPU control word or other

similar optimizations.



     <p>This option should never be turned on by any <code>-O</code> option since

it can result in incorrect output for programs which depend on

an exact implementation of IEEE or ISO rules/specifications for

math functions.



     <p>The default is <code>-fno-unsafe-math-optimizations</code>.



     <br><dt><code>-ffinite-math-only</code>

     <dd>Allow optimizations for floating-point arithmetic that assume

that arguments and results are not NaNs or +-Infs.



     <p>This option should never be turned on by any <code>-O</code> option since

it can result in incorrect output for programs which depend on

an exact implementation of IEEE or ISO rules/specifications.



     <p>The default is <code>-fno-finite-math-only</code>.



     <br><dt><code>-fno-trapping-math</code>

     <dd>Compile code assuming that floating-point operations cannot generate

user-visible traps.  These traps include division by zero, overflow,

underflow, inexact result and invalid operation.  This option implies

<code>-fno-signaling-nans</code>.  Setting this option may allow faster

code if one relies on "non-stop" IEEE arithmetic, for example.



     <p>This option should never be turned on by any <code>-O</code> option since

it can result in incorrect output for programs which depend on

an exact implementation of IEEE or ISO rules/specifications for

math functions.



     <p>The default is <code>-ftrapping-math</code>.



     <br><dt><code>-fsignaling-nans</code>

     <dd>Compile code assuming that IEEE signaling NaNs may generate user-visible

traps during floating-point operations.  Setting this option disables

optimizations that may change the number of exceptions visible with

signaling NaNs.  This option implies <code>-ftrapping-math</code>.



     <p>This option causes the preprocessor macro <code>__SUPPORT_SNAN__</code> to

be defined.



     <p>The default is <code>-fno-signaling-nans</code>.



     <p>This option is experimental and does not currently guarantee to

disable all GCC optimizations that affect signaling NaN behavior.



     <br><dt><code>-fsingle-precision-constant</code>

     <dd>Treat floating point constant as single precision constant instead of

implicitly converting it to double precision constant.



   </dl>



   <p>The following options control optimizations that may improve

performance, but are not enabled by any <code>-O</code> options.  This

section includes experimental options that may produce broken code.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -