📄 optimize-options.html
字号:
<dd>When scheduling after register allocation, do use superblock schedulingalgorithm. Superblock scheduling allows motion across basic block boundariesresulting on faster schedules. This option is experimental, as not all machinedescriptions used by GCC model the CPU closely enough to avoid unreliableresults from the algorithm. <p>This only makes sense when scheduling after register allocation, i.e. with<code>-fschedule-insns2</code> or at <code>-O2</code> or higher. <br><dt><code>-fsched2-use-traces</code> <dd>Use <code>-fsched2-use-superblocks</code> algorithm when scheduling after registerallocation and additionally perform code duplication in order to increase thesize of superblocks using tracer pass. See <code>-ftracer</code> for details ontrace formation. <p>This mode should produce faster but significantly longer programs. Alsowithout <code>-fbranch-probabilities</code> the traces constructed may not match thereality and hurt the performance. This only makessense when scheduling after register allocation, i.e. with<code>-fschedule-insns2</code> or at <code>-O2</code> or higher. <br><dt><code>-fcaller-saves</code> <dd>Enable values to be allocated in registers that will be clobbered byfunction calls, by emitting extra instructions to save and restore theregisters around such calls. Such allocation is done only when itseems to result in better code than would otherwise be produced. <p>This option is always enabled by default on certain machines, usuallythose which have no call-preserved registers to use instead. <p>Enabled at levels <code>-O2</code>, <code>-O3</code>, <code>-Os</code>. <br><dt><code>-fmove-all-movables</code> <dd>Forces all invariant computations in loops to be movedoutside the loop. <br><dt><code>-freduce-all-givs</code> <dd>Forces all general-induction variables in loops to bestrength-reduced. <p><em>Note:</em> When compiling programs written in Fortran,<code>-fmove-all-movables</code> and <code>-freduce-all-givs</code> are enabledby default when you use the optimizer. <p>These options may generate better or worse code; results are highlydependent on the structure of loops within the source code. <p>These two options are intended to be removed someday, oncethey have helped determine the efficacy of variousapproaches to improving loop optimizations. <p>Please let us (<a href="mailto:gcc@gcc.gnu.org">gcc@gcc.gnu.org</a> and <a href="mailto:fortran@gnu.org">fortran@gnu.org</a>)know how use of these options affectsthe performance of your production code. We're very interested in code that runs <em>slower</em>when these options are <em>enabled</em>. <br><dt><code>-fno-peephole</code> <dd><dt><code>-fno-peephole2</code> <dd>Disable any machine-specific peephole optimizations. The differencebetween <code>-fno-peephole</code> and <code>-fno-peephole2</code> is in how theyare implemented in the compiler; some targets use one, some use theother, a few use both. <p><code>-fpeephole</code> is enabled by default. <code>-fpeephole2</code> enabled at levels <code>-O2</code>, <code>-O3</code>, <code>-Os</code>. <br><dt><code>-fno-guess-branch-probability</code> <dd>Do not guess branch probabilities using a randomized model. <p>Sometimes GCC will opt to use a randomized model to guess branchprobabilities, when none are available from either profiling feedback(<code>-fprofile-arcs</code>) or <code>__builtin_expect</code>. This means thatdifferent runs of the compiler on the same program may produce differentobject code. <p>In a hard real-time system, people don't want different runs of thecompiler to produce code that has different behavior; minimizingnon-determinism is of paramount import. This switch allows users toreduce non-determinism, possibly at the expense of inferioroptimization. <p>The default is <code>-fguess-branch-probability</code> at levels<code>-O</code>, <code>-O2</code>, <code>-O3</code>, <code>-Os</code>. <br><dt><code>-freorder-blocks</code> <dd>Reorder basic blocks in the compiled function in order to reduce number oftaken branches and improve code locality. <p>Enabled at levels <code>-O2</code>, <code>-O3</code>. <br><dt><code>-freorder-blocks-and-partition</code> <dd>In addition to reordering basic blocks in the compiled function, in orderto reduce number of taken branches, partitions hot and cold basic blocksinto separate sections of the assembly and .o files, to improvepaging and cache locality performance. <br><dt><code>-freorder-functions</code> <dd>Reorder basic blocks in the compiled function in order to reduce number oftaken branches and improve code locality. This is implemented by using specialsubsections <code>.text.hot</code> for most frequently executed functions and<code>.text.unlikely</code> for unlikely executed functions. Reordering is done bythe linker so object file format must support named sections and linker mustplace them in a reasonable way. <p>Also profile feedback must be available in to make this option effective. See<code>-fprofile-arcs</code> for details. <p>Enabled at levels <code>-O2</code>, <code>-O3</code>, <code>-Os</code>. <br><dt><code>-fstrict-aliasing</code> <dd>Allows the compiler to assume the strictest aliasing rules applicable tothe language being compiled. For C (and C++), this activatesoptimizations based on the type of expressions. In particular, anobject of one type is assumed never to reside at the same address as anobject of a different type, unless the types are almost the same. Forexample, an <code>unsigned int</code> can alias an <code>int</code>, but not a<code>void*</code> or a <code>double</code>. A character type may alias any othertype. <p>Pay special attention to code like this: <pre class="smallexample"> union a_union { int i; double d; }; int f() { a_union t; t.d = 3.0; return t.i; } </pre> The practice of reading from a different union member than the one mostrecently written to (called "type-punning") is common. Even with<code>-fstrict-aliasing</code>, type-punning is allowed, provided the memoryis accessed through the union type. So, the code above will work asexpected. However, this code might not: <pre class="smallexample"> int f() { a_union t; int* ip; t.d = 3.0; ip = &t.i; return *ip; } </pre> <p>Every language that wishes to perform language-specific alias analysisshould define a function that computes, given an <code>tree</code>node, an alias set for the node. Nodes in different alias sets are notallowed to alias. For an example, see the C front-end function<code>c_get_alias_set</code>. <p>Enabled at levels <code>-O2</code>, <code>-O3</code>, <code>-Os</code>. <br><dt><code>-falign-functions</code> <dd><dt><code>-falign-functions=</code><var>n</var><code></code> <dd>Align the start of functions to the next power-of-two greater than<var>n</var>, skipping up to <var>n</var> bytes. For instance,<code>-falign-functions=32</code> aligns functions to the next 32-byteboundary, but <code>-falign-functions=24</code> would align to the next32-byte boundary only if this can be done by skipping 23 bytes or less. <p><code>-fno-align-functions</code> and <code>-falign-functions=1</code> areequivalent and mean that functions will not be aligned. <p>Some assemblers only support this flag when <var>n</var> is a power of two;in that case, it is rounded up. <p>If <var>n</var> is not specified or is zero, use a machine-dependent default. <p>Enabled at levels <code>-O2</code>, <code>-O3</code>. <br><dt><code>-falign-labels</code> <dd><dt><code>-falign-labels=</code><var>n</var><code></code> <dd>Align all branch targets to a power-of-two boundary, skipping up to<var>n</var> bytes like <code>-falign-functions</code>. This option can easilymake code slower, because it must insert dummy operations for when thebranch target is reached in the usual flow of the code. <p><code>-fno-align-labels</code> and <code>-falign-labels=1</code> areequivalent and mean that labels will not be aligned. <p>If <code>-falign-loops</code> or <code>-falign-jumps</code> are applicable andare greater than this value, then their values are used instead. <p>If <var>n</var> is not specified or is zero, use a machine-dependent defaultwhich is very likely to be <code>1</code>, meaning no alignment. <p>Enabled at levels <code>-O2</code>, <code>-O3</code>. <br><dt><code>-falign-loops</code> <dd><dt><code>-falign-loops=</code><var>n</var><code></code> <dd>Align loops to a power-of-two boundary, skipping up to <var>n</var> byteslike <code>-falign-functions</code>. The hope is that the loop will beexecuted many times, which will make up for any execution of the dummyoperations. <p><code>-fno-align-loops</code> and <code>-falign-loops=1</code> areequivalent and mean that loops will not be aligned. <p>If <var>n</var> is not specified or is zero, use a machine-dependent default. <p>Enabled at levels <code>-O2</code>, <code>-O3</code>. <br><dt><code>-falign-jumps</code> <dd><dt><code>-falign-jumps=</code><var>n</var><code></code> <dd>Align branch targets to a power-of-two boundary, for branch targetswhere the targets can only be reached by jumping, skipping up to <var>n</var>bytes like <code>-falign-functions</code>. In this case, no dummy operationsneed be executed. <p><code>-fno-align-jumps</code> and <code>-falign-jumps=1</code> areequivalent and mean that loops will not be aligned. <p>If <var>n</var> is not specified or is zero, use a machine-dependent default. <p>Enabled at levels <code>-O2</code>, <code>-O3</code>. <br><dt><code>-funit-at-a-time</code> <dd>Parse the whole compilation unit before starting to produce code. This allows some extra optimizations to take place but consumesmore memory (in general). There are some compatibility issueswith <em>unit-at-at-time</em> mode: <ul><li>enabling <em>unit-at-a-time</em> mode may change the orderin which functions, variables, and top-level <code>asm</code> statementsare emitted, and will likely break code relying on some particularordering. The majority of such top-level <code>asm</code> statements,though, can be replaced by <code>section</code> attributes. <li><em>unit-at-a-time</em> mode removes unreferenced static variablesand functions are removed. This may result in undefined referenceswhen an <code>asm</code> statement refers directly to variables or functionsthat are otherwise unused. In that case either the variable/functionshall be listed as an operand of the <code>asm</code> statement operand or,in the case of top-level <code>asm</code> statements the attribute <code>used</code>shall be used on the declaration. <li>Static functions now can use non-standard passing conventions thatmay break <code>asm</code> statements calling functions directly. Again,attribute <code>used</code> will prevent this behavior. </ul> <p>As a temporary workaround, <code>-fno-unit-at-a-time</code> can be used,but this scheme may not be supported by future releases of GCC. <p>Enabled at levels <code>-O2</code>, <code>-O3</code>. <br><dt><code>-fweb</code> <dd>Constructs webs as commonly used for register allocation purposes and assigneach web individual pseudo register. This allows our register allocation passto operate on pseudos directly, but also strengthens several other optimizationpasses, such as CSE, loop optimizer and trivial dead code remover. It can,however, make debugging impossible, since variables will no longer stay in a"home register". <p>Enabled at levels <code>-O2</code>, <code>-O3</code>, <code>-Os</code>,on targets where the default format for debugging information supportsvariable tracking. <br><dt><code>-fno-cprop-registers</code> <dd>After register allocation and post-register allocation instruction splitting,we perform a copy-propagation pass to try to reduce scheduling dependenciesand occasionally eliminate the copy. <p>Disabled at levels <code>-O</code>, <code>-O2</code>, <code>-O3</code>, <code>-Os</code>.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -