📄 loop.texi

📁 理解和实践操作系统的一本好书
💻 TEXI
📖 第 1 页 / 共 3 页
字号:
@item @code{flow_loop_tree_node_add}: Adds a node to the tree.@item @code{flow_loop_tree_node_remove}: Removes a node from the tree.@item @code{add_bb_to_loop}: Adds a basic block to a loop.@item @code{remove_bb_from_loops}: Removes a basic block from loops.@end itemizeMost low-level CFG functions update loops automatically.  The followingfunctions handle some more complicated cases of CFG manipulations:@itemize@item @code{remove_path}: Removes an edge and all blocks it dominates.@item @code{split_loop_exit_edge}: Splits exit edge of the loop,ensuring that PHI node arguments remain in the loop (this ensures thatloop-closed SSA form is preserved).  Only useful on GIMPLE.@end itemizeFinally, there are some higher-level loop transformations implemented.While some of them are written so that they should work on non-innermostloops, they are mostly untested in that case, and at the moment, theyare only reliable for the innermost loops:@itemize@item @code{create_iv}: Creates a new induction variable.  Only works onGIMPLE@.  @code{standard_iv_increment_position} can be used to find asuitable place for the iv increment.@item @code{duplicate_loop_to_header_edge},@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL andon GIMPLE) duplicate the body of the loop prescribed number of times onone of the edges entering loop header, thus performing either loopunrolling or loop peeling.  @code{can_duplicate_loop_p}(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicatedloop.@item @code{loop_version}, @code{tree_ssa_loop_version}: These functioncreate a copy of a loop, and a branch before them that selects one ofthem depending on the prescribed condition.  This is useful foroptimizations that need to verify some assumptions in runtime (one ofthe copies of the loop is usually left unchanged, while the other one istransformed in some way).@item @code{tree_unroll_loop}: Unrolls the loop, including peeling theextra iterations to make the number of iterations divisible by unrollfactor, updating the exit condition, and removing the exits that nowcannot be taken.  Works only on GIMPLE.@end itemize@node LCSSA@section Loop-closed SSA form@cindex LCSSA@cindex Loop-closed SSA formThroughout the loop optimizations on tree level, one extra condition isenforced on the SSA form:  No SSA name is used outside of the loop inthat it is defined.  The SSA form satisfying this condition is called``loop-closed SSA form'' -- LCSSA@.  To enforce LCSSA, PHI nodes must becreated at the exits of the loops for the SSA names that are usedoutside of them.  Only the real operands (not virtual SSA names) areheld in LCSSA, in order to save memory.There are various benefits of LCSSA:@itemize@item Many optimizations (value range analysis, final valuereplacement) are interested in the values that are defined in the loopand used outside of it, i.e., exactly those for that we create new PHInodes.@item In induction variable analysis, it is not necessary to specify theloop in that the analysis should be performed -- the scalar evolutionanalysis always returns the results with respect to the loop in that theSSA name is defined.@item It makes updating of SSA form during loop transformations simpler.Without LCSSA, operations like loop unrolling may force creation of PHInodes arbitrarily far from the loop, while in LCSSA, the SSA form can beupdated locally.  However, since we only keep real operands in LCSSA, wecannot use this advantage (we could have local updating of realoperands, but it is not much more efficient than to use generic SSA formupdating for it as well; the amount of changes to SSA is the same).@end itemizeHowever, it also means LCSSA must be updated.  This is usuallystraightforward, unless you create a new value in loop and use itoutside, or unless you manipulate loop exit edges (functions areprovided to make these manipulations simple).@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form toLCSSA, and @code{verify_loop_closed_ssa} to check that the invariant ofLCSSA is preserved.@node Scalar evolutions@section Scalar evolutions@cindex Scalar evolutions@cindex IV analysis on GIMPLEScalar evolutions (SCEV) are used to represent results of inductionvariable analysis on GIMPLE@.  They enable us to represent variables withcomplicated behavior in a simple and consistent way (we only use it toexpress values of polynomial induction variables, but it is possible toextend it).  The interfaces to SCEV analysis are declared in@file{tree-scalar-evolution.h}.  To use scalar evolutions analysis,@code{scev_initialize} must be used.  To stop using SCEV,@code{scev_finalize} should be used.  SCEV analysis caches results inorder to save time and memory.  This cache however is made invalid bymost of the loop transformations, including removal of code.  If such atransformation is performed, @code{scev_reset} must be called to cleanthe caches.Given an SSA name, its behavior in loops can be analyzed using the@code{analyze_scalar_evolution} function.  The returned SCEV howeverdoes not have to be fully analyzed and it may contain references toother SSA names defined in the loop.  To resolve these (potentiallyrecursive) references, @code{instantiate_parameters} or@code{resolve_mixers} functions must be used.@code{instantiate_parameters} is useful when you use the results of SCEVonly for some analysis, and when you work with whole nest of loops atonce.  It will try replacing all SSA names by their SCEV in all loops,including the super-loops of the current loop, thus providing a completeinformation about the behavior of the variable in the loop nest.@code{resolve_mixers} is useful if you work with only one loop at atime, and if you possibly need to create code based on the value of theinduction variable.  It will only resolve the SSA names defined in thecurrent loop, leaving the SSA names defined outside unchanged, even iftheir evolution in the outer loops is known.The SCEV is a normal tree expression, except for the fact that it maycontain several special tree nodes.  One of them is@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot beexpressed.  The other one is @code{POLYNOMIAL_CHREC}.  Polynomial chrechas three arguments -- base, step and loop (both base and step maycontain further polynomial chrecs).  Type of the expression and of baseand step must be the same.  A variable has evolution@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specifiedloop) equivalent to @code{x_1} in the following example@smallexamplewhile (@dots{})  @{    x_1 = phi (base, x_2);    x_2 = x_1 + step;  @}@end smallexampleNote that this includes the language restrictions on the operations.For example, if we compile C code and @code{x} has signed type, then theoverflow in addition would cause undefined behavior, and we may assumethat this does not happen.  Hence, the value with this SCEV cannotoverflow (which restricts the number of iterations of such a loop).In many cases, one wants to restrict the attention just to affineinduction variables.  In this case, the extra expressive power of SCEVis not useful, and may complicate the optimizations.  In this case,@code{simple_iv} function may be used to analyze a value -- the resultis a loop-invariant base and step.@node loop-iv@section IV analysis on RTL@cindex IV analysis on RTLThe induction variable on RTL is simple and only allows analysis ofaffine induction variables, and only in one loop at once.  The interfaceis declared in @file{cfgloop.h}.  Before analyzing induction variablesin a loop L, @code{iv_analysis_loop_init} function must be called on L.After the analysis (possibly calling @code{iv_analysis_loop_init} forseveral loops) is finished, @code{iv_analysis_done} should be called.The following functions can be used to access the results of theanalysis:@itemize@item @code{iv_analyze}: Analyzes a single register used in the giveninsn.  If no use of the register in this insn is found, the followinginsns are scanned, so that this function can be called on the insnreturned by get_condition.@item @code{iv_analyze_result}: Analyzes result of the assignment in thegiven insn.@item @code{iv_analyze_expr}: Analyzes a more complicated expression.All its operands are analyzed by @code{iv_analyze}, and hence they mustbe used in the specified insn or one of the following insns.@end itemizeThe description of the induction variable is provided in @code{structrtx_iv}.  In order to handle subregs, the representation is a bitcomplicated; if the value of the @code{extend} field is not@code{UNKNOWN}, the value of the induction variable in the i-thiteration is@smallexampledelta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)),@end smallexamplewith the following exception:  if @code{first_special} is true, then thevalue in the first iteration (when @code{i} is zero) is @code{delta +mult * base}.  However, if @code{extend} is equal to @code{UNKNOWN},then @code{first_special} must be false, @code{delta} 0, @code{mult} 1and the value in the i-th iteration is@smallexamplesubreg_@{mode@} (base + i * step)@end smallexampleThe function @code{get_iv_value} can be used to perform thesecalculations.@node Number of iterations@section Number of iterations analysis@cindex Number of iterations analysisBoth on GIMPLE and on RTL, there are functions available to determinethe number of iterations of a loop, with a similar interface.  Thenumber of iterations of a loop in GCC is defined as the number ofexecutions of the loop latch.  In many cases, it is not possible todetermine the number of iterations unconditionally -- the determinednumber is correct only if some assumptions are satisfied.  The analysistries to verify these conditions using the information contained in theprogram; if it fails, the conditions are returned together with theresult.  The following information and conditions are provided by theanalysis:@itemize@item @code{assumptions}: If this condition is false, the rest ofthe information is invalid.@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: Ifthis condition is true, the loop exits in the first iteration.@item @code{infinite}: If this condition is true, the loop is infinite.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -