📄 cfg.texi
字号:
@c -*-texinfo-*-@c Copyright (C) 2001, 2003, 2004, 2005, 2006, 2007, 2008 Free Software@c Foundation, Inc.@c This is part of the GCC manual.@c For copying conditions, see the file gcc.texi.@c ---------------------------------------------------------------------@c Control Flow Graph@c ---------------------------------------------------------------------@node Control Flow@chapter Control Flow Graph@cindex CFG, Control Flow Graph@findex basic-block.hA control flow graph (CFG) is a data structure built on top of theintermediate code representation (the RTL or @code{tree} instructionstream) abstracting the control flow behavior of a function that isbeing compiled. The CFG is a directed graph where the verticesrepresent basic blocks and edges represent possible transfer ofcontrol flow from one basic block to another. The data structuresused to represent the control flow graph are defined in@file{basic-block.h}.@menu* Basic Blocks:: The definition and representation of basic blocks.* Edges:: Types of edges and their representation.* Profile information:: Representation of frequencies and probabilities.* Maintaining the CFG:: Keeping the control flow graph and up to date.* Liveness information:: Using and maintaining liveness information.@end menu@node Basic Blocks@section Basic Blocks@cindex basic block@findex basic_blockA basic block is a straight-line sequence of code with only one entrypoint and only one exit. In GCC, basic blocks are represented usingthe @code{basic_block} data type.@findex next_bb, prev_bb, FOR_EACH_BBTwo pointer members of the @code{basic_block} structure are thepointers @code{next_bb} and @code{prev_bb}. These are used to keepdoubly linked chain of basic blocks in the same order as theunderlying instruction stream. The chain of basic blocks is updatedtransparently by the provided API for manipulating the CFG@. The macro@code{FOR_EACH_BB} can be used to visit all the basic blocks inlexicographical order. Dominator traversals are also possible using@code{walk_dominator_tree}. Given two basic blocks A and B, block Adominates block B if A is @emph{always} executed before B@.@findex BASIC_BLOCKThe @code{BASIC_BLOCK} array contains all basic blocks in anunspecified order. Each @code{basic_block} structure has a fieldthat holds a unique integer identifier @code{index} that is theindex of the block in the @code{BASIC_BLOCK} array.The total number of basic blocks in the function is@code{n_basic_blocks}. Both the basic block indices andthe total number of basic blocks may vary during the compilationprocess, as passes reorder, create, duplicate, and destroy basicblocks. The index for any block should never be greater than@code{last_basic_block}.@findex ENTRY_BLOCK_PTR, EXIT_BLOCK_PTRSpecial basic blocks represent possible entry and exit points of afunction. These blocks are called @code{ENTRY_BLOCK_PTR} and@code{EXIT_BLOCK_PTR}. These blocks do not contain any code, and arenot elements of the @code{BASIC_BLOCK} array. Therefore they havebeen assigned unique, negative index numbers.Each @code{basic_block} also contains pointers to the firstinstruction (the @dfn{head}) and the last instruction (the @dfn{tail})or @dfn{end} of the instruction stream contained in a basic block. Infact, since the @code{basic_block} data type is used to representblocks in both major intermediate representations of GCC (@code{tree}and RTL), there are pointers to the head and end of a basic block forboth representations.@findex NOTE_INSN_BASIC_BLOCK, CODE_LABEL, notesFor RTL, these pointers are @code{rtx head, end}. In the RTL functionrepresentation, the head pointer always points either to a@code{NOTE_INSN_BASIC_BLOCK} or to a @code{CODE_LABEL}, if present.In the RTL representation of a function, the instruction streamcontains not only the ``real'' instructions, but also @dfn{notes}.Any function that moves or duplicates the basic blocks needsto take care of updating of these notes. Many of these notes expectthat the instruction stream consists of linear regions, making suchupdates difficult. The @code{NOTE_INSN_BASIC_BLOCK} note is the onlykind of note that may appear in the instruction stream contained in abasic block. The instruction stream of a basic block always follows a@code{NOTE_INSN_BASIC_BLOCK}, but zero or more @code{CODE_LABEL}nodes can precede the block note. A basic block ends by control flowinstruction or last instruction before following @code{CODE_LABEL} or@code{NOTE_INSN_BASIC_BLOCK}. A @code{CODE_LABEL} cannot appear inthe instruction stream of a basic block.@findex can_fallthru@cindex table jumpIn addition to notes, the jump table vectors are also represented as``pseudo-instructions'' inside the insn stream. These vectors neverappear in the basic block and should always be placed just after thetable jump instructions referencing them. After removing thetable-jump it is often difficult to eliminate the code computing theaddress and referencing the vector, so cleaning up these vectors ispostponed until after liveness analysis. Thus the jump table vectorsmay appear in the insn stream unreferenced and without any purpose.Before any edge is made @dfn{fall-thru}, the existence of suchconstruct in the way needs to be checked by calling@code{can_fallthru} function.@cindex block statement iteratorsFor the @code{tree} representation, the head and end of the basic blockare being pointed to by the @code{stmt_list} field, but this special@code{tree} should never be referenced directly. Instead, at the treelevel abstract containers and iterators are used to access statementsand expressions in basic blocks. These iterators are called@dfn{block statement iterators} (BSIs). Grep for @code{^bsi}in the various @file{tree-*} files.The following snippet will pretty-print all the statements of theprogram in the GIMPLE representation.@smallexampleFOR_EACH_BB (bb) @{ block_stmt_iterator si; for (si = bsi_start (bb); !bsi_end_p (si); bsi_next (&si)) @{ tree stmt = bsi_stmt (si); print_generic_stmt (stderr, stmt, 0); @} @}@end smallexample@node Edges@section Edges@cindex edge in the flow graph@findex edgeEdges represent possible control flow transfers from the end of somebasic block A to the head of another basic block B@. We say that A isa predecessor of B, and B is a successor of A@. Edges are representedin GCC with the @code{edge} data type. Each @code{edge} acts as alink between two basic blocks: the @code{src} member of an edgepoints to the predecessor basic block of the @code{dest} basic block.The members @code{preds} and @code{succs} of the @code{basic_block} datatype point to type-safe vectors of edges to the predecessors andsuccessors of the block.@cindex edge iteratorsWhen walking the edges in an edge vector, @dfn{edge iterators} shouldbe used. Edge iterators are constructed using the@code{edge_iterator} data structure and several methods are availableto operate on them:@ftable @code@item ei_startThis function initializes an @code{edge_iterator} that points to thefirst edge in a vector of edges.@item ei_lastThis function initializes an @code{edge_iterator} that points to thelast edge in a vector of edges.@item ei_end_pThis predicate is @code{true} if an @code{edge_iterator} representsthe last edge in an edge vector.@item ei_one_before_end_pThis predicate is @code{true} if an @code{edge_iterator} representsthe second last edge in an edge vector.@item ei_nextThis function takes a pointer to an @code{edge_iterator} and makes itpoint to the next edge in the sequence.@item ei_prevThis function takes a pointer to an @code{edge_iterator} and makes itpoint to the previous edge in the sequence.@item ei_edgeThis function returns the @code{edge} currently pointed to by an@code{edge_iterator}.@item ei_safe_safeThis function returns the @code{edge} currently pointed to by an@code{edge_iterator}, but returns @code{NULL} if the iterator ispointing at the end of the sequence. This function has been providedfor existing code makes the assumption that a @code{NULL} edgeindicates the end of the sequence.@end ftableThe convenience macro @code{FOR_EACH_EDGE} can be used to visit all ofthe edges in a sequence of predecessor or successor edges. It mustnot be used when an element might be removed during the traversal,otherwise elements will be missed. Here is an example of how to usethe macro:@smallexampleedge e;edge_iterator ei;FOR_EACH_EDGE (e, ei, bb->succs) @{ if (e->flags & EDGE_FALLTHRU) break; @}@end smallexample@findex fall-thruThere are various reasons why control flow may transfer from one blockto another. One possibility is that some instruction, for example a@code{CODE_LABEL}, in a linearized instruction stream just alwaysstarts a new basic block. In this case a @dfn{fall-thru} edge linksthe basic block to the first following basic block. But there areseveral other reasons why edges may be created. The @code{flags}field of the @code{edge} data type is used to store informationabout the type of edge we are dealing with. Each edge is of one ofthe following types:@table @emph@item jumpNo type flags are set for edges corresponding to jump instructions.These edges are used for unconditional or conditional jumps and inRTL also for table jumps. They are the easiest to manipulate as theymay be freely redirected when the flow graph is not in SSA form.@item fall-thru@findex EDGE_FALLTHRU, force_nonfallthruFall-thru edges are present in case where the basic block may continueexecution to the following one without branching. These edges havethe @code{EDGE_FALLTHRU} flag set. Unlike other types of edges, theseedges must come into the basic block immediately following in theinstruction stream. The function @code{force_nonfallthru} isavailable to insert an unconditional jump in the case that redirectionis needed. Note that this may require creation of a new basic block.@item exception handling@cindex exception handling@findex EDGE_ABNORMAL, EDGE_EHException handling edges represent possible control transfers from atrapping instruction to an exception handler. The definition of``trapping'' varies. In C++, only function calls can throw, but forJava, exceptions like division by zero or segmentation fault aredefined and thus each instruction possibly throwing this kind ofexception needs to be handled as control flow instruction. Exceptionedges have the @code{EDGE_ABNORMAL} and @code{EDGE_EH} flags set.@findex purge_dead_edgesWhen updating the instruction stream it is easy to change possiblytrapping instruction to non-trapping, by simply removing the exceptionedge. The opposite conversion is difficult, but should not happenanyway. The edges can be eliminated via @code{purge_dead_edges} call.@findex REG_EH_REGION, EDGE_ABNORMAL_CALLIn the RTL representation, the destination of an exception edge isspecified by @code{REG_EH_REGION} note attached to the insn.In case of a trapping call the @code{EDGE_ABNORMAL_CALL} flag is settoo. In the @code{tree} representation, this extra flag is not set.@findex may_trap_p, tree_could_trap_pIn the RTL representation, the predicate @code{may_trap_p} may be usedto check whether instruction still may trap or not. For the treerepresentation, the @code{tree_could_trap_p} predicate is available,but this predicate only checks for possible memory traps, as indereferencing an invalid pointer location.@item sibling calls@cindex sibling call@findex EDGE_ABNORMAL, EDGE_SIBCALLSibling calls or tail calls terminate the function in a non-standardway and thus an edge to the exit must be present.@code{EDGE_SIBCALL} and @code{EDGE_ABNORMAL} are set in such case.These edges only exist in the RTL representation.@item computed jumps@cindex computed jump@findex EDGE_ABNORMALComputed jumps contain edges to all labels in the function referencedfrom the code. All those edges have @code{EDGE_ABNORMAL} flag set.The edges used to represent computed jumps often cause compile timeperformance problems, since functions consisting of many taken labelsand many computed jumps may have @emph{very} dense flow graphs, sothese edges need to be handled with special care. During the earlierstages of the compilation process, GCC tries to avoid such dense flowgraphs by factoring computed jumps. For example, given the followingseries of jumps,@smallexample goto *x; [ @dots{} ] goto *x; [ @dots{} ] goto *x; [ @dots{} ]@end smallexample@noindentfactoring the computed jumps results in the following code sequencewhich has a much simpler flow graph:@smallexample goto y; [ @dots{} ] goto y; [ @dots{} ] goto y; [ @dots{} ]y: goto *x;@end smallexampleHowever, the classic problem with this transformation is that it has aruntime cost in there resulting code: An extra jump. Therefore, thecomputed jumps are un-factored in the later passes of the compiler.Be aware of that when you work on passes in that area. There havebeen numerous examples already where the compile time for code withunfactored computed jumps caused some serious headaches.@item nonlocal goto handlers@cindex nonlocal goto handler
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -