📄 sched.c
字号:
/* Instruction scheduling pass. Copyright (C) 1992 Free Software Foundation, Inc. Contributed by Michael Tiemann (tiemann@cygnus.com) Enhanced by, and currently maintained by, Jim Wilson (wilson@cygnus.com)This file is part of GNU CC.GNU CC is free software; you can redistribute it and/or modifyit under the terms of the GNU General Public License as published bythe Free Software Foundation; either version 2, or (at your option)any later version.GNU CC is distributed in the hope that it will be useful,but WITHOUT ANY WARRANTY; without even the implied warranty ofMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See theGNU General Public License for more details.You should have received a copy of the GNU General Public Licensealong with GNU CC; see the file COPYING. If not, write tothe Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. *//* Instruction scheduling pass. This pass implements list scheduling within basic blocks. It is run after flow analysis, but before register allocation. The scheduler works as follows: We compute insn priorities based on data dependencies. Flow analysis only creates a fraction of the data-dependencies we must observe: namely, only those dependencies which the combiner can be expected to use. For this pass, we must therefore create the remaining dependencies we need to observe: register dependencies, memory dependencies, dependencies to keep function calls in order, and the dependence between a conditional branch and the setting of condition codes are all dealt with here. The scheduler first traverses the data flow graph, starting with the last instruction, and proceeding to the first, assigning values to insn_priority as it goes. This sorts the instructions topologically by data dependence. Once priorities have been established, we order the insns using list scheduling. This works as follows: starting with a list of all the ready insns, and sorted according to priority number, we schedule the insn from the end of the list by placing its predecessors in the list according to their priority order. We consider this insn scheduled by setting the pointer to the "end" of the list to point to the previous insn. When an insn has no predecessors, we either queue it until sufficient time has elapsed or add it to the ready list. As the instructions are scheduled or when stalls are introduced, the queue advances and dumps insns into the ready list. When all insns down to the lowest priority have been scheduled, the critical path of the basic block has been made as short as possible. The remaining insns are then scheduled in remaining slots. Function unit conflicts are resolved during reverse list scheduling by tracking the time when each insn is committed to the schedule and from that, the time the function units it uses must be free. As insns on the ready list are considered for scheduling, those that would result in a blockage of the already committed insns are queued until no blockage will result. Among the remaining insns on the ready list to be considered, the first one with the largest potential for causing a subsequent blockage is chosen. The following list shows the order in which we want to break ties among insns in the ready list: 1. choose insn with lowest conflict cost, ties broken by 2. choose insn with the longest path to end of bb, ties broken by 3. choose insn that kills the most registers, ties broken by 4. choose insn that conflicts with the most ready insns, or finally 5. choose insn with lowest UID. Memory references complicate matters. Only if we can be certain that memory references are not part of the data dependency graph (via true, anti, or output dependence), can we move operations past memory references. To first approximation, reads can be done independently, while writes introduce dependencies. Better approximations will yield fewer dependencies. Dependencies set up by memory references are treated in exactly the same way as other dependencies, by using LOG_LINKS. Having optimized the critical path, we may have also unduly extended the lifetimes of some registers. If an operation requires that constants be loaded into registers, it is certainly desirable to load those constants as early as necessary, but no earlier. I.e., it will not do to load up a bunch of registers at the beginning of a basic block only to use them at the end, if they could be loaded later, since this may result in excessive register utilization. Note that since branches are never in basic blocks, but only end basic blocks, this pass will not do any branch scheduling. But that is ok, since we can use GNU's delayed branch scheduling pass to take care of this case. Also note that no further optimizations based on algebraic identities are performed, so this pass would be a good one to perform instruction splitting, such as breaking up a multiply instruction into shifts and adds where that is profitable. Given the memory aliasing analysis that this pass should perform, it should be possible to remove redundant stores to memory, and to load values from registers instead of hitting memory. This pass must update information that subsequent passes expect to be correct. Namely: reg_n_refs, reg_n_sets, reg_n_deaths, reg_n_calls_crossed, and reg_live_length. Also, basic_block_head, basic_block_end. The information in the line number notes is carefully retained by this pass. All other NOTE insns are grouped in their same relative order at the beginning of basic blocks that have been scheduled. */#include <stdio.h>#include "config.h"#include "rtl.h"#include "basic-block.h"#include "regs.h"#include "hard-reg-set.h"#include "flags.h"#include "insn-config.h"#include "insn-attr.h"#ifdef INSN_SCHEDULING/* Arrays set up by scheduling for the same respective purposes as similar-named arrays set up by flow analysis. We work with these arrays during the scheduling pass so we can compare values against unscheduled code. Values of these arrays are copied at the end of this pass into the arrays set up by flow analysis. */static short *sched_reg_n_deaths;static int *sched_reg_n_calls_crossed;static int *sched_reg_live_length;/* Element N is the next insn that sets (hard or pseudo) register N within the current basic block; or zero, if there is no such insn. Needed for new registers which may be introduced by splitting insns. */static rtx *reg_last_uses;static rtx *reg_last_sets;/* Vector indexed by INSN_UID giving the original ordering of the insns. */static int *insn_luid;#define INSN_LUID(INSN) (insn_luid[INSN_UID (INSN)])/* Vector indexed by INSN_UID giving each instruction a priority. */static int *insn_priority;#define INSN_PRIORITY(INSN) (insn_priority[INSN_UID (INSN)])static short *insn_costs;#define INSN_COST(INSN) insn_costs[INSN_UID (INSN)]/* Vector indexed by INSN_UID giving an encoding of the function units used. */static short *insn_units;#define INSN_UNIT(INSN) insn_units[INSN_UID (INSN)]/* Vector indexed by INSN_UID giving an encoding of the blockage range function. The unit and the range are encoded. */static unsigned int *insn_blockage;#define INSN_BLOCKAGE(INSN) insn_blockage[INSN_UID (INSN)]#define UNIT_BITS 5#define BLOCKAGE_MASK ((1 << BLOCKAGE_BITS) - 1)#define ENCODE_BLOCKAGE(U,R) \ ((((U) << UNIT_BITS) << BLOCKAGE_BITS \ | MIN_BLOCKAGE_COST (R)) << BLOCKAGE_BITS \ | MAX_BLOCKAGE_COST (R))#define UNIT_BLOCKED(B) ((B) >> (2 * BLOCKAGE_BITS))#define BLOCKAGE_RANGE(B) \ (((((B) >> BLOCKAGE_BITS) & BLOCKAGE_MASK) << (HOST_BITS_PER_INT / 2)) \ | (B) & BLOCKAGE_MASK)/* Encodings of the `<name>_unit_blockage_range' function. */#define MIN_BLOCKAGE_COST(R) ((R) >> (HOST_BITS_PER_INT / 2))#define MAX_BLOCKAGE_COST(R) ((R) & ((1 << (HOST_BITS_PER_INT / 2)) - 1))#define DONE_PRIORITY -1#define MAX_PRIORITY 0x7fffffff#define TAIL_PRIORITY 0x7ffffffe#define LAUNCH_PRIORITY 0x7f000001#define DONE_PRIORITY_P(INSN) (INSN_PRIORITY (INSN) < 0)#define LOW_PRIORITY_P(INSN) ((INSN_PRIORITY (INSN) & 0x7f000000) == 0)/* Vector indexed by INSN_UID giving number of insns referring to this insn. */static int *insn_ref_count;#define INSN_REF_COUNT(INSN) (insn_ref_count[INSN_UID (INSN)])/* Vector indexed by INSN_UID giving line-number note in effect for each insn. For line-number notes, this indicates whether the note may be reused. */static rtx *line_note;#define LINE_NOTE(INSN) (line_note[INSN_UID (INSN)])/* Vector indexed by basic block number giving the starting line-number for each basic block. */static rtx *line_note_head;/* List of important notes we must keep around. This is a pointer to the last element in the list. */static rtx note_list;/* Regsets telling whether a given register is live or dead before the last scheduled insn. Must scan the instructions once before scheduling to determine what registers are live or dead at the end of the block. */static regset bb_dead_regs;static regset bb_live_regs;/* Regset telling whether a given register is live after the insn currently being scheduled. Before processing an insn, this is equal to bb_live_regs above. This is used so that we can find registers that are newly born/dead after processing an insn. */static regset old_live_regs;/* The chain of REG_DEAD notes. REG_DEAD notes are removed from all insns during the initial scan and reused later. If there are not exactly as many REG_DEAD notes in the post scheduled code as there were in the prescheduled code then we trigger an abort because this indicates a bug. */static rtx dead_notes;/* Queues, etc. *//* An instruction is ready to be scheduled when all insns following it have already been scheduled. It is important to ensure that all insns which use its result will not be executed until its result has been computed. An insn is maintained in one of four structures: (P) the "Pending" set of insns which cannot be scheduled until their dependencies have been satisfied. (Q) the "Queued" set of insns that can be scheduled when sufficient time has passed. (R) the "Ready" list of unscheduled, uncommitted insns. (S) the "Scheduled" list of insns. Initially, all insns are either "Pending" or "Ready" depending on whether their dependencies are satisfied. Insns move from the "Ready" list to the "Scheduled" list as they are committed to the schedule. As this occurs, the insns in the "Pending" list have their dependencies satisfied and move to either the "Ready" list or the "Queued" set depending on whether sufficient time has passed to make them ready. As time passes, insns move from the "Queued" set to the "Ready" list. Insns may move from the "Ready" list to the "Queued" set if they are blocked due to a function unit conflict. The "Pending" list (P) are the insns in the LOG_LINKS of the unscheduled insns, i.e., those that are ready, queued, and pending. The "Queued" set (Q) is implemented by the variable `insn_queue'. The "Ready" list (R) is implemented by the variables `ready' and `n_ready'. The "Scheduled" list (S) is the new insn chain built by this pass. The transition (R->S) is implemented in the scheduling loop in `schedule_block' when the best insn to schedule is chosen. The transition (R->Q) is implemented in `schedule_select' when an insn is found to to have a function unit conflict with the already committed insns. The transitions (P->R and P->Q) are implemented in `schedule_insn' as insns move from the ready list to the scheduled list. The transition (Q->R) is implemented at the top of the scheduling loop in `schedule_block' as time passes or stalls are introduced. *//* Implement a circular buffer to delay instructions until sufficient time has passed. INSN_QUEUE_SIZE is a power of two larger than MAX_BLOCKAGE and MAX_READY_COST computed by genattr.c. This is the longest time an isnsn may be queued. */static rtx insn_queue[INSN_QUEUE_SIZE];static int q_ptr = 0;static int q_size = 0;#define NEXT_Q(X) (((X)+1) & (INSN_QUEUE_SIZE-1))#define NEXT_Q_AFTER(X,C) (((X)+C) & (INSN_QUEUE_SIZE-1))/* Vector indexed by INSN_UID giving the minimum clock tick at which the insn becomes ready. This is used to note timing constraints for insns in the pending list. */static int *insn_tick;#define INSN_TICK(INSN) (insn_tick[INSN_UID (INSN)])/* Forward declarations. */static void sched_analyze_2 ();static void schedule_block ();/* Main entry point of this file. */void schedule_insns ();#endif /* INSN_SCHEDULING */#define SIZE_FOR_MODE(X) (GET_MODE_SIZE (GET_MODE (X)))/* Vector indexed by N giving the initial (unchanging) value known for pseudo-register N. */static rtx *reg_known_value;/* Indicates number of valid entries in reg_known_value. */static int reg_known_value_size;static rtxcanon_rtx (x) rtx x;{ if (GET_CODE (x) == REG && REGNO (x) >= FIRST_PSEUDO_REGISTER && REGNO (x) <= reg_known_value_size) return reg_known_value[REGNO (x)]; else if (GET_CODE (x) == PLUS) { rtx x0 = canon_rtx (XEXP (x, 0)); rtx x1 = canon_rtx (XEXP (x, 1)); if (x0 != XEXP (x, 0) || x1 != XEXP (x, 1)) { /* We can tolerate LO_SUMs being offset here; these rtl are used for nothing other than comparisons. */ if (GET_CODE (x0) == CONST_INT) return plus_constant_for_output (x1, INTVAL (x0)); else if (GET_CODE (x1) == CONST_INT) return plus_constant_for_output (x0, INTVAL (x1)); return gen_rtx (PLUS, GET_MODE (x), x0, x1); } } return x;}/* Set up all info needed to perform alias analysis on memory references. */voidinit_alias_analysis (){ int maxreg = max_reg_num (); rtx insn; rtx note; rtx set; reg_known_value_size = maxreg; reg_known_value = (rtx *) oballoc ((maxreg-FIRST_PSEUDO_REGISTER) * sizeof (rtx)) - FIRST_PSEUDO_REGISTER; bzero (reg_known_value+FIRST_PSEUDO_REGISTER, (maxreg-FIRST_PSEUDO_REGISTER) * sizeof (rtx)); /* Fill in the entries with known constant values. */ for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) if ((set = single_set (insn)) != 0 && GET_CODE (SET_DEST (set)) == REG && REGNO (SET_DEST (set)) >= FIRST_PSEUDO_REGISTER && (((note = find_reg_note (insn, REG_EQUAL, 0)) != 0 && reg_n_sets[REGNO (SET_DEST (set))] == 1) || (note = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != 0) && GET_CODE (XEXP (note, 0)) != EXPR_LIST) reg_known_value[REGNO (SET_DEST (set))] = XEXP (note, 0); /* Fill in the remaining entries. */ while (--maxreg >= FIRST_PSEUDO_REGISTER) if (reg_known_value[maxreg] == 0) reg_known_value[maxreg] = regno_reg_rtx[maxreg];}/* Return 1 if X and Y are identical-looking rtx's. We use the data in reg_known_value above to see if two registers with different numbers are, in fact, equivalent. */static intrtx_equal_for_memref_p (x, y) rtx x, y;{ register int i; register int j; register enum rtx_code code; register char *fmt; if (x == 0 && y == 0) return 1; if (x == 0 || y == 0) return 0; x = canon_rtx (x); y = canon_rtx (y); if (x == y) return 1; code = GET_CODE (x); /* Rtx's of different codes cannot be equal. */ if (code != GET_CODE (y)) return 0; /* (MULT:SI x y) and (MULT:HI x y) are NOT equivalent. (REG:SI x) and (REG:HI x) are NOT equivalent. */ if (GET_MODE (x) != GET_MODE (y)) return 0; /* REG, LABEL_REF, and SYMBOL_REF can be compared nonrecursively. */ if (code == REG) return REGNO (x) == REGNO (y); if (code == LABEL_REF) return XEXP (x, 0) == XEXP (y, 0); if (code == SYMBOL_REF) return XSTR (x, 0) == XSTR (y, 0); /* Compare the elements. If any pair of corresponding elements fail to match, return 0 for the whole things. */ fmt = GET_RTX_FORMAT (code); for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) { switch (fmt[i]) { case 'w': if (XWINT (x, i) != XWINT (y, i)) return 0; break; case 'n': case 'i': if (XINT (x, i) != XINT (y, i)) return 0; break; case 'V': case 'E': /* Two vectors must have the same length. */ if (XVECLEN (x, i) != XVECLEN (y, i)) return 0; /* And the corresponding elements must match. */ for (j = 0; j < XVECLEN (x, i); j++) if (rtx_equal_for_memref_p (XVECEXP (x, i, j), XVECEXP (y, i, j)) == 0)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -