📄 sched.c

📁 早期freebsd实现
💻 C
📖 第 1 页 / 共 5 页
字号:
12 3 4 5 下一页
/* Instruction scheduling pass.   Copyright (C) 1992 Free Software Foundation, Inc.   Contributed by Michael Tiemann (tiemann@cygnus.com)   Enhanced by, and currently maintained by, Jim Wilson (wilson@cygnus.com)This file is part of GNU CC.GNU CC is free software; you can redistribute it and/or modifyit under the terms of the GNU General Public License as published bythe Free Software Foundation; either version 2, or (at your option)any later version.GNU CC is distributed in the hope that it will be useful,but WITHOUT ANY WARRANTY; without even the implied warranty ofMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See theGNU General Public License for more details.You should have received a copy of the GNU General Public Licensealong with GNU CC; see the file COPYING.  If not, write tothe Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.  *//* Instruction scheduling pass.   This pass implements list scheduling within basic blocks.  It is   run after flow analysis, but before register allocation.  The   scheduler works as follows:   We compute insn priorities based on data dependencies.  Flow   analysis only creates a fraction of the data-dependencies we must   observe: namely, only those dependencies which the combiner can be   expected to use.  For this pass, we must therefore create the   remaining dependencies we need to observe: register dependencies,   memory dependencies, dependencies to keep function calls in order,   and the dependence between a conditional branch and the setting of   condition codes are all dealt with here.   The scheduler first traverses the data flow graph, starting with   the last instruction, and proceeding to the first, assigning   values to insn_priority as it goes.  This sorts the instructions   topologically by data dependence.   Once priorities have been established, we order the insns using   list scheduling.  This works as follows: starting with a list of   all the ready insns, and sorted according to priority number, we   schedule the insn from the end of the list by placing its   predecessors in the list according to their priority order.  We   consider this insn scheduled by setting the pointer to the "end" of   the list to point to the previous insn.  When an insn has no   predecessors, we either queue it until sufficient time has elapsed   or add it to the ready list.  As the instructions are scheduled or   when stalls are introduced, the queue advances and dumps insns into   the ready list.  When all insns down to the lowest priority have   been scheduled, the critical path of the basic block has been made   as short as possible.  The remaining insns are then scheduled in   remaining slots.   Function unit conflicts are resolved during reverse list scheduling   by tracking the time when each insn is committed to the schedule   and from that, the time the function units it uses must be free.   As insns on the ready list are considered for scheduling, those   that would result in a blockage of the already committed insns are   queued until no blockage will result.  Among the remaining insns on   the ready list to be considered, the first one with the largest   potential for causing a subsequent blockage is chosen.   The following list shows the order in which we want to break ties   among insns in the ready list:	1.  choose insn with lowest conflict cost, ties broken by	2.  choose insn with the longest path to end of bb, ties broken by	3.  choose insn that kills the most registers, ties broken by	4.  choose insn that conflicts with the most ready insns, or finally	5.  choose insn with lowest UID.   Memory references complicate matters.  Only if we can be certain   that memory references are not part of the data dependency graph   (via true, anti, or output dependence), can we move operations past   memory references.  To first approximation, reads can be done   independently, while writes introduce dependencies.  Better   approximations will yield fewer dependencies.   Dependencies set up by memory references are treated in exactly the   same way as other dependencies, by using LOG_LINKS.   Having optimized the critical path, we may have also unduly   extended the lifetimes of some registers.  If an operation requires   that constants be loaded into registers, it is certainly desirable   to load those constants as early as necessary, but no earlier.   I.e., it will not do to load up a bunch of registers at the   beginning of a basic block only to use them at the end, if they   could be loaded later, since this may result in excessive register   utilization.   Note that since branches are never in basic blocks, but only end   basic blocks, this pass will not do any branch scheduling.  But   that is ok, since we can use GNU's delayed branch scheduling   pass to take care of this case.   Also note that no further optimizations based on algebraic identities   are performed, so this pass would be a good one to perform instruction   splitting, such as breaking up a multiply instruction into shifts   and adds where that is profitable.   Given the memory aliasing analysis that this pass should perform,   it should be possible to remove redundant stores to memory, and to   load values from registers instead of hitting memory.   This pass must update information that subsequent passes expect to be   correct.  Namely: reg_n_refs, reg_n_sets, reg_n_deaths,   reg_n_calls_crossed, and reg_live_length.  Also, basic_block_head,   basic_block_end.   The information in the line number notes is carefully retained by this   pass.  All other NOTE insns are grouped in their same relative order at   the beginning of basic blocks that have been scheduled.  */#include <stdio.h>#include "config.h"#include "rtl.h"#include "basic-block.h"#include "regs.h"#include "hard-reg-set.h"#include "flags.h"#include "insn-config.h"#include "insn-attr.h"#ifdef INSN_SCHEDULING/* Arrays set up by scheduling for the same respective purposes as   similar-named arrays set up by flow analysis.  We work with these   arrays during the scheduling pass so we can compare values against   unscheduled code.   Values of these arrays are copied at the end of this pass into the   arrays set up by flow analysis.  */static short *sched_reg_n_deaths;static int *sched_reg_n_calls_crossed;static int *sched_reg_live_length;/* Element N is the next insn that sets (hard or pseudo) register   N within the current basic block; or zero, if there is no   such insn.  Needed for new registers which may be introduced   by splitting insns.  */static rtx *reg_last_uses;static rtx *reg_last_sets;/* Vector indexed by INSN_UID giving the original ordering of the insns.  */static int *insn_luid;#define INSN_LUID(INSN) (insn_luid[INSN_UID (INSN)])/* Vector indexed by INSN_UID giving each instruction a priority.  */static int *insn_priority;#define INSN_PRIORITY(INSN) (insn_priority[INSN_UID (INSN)])static short *insn_costs;#define INSN_COST(INSN)	insn_costs[INSN_UID (INSN)]/* Vector indexed by INSN_UID giving an encoding of the function units   used.  */static short *insn_units;#define INSN_UNIT(INSN)	insn_units[INSN_UID (INSN)]/* Vector indexed by INSN_UID giving an encoding of the blockage range   function.  The unit and the range are encoded.  */static unsigned int *insn_blockage;#define INSN_BLOCKAGE(INSN) insn_blockage[INSN_UID (INSN)]#define UNIT_BITS 5#define BLOCKAGE_MASK ((1 << BLOCKAGE_BITS) - 1)#define ENCODE_BLOCKAGE(U,R)				\  ((((U) << UNIT_BITS) << BLOCKAGE_BITS			\    | MIN_BLOCKAGE_COST (R)) << BLOCKAGE_BITS		\   | MAX_BLOCKAGE_COST (R))#define UNIT_BLOCKED(B) ((B) >> (2 * BLOCKAGE_BITS))#define BLOCKAGE_RANGE(B) \  (((((B) >> BLOCKAGE_BITS) & BLOCKAGE_MASK) << (HOST_BITS_PER_INT / 2)) \   | (B) & BLOCKAGE_MASK)/* Encodings of the `<name>_unit_blockage_range' function.  */#define MIN_BLOCKAGE_COST(R) ((R) >> (HOST_BITS_PER_INT / 2))#define MAX_BLOCKAGE_COST(R) ((R) & ((1 << (HOST_BITS_PER_INT / 2)) - 1))#define DONE_PRIORITY	-1#define MAX_PRIORITY	0x7fffffff#define TAIL_PRIORITY	0x7ffffffe#define LAUNCH_PRIORITY	0x7f000001#define DONE_PRIORITY_P(INSN) (INSN_PRIORITY (INSN) < 0)#define LOW_PRIORITY_P(INSN) ((INSN_PRIORITY (INSN) & 0x7f000000) == 0)/* Vector indexed by INSN_UID giving number of insns referring to this insn.  */static int *insn_ref_count;#define INSN_REF_COUNT(INSN) (insn_ref_count[INSN_UID (INSN)])/* Vector indexed by INSN_UID giving line-number note in effect for each   insn.  For line-number notes, this indicates whether the note may be   reused.  */static rtx *line_note;#define LINE_NOTE(INSN) (line_note[INSN_UID (INSN)])/* Vector indexed by basic block number giving the starting line-number   for each basic block.  */static rtx *line_note_head;/* List of important notes we must keep around.  This is a pointer to the   last element in the list.  */static rtx note_list;/* Regsets telling whether a given register is live or dead before the last   scheduled insn.  Must scan the instructions once before scheduling to   determine what registers are live or dead at the end of the block.  */static regset bb_dead_regs;static regset bb_live_regs;/* Regset telling whether a given register is live after the insn currently   being scheduled.  Before processing an insn, this is equal to bb_live_regs   above.  This is used so that we can find registers that are newly born/dead   after processing an insn.  */static regset old_live_regs;/* The chain of REG_DEAD notes.  REG_DEAD notes are removed from all insns   during the initial scan and reused later.  If there are not exactly as   many REG_DEAD notes in the post scheduled code as there were in the   prescheduled code then we trigger an abort because this indicates a bug.  */static rtx dead_notes;/* Queues, etc.  *//* An instruction is ready to be scheduled when all insns following it   have already been scheduled.  It is important to ensure that all   insns which use its result will not be executed until its result   has been computed.  An insn is maintained in one of four structures:   (P) the "Pending" set of insns which cannot be scheduled until   their dependencies have been satisfied.   (Q) the "Queued" set of insns that can be scheduled when sufficient   time has passed.   (R) the "Ready" list of unscheduled, uncommitted insns.   (S) the "Scheduled" list of insns.   Initially, all insns are either "Pending" or "Ready" depending on   whether their dependencies are satisfied.   Insns move from the "Ready" list to the "Scheduled" list as they   are committed to the schedule.  As this occurs, the insns in the   "Pending" list have their dependencies satisfied and move to either   the "Ready" list or the "Queued" set depending on whether   sufficient time has passed to make them ready.  As time passes,   insns move from the "Queued" set to the "Ready" list.  Insns may   move from the "Ready" list to the "Queued" set if they are blocked   due to a function unit conflict.   The "Pending" list (P) are the insns in the LOG_LINKS of the unscheduled   insns, i.e., those that are ready, queued, and pending.   The "Queued" set (Q) is implemented by the variable `insn_queue'.   The "Ready" list (R) is implemented by the variables `ready' and   `n_ready'.   The "Scheduled" list (S) is the new insn chain built by this pass.   The transition (R->S) is implemented in the scheduling loop in   `schedule_block' when the best insn to schedule is chosen.   The transition (R->Q) is implemented in `schedule_select' when an   insn is found to to have a function unit conflict with the already   committed insns.   The transitions (P->R and P->Q) are implemented in `schedule_insn' as   insns move from the ready list to the scheduled list.   The transition (Q->R) is implemented at the top of the scheduling   loop in `schedule_block' as time passes or stalls are introduced.  *//* Implement a circular buffer to delay instructions until sufficient   time has passed.  INSN_QUEUE_SIZE is a power of two larger than   MAX_BLOCKAGE and MAX_READY_COST computed by genattr.c.  This is the   longest time an isnsn may be queued.  */static rtx insn_queue[INSN_QUEUE_SIZE];static int q_ptr = 0;static int q_size = 0;#define NEXT_Q(X) (((X)+1) & (INSN_QUEUE_SIZE-1))#define NEXT_Q_AFTER(X,C) (((X)+C) & (INSN_QUEUE_SIZE-1))/* Vector indexed by INSN_UID giving the minimum clock tick at which   the insn becomes ready.  This is used to note timing constraints for   insns in the pending list.  */static int *insn_tick;#define INSN_TICK(INSN) (insn_tick[INSN_UID (INSN)])/* Forward declarations.  */static void sched_analyze_2 ();static void schedule_block ();/* Main entry point of this file.  */void schedule_insns ();#endif /* INSN_SCHEDULING */#define SIZE_FOR_MODE(X) (GET_MODE_SIZE (GET_MODE (X)))/* Vector indexed by N giving the initial (unchanging) value known   for pseudo-register N.  */static rtx *reg_known_value;/* Indicates number of valid entries in reg_known_value.  */static int reg_known_value_size;static rtxcanon_rtx (x)     rtx x;{  if (GET_CODE (x) == REG && REGNO (x) >= FIRST_PSEUDO_REGISTER      && REGNO (x) <= reg_known_value_size)    return reg_known_value[REGNO (x)];  else if (GET_CODE (x) == PLUS)    {      rtx x0 = canon_rtx (XEXP (x, 0));      rtx x1 = canon_rtx (XEXP (x, 1));      if (x0 != XEXP (x, 0) || x1 != XEXP (x, 1))	{	  /* We can tolerate LO_SUMs being offset here; these	     rtl are used for nothing other than comparisons.  */	  if (GET_CODE (x0) == CONST_INT)	    return plus_constant_for_output (x1, INTVAL (x0));	  else if (GET_CODE (x1) == CONST_INT)	    return plus_constant_for_output (x0, INTVAL (x1));	  return gen_rtx (PLUS, GET_MODE (x), x0, x1);	}    }  return x;}/* Set up all info needed to perform alias analysis on memory references.  */voidinit_alias_analysis (){  int maxreg = max_reg_num ();  rtx insn;  rtx note;  rtx set;  reg_known_value_size = maxreg;  reg_known_value    = (rtx *) oballoc ((maxreg-FIRST_PSEUDO_REGISTER) * sizeof (rtx))      - FIRST_PSEUDO_REGISTER;  bzero (reg_known_value+FIRST_PSEUDO_REGISTER,	 (maxreg-FIRST_PSEUDO_REGISTER) * sizeof (rtx));  /* Fill in the entries with known constant values.  */  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))    if ((set = single_set (insn)) != 0	&& GET_CODE (SET_DEST (set)) == REG	&& REGNO (SET_DEST (set)) >= FIRST_PSEUDO_REGISTER	&& (((note = find_reg_note (insn, REG_EQUAL, 0)) != 0	     && reg_n_sets[REGNO (SET_DEST (set))] == 1)	    || (note = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != 0)	&& GET_CODE (XEXP (note, 0)) != EXPR_LIST)      reg_known_value[REGNO (SET_DEST (set))] = XEXP (note, 0);  /* Fill in the remaining entries.  */  while (--maxreg >= FIRST_PSEUDO_REGISTER)    if (reg_known_value[maxreg] == 0)      reg_known_value[maxreg] = regno_reg_rtx[maxreg];}/* Return 1 if X and Y are identical-looking rtx's.   We use the data in reg_known_value above to see if two registers with   different numbers are, in fact, equivalent.  */static intrtx_equal_for_memref_p (x, y)     rtx x, y;{  register int i;  register int j;  register enum rtx_code code;  register char *fmt;  if (x == 0 && y == 0)    return 1;  if (x == 0 || y == 0)    return 0;  x = canon_rtx (x);  y = canon_rtx (y);  if (x == y)    return 1;  code = GET_CODE (x);  /* Rtx's of different codes cannot be equal.  */  if (code != GET_CODE (y))    return 0;  /* (MULT:SI x y) and (MULT:HI x y) are NOT equivalent.     (REG:SI x) and (REG:HI x) are NOT equivalent.  */  if (GET_MODE (x) != GET_MODE (y))    return 0;  /* REG, LABEL_REF, and SYMBOL_REF can be compared nonrecursively.  */  if (code == REG)    return REGNO (x) == REGNO (y);  if (code == LABEL_REF)    return XEXP (x, 0) == XEXP (y, 0);  if (code == SYMBOL_REF)    return XSTR (x, 0) == XSTR (y, 0);  /* Compare the elements.  If any pair of corresponding elements     fail to match, return 0 for the whole things.  */  fmt = GET_RTX_FORMAT (code);  for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)    {      switch (fmt[i])	{	case 'w':	  if (XWINT (x, i) != XWINT (y, i))	    return 0;	  break;	case 'n':	case 'i':	  if (XINT (x, i) != XINT (y, i))	    return 0;	  break;	case 'V':	case 'E':	  /* Two vectors must have the same length.  */	  if (XVECLEN (x, i) != XVECLEN (y, i))	    return 0;	  /* And the corresponding elements must match.  */	  for (j = 0; j < XVECLEN (x, i); j++)	    if (rtx_equal_for_memref_p (XVECEXP (x, i, j), XVECEXP (y, i, j)) == 0)
12 3 4 5 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -