📄 sched.c
字号:
/* Instruction scheduling pass. Copyright (C) 1992, 1993, 1994, 1995 Free Software Foundation, Inc. Contributed by Michael Tiemann (tiemann@cygnus.com) Enhanced by, and currently maintained by, Jim Wilson (wilson@cygnus.com)This file is part of GNU CC.GNU CC is free software; you can redistribute it and/or modifyit under the terms of the GNU General Public License as published bythe Free Software Foundation; either version 2, or (at your option)any later version.GNU CC is distributed in the hope that it will be useful,but WITHOUT ANY WARRANTY; without even the implied warranty ofMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See theGNU General Public License for more details.You should have received a copy of the GNU General Public Licensealong with GNU CC; see the file COPYING. If not, write tothe Free Software Foundation, 59 Temple Place - Suite 330,Boston, MA 02111-1307, USA. *//* Instruction scheduling pass. This pass implements list scheduling within basic blocks. It is run after flow analysis, but before register allocation. The scheduler works as follows: We compute insn priorities based on data dependencies. Flow analysis only creates a fraction of the data-dependencies we must observe: namely, only those dependencies which the combiner can be expected to use. For this pass, we must therefore create the remaining dependencies we need to observe: register dependencies, memory dependencies, dependencies to keep function calls in order, and the dependence between a conditional branch and the setting of condition codes are all dealt with here. The scheduler first traverses the data flow graph, starting with the last instruction, and proceeding to the first, assigning values to insn_priority as it goes. This sorts the instructions topologically by data dependence. Once priorities have been established, we order the insns using list scheduling. This works as follows: starting with a list of all the ready insns, and sorted according to priority number, we schedule the insn from the end of the list by placing its predecessors in the list according to their priority order. We consider this insn scheduled by setting the pointer to the "end" of the list to point to the previous insn. When an insn has no predecessors, we either queue it until sufficient time has elapsed or add it to the ready list. As the instructions are scheduled or when stalls are introduced, the queue advances and dumps insns into the ready list. When all insns down to the lowest priority have been scheduled, the critical path of the basic block has been made as short as possible. The remaining insns are then scheduled in remaining slots. Function unit conflicts are resolved during reverse list scheduling by tracking the time when each insn is committed to the schedule and from that, the time the function units it uses must be free. As insns on the ready list are considered for scheduling, those that would result in a blockage of the already committed insns are queued until no blockage will result. Among the remaining insns on the ready list to be considered, the first one with the largest potential for causing a subsequent blockage is chosen. The following list shows the order in which we want to break ties among insns in the ready list: 1. choose insn with lowest conflict cost, ties broken by 2. choose insn with the longest path to end of bb, ties broken by 3. choose insn that kills the most registers, ties broken by 4. choose insn that conflicts with the most ready insns, or finally 5. choose insn with lowest UID. Memory references complicate matters. Only if we can be certain that memory references are not part of the data dependency graph (via true, anti, or output dependence), can we move operations past memory references. To first approximation, reads can be done independently, while writes introduce dependencies. Better approximations will yield fewer dependencies. Dependencies set up by memory references are treated in exactly the same way as other dependencies, by using LOG_LINKS. Having optimized the critical path, we may have also unduly extended the lifetimes of some registers. If an operation requires that constants be loaded into registers, it is certainly desirable to load those constants as early as necessary, but no earlier. I.e., it will not do to load up a bunch of registers at the beginning of a basic block only to use them at the end, if they could be loaded later, since this may result in excessive register utilization. Note that since branches are never in basic blocks, but only end basic blocks, this pass will not do any branch scheduling. But that is ok, since we can use GNU's delayed branch scheduling pass to take care of this case. Also note that no further optimizations based on algebraic identities are performed, so this pass would be a good one to perform instruction splitting, such as breaking up a multiply instruction into shifts and adds where that is profitable. Given the memory aliasing analysis that this pass should perform, it should be possible to remove redundant stores to memory, and to load values from registers instead of hitting memory. This pass must update information that subsequent passes expect to be correct. Namely: reg_n_refs, reg_n_sets, reg_n_deaths, reg_n_calls_crossed, and reg_live_length. Also, basic_block_head, basic_block_end. The information in the line number notes is carefully retained by this pass. All other NOTE insns are grouped in their same relative order at the beginning of basic blocks that have been scheduled. */#include <stdio.h>#include "config.h"#include "rtl.h"#include "basic-block.h"#include "regs.h"#include "hard-reg-set.h"#include "flags.h"#include "insn-config.h"#include "insn-attr.h"#ifdef INSN_SCHEDULING/* Arrays set up by scheduling for the same respective purposes as similar-named arrays set up by flow analysis. We work with these arrays during the scheduling pass so we can compare values against unscheduled code. Values of these arrays are copied at the end of this pass into the arrays set up by flow analysis. */static short *sched_reg_n_deaths;static int *sched_reg_n_calls_crossed;static int *sched_reg_live_length;/* Element N is the next insn that sets (hard or pseudo) register N within the current basic block; or zero, if there is no such insn. Needed for new registers which may be introduced by splitting insns. */static rtx *reg_last_uses;static rtx *reg_last_sets;static regset reg_pending_sets;static int reg_pending_sets_all;/* Vector indexed by INSN_UID giving the original ordering of the insns. */static int *insn_luid;#define INSN_LUID(INSN) (insn_luid[INSN_UID (INSN)])/* Vector indexed by INSN_UID giving each instruction a priority. */static int *insn_priority;#define INSN_PRIORITY(INSN) (insn_priority[INSN_UID (INSN)])static short *insn_costs;#define INSN_COST(INSN) insn_costs[INSN_UID (INSN)]/* Vector indexed by INSN_UID giving an encoding of the function units used. */static short *insn_units;#define INSN_UNIT(INSN) insn_units[INSN_UID (INSN)]/* Vector indexed by INSN_UID giving an encoding of the blockage range function. The unit and the range are encoded. */static unsigned int *insn_blockage;#define INSN_BLOCKAGE(INSN) insn_blockage[INSN_UID (INSN)]#define UNIT_BITS 5#define BLOCKAGE_MASK ((1 << BLOCKAGE_BITS) - 1)#define ENCODE_BLOCKAGE(U,R) \ ((((U) << UNIT_BITS) << BLOCKAGE_BITS \ | MIN_BLOCKAGE_COST (R)) << BLOCKAGE_BITS \ | MAX_BLOCKAGE_COST (R))#define UNIT_BLOCKED(B) ((B) >> (2 * BLOCKAGE_BITS))#define BLOCKAGE_RANGE(B) \ (((((B) >> BLOCKAGE_BITS) & BLOCKAGE_MASK) << (HOST_BITS_PER_INT / 2)) \ | (B) & BLOCKAGE_MASK)/* Encodings of the `<name>_unit_blockage_range' function. */#define MIN_BLOCKAGE_COST(R) ((R) >> (HOST_BITS_PER_INT / 2))#define MAX_BLOCKAGE_COST(R) ((R) & ((1 << (HOST_BITS_PER_INT / 2)) - 1))#define DONE_PRIORITY -1#define MAX_PRIORITY 0x7fffffff#define TAIL_PRIORITY 0x7ffffffe#define LAUNCH_PRIORITY 0x7f000001#define DONE_PRIORITY_P(INSN) (INSN_PRIORITY (INSN) < 0)#define LOW_PRIORITY_P(INSN) ((INSN_PRIORITY (INSN) & 0x7f000000) == 0)/* Vector indexed by INSN_UID giving number of insns referring to this insn. */static int *insn_ref_count;#define INSN_REF_COUNT(INSN) (insn_ref_count[INSN_UID (INSN)])/* Vector indexed by INSN_UID giving line-number note in effect for each insn. For line-number notes, this indicates whether the note may be reused. */static rtx *line_note;#define LINE_NOTE(INSN) (line_note[INSN_UID (INSN)])/* Vector indexed by basic block number giving the starting line-number for each basic block. */static rtx *line_note_head;/* List of important notes we must keep around. This is a pointer to the last element in the list. */static rtx note_list;/* Regsets telling whether a given register is live or dead before the last scheduled insn. Must scan the instructions once before scheduling to determine what registers are live or dead at the end of the block. */static regset bb_dead_regs;static regset bb_live_regs;/* Regset telling whether a given register is live after the insn currently being scheduled. Before processing an insn, this is equal to bb_live_regs above. This is used so that we can find registers that are newly born/dead after processing an insn. */static regset old_live_regs;/* The chain of REG_DEAD notes. REG_DEAD notes are removed from all insns during the initial scan and reused later. If there are not exactly as many REG_DEAD notes in the post scheduled code as there were in the prescheduled code then we trigger an abort because this indicates a bug. */static rtx dead_notes;/* Queues, etc. *//* An instruction is ready to be scheduled when all insns following it have already been scheduled. It is important to ensure that all insns which use its result will not be executed until its result has been computed. An insn is maintained in one of four structures: (P) the "Pending" set of insns which cannot be scheduled until their dependencies have been satisfied. (Q) the "Queued" set of insns that can be scheduled when sufficient time has passed. (R) the "Ready" list of unscheduled, uncommitted insns. (S) the "Scheduled" list of insns. Initially, all insns are either "Pending" or "Ready" depending on whether their dependencies are satisfied. Insns move from the "Ready" list to the "Scheduled" list as they are committed to the schedule. As this occurs, the insns in the "Pending" list have their dependencies satisfied and move to either the "Ready" list or the "Queued" set depending on whether sufficient time has passed to make them ready. As time passes, insns move from the "Queued" set to the "Ready" list. Insns may move from the "Ready" list to the "Queued" set if they are blocked due to a function unit conflict. The "Pending" list (P) are the insns in the LOG_LINKS of the unscheduled insns, i.e., those that are ready, queued, and pending. The "Queued" set (Q) is implemented by the variable `insn_queue'. The "Ready" list (R) is implemented by the variables `ready' and `n_ready'. The "Scheduled" list (S) is the new insn chain built by this pass. The transition (R->S) is implemented in the scheduling loop in `schedule_block' when the best insn to schedule is chosen. The transition (R->Q) is implemented in `schedule_select' when an insn is found to to have a function unit conflict with the already committed insns. The transitions (P->R and P->Q) are implemented in `schedule_insn' as insns move from the ready list to the scheduled list. The transition (Q->R) is implemented at the top of the scheduling loop in `schedule_block' as time passes or stalls are introduced. *//* Implement a circular buffer to delay instructions until sufficient time has passed. INSN_QUEUE_SIZE is a power of two larger than MAX_BLOCKAGE and MAX_READY_COST computed by genattr.c. This is the longest time an isnsn may be queued. */static rtx insn_queue[INSN_QUEUE_SIZE];static int q_ptr = 0;static int q_size = 0;#define NEXT_Q(X) (((X)+1) & (INSN_QUEUE_SIZE-1))#define NEXT_Q_AFTER(X,C) (((X)+C) & (INSN_QUEUE_SIZE-1))/* Vector indexed by INSN_UID giving the minimum clock tick at which the insn becomes ready. This is used to note timing constraints for insns in the pending list. */static int *insn_tick;#define INSN_TICK(INSN) (insn_tick[INSN_UID (INSN)])/* Data structure for keeping track of register information during that register's life. */struct sometimes{ short offset; short bit; short live_length; short calls_crossed;};/* Forward declarations. */static rtx canon_rtx PROTO((rtx));static int rtx_equal_for_memref_p PROTO((rtx, rtx));static rtx find_symbolic_term PROTO((rtx));static int memrefs_conflict_p PROTO((int, rtx, int, rtx, HOST_WIDE_INT));static void add_dependence PROTO((rtx, rtx, enum reg_note));static void remove_dependence PROTO((rtx, rtx));static rtx find_insn_list PROTO((rtx, rtx));static int insn_unit PROTO((rtx));static unsigned int blockage_range PROTO((int, rtx));static void clear_units PROTO((void));static void prepare_unit PROTO((int));static int actual_hazard_this_instance PROTO((int, int, rtx, int, int));static void schedule_unit PROTO((int, rtx, int));static int actual_hazard PROTO((int, rtx, int, int));static int potential_hazard PROTO((int, rtx, int));static int insn_cost PROTO((rtx, rtx, rtx));static int priority PROTO((rtx));static void free_pending_lists PROTO((void));static void add_insn_mem_dependence PROTO((rtx *, rtx *, rtx, rtx));static void flush_pending_lists PROTO((rtx));static void sched_analyze_1 PROTO((rtx, rtx));static void sched_analyze_2 PROTO((rtx, rtx));static void sched_analyze_insn PROTO((rtx, rtx, rtx));static int sched_analyze PROTO((rtx, rtx));static void sched_note_set PROTO((int, rtx, int));static int rank_for_schedule PROTO((rtx *, rtx *));static void swap_sort PROTO((rtx *, int));static void queue_insn PROTO((rtx, int));static int birthing_insn PROTO((rtx));static void adjust_priority PROTO((rtx));static int schedule_insn PROTO((rtx, rtx *, int, int));static int schedule_select PROTO((rtx *, int, int, FILE *));static void create_reg_dead_note PROTO((rtx, rtx));static void attach_deaths PROTO((rtx, rtx, int));static void attach_deaths_insn PROTO((rtx));static rtx unlink_notes PROTO((rtx, rtx));static int new_sometimes_live PROTO((struct sometimes *, int, int, int));static void finish_sometimes_live PROTO((struct sometimes *, int));static void schedule_block PROTO((int, FILE *));static rtx regno_use_in PROTO((int, rtx));static void split_hard_reg_notes PROTO((rtx, rtx, rtx, rtx));static void new_insn_dead_notes PROTO((rtx, rtx, rtx, rtx));static void update_n_sets PROTO((rtx, int));static void update_flow_info PROTO((rtx, rtx, rtx, rtx));/* Main entry point of this file. */void schedule_insns PROTO((FILE *));#endif /* INSN_SCHEDULING */#define SIZE_FOR_MODE(X) (GET_MODE_SIZE (GET_MODE (X)))/* Vector indexed by N giving the initial (unchanging) value known for pseudo-register N. */static rtx *reg_known_value;/* Vector recording for each reg_known_value whether it is due to a REG_EQUIV note. Future passes (viz., reload) may replace the pseudo with the equivalent expression and so we account for the dependences that would be introduced if that happens. *//* ??? This is a problem only on the Convex. The REG_EQUIV notes created in assign_parms mention the arg pointer, and there are explicit insns in the RTL that modify the arg pointer. Thus we must ensure that such insns don't get scheduled across each other because that would invalidate the REG_EQUIV notes. One could argue that the REG_EQUIV notes are wrong, but solving the problem in the scheduler will likely give better code, so we do it here. */static char *reg_known_equiv_p;/* Indicates number of valid entries in reg_known_value. */static int reg_known_value_size;static rtxcanon_rtx (x) rtx x;{ if (GET_CODE (x) == REG && REGNO (x) >= FIRST_PSEUDO_REGISTER && REGNO (x) <= reg_known_value_size) return reg_known_value[REGNO (x)]; else if (GET_CODE (x) == PLUS) { rtx x0 = canon_rtx (XEXP (x, 0)); rtx x1 = canon_rtx (XEXP (x, 1)); if (x0 != XEXP (x, 0) || x1 != XEXP (x, 1)) { /* We can tolerate LO_SUMs being offset here; these rtl are used for nothing other than comparisons. */ if (GET_CODE (x0) == CONST_INT) return plus_constant_for_output (x1, INTVAL (x0)); else if (GET_CODE (x1) == CONST_INT) return plus_constant_for_output (x0, INTVAL (x1)); return gen_rtx (PLUS, GET_MODE (x), x0, x1); } } return x;}/* Set up all info needed to perform alias analysis on memory references. */voidinit_alias_analysis (){ int maxreg = max_reg_num (); rtx insn; rtx note; rtx set; reg_known_value_size = maxreg; reg_known_value = (rtx *) oballoc ((maxreg-FIRST_PSEUDO_REGISTER) * sizeof (rtx)) - FIRST_PSEUDO_REGISTER; bzero ((char *) (reg_known_value + FIRST_PSEUDO_REGISTER), (maxreg-FIRST_PSEUDO_REGISTER) * sizeof (rtx)); reg_known_equiv_p
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -