📄 modsched.java
字号:
/* * LA-CC 05-135 Trident 0.7.1Copyright NoticeCopyright 2006 (c) the Regents of the University of California.This Software was produced under a U.S. Government contract(W-7405-ENG-36) by Los Alamos National Laboratory, which is operatedby the University of California for the U.S. Department of Energy. TheU.S. Government is licensed to use, reproduce, and distribute thisSoftware. Permission is granted to the public to copy and use thisSoftware without charge, provided that this Notice and any statementof authorship are reproduced on all copies. Neither the Government northe University makes any warranty, express or implied, or assumes anyliability or responsibility for the user of this Software. */package fp.flowgraph;import java.util.*;import fp.util.*;import java.io.*;import java.lang.Math;import fp.hardware.*;import fp.graph.*;import fp.passes.*;import fp.*;/** This class contains the top level stuff for the mod scheduler, and if that's * not enough explanation, I don't know what is. * * @author Kris Peterson */public class ModSched extends Schedule{ /** Please refer to Rau's paper on modulo scheduling for a more full discussion of modulo scheduling. However, the result is a block of instructions which is the equivalent of multiple iterations of a loop running in parallel in staggered starts. This block of instructions, however, only contains those instructions that make up a repeating pattern when all those parallel iterations line up and are all running. That is, when execution of the loop first starts for a short period only one iteration is running and then only two and so on. The modulo scheduled block of instructions expects that all iterations are already up and running at the moment it starts exection. To handle start up and execution of these first few iterations, a block called a prolog is created, which only contains the instructions for these first few iterations until enough iterations have started that full parallelism can begin. The same problem occurs at the end of executing the loop when different iterations come to a conclusion and, once again, fewer than all possible iterations are running in parallel. To handle execution of these last iterations, another block, called the epilog is created. At the end of this block, all calculated values are stored to registers. There is a problem however, due to these multiple iterations all trying to write to the same registers. Actually, only one iteration should be allowed to write to the registers. To control which iteration performs the write, the outputs from all the iterations are placed through a tree of muxes. This nested class, MuxAdder, has methods for identifying when to add muxes, to do some editing of the epilog in preparation for adding the muxes, and a method for creation and scheduling of the prolog and epilog hyperblocks (this method probably shouldn't be here--I'm not sure what I was thinking when I did that). However the main mux adding code is in the file MuxTableGenerator.java, and the comments there should be reffered to for more understanding. */ private class MuxAdder { /** each primal that is getting stored to needs a different tree of muxes. Also we need to be able to tell the difference between outputs from different iterations. To help with this, as each iteration calls this method, a count for each primal is done, which is used to create iteration specific block operands to store the data from that iteration which would have been stored to that primal. */ private HashMap _muxCnts = new HashMap(); /** I need block variables to hold the data for each iteration that will be stored to a primal. To do this, I create a block variable named after the primal and on the iteration count. So for example, given a store to a primal "a", the first iteration requesting a mux will create a block variable called a_0 (note, the iteration that first requests a mux is actually the last iteration of the loop and so actually a_5 is from an earlier iteration than a_0). After this block operand has been created, the store instruction can be deleted, and the instruction that defined the block operand that was being stored to the register can be changed to define a_0 instead. In addition, any other instructions using the input to the store instruction need to be changed, to use a_0 instead, since the old block is no longer being defined. To clarify more, if we have these instructions: (BLOCK) tmp_1 = aaa_iadd (BLOCK) tmp_2, (BLOCK) tmp_3 (BLOCK) tmp_5 = aaa_iadd (BLOCK) tmp_1, (BLOCK) tmp_4 (PRIMAL) a = aaa_store (BLOCK) tmp_1 we need to delete the store and change the others to this (for the 1st iteration): (BLOCK) a_0 = aaa_iadd (BLOCK) tmp_2, (BLOCK) tmp_3 (BLOCK) tmp_5 = aaa_iadd (BLOCK) a_0, (BLOCK) tmp_4 and (BLOCK) a_1 = aaa_iadd (BLOCK) tmp_2, (BLOCK) tmp_3 (BLOCK) tmp_5 = aaa_iadd (BLOCK) a_1, (BLOCK) tmp_4 for the second iteration and so on for all necessary iterations in the epilog. (remember, later a new store instruction will be inserted that stores the output from the mux tree to primal a). This HashMap saves the association between who needs to be replaced and who does the replacing. So key = to be replaced (in above example, tmp_1) value = operand to replace it with (in above example, a_0 or a_1) */ private HashMap primReplace = new HashMap(); /** Sometimes, a single block variable is stored to multiple primals. After performing, the above described replacement, the block operand will have been renamed for one of the primals, but not for the other. This will create a problem later, because MuxTableGenerator will not know the correct input for the second primal. That is given this code: (BLOCK) tmp_1 = aaa_iadd (BLOCK) tmp_2, (BLOCK) tmp_3 (BLOCK) tmp_5 = aaa_iadd (BLOCK) tmp_1, (BLOCK) tmp_4 (PRIMAL) a = aaa_store (BLOCK) tmp_1 (PRIMAL) b = aaa_store (BLOCK) tmp_1 after doing the above described changes and creating a_0 and b_0 block operands, which the MuxTableGenerator will expect, only a_0 will be defined: (BLOCK) a_0 = aaa_iadd (BLOCK) tmp_2, (BLOCK) tmp_3 (BLOCK) tmp_5 = aaa_iadd (BLOCK) a_0, (BLOCK) tmp_4 This hashMap was written to notice this and allow duplicate definitions to handle b_0 as well. It is used to copy the definition of the source block operand, where the copy defines b_0 instead of a_0: (BLOCK) a_0 = aaa_iadd (BLOCK) tmp_2, (BLOCK) tmp_3 (BLOCK) b_0 = aaa_iadd (BLOCK) tmp_2, (BLOCK) tmp_3 (BLOCK) tmp_5 = aaa_iadd (BLOCK) a_0, (BLOCK) tmp_4 this is not ideal as now we have two iadds, and it could be two floating point square roots, and this actually could be changed, since MuxTableGenerator is told what to expect as input. For all it cares, it could use a_0 (or even "bob_the_block_variable") instead of b_0. update: I changed it to get rid of the copying and it works! */ //private HashMap primReplace2 = new HashMap(); /** The select input to the muxes is controlled by looking at the predicates for each iteration. However, we need to rename the predicate operands so that each iteration uses different ones and the control logic for the mux tree can tell them apart. This hashmap is simple: key = old predicate operand name value = new predicate operand name */ private HashMap predReplace = new HashMap(); /** This is used with primReplace2, to find instructions to copy. When I find that one block operand is being stored to two primals, the block operand is saved here. Later, the instructions are searched over for one who defines any operand in this set. When one is found, it is copied, and the copy changed to define the appropriate block operand in primReplace2 */ private HashSet _copyDefinition = new HashSet(); public MuxAdder() {} public void addMuxes(InstructionList iList, MuxTableGenerator muxGen, BooleanEquation loopExitPredicate) { primReplace = new HashMap(); //primReplace2 = new HashMap(); predReplace = new HashMap(); //foreach time, foreach instruction in the set at that time for (Iterator vIt = ((HashMap)iList.clone()).keySet().iterator(); vIt.hasNext();) { Float execTimeFl = (Float) vIt.next(); float execTime = execTimeFl.floatValue(); HashSet iList2 = iList.getAllAtTime(execTime); for (Iterator vIt2 = ((HashSet)iList2.clone()).iterator(); vIt2.hasNext();) { InstructionList.InstructionObject instObj = (InstructionList.InstructionObject) vIt2.next(); Instruction inst = instObj.inst; //if the instruction is a store if(Store.conforms(inst)) { //BooleanEquation predTmp = inst.getPredicate(); //if(predTmp != null) { //LinkedList BoolsTmp = predTmp.listBooleanOperands(); //if the store does not have a predicate, we can't make any //control logic for out mux table, and also, this should weed out //any constant stores to block operands if(instObj.listOfPredsUsed.size() != 0) { //if(BoolsTmp.size() != 0) { //just to make sure, check that it is storing to a primal, but //also make sure it's not one of the modPrims since we don't //care about their values after the loop, and any leftover //stores to modprims need to be simply deleted. if((Store.getDestination(inst).isPrimal())&& (Store.getDestination(inst).getFullName().indexOf("modPrim")==-1)) { PrimalOperand out = (PrimalOperand)Store.getDestination(inst); //BooleanEquation neweq = new BooleanEquation(loopExitPredicate); BooleanEquation neweq = new BooleanEquation((Bool)instObj.getItPred()); //get the iteration count (and remember, the lower this //number, the later the iteration, since later iterations //request their muxes earlier if(!_muxCnts.containsKey(out)) _muxCnts.put(out, new Integer(0)); int cnt = ((Integer)_muxCnts.get(out)).intValue(); //create new booleans for all the operands in the predicate, //change the predicate accordingly and save the mapping to //predReplace, so that the definitions of these operands can //also be changed //for (Iterator itin = instObj.listOfPredsUsed.iterator(); /*for (Iterator itin = neweq.listBooleanOperands().iterator(); itin.hasNext(); ) { Operand op = (Operand)itin.next(); Operand opCopy = Operand.newBoolean(op.getFullName() + "_" + cnt); predReplace.put(op, opCopy); neweq = neweq.replaceBool((Bool)op, (Bool)opCopy); System.out.println("op " + op); System.out.println("opCopy " + opCopy); //loopExitPredicate = neweq; }*/ //create the new block operand to hold this iterations output Operand newOut = Operand.newBlock(out.getFullName() + "_" + cnt); Operand input = Store.getValue(inst); //however, if input is stored to multiple primals, use the //existing new block if(primReplace.containsKey(input)) { newOut = (Operand)primReplace.get(input); } //tell MuxTableGenerator about this input to the mux tree muxGen.add(out, neweq, newOut); //increment the iteration count for this primal: _muxCnts.put(out, new Integer(++cnt)); //sometimes, a constant is being stored to the primal. The //constant will, obviously, not be defined anywhere else, and //so unlike in all other cases, the store instruction must not //be deleted, but instead changed to define the new block //operand. (And note, just because in this iteration a //constant is being stored to the primal does not mean that //other iterations will all also store this constant to the //primal, meaning, we cannot assume that we only need one //store in this case and no mux tree.) if(input.isConstant()) instObj.replaceOperand(out, newOut); else { /*if(primReplace.containsKey(input)) { _copyDefinition.add(input); primReplace2.put(input, newOut); } else {*/ //save the mapping from the old block operand to the new primReplace.put(input, newOut); //} //delete the store instruction: iList.remove(execTime, instObj); } } } //} } } } //this method performs all the replacements: replaceOperands(iList); } /** This method replaces operands as requested by addMuxes */ public void replaceOperands(InstructionList iList) { /*for (Iterator it1 = _copyDefinition.iterator(); it1.hasNext();) { Operand prim = (Operand)it1.next(); Operand primReplacement = (Operand)primReplace2.get(prim); for (Iterator it2 = ((HashSet)iList.getInstSet().clone()).iterator(); it2.hasNext();) { InstructionList.InstructionObject instObj = (InstructionList.InstructionObject)it2.next(); if(instObj.getInstOuts().contains(prim)) { InstructionList.InstructionObject instObjCopy = instObj.copySaveOps(); System.out.println("prim replace " + instObjCopy.inst); System.out.println("prim " + prim); System.out.println("primReplacement " + primReplacement); instObjCopy.replaceOperand(prim, primReplacement); iList.add(-1, instObjCopy); } } }*/ for (Iterator it1 = ((HashSet)iList.getInstSet().clone()).iterator(); it1.hasNext();) { InstructionList.InstructionObject instObj = (InstructionList.InstructionObject)it1.next(); //Instruction inst = instObj.inst; //replace the block operands with the iteration and primal specific //operands for (Iterator itin = ((Set)primReplace.keySet()).iterator(); itin.hasNext(); ) { Operand prim = (Operand)itin.next(); Operand primReplacement = (Operand)primReplace.get(prim); instObj.replaceOperand(prim, primReplacement); } //replace the predicate definitions with the new predicate operand names /*for (Iterator itin = ((Set)predReplace.keySet()).iterator(); itin.hasNext(); ) { Operand pred = (Operand)itin.next(); Operand predReplacement = (Operand)predReplace.get(pred); instObj.replaceOperand(pred, predReplacement); BooleanEquation eq = (BooleanEquation)instObj.inst.getPredicate(); BooleanEquation neweq = eq.replaceBool((Bool)pred, (Bool)predReplacement); instObj.inst.setPredicate(neweq); }*/ } } /** This method creates hyperblock nodes in the BlockGraph for the prolog and epilog, and it schedules those blocks using the user requested, nonmodulo scheduling algorithm */ public void createProNEpiNodes(BlockNode bNode, BlockGraph graph, InstructionList pList, InstructionList eList, ChipDef chipDef) { //just to make sure that there actually are instructions in the prolog if(pList.size()>0) { BlockNode prologNode = (BlockNode)graph.addNode(); LabelOperand blockName = bNode.getLabel(); String blockNameStr = blockName.getFullName(); //name the prolog after the loop body, with prolog_ in front prologNode.setName("prolog_"+ blockNameStr); //change the loops in edges (except the loop edge) to point at the //prolog instead Set in_edges = new HashSet(); in_edges.addAll(bNode.getInEdges()); for (Iterator it = in_edges.iterator(); it.hasNext();) { BlockEdge inEdge = (BlockEdge)it.next(); if(!inEdge.isBackwardEdge()) { inEdge.setSink(prologNode); } } //create a new edge from the prolog to the loop body (the kernal) and //set its predicate to true BlockEdge newEdge = (BlockEdge)graph.addEdge(prologNode, bNode); BooleanEquation trueBoolEq = new BooleanEquation(true); newEdge.setPredicate(trueBoolEq); //place all the instructions in the block, do a new hardware analysis, //and schedule the block prologNode.addInstructions(new ArrayList(pList.getInstructions())); if(!chipDef.isDummyBoard()) AnalyzeHrdWrConstraintsPass.iICalc.calcLogicSpaceReq(graph, chipDef, GlobalOptions.conserveArea);
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -