muxtablegenerator.java
来自「一种将c高级语言转化给VHDL的编译器」· Java 代码 · 共 823 行 · 第 1/3 页
JAVA
823 行
predicate were some kind of complicated logic equation it would have to translated into the appropriate instructions and added to the schedule.) It is %tmp_3 that I store, for use in the epilog. So I also add a store instruction to the loop. However, these extra instructions cannot simply be inserted anywhere into the kernal, nor can they be added after scheduling. They too can cause hardware conflicts and care must taken when scheduling them, and therefore, they must be added to the kernal, before modulo scheduling is performed. However, these instructions shouldn't be in the prolog or epilog, so we need to know what was added so that after the kernal is scheduled andan unrolled copy created (which, in turn, is used to create the epilog and prolog), they need to be deleted from the unrolled copy so they won't end up in the prolog or epilog. Additionally, an extra load statement must be added to the epilog to load the kernal's predicate and another instruction must be added to take the inverse of this and use it as the epilog's final store's predicate. (And we can't ignore this not and assume that the operand used in the kernal's predicate is itself the inverse of the kernal's store, because the compiler doesn't know what the kernal predicate will be and if it will actually be ~tmp_2 like in this case and not simply %tmp_2 or some complicated logic equation.) And there is one final issue regarding the kernal predicates. A %tmp_2 will be calculated during every iteration of the kernal, but which iteration of the original loop this corresponds to depends on where it falls in the kernal. In the above kernal, if %tmp_2 was calculated somewhere between cycles 8 and 11, but the store was between 12 and 15, then there will be a load and store between 11 and 12 transfering %tmp_2's value, and a new %tmp_2 copy will be created to contain its value between 12 and 15. If the original %tmp_2 were used, it would correspond to a %tmp_2 from a later iteration of the main loop. When we store the kernal store predicate, we need to make sure to store the correct predicate from the correct iteration. A few methods in here help with that, but also refer to MSchedHash and ModSched for more on this. The final mux table issue is the possibility of collapsing the tree. Often many iterations have already calculated the value that will be stored in x before the epilog begins, and have transfered it to the epilog using a modprim operand. All of these iterations will probably be reading from the same modPrim. Additionally, these iterations should be next to each other. This means that not only do we not need several loads for each iteration to read the modPrim value, but we can also eliminate several muxes, since all their inputs should be the same. This is performed in three steps. The first step is to search the epilog for cases when a value is loaded from a modPrim and used as output to store in the register. MuxTableGenerator takes note of which iteration's outputs are from which modPrims when they come from a modPrim and then it deletes those load instructions so that there won't be one for each iteration where this is the case. When it knows it needs to read a value from a modPrim, it creates its own block operand which it will later use to hold the value in the modPrim when it creates its own load statement. Step two is to replace any uses of all the previous loads of that modPrim from each iteration with a use of this MuxTableGenerator created block operand. Step 3 occurs during creation of the mux tree. Whenever two inputs to a mux are from the same modPrim, it doesn't create a mux and instead saves that this mux is instead a usage of the modPrim. When it finds a mux, where only one input is from a modPrim, it creates that mux, but uses as one input the block operand it created to hold the value in the modPrim. Finally it creates a load instruction to load the modPrim into its block operand. * * @author Kris Peterson */public class MuxTableGenerator extends HashMap{ /** The select input to the muxes needs an operand, but I'm using each iteration's store's predicate for the select. Even if a predicate is only one operand, it is of type BooleanEquation. To get a single operand to use on the select, I generate all the logic instructions necessary to calculate the predicate (usually just a NOT) and use the output from these logic instructions as the select input to a mux. Creating these equations happens during the method add but they are added at the end of the method genMuxTable. This ArrayList stores those instructions until they can be added. */ private ArrayList _logicHandleInsts = new ArrayList(); /** This ArrayList stores all the kernal predicate store instructions so that they can be added separately after scheduling to put them at the end of the kernal. Actually I'm not sure why I did this. Perhaps because I wanted to add them after unrolling. I add the kernal predicate evaluation instructions before scheduling, but I wrote that in after writing this part, so maybe I just forgot to just put these with the logic instructions and add them all at the top. */ private ArrayList _predStores = new ArrayList(); /** This guy saves the primal operand that the kernal predicate was saved to so that a load can be added to the epilog to retrieve it. key = primal being stored value = primal where the primal store's predicate was saved */ private HashMap _primToPred = new HashMap(); private HashMap _primToOldPred = new HashMap(); /** This saves the kernal store predicate logic evaluation instructions so that they can be deleted from the unrolled copy of the scheduled loop. key = primal being stored to value = set of instructions used to evaluate the store's predicate */ private HashMap _savedPredSaveStoreInsts = new HashMap(); public MuxTableGenerator() { super(); } /** This nested class is used to help with the creation of the mux tree. It contains the block operand that is the input to the mux, this operand's iteration's predicate, and a modPrim primal operand that this block operand was loaded from if one exists. As the mux tree is created new elements are created where the operand is the output from the mux in the layer below and the predicate is the OR of the predicates of that mux's input, and where modPrim is equal to a modPrim operand, if one was used by both inputs to the mux. */ private class Element { public BooleanOperand pred; public Operand op; public PrimalOperand modPrim = null; public Element(BooleanOperand p, Operand o) { pred = p; op = o; } /** I add this for debugging so I could print collections of elements and get something meaningful */ public String toString() { return op.toString() + " | " + pred.toString(); } } //end class Element /** returns all the created kernal store predicate store instructions so they can be added to the kernal */ public ArrayList getPredStores() { return _predStores; } /** returns all the created logic instructions used to evaluate the kernal store predicate store instructions so these instructions can be removed from the unrolled copy of the kernal */ public HashMap getPredLogic() { return _savedPredSaveStoreInsts; } /** This method looks over the kernal, before it has been scheduled and looks for stores to primals. When it finds one, it reads that store instruction's predicate, and creates logic instructions to evaluate the predicate and creates a store instruction to save the value of the predicate to a primal which will be used to transfer the value of the predicate to the epilog. It saves all these instructions so they can be added to the kernal and scheduled when the rest of the kernal is scheduled, and they are saved to private variables so that they can later be removed, from the unrolled copy of the kernal, which will be used to create the epilog and prolog, which cannot have these instructions in them. */ public HashMap savePredOps = new HashMap(); public HashMap saveSuccOps = new HashMap(); public HashSet getKernalPreds(HashSet kernalInstList, BooleanEquation loopExitPredicate) { HashSet instsToAdd = new HashSet(); for (Iterator itin = kernalInstList.iterator(); itin.hasNext(); ) { Instruction inst = (Instruction)itin.next(); if(Store.conforms(inst)) { Operand out = Store.getDestination(inst); if(out.isPrimal()) { System.err.println("out " + out); System.err.println("inst " + inst); HashSet predLogicInsts = new HashSet(); PrimalOperand output = Operand.newPrimal(out.getFullName() + "_predSave"); //create the logic instructions to evaluate the predicate: BEqToList eq_list = new BEqToList(loopExitPredicate, new BooleanEquation(true)); instsToAdd.addAll(eq_list); predLogicInsts.addAll(eq_list); System.err.println("predLogicInsts " + predLogicInsts); _savedPredSaveStoreInsts.put(out, predLogicInsts); BooleanOperand in = eq_list.getResult(); Instruction resultInst = null; boolean resNotFnd=true; for (Iterator eqit = eq_list.iterator(); eqit.hasNext() && resNotFnd; ) { Instruction instTmp = (Instruction)eqit.next(); System.err.println("instTmp " + instTmp); for(int i=0; i<instTmp.getNumberOfOperands() && resNotFnd; i++) { if(instTmp.getOperand(i)==in) { resNotFnd=false; resultInst=instTmp; } } } System.err.println("result " + in); savePredOps.put(inst, in); saveSuccOps.put(resultInst, in); System.out.println("result " + in); System.out.println("inst " + inst); System.out.println("resultInst " + resultInst); //create the store instruction to save the predicate's value: BooleanEquation trueBoolEq = new BooleanEquation(true); Instruction storeInst = Store.create(Operators.STORE, Type.Bool, output, in, trueBoolEq); predLogicInsts.add(storeInst); _predStores.add(storeInst); _primToPred.put(out, output); _primToOldPred.put(out, inst.getPredicate()); } } } return instsToAdd; } /** This method performs the second step described above for collapsing the mux tree when iterations read from the same modPrim operand. This method creates a new block operand to hold the value in the modPrim operand, and then all instructions which use the original block operands defined by loads from that modprim, are changed to use this new block operand. */ public void replaceOpsWithNewMPrimBs(HashSet instSet) { for (Iterator itin = instSet.iterator(); itin.hasNext(); ) { Instruction inst = (Instruction)itin.next(); for (Iterator it = _opToElementMap.keySet().iterator(); it.hasNext(); ) { Operand op = (Operand)it.next(); //PrimalOperand modPrim = ((Element)_opToElementMap.get(op)).modPrim; HashSet elements = (HashSet)_opToElementMap.get(op); for (Iterator it2 = elements.iterator(); it2.hasNext(); ) { Element element = (Element)it2.next(); PrimalOperand modPrim = element.modPrim; if(modPrim != null) { //get a block operand specific to this modPrim operand: Operand newOp = getModPrimBlock(modPrim); //replace the original operand used here: inst.replaceOperand(op, newOp); } } } } } /** This HashMap was created to help me find an Element associated with a primal that is being stored to in the loop (a non modPrim primal). key = primal value = element */ private HashMap _opToElementMap = new HashMap(); /** This hashMap saves the Type of a load instruction used to read from a modPrim, before saving this into an output primal. This is used, when the laod instructions are created to know what type the new load should have. key = modPrim operand value = Type */ private HashMap _modPrimToType = new HashMap(); /** This method performs step 1 of the mux tree collapsing. It looks for instances when an iteration wishes to store a value to an output primal that it simply loaded from a modPrim. It also creates a set of loads of the modPrims so that they can be deleted. */
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?