📄 porttour2
字号:
.SHThe Sethi-Ullman Computation.PPThe heart of the heuristics is the computation of the Sethi-Ullman numbers.This computation is closely linked with the rewriting rules and thetemplates.As mentioned before, the Sethi-Ullman numbers are expected toestimate the number of scratch registers needed to computethe subtrees without using any stores.However, the original theory does not apply to real machines.For one thing, the theory assumes that all registersare interchangeable.Real machines have general purpose, floating point, and index registers,register pairs, etc.The theory also does not account for side effects;this rules out various forms of pathology that arisefrom assignment and assignment ops.Condition codes are also undreamed of.Finally, the influence of types, conversions, and thevarious addressability restrictions and extensions of realmachines are also ignored..PPNevertheless, for a ``useless'' theory,the basic insight of Sethi and Ullman is amazinglyuseful in a real compiler.The notion that one should attempt to estimate theresource needs of trees before starting thecode generation provides a natural means of splitting thecode generation problem, and provides a bit of redundancyand self checking in the compiler.Moreover, if writing theSethi-Ullman routines is hard, describing, writing, and debugging thealternative (routines that attempt to free up registers by stores intotemporaries ``on the fly'') is even worse.Nevertheless, it should be clearly understood that these routines exist in a realmwhere there is no ``right'' way to write them;it is an art, the realm of heuristics, and, consequently, a majorsource of bugs in the compiler.Often, the early, crude versions of these routines give little trouble;only afterthe compiler is actually workingand thecode quality is being improved do serious problem have to be faced.Having a simple, regular machine architecture is worth quitea lot at this time..PPThe major problems arise from asymmetries in the registers: register pairs,having different kinds of registers, and the related problem ofneeding more than one register (frequently a pair) to store certaindatatypes (such as longs or doubles).There appears to be no general way of treating this problem;solutions have to be fudged for each machine where the problem arises.On the Honeywell 66, for example, there are only two general purpose registers,so a need for a pair is the same as the need for two registers.On the IBM 370, the register pair (0,1) is used to do multiplications and divisions;registers 0 and 1 are not generally considered part of the scratch registers, and sodo not require allocation explicitly.On the Interdata 8/32, after much consideration, thedecision was made not to try to deal with the register pair issue;operations such as multiplication and division that required pairswere simply assumed to take all of the scratch registers.Several weeks of effort had failed to producean algorithm that seemed to have much chance of running successfullywithout inordinate debugging effort.The difficulty of this issue should not be minimized; it represents one of themain intellectual efforts in porting the compiler.Nevertheless, this problem has been fudged with a degree ofsuccess on nearly a dozen machines, so the compiler writer shouldnot abandon hope..PPThe Sethi-Ullman computations interact with therest of the compiler in a number of rather subtle ways.As already discussed, the.I storeroutine uses the Sethi-Ullman numbers to decide which subtrees are too difficultto compute in registers, and must be stored.There are also subtle interactions between therewriting routines and the Sethi-Ullman numbers.Suppose we have a tree such as.DS.I "A \- B".DEwhere.I Aand.I Bare expressions; suppose further that.I Btakes two registers, and.I Aone.It is possible to compute the full expression in two registers byfirst computing.I B ,and then, using the scratch registerused by.I B ,but not containing the answer, compute.I A .The subtraction can then be done, computing the expression.(Note that this assumes a number of things, not the least of whichare register-to-register subtraction operators and symmetricregisters.)If the machine dependent routine.I setbin ,however, is not prepared to recognize this caseand compute the more difficult side of the expressionfirst, theSethi-Ullman number must be set to three.Thus, theSethi-Ullman number for a tree should represent the code thatthe machine dependent routines are actually willing to generate..PPThe interaction can go the other way.If we take an expression such as.DS.I "* ( p + i )".DEwhere.I pis a pointer and.I ian integer,this can probably be done in one register on most machines.Thus, its Sethi-Ullman number would probably be set to one.If double indexing is possible in the machine, a possible wayof computing the expression is to load both.I pand.I iinto registers, and then use double indexing.This would use two scratch registers; in such a case,it is possible that the scratch registers might be unobtainable,or might make some other part of the computation run out ofregisters.The usual solution is to cause.I offstarto ignore opportunities for double indexing that would tie up more scratchregisters than the Sethi-Ullman number had reserved..PPIn summary, the Sethi-Ullman computation represents much of the craftsmanship and artistry in any applicationof the portable compiler.It is also a frequent source of bugs.Algorithms are available that will produce nearly optimal codefor specialized machines, but unfortunately most existing machinesare far removed from these ideals.The best way of proceeding in practice is to start with a compilerfor a similar machine to the target, and proceed verycarefully..SHRegister Allocation.PPAfter the Sethi-Ullman numbers are computed,.I ordercalls a routine,.I rallo ,that does register allocation, if appropriate.This routine does relatively little, in general;this is especially true if the target machineis fairly regular.There are a few cases where it is assumed thatthe result of a computation takes place in a particular register;switch and function return are the two major places.The expression tree has a field,.I rall ,that may be filled with a register number; this is takento be a preferred register, and the first temporaryregister allocated by a template match will be this preferred one, ifit is free.If not, no particular action is taken; this is just a heuristic.If no register preference is present, the field contains NOPREF.In some cases, the result must be placed ina given register, no matter what.The register number is placed in.I rall ,and the mask MUSTDO is logically or'ed in with it.In this case, if the subtree is requested in a register, and comesback in a register other than the demanded one, it is movedby calling the routine.I rmove .If the target register for this move is busy, it is a compilererror..PPNote that this mechanism is the only one that will ever cause a register-to-registermove between scratch registers (unless such a move is buried in the depths ofsome template).This simplifies debugging.In some cases, there is a rather strange interaction betweenthe register allocation and the Sethi-Ullman number;if there is an operator or situation requiring aparticular register, the allocator and the Sethi-Ullmancomputation must conspire to ensure that the targetregister is not being used by some intermediate result of some far-removed computation.This is most easily done by making the special operation takeall of the free registers, preventing any other partially-computedresults from cluttering up the works..SHCompiler Bugs.PPThe portable compiler has an excellent record of generating correct code.The requirement for reasonable cooperation between the register allocation,Sethi-Ullman computation, rewriting rules, and templates builds quite a bitof redundancy into the compiling process.The effect of this is that, in a surprisingly short time, the compiler willstart generating correct code for thoseprograms that it can compile.The hard part of the job then becomes finding andeliminating those situations where the compiler refuses tocompile a program because it knows it cannot do it right.For example, a template may simply be missing; this may eithergive a compiler error of the form ``no match for op ...'' , or causethe compiler to go into an infinite loop applying various rewriting rules.The compiler has a variable,.I nrecur ,that is set to 0 at the beginning of an expressions, andincremented at key spots in thecompilation process; if this parameter gets too large, thecompiler decides that it is in a loop, and aborts.Loops are also characteristic of botches in the machine-dependent rewriting rules.Bad Sethi-Ullman computations usually cause the scratch registersto run out; this often meansthat the Sethi-Ullman number was underestimated, so.I storedid not store something it should have; alternatively,it can mean that the rewriting rules were not smart enough tofind the sequence that.I sucompassumed would be used..PPThe best approach when a compiler error is detected involves several stages.First, try to get a small example program that steps on the bug.Second, turn on various debugging flags in the code generator, and follow thetree through the process of being matched and rewritten.Some flags of interest are\-e, which prints the expression tree,\-r, which gives information about the allocation of registers,\-a, which gives information about the performance of.I rallo ,and \-o, which gives information about the behavior of.I order .This technique should allow most bugs to be found relatively quickly..PPUnfortunately, finding the bug is usually not enough; it must alsobe fixed!The difficulty arises because a fix to the particular bug of interest tendsto break other code that already works.Regression tests, tests that compare the performance ofa new compiler against the performance of an older one, are veryvaluable in preventing major catastrophes..SHSummary and Conclusion.PPThe portable compiler has been a useful tool for providing Ccapability on a large number of diverse machines,and for testing a number of theoreticalconstructs in a practical setting.It has many blemishes, both in style and functionality.It has been applied to many more machines than firstanticipated, of a much wider range than originally dreamed of.Its use has also spread much faster than expected, leaving parts ofthe compiler still somewhat raw in shape..PPOn the theoretical side, there is some hope that theskeleton of the.I sucomproutine could be generated for many machines directly from thetemplates; this would give a considerable boostto the portability and correctness of the compiler,but might affect tunability and code quality.There is also room for more optimization, both within.I optimand in the form of a portable ``peephole'' optimizer..PPOn the practical, development side,the compiler could probably be sped up and made smallerwithout doing too much violence to its basic structure.Parts of the compiler deserve to be rewritten;the initialization code, register allocation, andparser are prime candidates.It might be that doing some or all of the parsingwith a recursive descent parser mightsave enough space and time to be worthwhile;it would certainly ease the problem of moving thecompiler to an environment where.I Yaccis not already present..PPFinally, I would like to thank the many people who havesympathetically, and even enthusiastically, helped me grapplewith what has been a frustrating program to write, test, and install.D. M. Ritchie and E. N. Pinson provided needed earlyencouragement and philosophical guidance;M. E. Lesk,R. Muha, T. G. Peterson,G. Riddle, L. Rosler,R. W. Mitze,B. R. Rowland,S. I. Feldman,andT. B. Londonhave all contributed ideas, gripes, and all, at one time or another,climbed ``into the pits'' with me to help debug.Without their help this effort would have not been possible;with it, it was often kind of fun..sp 100.LP.[$LIST$.].LP.sp 100.LP
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -