📄 porttour2

📁 unix v7是最后一个广泛发布的研究型UNIX版本
💻
📖 第 1 页 / 共 4 页
字号:
12 3 4 下一页
.SHPass Two.PPCode generation is far less well understood thanparsing or lexical analysis, and for this reasonthe second pass is far harder to discuss in a file by file manner.A great deal of the difficulty is in understanding the issuesand the strategies employed to meet them.Any particular function is likely to be reasonably straightforward..PPThus, this part of the paper will concentrate a good deal on the broaderaspects of strategy in the code generator,and will not get too intimate with the details..SHOverview..PPIt is difficult to organize a code generator to be flexible enough togenerate code for a large number of machines,and still be efficient for any one of them.Flexibility is also important when it comes time to tunethe code generator to improve the output code quality.On the other hand, too much flexibility can lead to semanticallyincorrect code, and potentially a combinatorial explosion in thenumber of cases to be considered in the compiler..PPOne goal of the code generator is to have a high degree of correctness.It is very desirable to have the compiler detect its own inability togenerate correct code, rather than to produce incorrect code.This goal is achieved by having a simple model of the jobto be done (e.g., an expression tree)and a simple model of the machine state(e.g., which registers are free).The act of generating an instruction performs a transformationon the tree and the machine state;hopefully, the tree eventually getsreduced to a single node.If each of these instruction/transformation pairs is correct,and if the machine state model really represents the actual machine,and if the transformations reduce the input tree to the desired single node, then theoutput code will be correct..PPFor most real machines, there is no definitive theory of code generation thatencompasses all the C operators.Thus the selection of which instruction/transformations to generate, and in whatorder, will have a heuristic flavor.If, for some expression tree, no transformation applies, or, moreseriously, if the heuristics select a sequence of instruction/transformationsthat do not in fact reduce the tree, the compilerwill report its inability to generate code, and abort..PPA major part of the code generator is concerned with the model and the transformations,\(em most of this is machine independent, or depends only on simple tables.The flexibility comes from the heuristics that guide the transformationsof the trees, the selection of subgoals, and the ordering of the computation..SHThe Machine Model.PPThe machine is assumed to have a number of registers, of at most two differenttypes:.I Aand.I B .Within each register class, there may be scratch (temporary) registers and dedicated registers(e.g., register variables, the stack pointer, etc.).Requests to allocate and free registers involve only the temporary registers..PPEach of the registers in the machine is given a name and a numberin the.I mac2defsfile; the numbers are used as indices into various tablesthat describe the registers, so they should be kept small.One such table is the.I rstatustable on file.I local2.c .This table is indexed by register number, andcontains expressions made up from manifest constants describing the register types:SAREG for dedicated AREG's, SAREG\(orSTAREG for scratch AREGS's,and SBREG and SBREG\(orSTBREG similarly for BREG's.There are macros that access this information:.I isbreg(r)returns true if register number.I ris a BREG, and.I istreg(r)returns true if register number.I ris a temporary AREG or BREG.Another table,.I rnames ,contains the register names; this is used whenputting out assembler code and diagnostics..PPThe usage of registers is kept track of by an array called.I busy ..I Busy[r]is the number of uses of register.I rin the current tree being processed.The allocation and freeing of registers will be discussed later as part of the code generationalgorithm..SHGeneral Organization.PPAs mentioned above, the second pass reads lines fromthe intermediate file, copying through to the output unchangedany lines that begin with a `)', and making note of theinformation about stack usage and register allocation contained onlines beginning with `]' and `['.The expression trees, whose beginning is indicated by a line beginning with `.',are read and rebuilt into trees.If the compiler is loaded as one pass, the expression trees areimmediately available to the code generator..PPThe actual code generation is done by a hierarchy of routines.The routine.I delayis first given the tree; it attempts to delay some postfix++ and \-\- computations that might reasonably be done after thesmoke clears.It also attempts to handle comma (,) operators by computing theleft side expression first, and then rewriting the treeto eliminate the operator..I Delaycalls.I codgento control the actual code generation process..I Codgentakes as arguments a pointer to the expression tree,and a second argument that, for socio-historical reasons, is called a.I cookie .The cookie describes a set of goals that would be acceptablefor the code generation: these are assigned to individual bits,so they may be logically or'ed together to form a large number of possible goals.Among the possible goals are FOREFF (compute for side effects only;don't worry about the value), INTEMP (compute and store value into a temporary locationin memory), INAREG (compute into an A register), INTAREG (compute into a scratchA register), INBREG and INTBREG similarly, FORCC (compute for condition codes),and FORARG (compute it as a function argument; e.g., stack it if appropriate)..PP.I Codgenfirst canonicalizes the tree by calling.I canon .This routine looks for certain transformations that might now beapplicable to the tree.One, which is very common and very powerful, is tofold together an indirection operator (UNARY MUL)and a register (REG); in most machines, this combination isaddressable directly, and so is similar to a NAME in itsbehavior.The UNARY MUL and REG are folded together to makeanother node type called OREG.In fact, in many machines it is possible to directly address not just thecell pointed to by a register, but also cells differing by a constantoffset from the cell pointed to by the register..I Canonalso looks for such cases, calling themachine dependent routine .I notoffto decide if the offset is acceptable (for example, in the IBM 370 the offsetmust be between 0 and 4095 bytes).Another optimization is to replace bit field operationsby shifts and masks if the operation involves extracting the field.Finally, a machine dependent routine,.I sucomp ,is called that computes the Sethi-Ullman numbersfor the tree (see below)..PPAfter the tree is canonicalized,.I codgencalls the routine.I storewhose job is to select a subtree of the tree to be computedand (usually) stored before beginning thecomputation of the full tree..I Storemust return a tree that can be computed without needfor any temporary storage locations.In effect, the only store operations generated while processing the subtree must be as a responseto explicit assignment operators in the tree.This division of the job marks one of the more significant, and successful,departures from most other compilers.It means that the code generator can operateunder the assumption that there are enoughregisters to do its job, without worrying abouttemporary storage.If a store into a temporary appears in the output, it is alwaysas a direct result of logic in the.I storeroutine; this makes debugging easier..PPOne consequence of this organization is thatcode is not generated by a treewalk.There are theoretical results that support this decision..[Aho Johnson Optimal Code Trees jacm.]It may be desirable to compute several subtrees and storethem before tackling the whole tree;if a subtree is to be stored, this is known before the codegeneration for the subtree is begun, and the subtree is computedwhen all scratch registers are available..PPThe.I storeroutine decides what subtrees, if any, should bestored by making use of numbers,called.I "Sethi-Ullman numbers" ,that give, for each subtree of an expression tree,the minimum number of scratch registers required tocompile the subtree, without any stores into temporaries..[Sethi Ullman jacm 1970.]These numbers are computed by the machine-dependentroutine.I sucomp ,called by.I canon .The basic notion is that, knowing the Sethi-Ullman numbers forthe descendants of a node, and knowing the operator of thenode and some information about the machine, theSethi-Ullman number of the node itself can be computed.If the Sethi-Ullman number for a tree exceeds the number of scratchregisters available, some subtree must be stored.Unfortunately, the theory behind the Sethi-Ullman numbersapplies only to uselessly simple machines and operators.For the rich set of C operators, and for machines withasymmetric registers, register pairs, different kinds of registers,and exceptional forms of addressing,the theory cannot be applied directly.The basic idea of estimation is a good one, however,and well worth applying; the application, especiallywhen the compiler comes to be tuned for high codequality, goes beyond the park of theory into theswamp of heuristics.This topic will be taken up again later, when more of the compilerstructure has been described..PPAfter examining the Sethi-Ullman numbers,.I storeselects a subtree, if any, to be stored, and returns the subtree and the associated cookie inthe external variables.I stotreeand.I stocook .If a subtree has been selected, or ifthe whole tree is ready to be processed, theroutine.I orderis called, with a tree and cookie..I Ordergenerates code for trees thatdo not require temporary locations..I Ordermay make recursive calls on itself, and,in some cases, on.I codgen ;for example, when processing the operators &&, \(or\(or, and comma (`,'), that havea left to right evaluation, it isincorrect for.I storeexamine the right operand for subtrees to be stored.In these cases,.I orderwill call.I codgenrecursively when it is permissible to work on the right operand.A similar issue arises with the ? : operator..PPThe.I orderroutine works by matching the current tree witha set of code templates.If a template is discovered that willmatch the current tree and cookie, the associated assembly languagestatement or statements are generated.The tree is then rewritten,as specified by the template, to represent the effect of the output instruction(s).If no template match is found, first an attempt is made to find a match with adifferent cookie; for example, in order to computean expression with cookie INTEMP (store into a temporary storage location),it is usually necessary to compute the expression into a scratch registerfirst.If all attempts to match the tree fail, the heuristic part ofthe algorithm becomes dominant.Control is typically given to one of a number of machine-dependent routinesthat may in turn recursively call.I orderto achieve a subgoal of the computation (for example, one of thearguments may be computed into a temporary register).After this subgoal has been achieved, the process begins again with themodified tree.If the machine-dependent heuristics are unable to reduce the tree further,a number of default rewriting rules may be considered appropriate.For example, if the left operand of a + is a scratchregister, the + can be replaced by a += operator;the tree may then match a template..PPTo close this introduction, we will discuss the steps in compilingcode for the expression.DS\fIa\fR += \fIb\fR.DEwhere.I aand.I bare static variables..PPTo begin with, the whole expression tree is examined with cookie FOREFF, andno match is found.  Search with other cookies is equally fruitless, so anattempt at rewriting is made.
12 3 4 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -