cdoc3

来自「unix v7是最后一个广泛发布的研究型UNIX版本」· 代码 · 共 813 行 · 第 1/2 页
TXT
813 行
.SHCode Generation.PPThe grand plan for code-generation isindependent of any particular machine;it depends largely on a set of tables.But this fact does not necessarily make it very easyto modify the compiler to produce code for other machines,both because there is a good deal of machine-dependent structurein the tables, and because in any event such tables are non-trivial toprepare..PPThe arguments to the basic code generation routine.II rcexprare a pointer to a tree representing an expression,the name of a code-generation table,and the number of a register in which the value of theexpression should be placed..II Rcexprreturns the number of the register in which the value actuallyended up;its callermay need to produce a.II movinstruction if the value really needs to be in the given register.There are four code generation tables..PP.II Regtabis the basic one, which actually does the job describedabove: namely,compile code which places the value represented by the expressiontree in a register..PP.II Cctabis used when the value of the expression is not actually needed,but instead the value of the condition codes resulting fromevaluation of the expression.This table is used, for example, to evaluate the expression after.II if.It is clearly silly tocalculate the value (0 or 1) of the expression`a==b' in the context `if (a==b) ... '.PPThe.II sptabtable is used when the value of an expression is to be pushed on the stack,for example when it is an actual argument.For example in the function call `f(a)' it is a bad idea toload.II ainto a register which is then pushed on the stack,when there is a single instruction which does the job..PPThe.II efftabtable is used when an expression is to be evaluated for its side effects,not its value.This occurs mostly for expressions which are statements, which have novalue.Thus the code for the statement`a = b'need produce only the approoriate.II movinstruction, and need not leave the value of.II bin a register,while in the expression `a + (b = c)'the value of `b = c' will appear in a register..PPAll of the tables besides.II regtabare rather small, and handle only a relatively few special cases.If one of these subsidiary tables does not containan entry applicable to the given expression tree,.II rcexpruses.II regtabto put the value of the expression into a registerand then fixes things up;nothing need be done when the tablewas.II efftab,but a.II tstinstruction is produced when the table called for was.II cctab,and a .II movinstruction,pushing the register on the stack,when the table was.II sptab..PPThe.II rcexprroutine itself picks off some specialcases, then calls.II cexprto do the real work..II Cexprtries to find an entry applicableto the given tree in the given table, and returns \-1 ifno such entry is found, letting.II rcexprtry again with a different table.A successful match yields a stringcontaining both literal characterswhich are written out and pseudo-operations, or macros, which are expanded.Before studying the contentsof these strings we will consider how table entries are matchedagainst trees..PPRecall that most non-leaf nodes in an expression treecontain the name of the operator,the type of the value represented, and pointers to the subtrees (operands).They also contain an estimate of the number of registers required to evaluatethe expression, placed there by the expression-optimizer routines.The register counts are used to guide the code generation process,which is based on the Sethi-Ullman algorithm..PPThe main code generationtables consist of entrieseach containing an operator number and a pointerto a subtable for the corresponding operator.A subtable consists of a sequenceof entries, each with a key describing certain properties of theoperands of the operator involved; associated with the key is a code string.Once the subtable corresponding to the operator is found, the subtableis searched linearly until a key is found such that the properties demandedby the key are compatible with the operands of the tree node.A successful match returns the code string;an unsuccessful search, either for the operator in the main tableor a compatble key in the subtable,returns a failure indication..PPThe tables are all contained in a filewhich must be processed to obtain an assembly language program.Thus they are written in a special-purpose language.To provided definiteness to the following discussion, here is anexample of a subtable entry..DS%n,aw	F	add	A2,R.DEThe `%' indicates the key;the information following (up to a blank line) specifies the code string.Very briefly, this entry is in the subtablefor `+' of.II regtab;the key specifies that the left operand is any integer, character, or pointerexpression,and the right operand is any word quantity which is directly addressible(e.g. a variable or constant).The code string calls for the generation of the codeto compile the left (first) operand into thecurrent register (`F')and then to produce an `add' instruction which adds thesecond operand (`A2') to the register (`R').All of the notation will be explained below..PPOnly three features of the operands are used in decidingwhether a match has occurred.They are:.IP 1.Is the type of the operand compatible with that demanded?.RT.IP 2.Is the `degree of difficulty' (in a sense described below) compatible?.RT.IP 3.The table may demand that the operand have a `*'(indirection operator) as its highest operator..PPAs suggested above, the key for a subtable entryis indicated by a `%,' and a comma-separated pairof specifications for the operands.(The second specification is ignored for unary operators).A specification indicatesa type requirement by including one of the following letters.If no type letter is present, any integer, character,or pointer operand will satisfy the requirement (not float, double, or long)..IP bA byte (character) operand is required..RT.IP wA word (integer or pointer) operand is required..RT.IP fA float or double operand is required..RT.IP dA double operand is required..RT.IP lA long (32-bit integer) operand is required..PPBefore discussing the `degree of difficulty' specification,the algorithm has to be explained more completely..II Rcexpr(and.II cexpr)are called with a register number in which to place their result.Registers 0, 1, ... are used during evaluation of expressions;the maximum register which can be used in this way depends on thenumber of register variables, but in any event only registers0 through 4 are available since r5 is used as a stack frameheader and r6 (sp) and r7 (pc) have specialhardware properties.The code generation routines assume that when called with register.II nas argument, they may use.II n+1,\&...(up to the first register variable)as temporaries.Consider the expression `X+Y', where bothX and Y are expressions.As a first approximation, there are three ways of compilingcode to put this expression in register.II n..IP 1.If Y is an addressible cell,(recursively) put X into register.II nand add Y to it..RT.IP 2.If Y is an expression that can be calculated in.II kregisters, where.II ksmaller than the number of registers available,compile X into register.II n,Y into register.II n+1,and add register.II n+1to.II n..RT.IP 3.Otherwise, compile Y into register.II n,save the result in a temporary (actually, on the stack)compile X into register.II n,then add in the temporary..PPThe distinction between cases 2 and 3 therefore dependson whether the right operand can be compiled in fewer than.II kregisters, where.II kis the number of free registers left after registers 0 through.II nare taken:0 through.II n\-1are presumed to contain already computed temporary results;.II nwill, in case 2,contain the value of the left operand while the rightis being evaluated..PPThese considerations should make clearthe specification codes for the degree of difficulty,bearing in mind that a number of special cases are also present:.IP zis satisfied when the operand is zero, so that special codecan be produced for expressions like `x = 0'..RT.IP 1is satisfied when the operand is the constant 1, to optimizecases like left and right shift by 1, which can be doneefficiently on the PDP-11..RT.IP cis satisfied when the operand is a positive (16-bit)constant; this takes care of some special cases in long arithmetic..RT.IP ais satisfied when the operand is addressible;this occurs not only for variables and constants, but also forsome more complicated constructions, such as indirection througha simple variable, `*p++' where.II pis a register variable (because of the PDP-11's auto-increment addressmode), and `*(p+c)' where.II pis a register and.II cis a constant.Precisely, the requirement is that the operand refers to a cellwhose address can be written as a source or destination of a PDP-11instruction..RT.IP eis satisfied by an operand whose value can be generated in a registerusing no more than.II kregisters, where.II kis the number of registers left (not counting the current register).The `e' stands for `easy.'.RT.IP nis satisfied by any operand.The `n' stands for `anything.'.PPThese degrees of difficulty are considered to lie in a linear orderingand any operand which satisfies an earlier-mentioned requirementwill satisfy a later one.Since the subtables are searched linearly,if a `1' specification is included, almost certainlya `z' must be written first to preventexpressions containing the constant 0 to be compiledas if the 0 were 1..PPFinally,a key specification may contain a `*' whichrequires the operand to have an indirection as its leading operator.Examples below should clarify the utility of this specification..PPNow let us consider the contents of the code stringassociated with each subtable entry.Conventionally, lower-case letters in this stringrepresent literal information which is copied directlyto the output.Upper-case letters generally introduce specificmacro-operations, some of which may be followedby modifying information.The code strings in the tables are written with tabs andnew-lines used freely to suggest instructions which will be generated;the table-compiling program compresses tabs (using the 0200 bit of thenext character) and throws away some of the new-lines.For example the macro `F' is ordinarily written on a line by itself;but since its expansion will end with a new-line, the new-lineafter `F' itself is dispensable.This is all to reduce the size of the stored tables..PPThe first set of macro-operations is concerned withcompiling subtrees.Recall that this is done by the.II cexprroutine.In the following discussion the `current register'is generally the argument register to.II cexpr;that is, the place where the result is desired.The `next register' is numbered onehigherthan the current register.(This explanation isn't fully truebecause of complications, described below, involvingoperations which require even-odd register pairs.).IP Fcauses a recursive call tothe.II rcexprroutine to compile code which places the value of the first (left)operand of the operator in the current register..RT.IP F1generates code which places the value of the first operand in thenext register.It is incorrectly used if there might be no next register;that is, if the degree of difficulty of the first operand is not `easy;'if not, another register might not be available..RT.IP FSgenerates code which pushes the value of the first operand on the stack,by calling.II rcexprspecifying.II sptabas the table..LPAnalogously,.IP "S, S1, SS"compile the second (right) operandinto the current register, the next register, or onto the stack..LPTo deal with registers, there are.IP Rwhich expands into the name of the current register..RT.IP R1which expands into the name of the next register..RT.IP R+which expands into the the name of the current register plus 1.It was suggested above that this is the same as the next register,except for complications; here is one of them.Long integer variables have32 bits and require 2 registers; in such cases the next registeris the current register plus 2.The code would like to talk about both halves of thelong quantity, so R refers to the register with the high-order partand R+ to the low-order part..RT.IP R\-This is another complication, involving division and mod.These operations involve a pair of registers of which the odd-numberedcontains the left operand..II Cexpr
cdoc3 - 源码说明

本页面展示了「unix v7是最后一个广泛发布的研究型UNIX版本」中的 cdoc3 源码文件，采用编程语言编写，共 813 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与unix相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?