⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 porttour1

📁 unix v7是最后一个广泛发布的研究型UNIX版本
💻
📖 第 1 页 / 共 4 页
字号:
initializer occupies exactly the right size..PPCharacter strings represent a bit of an exception.If a character string is seen as the initializer fora pointer, the characters making up the string mustbe put out under a different location counter.When the lexical analyzer sees the quote at the headof a character string, it returns the token STRING,but does not do anything with the contents.The parser calls.I getstr ,which sets up the appropriate location countersand flags, and calls.I lxstrto read and process the contents of the string..PPIf the string is being used to initialize a character array,.I lxstrcalls.I putbyte ,which in effect simulates.I doinitfor each character read.If the string is used to initialize a character pointer,.I lxstrcalls a machine dependent routine,.I bycode ,which stashes away each character.The pointer to this string is then returned,and processed normally by.I doinit ..PPThe null at the end of the string is treated as if itwere read explicitly by.I lxstr ..SHStatements.PPThe first pass addresses four main areas; declarations, expressions, initialization, and statements.The statement processing is relatively simple; most of it is carried out in theparser directly.Most of the logic is concerned with allocatinglabel numbers, defining the labels, and branching appropriately.An external symbol,.I reached ,is 1 if a statement can be reached, 0 otherwise; this isused to do a bit of simple flow analysis as the program is being parsed,and also to avoid generating the subroutine return sequence if the subroutinecannot ``fall through'' the last statement..PPConditional branches are handled by generating an expressionnode, CBRANCH,whose left descendant is the conditional expression and theright descendant is an ICON node containing the internal labelnumber to be branched to.For efficiency, the semantics are thatthe label is gone to if the condition is.I false ..PPThe switch statement is compiled by collecting the case entries, and an indication as to whetherthere is a default case;an internal label number is generated for each of these,and remembered in a big array.The expression comprising the value to be switched on iscompiled when the switch keyword is encountered,but the expression tree is headed bya special node, FORCE, which tells the code generator toput the expression value into a special distinguishedregister (this same mechanism is used for processing thereturn statement).When the end of the switch block is reached, the arraycontaining the case values is sorted, and checked forduplicate entries (an error); if all iscorrect, the machine dependent routine.I genswitchis called, with this array of labels and values in increasing order..I Genswitchcan assume that the value to be tested is already in theregister which is the usual integer return value register..SHOptimization.PPThere is a machine independent file,.I optim.c ,which contains a relatively short optimization routine,.I optim .Actually the word optimization is something of a misnomer;the results are not optimum, only improved, and theroutine is in fact not optional; it mustbe called for proper operation of the compiler..PP.I Optimis called after an expression tree is built, butbefore the code generator is called.The essential part of its job is to call.I clocalon the conversion operators.On most machines, the treatment of& is also essential:by this time in the processing, the only node whichis a legal descendant of & is NAME.(Possible descendants of * have been eliminated by.I buildtree.)The address of a static name is, almost by definition, aconstant, and can be represented by an ICON node on most machines(provided that the loader has enough power).Unfortunately, this is not universally true; on some machine, such as the IBM 370,the issue of addressability rears its ugly head;thus, before turning a NAME node into an ICON node,the machine dependent function.I andableis called..PPThe optimization attempts of.I optimare currently quite limited.It is primarily concerned with improving the behavior ofthe compiler with operations one of whose arguments is a constant.In the simplest case, the constant is placed on the right if theoperation is commutative.The compiler also makes a limited search for expressionssuch as.DS.I "( x + a ) + b".DEwhere.I aand.I bare constants, and attempts to combine.I aand.I bat compile time.A number of special cases are also examined;additions of 0 and multiplications by 1 are removed,although the correct processing of these cases to getthe type of the resulting tree correct isdecidedly nontrivial.In some cases, the addition or multiplication must be replaced bya conversion op to keep the types from becomingfouled up.Finally, in cases where a relational operation is being done,and one operand is a constant, the operands are permuted, and the operator altered, if necessary,to put the constant on the right.Finally, multiplications by a power of 2 are changed to shifts..PPThere are dozens of similar optimizations that can be, and should be,done.It seems likely that this routine will be expanded in the relatively near future..SHMachine Dependent Stuff.PPA number of the first pass machine dependent routines have been discussed above.In general, the routines are short, and easy to adapt frommachine to machine.The two exceptions to this general rule are.I clocalandthe function prolog and epilog generation routines,.I bfcodeand.I efcode ..PP.I Clocalhas the job of rewriting, if appropriate and desirable,the nodes constructed by.I buildtree .There are two major areas where thisis important;NAME nodes and conversion operations.In the case of NAME nodes,.I clocalmust rewrite the NAME node to reflect theactual physical location of the name in the machine.In effect, the NAME node must be examined, the symbol tableentry found (through the.I rvalfield of the node),and, based on the storage class of the node,the tree must be rewritten.Automatic variables and parameters are typicallyrewritten by treating the reference to the variable asa structure reference, off the register whichholds the stack or argument pointer;the.I strefroutine is set up to be called in this way, and tobuild the appropriate tree.In the most general case, the tree consistsof a unary * node, whose descendant isa + node, with the stack or argument register as left operand,and a constant offset as right operand.In the case of LABEL and internal static nodes, the.I rvalfield is rewritten to be the negative of the internallabel number; a negative.I rval field is taken to be an internal label number.Finally, a name of class REGISTER must be converted into a REG node,and the.I rvalfield replaced by the register number.In fact, this part of the.I clocalroutine is nearly machine independent; only for machineswith addressability problems (IBM 370 again!) does ithave to be noticeably different,.a.PPThe conversion operator treatment is rather tricky.It is necessary to handle the application of conversion operatorsto constants in.I clocal ,in order that all constant expressions can have their values knownat compile time.In extreme cases, this may mean that some simulation of thearithmetic of the target machine might have to be done in across-compiler.In the most common case,conversions from pointer to pointer do nothing.For some machines, however, conversion from byte pointer to short or longpointer might require a shift or rotate operation, which wouldhave to be generated here..PPThe extension of the portable compiler to machines where the size of a pointerdepends on its type would be straightforward, but has not yet been done..PPThe other major machine dependent issue involves the subroutine prolog and epiloggeneration.The hard part here is the design of the stack frameand calling sequence; this design issue is discussed elsewhere..[Johnson Lesk Ritchie calling sequence.]The routine.I bfcodeis called with the number of argumentsthe function is defined with, andan array containing the symbol table indices of thedeclared parameters..I Bfcodemust generate the code to establish the new stack frame,save the return address and previous stack pointervalue on the stack, and save whateverregisters are to be used for register variables.The stack size and the number of register variables is notknown when.I bfcodeis called, so these numbers must bereferred to by assembler constants, which aredefined when they are known (usually in the second pass,after all register variables, automatics, and temporaries have been seen).The final job is to find those parameters which may have been declaredregister, and generate the code to initializethe register with the value passed on the stack.Once again, for most machines, the general logic of.I bfcoderemains the same, but the contents of the.I printfcalls in it will change from machine to machine..I efcodeis rather simpler, having just to generate the defaultreturn at the end of a function.This may be nontrivial in the case of a function returning a structure or union, however..PPThere seems to be no really good place to discuss structures and unions, butthis is as good a place as any.The C language now supports structure assignment,and the passing of structures as arguments to functions,and the receiving of structures back from functions.This was added rather late to C, and thus to the portable compiler.Consequently, it fits in less well than the older features.Moreover, most of the burden of making these features work isplaced on the machine dependent code..PPThere are both conceptual and practical problems.Conceptually, the compiler is structured aroundthe idea that to compute something, you put it intoa register and work on it.This notion causes a bit of trouble on some machines (e.g., machines with 3-address opcodes), butmatches many machines quite well.Unfortunately, this notion breaks down with structures.The closest that one can come is to keep the addresses of thestructures in registers.The actual code sequences used to move structures vary from thetrivial (a multiple byte move) to the horrible (afunction call), and are very machine dependent..PPThe practical problem is more painful.When a function returning a structure is called, this functionhas to have some place to put the structure value.If it places it on the stack, it has difficulty popping its stack frame.If it places the value in a static temporary, the routine fails to bereentrant.The most logically consistent way of implementing this is for thecaller to pass in a pointer to a spot where the called functionshould put the value before returning.This is relatively straightforward, although a bit tedious, to implement,but means that the caller must have properly declaredthe function type, even if the value is never used.On some machines, such as the Interdata 8/32, the return valuesimply overlays the argument region (which on the 8/32 is partof the caller's stack frame).The caller takes care of leaving enough room if the returned value is largerthan the arguments.This also assumes that the caller know and declares thefunction properly..PPThe PDP-11 and the VAX have stack hardware which is used in function calls and returns;this makes it very inconvenient touse either of the above mechanisms.In these machines, a static area within the called functionis allocated, andthe function return value is copied into it on return; the functionreturns the address of that region.This is simple to implement, but is non-reentrant.However, the function can now be called as a subroutinewithout being properly declared, without the disaster which would otherwise ensue.No matter what choice is taken, the convention is that the functionactually returns the address of the return structure value..PPIn building expression trees, the portable compiler takes a bit for granted aboutstructures.It assumes that functions returning structuresactually return a pointer to the structure, and it assumes thata reference to a structure is actually a reference to its address.The structure assignment operator is rebuilt so that the leftoperand is the structure being assigned to, but theright operand is the address of the structure being assigned;this makes it easier to deal with.DS.I "a = b = c".DEand similar constructions..PPThere are four special tree nodes associated with theseoperations:STASG (structure assignment), STARG (structure argumentto a function call), and STCALL and UNARY STCALL(calls of a function with nonzero and zero arguments, respectively).These four nodes are unique in that the size and alignment information, which can be determined bythe type for all other objects in C, must be known to carry out these operations; specialfields are set aside in these nodes to containthis information, and specialintermediate code is used to transmit this information..SHFirst Pass Summary.PPThere are may other issues which have been ignored here,partly to justify the title ``tour'', and partiallybecause they have seemed to cause little trouble.There are some debugging flagswhich may be turned on, by giving the compiler's first passthe argument.DS\-X[flags].DESome of the more interesting flags are\-Xd for the defining and freeing of symbols,\-Xi for initialization comments, and\-Xb for various comments about the building of trees.In many cases, repeating the flag more than once gives more information;thus,\-Xddd gives more information than \-Xd.In the two pass version of the compiler, theflags should not be set when the output is sent to the secondpass, since the debugging output and the intermediate code both go onto the standardoutput..PPWe turn now to consideration of the second pass.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -