📄 internal.doc
字号:
instructions, usually an instruction type followed by someinformation. For instance, a VTC function call is stored as threeinstructions: a type instruction I_FCALL, an integer containing theargument count, and a pointer to the Func structure containing thefunction.Although the interpreter is reentrant, each call to the interpreter istreated as a separate program running. This is important for theprimitives read(), getch(), sleep(), and abort(), which suspend orabort the currently-running program. For instance, parse() oftenresults in a call to interp(), and if a read() occurs at some pointwhile this program is executing, only the data and call stack entriesfor this new program (i.e. down to the topmost detached call frame)will be suspended, and the function that called parse() will continue.If interp() is called without a new detached call frame being placedon the call stack, VT will shortly crash.According to this design, the callv() primitive does not callinterp() but rather modifies the top of the call and data stacks andthen returns with a message to the interpreter not to execute itsnormal cleanup sequence. If it did call interp() instead, callv()would act like detach(), which is intended to create an independentthread.The call frames on the call stack contain pointers and indices intothe data stack, which can be visualized like so: . . | . | |-------| <-- dpos | | |-------| (Frames used by program algorithm) | | |-------| <-- lvars | | |-------| (Local variable values) | | |-------| <-- lvars->vals | | |-------| (Parameter variable values) | | |-------| <-- avars->vals (indexed by dstart) | . | . .The dstart element of the call frame is the index of the frame pointedto be avars->vals, or the start of the data frames used by the callframe. The cpush() call initializes a new call frame. It assumesthat the parameters have already been pushed on the stack. It extendsthe stack by prog->lvarc entries to make room for the local variablevalues. The "progress" argument to the cpush() call is non-zero onlyfor programs generated by the prog_pcall() call in prmt4.c. Thesepseudo-programs have no local or argument variables, and their onlyinstruction is to call a primitive. Thus, they act like a programwith no local or parameter variables that has already pushed severalvalues onto the stack and is ready to pop them off with a primitivecall. All other functions begin with dpos indexing the data frameimmediately after the local variables. If cpush() is called with anon-zero progress argument, the program should have no local variablesor parameter variables.When a call frame exits normally, the program has left one data frameon the stack on top of its argument and local variables.cpop_normal() removes the variables and moves the top frame down tothe first frame of the argument space. The net effect of interpretinga call frame is thus to replace its parameter variables with a returnvalue. This is the same effect as calling a primitive.If the program aborts preternaturally, the interpreter simply cleansup pointers to the arrays in the call frames and makes a single callto deref_frames() to remove all the data frames from the dstartelement of the most recent detached data frame up to the top of thestack.Parser------The parser has to overcome several difficult aspects of the VT design:first, it must produce linear code rather than parse trees; second, itmust be able to determine what types of expressions can be used aslvalues (most C-ish extension language compilers do not have pointers,which greatly simplifies this problem); third, at the tokenizing levelit must be able to accept its input in multiple chunks spaced out intime, and determine before it starts compiling when a parser directiveis finished.The primary difficulty in producing linear code is dealing with jumps.The parser maintains a jump table and two stacks, one for conditionalflow control and one for loops, which are more complicated. When theparser comes to a point in the code where loops are required, itallocates space in the jump table and pushes the current position inthe jump table onto one of the stacks. The macro interface handlesthese operations fairly transparently. For conditionals, we call'Jmp(t);', where t is the type of jump (e.g. I_JMPF), and then we call'Dest;' when we arrive at the destination of the jump, or 'Destnext;'if it's more convenient to pop the conditional stack one jumpinstruction ahead of time. For loops, we call Incloop(n), where n isthe number of jump destinations required for the loop. Then we callLdest(k) to set destination #k, and Ljmp(k, t) to jump to destination#k (t is, again, something like I_JMPF).Because we produce linear code, it is most expedient to handle lvaluesin the syntax. This results in a second shift-reduce conflict (beyondthe usual if-then-else conflict) resulting from the rule "postlval:'(' lval ')'", since according to the grammar, the lval could bereduced to an expr and then to an atom with the parenthesees. Anotherunpleasant side-effect of handling this in the grammar is that yacc'sexplicit precedence rules lose a lot of their effectiveness, sincethey do not operate once a group of tokens has been reduced to anlval. This results in a more complicated expression syntax, but thelinear design of the code precludes handling lvalue-checking in thesemantics.Since we do not receive our text input all at once, we keep a tokenbuffer. The parsing routine, rather than passing tokens directly toyyparse(), is called directly by the parse() primitive or from theinput window in VTC mode, and places the resulting tokens in thebuffer. At the end of each line the parser makes an educate guess atwhether a parser directive is complete based on the balance of the '{'and '}' tokens and whether the "func" keyword is at the beginning ofthe token buffer. The yylex() function simply reads out of the tokenbuffer until it is empty.The reduction rules writes the information for some instructions in atemporary form that is different from the final form of instructions.They write jump instructions as integer offsets instead of the Instr *pointers used in the final form, because they do not know while it iscompiling where the instructions will be stored, they store allreferences to identifiers as strings in order to avoid adding to thefunction and variable dictionaries in case of error. The functionscan() converts these intermediate forms into the final forms.The compiler makes a simple optimization for a[0] so that it produces code for *a instead of *(a+0). This results in a third shift-reduce conflict because it is done in the grammar.Efficiency notes for VTC code-----------------------------String lengths are stored in memory along with strings, so strlen() isnot required to manually count string elements. Also, separate copiesof strings are stored in the same physical location in memory as oftenas possible. For instance, the strdup() primitive does notimmediately copy its string argument into another location in memory;it waits until one of the copies of the strings are modified. Thus,avoiding strlen() or strdup() calls will not save a significant amountof time or space.If we pass strdup() a pointer to the middle of a string, it will stillmake a copy of the whole string if the string is later modified. Ifwe are going to store a copy of a substring, this could result in aconsiderable space overhead. So it is better to use something like: substring = strcpy("", s, n);where s is the pointer into the middle of the string and n is thelength of the substring.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -