📄 yacc-docs.txt
字号:
and then read the remaining input, ELSE S2and reduce IF ( C1 ) stat ELSE S2by the if-else rule. This leads to the first of the above group-ings of the input. On the other hand, the ELSE may be shifted, S2 read, andthen the right hand portion of IF ( C1 ) IF ( C2 ) S1 ELSE S2can be reduced by the if-else rule to getYacc: Yet Another Compiler-Compiler PS1:15-19 IF ( C1 ) statwhich can be reduced by the simple-if rule. This leads to thesecond of the above groupings of the input, which is usuallydesired. Once again the parser can do two valid things - there is ashift/reduce conflict. The application of disambiguating rule 1tells the parser to shift in this case, which leads to thedesired grouping. This shift/reduce conflict arises only when there is a par-ticular current input symbol, ELSE, and particular inputs alreadyseen, such as IF ( C1 ) IF ( C2 ) S1In general, there may be many conflicts, and each one will beassociated with an input symbol and a set of previously readinputs. The previously read inputs are characterized by thestate of the parser. The conflict messages of Yacc are best understood by examin-ing the verbose (-v) option output file. For example, the outputcorresponding to the above conflict state might be:23: shift/reduce conflict (shift 45, reduce 18) on ELSEstate 23 stat : IF ( cond ) stat_ (18) stat : IF ( cond ) stat_ELSE stat ELSE shift 45 . reduce 18The first line describes the conflict, giving the state and theinput symbol. The ordinary state description follows, giving thegrammar rules active in the state, and the parser actions.Recall that the underline marks the portion of the grammar ruleswhich has been seen. Thus in the example, in state 23 the parserhas seen input corresponding to IF ( cond ) statand the two grammar rules shown are active at this time. Theparser can do two possible things. If the input symbol is ELSE,it is possible to shift into state 45. State 45 will have, aspart of its description, the line stat : IF ( cond ) stat ELSE_statPS1:15-20 Yacc: Yet Another Compiler-Compilersince the ELSE will have been shifted in this state. Back instate 23, the alternative action, described by ``.'', is to bedone if the input symbol is not mentioned explicitly in the aboveactions; thus, in this case, if the input symbol is not ELSE, theparser reduces by grammar rule 18: stat : IF '(' cond ')' statOnce again, notice that the numbers following ``shift'' commandsrefer to other states, while the numbers following ``reduce''commands refer to grammar rule numbers. In the y.output file,the rule numbers are printed after those rules which can bereduced. In most one states, there will be at most reduce actionpossible in the state, and this will be the default command. Theuser who encounters unexpected shift/reduce conflicts will prob-ably want to look at the verbose output to decide whether thedefault actions are appropriate. In really tough cases, the usermight need to know more about the behavior and construction ofthe parser than can be covered here. In this case, one of thetheoretical references[2, 3, 4] might be consulted; the servicesof a local guru might also be appropriate.6: Precedence There is one common situation where the rules given abovefor resolving conflicts are not sufficient; this is in the pars-ing of arithmetic expressions. Most of the commonly used con-structions for arithmetic expressions can be naturally describedby the notion of precedence levels for operators, together withinformation about left or right associativity. It turns out thatambiguous grammars with appropriate disambiguating rules can beused to create parsers that are faster and easier to write thanparsers constructed from unambiguous grammars. The basic notionis to write grammar rules of the form expr : expr OP exprand expr : UNARY exprfor all binary and unary operators desired. This creates a veryambiguous grammar, with many parsing conflicts. As disambiguat-ing rules, the user specifies the precedence, or bindingstrength, of all the operators, and the associativity of thebinary operators. This information is sufficient to allow Yaccto resolve the parsing conflicts in accordance with these rules,and construct a parser that realizes the desired precedences andassociativities. The precedences and associativities are attached to tokensin the declarations section. This is done by a series of linesbeginning with a Yacc keyword: %left, %right, or %nonassoc, fol-lowed by a list of tokens. All of the tokens on the same lineYacc: Yet Another Compiler-Compiler PS1:15-21are assumed to have the same precedence level and associativity;the lines are listed in order of increasing precedence or bindingstrength. Thus, %left '+' '-' %left '*' '/'describes the precedence and associativity of the four arithmeticoperators. Plus and minus are left associative, and have lowerprecedence than star and slash, which are also left associative.The keyword %right is used to describe right associative opera-tors, and the keyword %nonassoc is used to describe operators,like the operator .LT. in Fortran, that may not associate withthemselves; thus, A .LT. B .LT. Cis illegal in Fortran, and such an operator would be describedwith the keyword %nonassoc in Yacc. As an example of thebehavior of these declarations, the description %right '=' %left '+' '-' %left '*' '/' %% expr : expr '=' expr | expr '+' expr | expr '-' expr | expr '*' expr | expr '/' expr | NAME ;might be used to structure the input a = b = c*d - e - f*gas follows: a = ( b = ( ((c*d)-e) - (f*g) ) )When this mechanism is used, unary operators must, in general, begiven a precedence. Sometimes a unary operator and a binaryoperator have the same symbolic representation, but differentprecedences. An example is unary and binary '-'; unary minus maybe given the same strength as multiplication, or even higher,while binary minus has a lower strength than multiplication. Thekeyword, %prec, changes the precedence level associated with aparticular grammar rule. %prec appears immediately after thebody of the grammar rule, before the action or closing semicolon,and is followed by a token name or literal. It causes the pre-cedence of the grammar rule to become that of the following tokenPS1:15-22 Yacc: Yet Another Compiler-Compilername or literal. For example, to make unary minus have the sameprecedence as multiplication the rules might resemble: %left '+' '-' %left '*' '/' %% expr : expr '+' expr | expr '-' expr | expr '*' expr | expr '/' expr | '-' expr %prec '*' | NAME ; A token declared by %left, %right, and %nonassoc need notbe, but may be, declared by %token as well. The precedences and associativities are used by Yacc toresolve parsing conflicts; they give rise to disambiguatingrules. Formally, the rules work as follows:1. The precedences and associativities are recorded for those tokens and literals that have them.2. A precedence and associativity is associated with each gram- mar rule; it is the precedence and associativity of the last token or literal in the body of the rule. If the %prec con- struction is used, it overrides this default. Some grammar rules may have no precedence and associativity associated with them.3. When there is a reduce/reduce conflict, or there is a shift/reduce conflict and either the input symbol or the grammar rule has no precedence and associativity, then the two disambiguating rules given at the beginning of the sec- tion are used, and the conflicts are reported.4. If there is a shift/reduce conflict, and both the grammar rule and the input character have precedence and associa- tivity associated with them, then the conflict is resolved in favor of the action (shift or reduce) associated with the higher precedence. If the precedences are the same, then the associativity is used; left associative implies reduce, right associative implies shift, and nonassociating implies error. Conflicts resolved by precedence are not counted in thenumber of shift/reduce and reduce/reduce conflicts reported byYacc. This means that mistakes in the specification of pre-cedences may disguise errors in the input grammar; it is a goodidea to be sparing with precedences, and use them in anYacc: Yet Another Compiler-Compiler PS1:15-23essentially ``cookbook'' fashion, until some experience has beengained. The y.output file is very useful in deciding whether theparser is actually doing what was intended.7: Error Handling Error handling is an extremely difficult area, and many ofthe problems are semantic ones. When an error is found, forexample, it may be necessary to reclaim parse tree storage,delete or alter symbol table entries, and, typically, setswitches to avoid generating any further output. It is seldom acceptable to stop all processing when an erroris found; it is more useful to continue scanning the input tofind further syntax errors. This leads to the problem of gettingthe parser ``restarted'' after an error. A general class ofalgorithms to do this involves discarding a number of tokens fromthe input string, and attempting to adjust the parser so thatinput can continue. To allow the user some control over this process, Yacc pro-vides a simple, but reasonably general, feature. The token name``error'' is reserved for error handling. This name can be usedin grammar rules; in effect, it suggests places where errors areexpected, and recovery might take place. The parser pops itsstack until it enters a state where the token ``error'' is legal.It then behaves as if the token ``error'' were the current looka-head token, and performs the action encountered. The lookaheadtoken is then reset to the token that caused the error. If nospecial error rules have been specified, the processing haltswhen an error is detected. In order to prevent a cascade of error messages, the parser,after detecting an error, remains in error state until threetokens have been successfully read and shifted. If an error isdetected when the parser is already in error state, no message isgiven, and the input token is quietly deleted. As an example, a rule of the form stat : errorwould, in effect, mean that on a syntax error the parser wouldattempt to skip over the statement in which the error was seen.More precisely, the parser will scan ahead, looking for threetokens that might legally follow a statement, and start process-ing at the first of these; if the beginnings of statements arenot sufficiently distinctive, it may make a false start in themiddle of a statement, and end up reporting a second error wherethere is in fact no error. Actions may be used with these special error rules. Theseactions might attempt to reinitialize tables, reclaim symboltable space, etc.PS1:15-24 Yacc: Yet Another Compiler-Compiler Error rules such as the above are very general, but diffi-cult to control. Somewhat easier are rules such as stat : error ';'Here, when there is an error, the parser attempts to skip overthe statement, but will do so by skipping to the next ';'. Alltokens after the error and before the next ';' cannot be shifted,and are discarded. When the ';' is seen, this rule will bereduced, and any ``cleanup'' action associated with it performed. Another form of error rule arises in interactive applica-tions, where it may be desirable to permit a line to be reenteredafter an error. A possible error rule might be input : error '\n' { printf( "Reenter last line: " ); } input { $$ = $4; }There is one potential difficulty with this approach; the parsermust correctly process three input tokens before it admits thatit has correctly resynchronized after the error. If the reen-tered line contains an error in the first two tokens, the parserdeletes the offending tokens, and gives no message; this isclearly unacceptable. For this reason, there is a mechanism thatcan be used to force the parser to believe that an error has beenfully recovered from. The statement yyerrok ;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -