📄 yacc-docs.txt
字号:
yyval is copied onto the value stack. The pseudo-variables $1,$2, etc., refer to the value stack. The other two parser actions are conceptually much simpler.The accept action indicates that the entire input has been seenand that it matches the specification. This action appears onlywhen the lookahead token is the endmarker, and indicates that theparser has successfully done its job. The error action, on theother hand, represents a place where the parser can no longercontinue parsing according to the specification. The inputYacc: Yet Another Compiler-Compiler PS1:15-13tokens it has seen, together with the lookahead token, cannot befollowed by anything that would result in a legal input. Theparser reports an error, and attempts to recover the situationand resume parsing: the error recovery (as opposed to the detec-tion of error) will be covered in Section 7. It is time for an example! Consider the specification %token DING DONG DELL %% rhyme : sound place ; sound : DING DONG ; place : DELL ; When Yacc is invoked with the -v option, a file calledy.output is produced, with a human-readable description of theparser. The y.output file corresponding to the above grammar(with some statistics stripped off the end) is:PS1:15-14 Yacc: Yet Another Compiler-Compiler state 0 $accept : _rhyme $end DING shift 3 . error rhyme goto 1 sound goto 2 state 1 $accept : rhyme_$end $end accept . error state 2 rhyme : sound_place DELL shift 5 . error place goto 4 state 3 sound : DING_DONG DONG shift 6 . error state 4 rhyme : sound place_ (1) . reduce 1 state 5 place : DELL_ (3) . reduce 3 state 6 sound : DING DONG_ (2) . reduce 2Notice that, in addition to the actions for each state, there isa description of the parsing rules being processed in each state.The _ character is used to indicate what has been seen, and whatis yet to come, in each rule. Suppose the input is DING DONG DELLIt is instructive to follow the steps of the parser while pro-cessing this input.Yacc: Yet Another Compiler-Compiler PS1:15-15 Initially, the current state is state 0. The parser needsto refer to the input in order to decide between the actionsavailable in state 0, so the first token, DING, is read, becomingthe lookahead token. The action in state 0 on DING is is ``shift3'', so state 3 is pushed onto the stack, and the lookahead tokenis cleared. State 3 becomes the current state. The next token,DONG, is read, becoming the lookahead token. The action in state3 on the token DONG is ``shift 6'', so state 6 is pushed onto thestack, and the lookahead is cleared. The stack now contains 0,3, and 6. In state 6, without even consulting the lookahead, theparser reduces by rule 2. sound : DING DONGThis rule has two symbols on the right hand side, so two states,6 and 3, are popped off of the stack, uncovering state 0. Con-sulting the description of state 0, looking for a goto on sound, sound goto 2is obtained; thus state 2 is pushed onto the stack, becoming thecurrent state. In state 2, the next token, DELL, must be read. The actionis ``shift 5'', so state 5 is pushed onto the stack, which nowhas 0, 2, and 5 on it, and the lookahead token is cleared. Instate 5, the only action is to reduce by rule 3. This has onesymbol on the right hand side, so one state, 5, is popped off,and state 2 is uncovered. The goto in state 2 on place, the leftside of rule 3, is state 4. Now, the stack contains 0, 2, and 4.In state 4, the only action is to reduce by rule 1. There aretwo symbols on the right, so the top two states are popped off,uncovering state 0 again. In state 0, there is a goto on rhymecausing the parser to enter state 1. In state 1, the input isread; the endmarker is obtained, indicated by ``$end'' in they.output file. The action in state 1 when the endmarker is seenis to accept, successfully ending the parse. The reader is urged to consider how the parser works whenconfronted with such incorrect strings as DING DONG DONG, DINGDONG, DING DONG DELL DELL, etc. A few minutes spend with thisand other simple examples will probably be repaid when problemsarise in more complicated contexts.5: Ambiguity and Conflicts A set of grammar rules is ambiguous if there is some inputstring that can be structured in two or more different ways. Forexample, the grammar rule expr : expr '-' expris a natural way of expressing the fact that one way of formingan arithmetic expression is to put two other expressions togetherPS1:15-16 Yacc: Yet Another Compiler-Compilerwith a minus sign between them. Unfortunately, this grammar ruledoes not completely specify the way that all complex inputsshould be structured. For example, if the input is expr - expr - exprthe rule allows this input to be structured as either ( expr - expr ) - expror as expr - ( expr - expr )(The first is called left association, the second right associa-tion). Yacc detects such ambiguities when it is attempting to buildthe parser. It is instructive to consider the problem that con-fronts the parser when it is given an input such as expr - expr - exprWhen the parser has read the second expr, the input that it hasseen: expr - exprmatches the right side of the grammar rule above. The parsercould reduce the input by applying this rule; after applying therule; the input is reduced to expr(the left side of the rule).The parser would then read the final part of the input: - exprand again reduce. The effect of this is to take the left associ-ative interpretation. Alternatively, when the parser has seen expr - exprit could defer the immediate application of the rule, and con-tinue reading the input until it had seen expr - expr - exprIt could then apply the rule to the rightmost three symbols,reducing them to expr and leaving expr - exprNow the rule can be reduced once more; the effect is to take theright associative interpretation. Thus, having readYacc: Yet Another Compiler-Compiler PS1:15-17 expr - exprthe parser can do two legal things, a shift or a reduction, andhas no way of deciding between them. This is called a shift /reduce conflict. It may also happen that the parser has a choiceof two legal reductions; this is called a reduce / reduce con-flict. Note that there are never any ``Shift/shift'' conflicts. When there are shift/reduce or reduce/reduce conflicts, Yaccstill produces a parser. It does this by selecting one of thevalid steps wherever it has a choice. A rule describing whichchoice to make in a given situation is called a disambiguatingrule. Yacc invokes two disambiguating rules by default:1. In a shift/reduce conflict, the default is to do the shift.2. In a reduce/reduce conflict, the default is to reduce by the earlier grammar rule (in the input sequence). Rule 1 implies that reductions are deferred whenever thereis a choice, in favor of shifts. Rule 2 gives the user rathercrude control over the behavior of the parser in this situation,but reduce/reduce conflicts should be avoided whenever possible. Conflicts may arise because of mistakes in input or logic,or because the grammar rules, while consistent, require a morecomplex parser than Yacc can construct. The use of actionswithin rules can also cause conflicts, if the action must be donebefore the parser can be sure which rule is being recognized. Inthese cases, the application of disambiguating rules is inap-propriate, and leads to an incorrect parser. For this reason,Yacc always reports the number of shift/reduce and reduce/reduceconflicts resolved by Rule 1 and Rule 2. In general, whenever it is possible to apply disambiguatingrules to produce a correct parser, it is also possible to rewritethe grammar rules so that the same inputs are read but there areno conflicts. For this reason, most previous parser generatorshave considered conflicts to be fatal errors. Our experience hassuggested that this rewriting is somewhat unnatural, and producesslower parsers; thus, Yacc will produce parsers even in the pres-ence of conflicts. As an example of the power of disambiguating rules, considera fragment from a programming language involving an ``if-then-else'' construction: stat : IF '(' cond ')' stat | IF '(' cond ')' stat ELSE stat ;PS1:15-18 Yacc: Yet Another Compiler-CompilerIn these rules, IF and ELSE are tokens, cond is a nonterminalsymbol describing conditional (logical) expressions, and stat isa nonterminal symbol describing statements. The first rule willbe called the simple-if rule, and the second the if-else rule. These two rules form an ambiguous construction, since inputof the form IF ( C1 ) IF ( C2 ) S1 ELSE S2can be structured according to these rules in two ways: IF ( C1 ) { IF ( C2 ) S1 } ELSE S2or IF ( C1 ) { IF ( C2 ) S1 ELSE S2 }The second interpretation is the one given in most programminglanguages having this construct. Each ELSE is associated withthe last preceding ``un-ELSE'd'' IF. In this example, considerthe situation where the parser has seen IF ( C1 ) IF ( C2 ) S1and is looking at the ELSE. It can immediately reduce by thesimple-if rule to get IF ( C1 ) stat
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -