⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ss4

📁 UNIX版本6的源代码
💻
字号:
.SHSection 4: Ambiguity, Conflicts, and Precedence.PPA set of grammar rules is.ulambiguousif there is some input string which can be structured in two or more different ways.For example, the grammar rule.DSexpr :  expr \'\-\' expr ;.DEis a natural way of expressing the fact that one way of forming an arithmetic expressionis to put two other expressions together with a minus sign between them.Unfortunately, this grammar rule does notcompletely specify the way that all complex inputsshould be structured.For example, if we have input of the form.DSexpr \- expr \- expr.DEthe rule would permit us to treat this input either as.DS( expr \- expr ) \- expr.DEor as.DSexpr \- ( expr \- expr ).DE(We speak of the first as.ulleft associationof operators, and the second as.ulright association)..PPYacc detects such ambiguities when it is attempting to build the parser.It is instructive to consider the problem that confronts the parser when it isgiven an input such as.DSexpr \- expr \- expr.DEWhen the parser has read the second expr, the input which it has seen:.DSexpr \- expr.DEmatches the right side of the grammar rule above.One valid thing for the parser to do is to.ulreducethe input it has seen by applying this rule;after applying the rule, it would have reduced the input it had already seen to expr (the left side of the rule).It could then read the final part of the input:.DS\- expr.DEand again reduce by the rule.We see that the effect of this is to take the left associative interpretation..PPAlternatively, when the parser has seen.DSexpr \- expr.DEit could defer the immediate application of the rule, and continue reading(the technical term is.ulshifting)the input until it had seen.DSexpr \- expr \- expr.DEIt could then apply the grammar rule to the rightmost three symbols, reducing them to expr and leaving.DSexpr \- expr.DENow it can reduce by the rule again; the effect is totake the right associative interpretation.Thus, having read.DSexpr \- expr.DEthe parser can do two legal things, a shift or a reduction, and has no way ofdeciding between them.We refer to this as a.ulshift/reduce conflict.It may also happen that the parser has a choice of two legal reductions;this is called a.ulreduce/reduce conflict..PPWhen there are shift/reduce or reduce/reduce conflicts, Yacc still produces a parser.It does this by selecting one of the valid steps wherever it has a choice.A rule which describes which choice to make in a given situation is calleda.uldisambiguating rule..PPYacc has two disambiguating rules which are invoked by default,in the absence of any user directives to the contrary:.IP 1.In a shift/reduce conflict, the default is to do the shift..IP 2.In a reduce/reduce conflict, the default is to reduce by the.ulearliergrammar rule (in the input sequence)..PPRule 1 implies that reductions are deferred whenever there is a choice,in favor of shifts.Rule 2 gives the user rather crude control over the behavior of the parserin this situation, but the proper use of reduce/reduce conflicts is still a black art, and isproperly considered an advanced topic..PPConflicts may arise because of mistakes in input or logic, or because the grammar rules, while consistent,require a more complex parser than Yacc can construct.In these cases, the application of disambiguating rules is inappropriate,and leads to a parser which is in error.For this reason, Yaccalways reports the number of shift/reduce and reduce/reduce conflicts which were resolved by Rule 1 and Rule 2..PPIn general, whenever it is possible to apply disambiguating rules to produce a correct parser, it is alsopossible to rewrite the grammar rules so that the same inputs are read, but there are noconflicts.For this reason, most previous systems like Yacchave considered conflicts to be fatal errors.Our experience has suggested that this rewriting is somewhat unnatural to do,and produces slower parsers; thus, Yacc will produce parsers even in the presence of conflicts..PPAs an example of the power of disambiguating rules, consider a fragment from a programminglanguage involving an``if-then-else'' construction:.DSstat :	IF \'(\' cond \')\' stat |	IF \'(\' cond \')\' stat ELSE stat ;.DEHere, we consider IF and ELSE to be tokens, cond to be a nonterminal symbol describingconditional (logical) expressions, andstat to be a nonterminal symbol describing statements.In the following, we shall refer to these two rules as the.ulsimple-ifrule and the.ulif-elserule, respectively..PPThese two rules form an ambiguous construction, since input of the form.DSIF ( C1 ) IF ( C2 ) S1 ELSE S2.DEcan be structured according to these rules in two ways:.DSIF ( C1 ) {	IF ( C2 ) S1}ELSE S2.DEor.DSIF ( C1 ) {	IF ( C2 ) S1	ELSE S2}.DEThe second interpretation is the one given in most programming languageswhich have this construct.Each ELSE is associated with the last preceding ``un-ELSE'd'' IF.In this example, consider the situation where the parser has seen.DSIF ( C1 ) IF ( C2 ) S1.DEand is looking at the ELSE.It can immediately.ulreduceby the simple-if rule to get.DSIF ( C1 ) stat.DEand then read the remaining input,.DSELSE S2.DEand reduce.DSIF ( C1 ) stat ELSE S2.DEby the if-else rule.This leads to the first of the above groupings of the input..PPOn the other hand, we may.ulshiftthe ELSE and read S2, and then reduce the right hand portion of.a.DSIF ( C1 ) IF ( C2 ) S1 ELSE S2.DEby the if-else rule to get.DSIF ( C1 ) stat.DEwhich can be reduced by the simple-if rule.This leads to the second of the above groupings of the input, whichis usually desired..PPOnce again the parser can do two valid things \- we have a shift/reduce conflict.The application of disambiguating rule 1 tells the parser to shift in this case,which leads to the desired grouping..PPNotice that this shift/reduce conflict arises only when there is a particular current input symbol,ELSE, and particular inputs already seen, such as.DSIF ( C1 ) IF ( C2 ) S1.DEIn general, there may be many conflicts, and each onewill be associated with an input symbol anda set of previously read inputs.The previously read inputs are characterized by the.ulstateof the parser, which is assigned a nonnegative integer.The number of states in the parser is typically two to five times the number of grammarrules..PPWhen Yacc is invoked with the verbose(\-v) option (see Appendix B), it produces a file of user outputwhich includes a description of the states in the parser.For example, the output corresponding to the aboveexample might be:.DS L23: shift/reduce Conflict (Shift 45, Reduce 18) on ELSEState 23	stat : IF ( cond ) stat\_	stat : IF ( cond ) stat\_ELSE stat	ELSE	shift 45	.		reduce 18.DEThe first line describes the conflict, giving the state and the input symbol.The state title follows, and a brief description of the grammarrules which are active in this state.The underline ``\_'' describes theportions of the grammar rules which have been seen.Thus in the example, in state 23 we have seen input which correspondsto.DSIF ( cond ) stat.DEand the two grammar rules shown are active at this time.The actions possible are, if the input symbol is ELSE, we may shift into state45.In this state, we should find as part of the description a line of the form.DSstat : IF ( cond ) stat ELSE\_stat.DEbecause in this state we will have read and shifted the ELSE.Back in state 23, the alternative action, described by ``.'',is to be done if the input symbol is not mentioned explicitly in the above actions; thus,in this case, if the input symbol is not ELSE, we should reduce by grammar rule 18,which is presumably.DSstat : IF \'(\' cond \')\' stat.DENotice that the numbers following ``shift'' commands refer to other states,while the numbers following ``reduce'' commands refer to grammarrule numbers.In most states, there will be only one reduce action possible in thestate, and this will always be the default command.The user who encounters unexpected shift/reduce conflicts will probably want tolook at the verbose output to decide whether the default actions are appropriate.In really tough cases, the user might need to know more aboutthe behavior and construction of the parser than can be covered here;in this case, a reference such as [1]might be consulted; the services of a local guru might also be appropriate..PPThere is one common situationwhere the rules given above for resolving conflicts are not sufficient;this is in the area of arithmetic expressions.Most of the commonly used constructions for arithmetic expressions can be naturallydescribed by the notion of.ulprecedencelevels for operators, together with information about leftor right associativity.It turns out that ambiguous grammars with appropriate disambiguating rulescan be used to create parsers which are faster and easier towrite than parsers constructed from unambiguous grammars.The basic notion is to write grammar rulesof the form.DSexpr : expr OP expr.DEand.DSexpr : UNARY expr.DEfor all binary and unary operators desired.This creates a very ambiguous grammar, with many parsing conflicts.As disambiguating rules, the user specifies the precedence, or bindingstrength, of all the operators, and the associativityof the binary operators.This information is sufficient to allow Yacc to resolve the parsing conflictsin accordance with these rules, and construct a parser which realizes the desiredprecedences and associativities..PPThe precedences and associativities are attached to tokens in the declarations section.This is done by a series of lines beginning with a Yacc keyword: %left, %right,or %nonassoc, followed by a list of tokens.All of the tokens on the same line are assumed to have the same precedence leveland associativity; the lines are listed inorder of increasing precedence or binding strength.Thus,.DS%left \'+\' \'\-\'%left \'*\' \'/\'.DEdescribes the precedence and associativity of the four arithmetic operators.Plus and minus are left associative, and have lower precedence thanstar and slash, which are also left associative.The keyword %right is used to describe right associative operators,and the keyword %nonassoc is used to describe operators, likethe operator .LT. in Fortran, which may not associate with themselves; thus,.DSA .LT. B .LT. C.DEis illegal in Fortran, and such an operator would be described with the keyword%nonassoc in Yacc.As an example of the behavior of these declarations, the description.DS%right \'=\'%left \'+\' \'\-\'%left \'*\' \'/\'%%expr :	expr \'=\' expr |	expr \'+\' expr |	expr \'\-\' expr |	expr \'*\' expr |	expr \'/\' expr |	NAME ;.DEmight be used to structure the input.DSa = b = c*d \- e \- f*g.DEas follows:.DSa = ( b = ( ((c*d)\-e) \- (f*g) ) ).DEWhen this mechanism is used,unary operators must, in general, be given a precedence.An interesting situation arises when we have a unary operator and a binary operatorwhich have the same symbolic representation, but different precedences.An example is unary and binary \'\-\'; frequently, unary minus is given the samestrength as multiplication, or even higher, while binary minus has a lower strength thanmultiplication.We can indicate this situation by use ofanother keyword, %prec, to change the precedence level associated with a particular grammar rule.%prec appears immediately after the body of the grammar rule, before the action or closing semicolon,and is followed by a token name or literal; itcauses the precedence of the grammar rule to become that of the token name or literal.Thus, to make unary minus have the same precedence as multiplication, we might write:.DS%left \'+\' \'\-\'%left \'*\' \'/\'%%expr :	expr \'+\' expr |	expr \'\-\' expr |	expr \'*\' expr |	expr \'/\' expr |	\'\-\' expr %prec \'*\' |	NAME ;.DE.PPNotice that the precedences which are describedby %left, %right, and %nonassoc are independent of the declarations of token namesby %token.A symbol can be declared by %token, and, later in the declarations section,be given a precedence and associativity by one of the above methods.It is true, however, that names which are given a precedence or associativity arealso declared to be token names, and so in general do not need to bedeclared by %token, although it does not hurt to do so..PPAs we mentioned above, the precedences and associativities are used by Yacc toresolve parsing conflicts; they give rise to disambiguating rules.Formally, the rules work as follows:.IP 1.The precedences and associativities are recorded for those tokens and literalswhich have them..IP 2.A precedence and associativity is associated with each grammar rule; it is the precedenceand associativity of the last token or literal in the body of the rule.If the %prec construction is used, it overrides this default.Notice that some grammar rules may have no precedence and associativity associated with them..IP 3.When there is a reduce/reduce conflict, or there is a shift/reduce conflictand either the input symbol or the grammar rule, or both, has no precedence and associativityassociated with it, then the two disambiguating rules given at the beginning of the section are used,and the conflicts are reported..IP 4.If there is a shift/reduce conflict, and both the grammar rule and the input characterhave precedence and associativity associated with them, then the conflict is resolvedin favor of the action (shift or reduce) associated with the higher precedence.If the precedences are the same, then the associativity is used; leftassociative implies reduce, right associative implies shift, and nonassociatingimplies error..PPThere are a number of points worth makingabout this use of disambiguation.There is no reporting of conflicts which are resolved by this mechanism,and these conflicts are not counted in the number of shift/reduce and reduce/reduceconflicts found in the grammar.This means that occasionally mistakes in the specification of precedencesdisguise errors in the input grammar; it is a good idea to be sparingwith precedences, and use them in an essentially ``cookbook'' fashion,until some experience has been gained.Frequently, not enough operators or precedences have been specified;this leads to a number of messages about shift/reduce or reduce/reduce conflicts.The cure is usually to specify more precedences, or use the %prec mechanism, or both.It is generally good to examine the verbose output file to ensure thatthe conflicts which are being reported can be validly resolved by precedence.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -