⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 yacc-docs.txt

📁 句法分析器。一般用文法(grammar)来刻画.常见的是短语结构文法(chomsky hierarchy),其中最常用的是上下文无关文法(CFG)。
💻 TXT
📖 第 1 页 / 共 5 页
字号:

     In effect, the reduce action ``turns back the clock'' in the
parse,  popping  the states off the stack to go back to the state
where the right hand side of the rule was first seen.  The parser
then  behaves  as  if it had seen the left side at that time.  If
the right hand side of the rule is empty, no  states  are  popped
off  of  the  stack:  the  uncovered state is in fact the current
state.

     The reduce action is also  important  in  the  treatment  of
user-supplied  actions  and  values.  When a rule is reduced, the
code supplied with the rule  is  executed  before  the  stack  is
adjusted.   In  addition to the stack holding the states, another
stack, running in parallel with it,  holds  the  values  returned
from  the  lexical  analyzer and the actions.  When a shift takes
place, the external variable yylval  is  copied  onto  the  value
stack.   After  the  return  from the user code, the reduction is
carried out.  When the goto action is done, the external variable
yyval  is  copied onto the value stack.  The pseudo-variables $1,
$2, etc., refer to the value stack.

     The other two parser actions are conceptually much  simpler.
The  accept  action indicates that the entire input has been seen
and that it matches the specification.  This action appears  only
when the lookahead token is the endmarker, and indicates that the
parser has successfully done its job.  The error action,  on  the
other  hand,  represents  a  place where the parser can no longer
continue parsing  according  to  the  specification.   The  input









Yacc: Yet Another Compiler-Compiler                     PS1:15-13


tokens  it has seen, together with the lookahead token, cannot be
followed by anything that would result in  a  legal  input.   The
parser  reports  an  error, and attempts to recover the situation
and resume parsing: the error recovery (as opposed to the  detec-
tion of error) will be covered in Section 7.

     It is time for an example!  Consider the specification

        %token  DING  DONG  DELL
        %%
        rhyme   :       sound  place
                ;
        sound   :       DING  DONG
                ;
        place   :       DELL
                ;


     When Yacc is invoked with  the  -v  option,  a  file  called
y.output  is  produced,  with a human-readable description of the
parser.  The y.output file corresponding  to  the  above  grammar
(with some statistics stripped off the end) is:









































PS1:15-14                     Yacc: Yet Another Compiler-Compiler



        state 0
                $accept  :  _rhyme  $end

                DING  shift 3
                .  error

                rhyme  goto 1
                sound  goto 2

        state 1
                $accept  :   rhyme_$end

                $end  accept
                .  error

        state 2
                rhyme  :   sound_place

                DELL  shift 5
                .  error

                place   goto 4

        state 3
                sound   :   DING_DONG

                DONG  shift 6
                .  error

        state 4
                rhyme  :   sound  place_    (1)

                .   reduce  1

        state 5
                place  :   DELL_    (3)

                .   reduce  3

        state 6
                sound   :   DING  DONG_    (2)

                .   reduce  2

Notice that, in addition to the actions for each state, there  is
a description of the parsing rules being processed in each state.
The _ character is used to indicate what has been seen, and  what
is yet to come, in each rule.  Suppose the input is

        DING  DONG  DELL

It is instructive to follow the steps of the  parser  while  pro-
cessing this input.









Yacc: Yet Another Compiler-Compiler                     PS1:15-15


     Initially, the current state is state 0.  The  parser  needs
to  refer  to  the  input  in order to decide between the actions
available in state 0, so the first token, DING, is read, becoming
the lookahead token.  The action in state 0 on DING is is ``shift
3'', so state 3 is pushed onto the stack, and the lookahead token
is  cleared.  State 3 becomes the current state.  The next token,
DONG, is read, becoming the lookahead token.  The action in state
3 on the token DONG is ``shift 6'', so state 6 is pushed onto the
stack, and the lookahead is cleared.  The stack now  contains  0,
3, and 6.  In state 6, without even consulting the lookahead, the
parser reduces by rule 2.

                sound  :   DING  DONG

This rule has two symbols on the right hand side, so two  states,
6  and  3, are popped off of the stack, uncovering state 0.  Con-
sulting the description of state 0, looking for a goto on sound,

                sound   goto 2

is obtained; thus state 2 is pushed onto the stack, becoming  the
current state.

     In state 2, the next token, DELL, must be read.  The  action
is  ``shift  5'',  so state 5 is pushed onto the stack, which now
has 0, 2, and 5 on it, and the lookahead token  is  cleared.   In
state  5,  the  only action is to reduce by rule 3.  This has one
symbol on the right hand side, so one state, 5,  is  popped  off,
and state 2 is uncovered.  The goto in state 2 on place, the left
side of rule 3, is state 4.  Now, the stack contains 0, 2, and 4.
In  state  4,  the only action is to reduce by rule 1.  There are
two symbols on the right, so the top two states are  popped  off,
uncovering  state  0 again.  In state 0, there is a goto on rhyme
causing the parser to enter state 1.  In state 1,  the  input  is
read;  the  endmarker  is  obtained, indicated by ``$end'' in the
y.output file.  The action in state 1 when the endmarker is  seen
is to accept, successfully ending the parse.

     The reader is urged to consider how the  parser  works  when
confronted  with  such  incorrect strings as DING DONG DONG, DING
DONG, DING DONG DELL DELL, etc.  A few minutes  spend  with  this
and  other  simple examples will probably be repaid when problems
arise in more complicated contexts.

5: Ambiguity and Conflicts

     A set of grammar rules is ambiguous if there is  some  input
string that can be structured in two or more different ways.  For
example, the grammar rule

        expr    :       expr  '-'  expr

is a natural way of expressing the fact that one way  of  forming
an arithmetic expression is to put two other expressions together









PS1:15-16                     Yacc: Yet Another Compiler-Compiler


with a minus sign between them.  Unfortunately, this grammar rule
does  not  completely  specify  the  way  that all complex inputs
should be structured.  For example, if the input is

        expr  -  expr  -  expr

the rule allows this input to be structured as either

        (  expr  -  expr  )  -  expr

or as

        expr  -  (  expr  -  expr  )

(The first is called left association, the second right  associa-
tion).

     Yacc detects such ambiguities when it is attempting to build
the  parser.  It is instructive to consider the problem that con-
fronts the parser when it is given an input such as

        expr  -  expr  -  expr

When the parser has read the second expr, the input that  it  has
seen:

        expr  -  expr

matches the right side of the grammar  rule  above.   The  parser
could  reduce the input by applying this rule; after applying the
rule; the input is reduced to expr(the left side  of  the  rule).
The parser would then read the final part of the input:

        -  expr

and again reduce.  The effect of this is to take the left associ-
ative interpretation.

     Alternatively, when the parser has seen

        expr  -  expr

it could defer the immediate application of the  rule,  and  con-
tinue reading the input until it had seen

        expr  -  expr  -  expr

It could then apply the rule  to  the  rightmost  three  symbols,
reducing them to expr and leaving

        expr  -  expr

Now the rule can be reduced once more; the effect is to take  the
right associative interpretation.  Thus, having read









Yacc: Yet Another Compiler-Compiler                     PS1:15-17



        expr  -  expr

the parser can do two legal things, a shift or a  reduction,  and
has  no  way  of deciding between them.  This is called a shift /
reduce conflict.  It may also happen that the parser has a choice
of  two  legal  reductions; this is called a reduce / reduce con-
flict.  Note that there are never any ``Shift/shift'' conflicts.

     When there are shift/reduce or reduce/reduce conflicts, Yacc
still  produces  a  parser.  It does this by selecting one of the
valid steps wherever it has a choice.  A  rule  describing  which
choice  to  make  in a given situation is called a disambiguating
rule.

     Yacc invokes two disambiguating rules by default:

1.   In a shift/reduce conflict, the default is to do the shift.

2.   In a reduce/reduce conflict, the default is to reduce by the
     earlier grammar rule (in the input sequence).

     Rule 1 implies that reductions are deferred  whenever  there
is  a  choice,  in favor of shifts.  Rule 2 gives the user rather
crude control over the behavior of the parser in this  situation,
but reduce/reduce conflicts should be avoided whenever possible.

     Conflicts may arise because of mistakes in input  or  logic,
or  because  the  grammar rules, while consistent, require a more
complex parser than Yacc  can  construct.   The  use  of  actions
within rules can also cause conflicts, if the action must be done
before the parser can be sure which rule is being recognized.  In
these  cases,  the  application  of disambiguating rules is inap-
propriate, and leads to an incorrect parser.   For  this  reason,
Yacc  always reports the number of shift/reduce and reduce/reduce
conflicts resolved by Rule 1 and Rule 2.

     In general, whenever it is possible to apply  disambiguating
rules to produce a correct parser, it is also possible to rewrite
the grammar rules so that the same inputs are read but there  are
no  conflicts.   For this reason, most previous parser generators
have considered conflicts to be fatal errors.  Our experience has
suggested that this rewriting is somewhat unnatural, and produces
slower parsers; thus, Yacc will produce parsers even in the pres-
ence of conflicts.

     As an example of the power of disambiguating rules, consider
a  fragment  from  a programming language involving an ``if-then-
else'' construction:

        stat    :       IF  '('  cond  ')'  stat
                |       IF  '('  cond  ')'  stat  ELSE  stat
                ;










PS1:15-18                     Yacc: Yet Another Compiler-Compiler


In these rules, IF and ELSE are tokens,  cond  is  a  nonterminal
symbol  describing conditional (logical) expressions, and stat is
a nonterminal symbol describing statements.  The first rule  will
be called the simple-if rule, and the second the if-else rule.

     These two rules form an ambiguous construction, since  input
of the form

        IF  (  C1  )  IF  (  C2  )  S1  ELSE  S2

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -