📄 bison.texinfo

📁 这是一个软件水平资格考试中使用的CASL汇编语言的编译器,实现文件中包括一个编译器,一个虚拟机,一个类似于Debug的调试器.
💻 TEXINFO
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
;

exp:      NUM             @{ $$ = $1;         @}
        | exp exp '+'     @{ $$ = $1 + $2;    @}
        | exp exp '-'     @{ $$ = $1 - $2;    @}
        | exp exp '*'     @{ $$ = $1 * $2;    @}
        | exp exp '/'     @{ $$ = $1 / $2;    @}
      /* Exponentiation */
        | exp exp '^'     @{ $$ = pow ($1, $2); @}
      /* Unary minus    */
        | exp 'n'         @{ $$ = -$1;        @}
;
%%
@end example

The groupings of the rpcalc ``language'' defined here are the expression
(given the name @code{exp}), the line of input (@code{line}), and the
complete input transcript (@code{input}).  Each of these nonterminal
symbols has several alternate rules, joined by the @samp{|} punctuator
which is read as ``or''.  The following sections explain what these rules
mean.

The semantics of the language is determined by the actions taken when a
grouping is recognized.  The actions are the C code that appears inside
braces.  @xref{Actions}.

You must specify these actions in C, but Bison provides the means for
passing semantic values between the rules.  In each action, the
pseudo-variable @code{$$} stands for the semantic value for the grouping
that the rule is going to construct.  Assigning a value to @code{$$} is the
main job of most actions.  The semantic values of the components of the
rule are referred to as @code{$1}, @code{$2}, and so on.

@menu
* Rpcalc Input::      
* Rpcalc Line::       
* Rpcalc Expr::       
@end menu

@node Rpcalc Input, Rpcalc Line,  , Rpcalc Rules
@subsubsection Explanation of @code{input}

Consider the definition of @code{input}:

@example
input:    /* empty */
        | input line
;
@end example

This definition reads as follows: ``A complete input is either an empty
string, or a complete input followed by an input line''.  Notice that
``complete input'' is defined in terms of itself.  This definition is said
to be @dfn{left recursive} since @code{input} appears always as the
leftmost symbol in the sequence.  @xref{Recursion, ,Recursive Rules}.

The first alternative is empty because there are no symbols between the
colon and the first @samp{|}; this means that @code{input} can match an
empty string of input (no tokens).  We write the rules this way because it
is legitimate to type @kbd{Ctrl-d} right after you start the calculator.
It's conventional to put an empty alternative first and write the comment
@samp{/* empty */} in it.

The second alternate rule (@code{input line}) handles all nontrivial input.
It means, ``After reading any number of lines, read one more line if
possible.''  The left recursion makes this rule into a loop.  Since the
first alternative matches empty input, the loop can be executed zero or
more times.

The parser function @code{yyparse} continues to process input until a
grammatical error is seen or the lexical analyzer says there are no more
input tokens; we will arrange for the latter to happen at end of file.

@node Rpcalc Line, Rpcalc Expr, Rpcalc Input, Rpcalc Rules
@subsubsection Explanation of @code{line}

Now consider the definition of @code{line}:

@example
line:     '\n'
        | exp '\n'  @{ printf ("\t%.10g\n", $1); @}
;
@end example

The first alternative is a token which is a newline character; this means
that rpcalc accepts a blank line (and ignores it, since there is no
action).  The second alternative is an expression followed by a newline.
This is the alternative that makes rpcalc useful.  The semantic value of
the @code{exp} grouping is the value of @code{$1} because the @code{exp} in
question is the first symbol in the alternative.  The action prints this
value, which is the result of the computation the user asked for.

This action is unusual because it does not assign a value to @code{$$}.  As
a consequence, the semantic value associated with the @code{line} is
uninitialized (its value will be unpredictable).  This would be a bug if
that value were ever used, but we don't use it: once rpcalc has printed the
value of the user's input line, that value is no longer needed.

@node Rpcalc Expr,  , Rpcalc Line, Rpcalc Rules
@subsubsection Explanation of @code{expr}

The @code{exp} grouping has several rules, one for each kind of expression.
The first rule handles the simplest expressions: those that are just numbers.
The second handles an addition-expression, which looks like two expressions
followed by a plus-sign.  The third handles subtraction, and so on.

@example
exp:      NUM
        | exp exp '+'     @{ $$ = $1 + $2;    @}
        | exp exp '-'     @{ $$ = $1 - $2;    @}
        @dots{}
        ;
@end example

We have used @samp{|} to join all the rules for @code{exp}, but we could
equally well have written them separately:

@example
exp:      NUM ;
exp:      exp exp '+'     @{ $$ = $1 + $2;    @} ;
exp:      exp exp '-'     @{ $$ = $1 - $2;    @} ;
        @dots{}
@end example

Most of the rules have actions that compute the value of the expression in
terms of the value of its parts.  For example, in the rule for addition,
@code{$1} refers to the first component @code{exp} and @code{$2} refers to
the second one.  The third component, @code{'+'}, has no meaningful
associated semantic value, but if it had one you could refer to it as
@code{$3}.  When @code{yyparse} recognizes a sum expression using this
rule, the sum of the two subexpressions' values is produced as the value of
the entire expression.  @xref{Actions}.

You don't have to give an action for every rule.  When a rule has no
action, Bison by default copies the value of @code{$1} into @code{$$}.
This is what happens in the first rule (the one that uses @code{NUM}).

The formatting shown here is the recommended convention, but Bison does
not require it.  You can add or change whitespace as much as you wish.
For example, this:

@example
exp   : NUM | exp exp '+' @{$$ = $1 + $2; @} | @dots{}
@end example

@noindent
means the same thing as this:

@example
exp:      NUM
        | exp exp '+'    @{ $$ = $1 + $2; @}
        | @dots{}
@end example

@noindent
The latter, however, is much more readable.

@node Rpcalc Lexer, Rpcalc Main, Rpcalc Rules, RPN Calc
@subsection The @code{rpcalc} Lexical Analyzer
@cindex writing a lexical analyzer
@cindex lexical analyzer, writing

The lexical analyzer's job is low-level parsing: converting characters or
sequences of characters into tokens.  The Bison parser gets its tokens by
calling the lexical analyzer.  @xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}.

Only a simple lexical analyzer is needed for the RPN calculator.  This
lexical analyzer skips blanks and tabs, then reads in numbers as
@code{double} and returns them as @code{NUM} tokens.  Any other character
that isn't part of a number is a separate token.  Note that the token-code
for such a single-character token is the character itself.

The return value of the lexical analyzer function is a numeric code which
represents a token type.  The same text used in Bison rules to stand for
this token type is also a C expression for the numeric code for the type.
This works in two ways.  If the token type is a character literal, then its
numeric code is the ASCII code for that character; you can use the same
character literal in the lexical analyzer to express the number.  If the
token type is an identifier, that identifier is defined by Bison as a C
macro whose definition is the appropriate number.  In this example,
therefore, @code{NUM} becomes a macro for @code{yylex} to use.

The semantic value of the token (if it has one) is stored into the global
variable @code{yylval}, which is where the Bison parser will look for it.
(The C data type of @code{yylval} is @code{YYSTYPE}, which was defined
at the beginning of the grammar; @pxref{Rpcalc Decls, ,Declarations for @code{rpcalc}}.)

A token type code of zero is returned if the end-of-file is encountered.
(Bison recognizes any nonpositive value as indicating the end of the
input.)

Here is the code for the lexical analyzer:

@example
@group
/* Lexical analyzer returns a double floating point 
   number on the stack and the token NUM, or the ASCII
   character read if not a number.  Skips all blanks
   and tabs, returns 0 for EOF. */

#include <ctype.h>
@end group

@group
yylex ()
@{
  int c;

  /* skip white space  */
  while ((c = getchar ()) == ' ' || c == '\t')  
    ;
@end group
@group
  /* process numbers   */
  if (c == '.' || isdigit (c))                
    @{
      ungetc (c, stdin);
      scanf ("%lf", &yylval);
      return NUM;
    @}
@end group
@group
  /* return end-of-file  */
  if (c == EOF)                            
    return 0;
  /* return single chars */
  return c;                                
@}
@end group
@end example

@node Rpcalc Main, Rpcalc Error, Rpcalc Lexer, RPN Calc
@subsection The Controlling Function
@cindex controlling function
@cindex main function in simple example

In keeping with the spirit of this example, the controlling function is
kept to the bare minimum.  The only requirement is that it call
@code{yyparse} to start the process of parsing.

@example
@group
main ()
@{
  yyparse ();
@}
@end group
@end example

@node Rpcalc Error, Rpcalc Gen, Rpcalc Main, RPN Calc
@subsection The Error Reporting Routine
@cindex error reporting routine

When @code{yyparse} detects a syntax error, it calls the error reporting
function @code{yyerror} to print an error message (usually but not always
@code{"parse error"}).  It is up to the programmer to supply @code{yyerror}
(@pxref{Interface, ,Parser C-Language Interface}), so here is the definition we will use:

@example
@group
#include <stdio.h>

yyerror (s)  /* Called by yyparse on error */
     char *s;
@{
  printf ("%s\n", s);
@}
@end group
@end example

After @code{yyerror} returns, the Bison parser may recover from the error
and continue parsing if the grammar contains a suitable error rule
(@pxref{Error Recovery}).  Otherwise, @code{yyparse} returns nonzero.  We
have not written any error rules in this example, so any invalid input will
cause the calculator program to exit.  This is not clean behavior for a
real calculator, but it is adequate in the first example.

@node Rpcalc Gen, Rpcalc Compile, Rpcalc Error, RPN Calc
@subsection Running Bison to Make the Parser
@cindex running Bison (introduction)

Before running Bison to produce a parser, we need to decide how to arrange
all the source code in one or more source files.  For such a simple example,
the easiest thing is to put everything in one file.  The definitions of
@code{yylex}, @code{yyerror} and @code{main} go at the end, in the
``additional C code'' section of the file (@pxref{Grammar Layout, ,The Overall Layout of a Bison Grammar}).

For a large project, you would probably have several source files, and use
@code{make} to arrange to recompile them.

With all the source in a single file, you use the following command to
convert it into a parser file:

@example
bison @var{file_name}.y
@end example

@noindent
In this example the file was called @file{rpcalc.y} (for ``Reverse Polish
CALCulator'').  Bison produces a file named @file{@var{file_name}.tab.c},
removing the @samp{.y} from the original file name. The file output by
Bison contains the source code for @code{yyparse}.  The additional
functions in the input file (@co
上一页 1 2 3 45
💿 文件大小 2774 K
👤 上传用户 WUYUEASDF
📂 所属分类编译器/解释器
🏷️ 相关标签

#Debug #CASL #编译器 #软件
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -