📄 bison.texinfo
字号:
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and ``any
later version'', you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
@item
If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
@iftex
@heading NO WARRANTY
@end iftex
@ifinfo
@center NO WARRANTY
@end ifinfo
@item
BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
@item
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
@end enumerate
@iftex
@heading END OF TERMS AND CONDITIONS
@end iftex
@ifinfo
@center END OF TERMS AND CONDITIONS
@end ifinfo
@page
@unnumberedsec How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the ``copyright'' line and a pointer to where the full notice is found.
@smallexample
@var{one line to give the program's name and a brief idea of what it does.}
Copyright (C) 19@var{yy} @var{name of author}
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
@end smallexample
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
@smallexample
Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
@end smallexample
The hypothetical commands @samp{show w} and @samp{show c} should show
the appropriate parts of the General Public License. Of course, the
commands you use may be called something other than @samp{show w} and
@samp{show c}; they could even be mouse-clicks or menu items---whatever
suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a ``copyright disclaimer'' for the program, if
necessary. Here is a sample; alter the names:
@smallexample
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
@var{signature of Ty Coon}, 1 April 1989
Ty Coon, President of Vice
@end smallexample
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.
@node Concepts, Examples, Copying, Top
@chapter The Concepts of Bison
This chapter introduces many of the basic concepts without which the
details of Bison will not make sense. If you do not already know how to
use Bison or Yacc, we suggest you start by reading this chapter carefully.
@menu
* Language and Grammar:: Languages and context-free grammars,
as mathematical ideas.
* Grammar in Bison:: How we represent grammars for Bison's sake.
* Semantic Values:: Each token or syntactic grouping can have
a semantic value (the value of an integer,
the name of an identifier, etc.).
* Semantic Actions:: Each rule can have an action containing C code.
* Bison Parser:: What are Bison's input and output,
how is the output used?
* Stages:: Stages in writing and running Bison grammars.
* Grammar Layout:: Overall structure of a Bison grammar file.
@end menu
@node Language and Grammar, Grammar in Bison, , Concepts
@section Languages and Context-Free Grammars
@cindex context-free grammar
@cindex grammar, context-free
In order for Bison to parse a language, it must be described by a
@dfn{context-free grammar}. This means that you specify one or more
@dfn{syntactic groupings} and give rules for constructing them from their
parts. For example, in the C language, one kind of grouping is called an
`expression'. One rule for making an expression might be, ``An expression
can be made of a minus sign and another expression''. Another would be,
``An expression can be an integer''. As you can see, rules are often
recursive, but there must be at least one rule which leads out of the
recursion.
@cindex BNF
@cindex Backus-Naur form
The most common formal system for presenting such rules for humans to read
is @dfn{Backus-Naur Form} or ``BNF'', which was developed in order to
specify the language Algol 60. Any grammar expressed in BNF is a
context-free grammar. The input to Bison is essentially machine-readable
BNF.
Not all context-free languages can be handled by Bison, only those
that are LALR(1). In brief, this means that it must be possible to
tell how to parse any portion of an input string with just a single
token of look-ahead. Strictly speaking, that is a description of an
LR(1) grammar, and LALR(1) involves additional restrictions that are
hard to explain simply; but it is rare in actual practice to find an
LR(1) grammar that fails to be LALR(1). @xref{Mystery Conflicts, ,
Mysterious Reduce/Reduce Conflicts}, for more information on this.
@cindex symbols (abstract)
@cindex token
@cindex syntactic grouping
@cindex grouping, syntactic
In the formal grammatical rules for a language, each kind of syntactic unit
or grouping is named by a @dfn{symbol}. Those which are built by grouping
smaller constructs according to grammatical rules are called
@dfn{nonterminal symbols}; those which can't be subdivided are called
@dfn{terminal symbols} or @dfn{token types}. We call a piece of input
corresponding to a single terminal symbol a @dfn{token}, and a piece
corresponding to a single nonterminal symbol a @dfn{grouping}.@refill
We can use the C language as an example of what symbols, terminal and
nonterminal, mean. The tokens of C are identifiers, constants (numeric and
string), and the various keywords, arithmetic operators and punctuation
marks. So the terminal symbols of a grammar for C include `identifier',
`number', `string', plus one symbol for each keyword, operator or
punctuation mark: `if', `return', `const', `static', `int', `char',
`plus-sign', `open-brace', `close-brace', `comma' and many more. (These
tokens can be subdivided into characters, but that is a matter of
lexicography, not grammar.)
Here is a simple C function subdivided into tokens:
@example
int /* @r{keyword `int'} */
square (x) /* @r{identifier, open-paren,} */
/* @r{identifier, close-paren} */
int x; /* @r{keyword `int', identifier, semicolon} */
@{ /* @r{open-brace} */
return x * x; /* @r{keyword `return', identifier,} */
/* @r{asterisk, identifier, semicolon} */
@} /* @r{close-brace} */
@end example
The syntactic groupings of C include the expression, the statement, the
declaration, and the function definition. These are represented in the
grammar of C by nonterminal symbols `expression', `statement',
`declaration' and `function definition'. The full grammar uses dozens of
additional language constructs, each with its own nonterminal symbol, in
order to express the meanings of these four. The example above is a
function definition; it contains one declaration, and one statement. In
the statement, each @samp{x} is an expression and so is @samp{x * x}.
Each nonterminal symbol must have grammatical rules showing how it is made
out of simpler constructs. For example, one kind of C statement is the
@code{return} statement; this would be described with a grammar rule which
reads informally as follows:
@quotation
A `statement' can be made of a `return' keyword, an `expression' and a
`semicolon'.
@end quotation
@noindent
There would be many other rules for `statement', one for each kind of
statement in C.
@cindex start symbol
One nonterminal symbol must be distinguished as the special one which
defines a complete utterance in the language. It is called the @dfn{start
symbol}. In a compiler, this means a complete input program. In the C
language, the nonterminal symbol `sequence of definitions and declarations'
plays this role.
For example, @samp{1 + 2} is a valid C expression---a valid part of a C
program---but it is not valid as an @emph{entire} C program. In the
context-free grammar of C, this follows from the fact that `expression' is
not the start symbol.
The Bison parser reads a sequence of tokens as its input, and groups the
tokens using the grammar rules. If the input is valid, the end result is
that the entire token sequence reduces to a single grouping whose symbol is
the grammar's start symbol. If we use a grammar for C, the entire input
must be a `sequence of definitions and declarations'. If not, the parser
reports a syntax error.
@node Grammar in Bison, Semantic Values, Language and Grammar, Concepts
@section From Formal Rules to Bison Input
@cindex Bison grammar
@cindex grammar, Bison
@cindex formal grammar
A formal grammar is a mathematical construct. To define the language
for Bison, you must write a file expressing the grammar in Bison syntax:
a @dfn{Bison grammar} file. @xref{Grammar File, ,Bison Grammar Files}.
A nonterminal symbol in the formal grammar is represented in Bison input
as an identifier, like an identifier in C. By convention, it should be
in lower case, such as @code{expr}, @code{stmt} or @code{declaration}.
The Bison representation for a terminal symbol is also called a @dfn{token
type}. Token types as well can be represented as C-like identifiers. By
convention, these identifiers should be upper case to distinguish them from
nonterminals: for example, @code{INTEGER}, @code{IDENTIFIER}, @code{IF} or
@code{RETURN}. A terminal symbol that stands for a particular keyword in
the language should be named after that keyword converted to upper case.
The terminal symbol @code{error} is reserved for error recovery.
@xref{Symbols}.@refill
A terminal symbol can also be represented as a character literal, just like
a C character constant. You should do this whenever a token is just a
single character (parenthesis, plus-sign, etc.): use that same character in
a literal as the terminal symbol for that token.
The grammar rules also have an expression in Bison syntax. For example,
here is the Bison rule for a C @code{return} statement. The semicolon in
quotes is a literal character token, representing part of the C syntax for
the statement; the naked semicolon, and the colon, are Bison punctuation
used in every rule.
@example
stmt: RETURN expr ';'
;
@end example
@noindent
@xref{Rules, ,Syntax of Grammar Rules}.
@node Semantic Values, Semantic Actions, Grammar in Bison, Concepts
@section Semantic Values
@cindex semantic value
@cindex value, semantic
A formal grammar selects tokens only by their classifications: for example,
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -