📄 bison.texinfo
字号:
@ifinfo@center NO WARRANTY@end ifinfo@itemBECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTYFOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHENOTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIESPROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSEDOR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OFMERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK ASTO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THEPROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,REPAIR OR CORRECTION.@itemIN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITINGWILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/ORREDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISINGOUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITEDTO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BYYOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHERPROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THEPOSSIBILITY OF SUCH DAMAGES.@end enumerate@iftex@heading END OF TERMS AND CONDITIONS@end iftex@ifinfo@center END OF TERMS AND CONDITIONS@end ifinfo@page@unnumberedsec How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatestpossible use to the public, the best way to achieve this is to make itfree software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safestto attach them to the start of each source file to most effectivelyconvey the exclusion of warranty; and each file should have at leastthe ``copyright'' line and a pointer to where the full notice is found.@smallexample@var{one line to give the program's name and a brief idea of what it does.}Copyright (C) 19@var{yy} @var{name of author}This program is free software; you can redistribute it and/or modifyit under the terms of the GNU General Public License as published bythe Free Software Foundation; either version 2 of the License, or(at your option) any later version.This program is distributed in the hope that it will be useful,but WITHOUT ANY WARRANTY; without even the implied warranty ofMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See theGNU General Public License for more details.You should have received a copy of the GNU General Public Licensealong with this program; if not, write to the Free SoftwareFoundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.@end smallexampleAlso add information on how to contact you by electronic and paper mail.If the program is interactive, make it output a short notice like thiswhen it starts in an interactive mode:@smallexampleGnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.This is free software, and you are welcome to redistribute itunder certain conditions; type `show c' for details.@end smallexampleThe hypothetical commands @samp{show w} and @samp{show c} should showthe appropriate parts of the General Public License. Of course, thecommands you use may be called something other than @samp{show w} and@samp{show c}; they could even be mouse-clicks or menu items---whateversuits your program.You should also get your employer (if you work as a programmer) or yourschool, if any, to sign a ``copyright disclaimer'' for the program, ifnecessary. Here is a sample; alter the names:@smallexampleYoyodyne, Inc., hereby disclaims all copyright interest in the program`Gnomovision' (which makes passes at compilers) written by James Hacker.@var{signature of Ty Coon}, 1 April 1989Ty Coon, President of Vice@end smallexampleThis General Public License does not permit incorporating your program intoproprietary programs. If your program is a subroutine library, you mayconsider it more useful to permit linking proprietary applications with thelibrary. If this is what you want to do, use the GNU Library GeneralPublic License instead of this License.@node Concepts, Examples, Copying, Top@chapter The Concepts of BisonThis chapter introduces many of the basic concepts without which thedetails of Bison will not make sense. If you do not already know how touse Bison or Yacc, we suggest you start by reading this chapter carefully.@menu* Language and Grammar:: Languages and context-free grammars, as mathematical ideas.* Grammar in Bison:: How we represent grammars for Bison's sake.* Semantic Values:: Each token or syntactic grouping can have a semantic value (the value of an integer, the name of an identifier, etc.).* Semantic Actions:: Each rule can have an action containing C code.* Bison Parser:: What are Bison's input and output, how is the output used?* Stages:: Stages in writing and running Bison grammars.* Grammar Layout:: Overall structure of a Bison grammar file.@end menu@node Language and Grammar, Grammar in Bison, , Concepts@section Languages and Context-Free Grammars@cindex context-free grammar@cindex grammar, context-freeIn order for Bison to parse a language, it must be described by a@dfn{context-free grammar}. This means that you specify one or more@dfn{syntactic groupings} and give rules for constructing them from theirparts. For example, in the C language, one kind of grouping is called an`expression'. One rule for making an expression might be, ``An expressioncan be made of a minus sign and another expression''. Another would be,``An expression can be an integer''. As you can see, rules are oftenrecursive, but there must be at least one rule which leads out of therecursion.@cindex BNF@cindex Backus-Naur formThe most common formal system for presenting such rules for humans to readis @dfn{Backus-Naur Form} or ``BNF'', which was developed in order tospecify the language Algol 60. Any grammar expressed in BNF is acontext-free grammar. The input to Bison is essentially machine-readableBNF.Not all context-free languages can be handled by Bison, only thosethat are LALR(1). In brief, this means that it must be possible totell how to parse any portion of an input string with just a singletoken of look-ahead. Strictly speaking, that is a description of anLR(1) grammar, and LALR(1) involves additional restrictions that arehard to explain simply; but it is rare in actual practice to find anLR(1) grammar that fails to be LALR(1). @xref{Mystery Conflicts, ,Mysterious Reduce/Reduce Conflicts}, for more information on this.@cindex symbols (abstract)@cindex token@cindex syntactic grouping@cindex grouping, syntacticIn the formal grammatical rules for a language, each kind of syntactic unitor grouping is named by a @dfn{symbol}. Those which are built by groupingsmaller constructs according to grammatical rules are called@dfn{nonterminal symbols}; those which can't be subdivided are called@dfn{terminal symbols} or @dfn{token types}. We call a piece of inputcorresponding to a single terminal symbol a @dfn{token}, and a piececorresponding to a single nonterminal symbol a @dfn{grouping}.@refillWe can use the C language as an example of what symbols, terminal andnonterminal, mean. The tokens of C are identifiers, constants (numeric andstring), and the various keywords, arithmetic operators and punctuationmarks. So the terminal symbols of a grammar for C include `identifier',`number', `string', plus one symbol for each keyword, operator orpunctuation mark: `if', `return', `const', `static', `int', `char',`plus-sign', `open-brace', `close-brace', `comma' and many more. (Thesetokens can be subdivided into characters, but that is a matter oflexicography, not grammar.)Here is a simple C function subdivided into tokens:@exampleint /* @r{keyword `int'} */square (x) /* @r{identifier, open-paren,} */ /* @r{identifier, close-paren} */ int x; /* @r{keyword `int', identifier, semicolon} */@{ /* @r{open-brace} */ return x * x; /* @r{keyword `return', identifier,} */ /* @r{asterisk, identifier, semicolon} */@} /* @r{close-brace} */@end exampleThe syntactic groupings of C include the expression, the statement, thedeclaration, and the function definition. These are represented in thegrammar of C by nonterminal symbols `expression', `statement',`declaration' and `function definition'. The full grammar uses dozens ofadditional language constructs, each with its own nonterminal symbol, inorder to express the meanings of these four. The example above is afunction definition; it contains one declaration, and one statement. Inthe statement, each @samp{x} is an expression and so is @samp{x * x}.Each nonterminal symbol must have grammatical rules showing how it is madeout of simpler constructs. For example, one kind of C statement is the@code{return} statement; this would be described with a grammar rule whichreads informally as follows:@quotationA `statement' can be made of a `return' keyword, an `expression' and a`semicolon'.@end quotation@noindentThere would be many other rules for `statement', one for each kind ofstatement in C.@cindex start symbolOne nonterminal symbol must be distinguished as the special one whichdefines a complete utterance in the language. It is called the @dfn{startsymbol}. In a compiler, this means a complete input program. In the Clanguage, the nonterminal symbol `sequence of definitions and declarations'plays this role.For example, @samp{1 + 2} is a valid C expression---a valid part of a Cprogram---but it is not valid as an @emph{entire} C program. In thecontext-free grammar of C, this follows from the fact that `expression' isnot the start symbol.The Bison parser reads a sequence of tokens as its input, and groups thetokens using the grammar rules. If the input is valid, the end result isthat the entire token sequence reduces to a single grouping whose symbol isthe grammar's start symbol. If we use a grammar for C, the entire inputmust be a `sequence of definitions and declarations'. If not, the parserreports a syntax error.@node Grammar in Bison, Semantic Values, Language and Grammar, Concepts@section From Formal Rules to Bison Input@cindex Bison grammar@cindex grammar, Bison@cindex formal grammarA formal grammar is a mathematical construct. To define the languagefor Bison, you must write a file expressing the grammar in Bison syntax:a @dfn{Bison grammar} file. @xref{Grammar File, ,Bison Grammar Files}.A nonterminal symbol in the formal grammar is represented in Bison inputas an identifier, like an identifier in C. By convention, it should bein lower case, such as @code{expr}, @code{stmt} or @code{declaration}.The Bison representation for a terminal symbol is also called a @dfn{tokentype}. Token types as well can be represented as C-like identifiers. Byconvention, these identifiers should be upper case to distinguish them fromnonterminals: for example, @code{INTEGER}, @code{IDENTIFIER}, @code{IF} or@code{RETURN}. A terminal symbol that stands for a particular keyword inthe language should be named after that keyword converted to upper case.The terminal symbol @code{error} is reserved for error recovery.@xref{Symbols}.A terminal symbol can also be represented as a character literal, just likea C character constant. You should do this whenever a token is just asingle character (parenthesis, plus-sign, etc.): use that same character ina literal as the terminal symbol for that token.A third way to represent a terminal symbol is with a C string constantcontaining several characters. @xref{Symbols}, for more information.The grammar rules also have an expression in Bison syntax. For example,here is the Bison rule for a C @code{return} statement. The semicolon inquotes is a literal character token, representing part of the C syntax forthe statement; the naked semicolon, and the colon, are Bison punctuationused in every rule.@examplestmt: RETURN expr ';' ;@end example@noindent@xref{Rules, ,Syntax of Grammar Rules}.@node Semantic Values, Semantic Actions, Grammar in Bison, Concepts@section Semantic Values@cindex semantic value@cindex value, semanticA formal grammar selects tokens only by their classifications: for example,if a rule mentions the terminal symbol `integer constant', it means that@emph{any} integer constant is grammatically valid in that position. Theprecise value of the constant is irrelevant to how to parse the input: if@samp{x+4} is grammatical then @samp{x+1} or @samp{x+3989} is equallygrammatical.@refillBut the precise value is very important for what the input means once it isparsed. A compiler is useless if it fails to distinguish between 4, 1 and3989 as constants in the program! Therefore, each token in a Bison grammarhas both a token type and a @dfn{semantic value}. @xref{Semantics, ,Defining Language Semantics},for details.The token type is a terminal symbol defined in the grammar, such as@code{INTEGER}, @code{IDENTIFIER} or @code{','}. It tells everythingyou need to know to decide where the token may validly appear and how togroup it with other tokens. The grammar rules know nothing about tokensexcept their types.@refillThe semantic value has all the rest of the information about themeaning of the token, such as the value of an integer, or the name of anidentifier. (A token such as @code{','} which is just punctuation doesn'tneed to have any semantic value.)For example, an input token might be classified as token type@code{INTEGER} and have the semantic value 4. Another input token mighthave the same token type @code{INTEGER} but value 3989. When a grammarrule says that @code{INTEGER} is allowed, either of these tokens isacceptable because each is an @code{INTEGER}. When the parser accepts thetoken, it keeps track of the token's semantic value.Each grouping can also have a semantic value as well as its nonterminalsymbol. For example, in a calculator, an expression typically has asemantic value that is a number. In a compiler for a programming
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -