📄 bison_7.htm
字号:
<HTML><HEAD><!-- This HTML file has been created by texi2html 1.44 from /opt/src/gnu/bison-1.25/bison.texinfo on 30 June 1997 --><TITLE>Bison 1.25 - Parser C-Language Interface</TITLE></HEAD><BODY>Go to the <A HREF="bison_1.html">first</A>, <A HREF="bison_6.html">previous</A>, <A HREF="bison_8.html">next</A>, <A HREF="bison_15.html">last</A> section, <A HREF="index.html">table of contents</A>.<HR><H1><A NAME="SEC59" HREF="index.html#SEC59">Parser C-Language Interface</A></H1><P><A NAME="IDX120"></A><A NAME="IDX121"></A></P><P>The Bison parser is actually a C function named <CODE>yyparse</CODE>. Here wedescribe the interface conventions of <CODE>yyparse</CODE> and the otherfunctions that it needs to use.</P><P>Keep in mind that the parser uses many C identifiers starting with<SAMP>`yy'</SAMP> and <SAMP>`YY'</SAMP> for internal purposes. If you use such anidentifier (aside from those in this manual) in an action or in additionalC code in the grammar file, you are likely to run into trouble.</P><H2><A NAME="SEC60" HREF="index.html#SEC60">The Parser Function <CODE>yyparse</CODE></A></H2><P><A NAME="IDX122"></A></P><P>You call the function <CODE>yyparse</CODE> to cause parsing to occur. Thisfunction reads tokens, executes actions, and ultimately returns when itencounters end-of-input or an unrecoverable syntax error. You can alsowrite an action which directs <CODE>yyparse</CODE> to return immediately withoutreading further.</P><P>The value returned by <CODE>yyparse</CODE> is 0 if parsing was successful (returnis due to end-of-input).</P><P>The value is 1 if parsing failed (return is due to a syntax error).</P><P>In an action, you can cause immediate return from <CODE>yyparse</CODE> by usingthese macros:</P><DL COMPACT><DT><CODE>YYACCEPT</CODE><DD><A NAME="IDX123"></A>Return immediately with value 0 (to report success).<DT><CODE>YYABORT</CODE><DD><A NAME="IDX124"></A>Return immediately with value 1 (to report failure).</DL><H2><A NAME="SEC61" HREF="index.html#SEC61">The Lexical Analyzer Function <CODE>yylex</CODE></A></H2><P><A NAME="IDX125"></A><A NAME="IDX126"></A></P><P>The <STRONG>lexical analyzer</STRONG> function, <CODE>yylex</CODE>, recognizes tokens fromthe input stream and returns them to the parser. Bison does not createthis function automatically; you must write it so that <CODE>yyparse</CODE> cancall it. The function is sometimes referred to as a lexical scanner.</P><P>In simple programs, <CODE>yylex</CODE> is often defined at the end of the Bisongrammar file. If <CODE>yylex</CODE> is defined in a separate source file, youneed to arrange for the token-type macro definitions to be available there.To do this, use the <SAMP>`-d'</SAMP> option when you run Bison, so that it willwrite these macro definitions into a separate header file<TT>`<VAR>name</VAR>.tab.h'</TT> which you can include in the other source filesthat need it. See section <A HREF="bison_12.html#SEC87">Invoking Bison</A>.</P><H3><A NAME="SEC62" HREF="index.html#SEC62">Calling Convention for <CODE>yylex</CODE></A></H3><P>The value that <CODE>yylex</CODE> returns must be the numeric code for the typeof token it has just found, or 0 for end-of-input.</P><P>When a token is referred to in the grammar rules by a name, that namein the parser file becomes a C macro whose definition is the propernumeric code for that token type. So <CODE>yylex</CODE> can use the nameto indicate that type. See section <A HREF="bison_6.html#SEC40">Symbols, Terminal and Nonterminal</A>.</P><P>When a token is referred to in the grammar rules by a character literal,the numeric code for that character is also the code for the token type.So <CODE>yylex</CODE> can simply return that character code. The null charactermust not be used this way, because its code is zero and that is whatsignifies end-of-input.</P><P>Here is an example showing these things:</P><PRE>yylex (){ ... if (c == EOF) /* Detect end of file. */ return 0; ... if (c == '+' || c == '-') return c; /* Assume token type for `+' is '+'. */ ... return INT; /* Return the type of the token. */ ...}</PRE><P>This interface has been designed so that the output from the <CODE>lex</CODE>utility can be used without change as the definition of <CODE>yylex</CODE>.</P><P>If the grammar uses literal string tokens, there are two ways that<CODE>yylex</CODE> can determine the token type codes for them:</P><UL><LI>If the grammar defines symbolic token names as aliases for theliteral string tokens, <CODE>yylex</CODE> can use these symbolic names likeall others. In this case, the use of the literal string tokens inthe grammar file has no effect on <CODE>yylex</CODE>.<LI><CODE>yylex</CODE> can find the multi-character token in the <CODE>yytname</CODE>table. The index of the token in the table is the token type's code.The name of a multi-character token is recorded in <CODE>yytname</CODE> with adouble-quote, the token's characters, and another double-quote. Thetoken's characters are not escaped in any way; they appear verbatim inthe contents of the string in the table.Here's code for looking up a token in <CODE>yytname</CODE>, assuming that thecharacters of the token are stored in <CODE>token_buffer</CODE>.<PRE>for (i = 0; i < YYNTOKENS; i++) { if (yytname[i] != 0 && yytname[i][0] == '"' && strncmp (yytname[i] + 1, token_buffer, strlen (token_buffer)) && yytname[i][strlen (token_buffer) + 1] == '"' && yytname[i][strlen (token_buffer) + 2] == 0) break; }</PRE>The <CODE>yytname</CODE> table is generated only if you use the<CODE>%token_table</CODE> declaration. See section <A HREF="bison_6.html#SEC57">Bison Declaration Summary</A>.</UL><H3><A NAME="SEC63" HREF="index.html#SEC63">Semantic Values of Tokens</A></H3><P><A NAME="IDX127"></A>In an ordinary (nonreentrant) parser, the semantic value of the token mustbe stored into the global variable <CODE>yylval</CODE>. When you are usingjust one data type for semantic values, <CODE>yylval</CODE> has that type.Thus, if the type is <CODE>int</CODE> (the default), you might write this in<CODE>yylex</CODE>:</P><PRE> ... yylval = value; /* Put value onto Bison stack. */ return INT; /* Return the type of the token. */ ...</PRE><P>When you are using multiple data types, <CODE>yylval</CODE>'s type is a unionmade from the <CODE>%union</CODE> declaration (see section <A HREF="bison_6.html#SEC52">The Collection of Value Types</A>). So whenyou store a token's value, you must use the proper member of the union.If the <CODE>%union</CODE> declaration looks like this:</P><PRE>%union { int intval; double val; symrec *tptr;}</PRE><P>then the code in <CODE>yylex</CODE> might look like this:</P><PRE> ... yylval.intval = value; /* Put value onto Bison stack. */ return INT; /* Return the type of the token. */ ...</PRE><H3><A NAME="SEC64" HREF="index.html#SEC64">Textual Positions of Tokens</A></H3><P><A NAME="IDX128"></A>If you are using the <SAMP>`@<VAR>n</VAR>'</SAMP>-feature (see section <A HREF="bison_7.html#SEC67">Special Features for Use in Actions</A>) inactions to keep track of the textual locations of tokens and groupings,then you must provide this information in <CODE>yylex</CODE>. The function<CODE>yyparse</CODE> expects to find the textual location of a token just parsedin the global variable <CODE>yylloc</CODE>. So <CODE>yylex</CODE> must store theproper data in that variable. The value of <CODE>yylloc</CODE> is a structureand you need only initialize the members that are going to be used by theactions. The four members are called <CODE>first_line</CODE>,<CODE>first_column</CODE>, <CODE>last_line</CODE> and <CODE>last_column</CODE>. Note thatthe use of this feature makes the parser noticeably slower.</P><P><A NAME="IDX129"></A>The data type of <CODE>yylloc</CODE> has the name <CODE>YYLTYPE</CODE>.</P><H3><A NAME="SEC65" HREF="index.html#SEC65">Calling Conventions for Pure Parsers</A></H3><P>When you use the Bison declaration <CODE>%pure_parser</CODE> to request apure, reentrant parser, the global communication variables <CODE>yylval</CODE>and <CODE>yylloc</CODE> cannot be used. (See section <A HREF="bison_6.html#SEC56">A Pure (Reentrant) Parser</A>.) In such parsers the two global variables are replaced bypointers passed as arguments to <CODE>yylex</CODE>. You must declare them asshown here, and pass the information back by storing it through thosepointers.</P><PRE>yylex (lvalp, llocp) YYSTYPE *lvalp; YYLTYPE *llocp;{ ... *lvalp = value; /* Put value onto Bison stack. */ return INT; /* Return the type of the token. */ ...}</PRE><P>If the grammar file does not use the <SAMP>`@'</SAMP> constructs to refer totextual positions, then the type <CODE>YYLTYPE</CODE> will not be defined. Inthis case, omit the second argument; <CODE>yylex</CODE> will be called withonly one argument.</P><P>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -