📄 yacc.html

📁 IEEE 1003.1-2003, Single Unix Specification v3
💻 HTML
📖 第 1 页 / 共 4 页
字号:
上一页 1 2 34
<tr valign="top"><td align="left"><p class="tent">{MEMSIZE}</p></td><td align="left"><p class="tent">5200</p></td><td align="left"><p class="tent">Length of rules. The total length, in names (tokens and non-terminals), of all the rules of the grammar. Theleft-hand side is counted for each rule, even if it is not explicitly repeated, as specified in <a href="#tag_04_174_13_04">GrammarRules in yacc</a> .</p></td></tr><tr valign="top"><td align="left"><p class="tent">{ACTSIZE}</p></td><td align="left"><p class="tent">4000</p></td><td align="left"><p class="tent">Number of actions. &quot;Actions&quot; here (and in the description file) refer to parser actions (shift, reduce, and soon) not to semantic actions defined in <a href="#tag_04_174_13_04">Grammar Rules in yacc</a> .</p></td></tr></table></center></blockquote><h4><a name="tag_04_174_14"></a>EXIT STATUS</h4><blockquote><p>The following exit values shall be returned:</p><dl compact><dt>&nbsp;0</dt><dd>Successful completion.</dd><dt>&gt;0</dt><dd>An error occurred.</dd></dl></blockquote><h4><a name="tag_04_174_15"></a>CONSEQUENCES OF ERRORS</h4><blockquote><p>If any errors are encountered, the run is aborted and <i>yacc</i> exits with a non-zero status. Partial code files and headerfiles may be produced. The summary information in the description file shall always be produced if the <b>-v</b> flag ispresent.</p></blockquote><hr><div class="box"><em>The following sections are informative.</em></div><h4><a name="tag_04_174_16"></a>APPLICATION USAGE</h4><blockquote><p>Historical implementations experience name conflicts on the names <b>yacc.tmp</b>, <b>yacc.acts</b>, <b>yacc.debug</b>,<b>y.tab.c</b>, <b>y.tab.h</b>, and <b>y.output</b> if more than one copy of <i>yacc</i> is running in a single directory at onetime. The <b>-b</b> option was added to overcome this problem. The related problem of allowing multiple <i>yacc</i> parsers to beplaced in the same file was addressed by adding a <b>-p</b> option to override the previously hard-coded <b>yy</b> variableprefix.</p><p>The description of the <b>-p</b> option specifies the minimal set of function and variable names that cause conflict whenmultiple parsers are linked together. YYSTYPE does not need to be changed. Instead, the programmer can use <b>-b</b> to give theheader files for different parsers different names, and then the file with the <i>yylex</i>() for a given parser can include theheader for that parser. Names such as <i>yyclearerr</i> do not need to be changed because they are used only in the actions; theydo not have linkage. It is possible that an implementation has other names, either internal ones for implementing things such as<i>yyclearerr</i>, or providing non-standard features that it wants to change with <b>-p</b>.</p><p>Unary operators that are the same token as a binary operator in general need their precedence adjusted. This is handled by the<b>%prec</b> advisory symbol associated with the particular grammar rule defining that unary operator. (See <a href="#tag_04_174_13_04">Grammar Rules in yacc</a> .) Applications are not required to use this operator for unary operators, but thegrammars that do not require it are rare.</p></blockquote><h4><a name="tag_04_174_17"></a>EXAMPLES</h4><blockquote><p>Access to the <i>yacc</i> library is obtained with library search operands to <a href="../utilities/c99.html"><i>c99</i></a>. Touse the <i>yacc</i> library <i>main</i>():</p><pre><tt>c99 y.tab.c -l y</tt></pre><p>Both the <a href="../utilities/lex.html"><i>lex</i></a> library and the <i>yacc</i> library contain <i>main</i>(). To access the<i>yacc</i> <i>main</i>():</p><pre><tt>c99 y.tab.c lex.yy.c -l y -l l</tt></pre><p>This ensures that the <i>yacc</i> library is searched first, so that its <i>main</i>() is used.</p><p>The historical <i>yacc</i> libraries have contained two simple functions that are normally coded by the application programmer.These functions are similar to the following code:</p><pre><tt>#include &lt;locale.h&gt;int main(void){    extern int yyparse();<br>    setlocale(LC_ALL, "");<br>    /* If the following parser is one created by lex, the       application must be careful to ensure that LC_CTYPE       and LC_COLLATE are set to the POSIX locale. */    (void) yyparse();    return (0);}<br>#include &lt;stdio.h&gt;<br>int yyerror(const char *msg){    (void) fprintf(stderr, "%s\n", msg);    return (0);}</tt></pre></blockquote><h4><a name="tag_04_174_18"></a>RATIONALE</h4><blockquote><p>The references in may be helpful in constructing the parser generator. The referenced DeRemer and Pennello article (along withthe works it references) describes a technique to generate parsers that conform to this volume of IEEE&nbsp;Std&nbsp;1003.1-2001.Work in this area continues to be done, so implementors should consult current literature before doing any new implementations. Theoriginal Knuth article is the theoretical basis for this kind of parser, but the tables it generates are impractically large forreasonable grammars and should not be used. The &quot;equivalent to&quot; wording is intentional to assure that the best tables that areLALR(1) can be generated.</p><p>There has been confusion between the class of grammars, the algorithms needed to generate parsers, and the algorithms needed toparse the languages. They are all reasonably orthogonal. In particular, a parser generator that accepts the full range of LR(1)grammars need not generate a table any more complex than one that accepts SLR(1) (a relatively weak class of LR grammars) for agrammar that happens to be SLR(1). Such an implementation need not recognize the case, either; table compression can yield theSLR(1) table (or one even smaller than that) without recognizing that the grammar is SLR(1). The speed of an LR(1) parser for anyclass is dependent more upon the table representation and compression (or the code generation if a direct parser is generated) thanupon the class of grammar that the table generator handles.</p><p>The speed of the parser generator is somewhat dependent upon the class of grammar it handles. However, the original Knutharticle algorithms for constructing LR parsers were judged by its author to be impractically slow at that time. Although full LR ismore complex than LALR(1), as computer speeds and algorithms improve, the difference (in terms of acceptable wall-clock executiontime) is becoming less significant.</p><p>Potential authors are cautioned that the referenced DeRemer and Pennello article previously cited identifies a bug (anover-simplification of the computation of LALR(1) lookahead sets) in some of the LALR(1) algorithm statements that preceded it topublication. They should take the time to seek out that paper, as well as current relevant work, particularly Aho's.</p><p>The <b>-b</b> option was added to provide a portable method for permitting <i>yacc</i> to work on multiple separate parsers inthe same directory. If a directory contains more than one <i>yacc</i> grammar, and both grammars are constructed at the same time(by, for example, a parallel <a href="../utilities/make.html"><i>make</i></a> program), conflict results. While the solution is nothistorical practice, it corrects a known deficiency in historical implementations. Corresponding changes were made to all sectionsthat referenced the filenames <b>y.tab.c</b> (now &quot;the code file&quot;), <b>y.tab.h</b> (now &quot;the header file&quot;), and <b>y.output</b>(now &quot;the description file&quot;).</p><p>The grammar for <i>yacc</i> input is based on System V documentation. The textual description shows there that the <tt>';'</tt>is required at the end of the rule. The grammar and the implementation do not require this. (The use of <b>C_IDENTIFIER</b> causesa reduce to occur in the right place.)</p><p>Also, in that implementation, the constructs such as <b>%token</b> can be terminated by a semicolon, but this is not permittedby the grammar. The keywords such as <b>%token</b> can also appear in uppercase, which is again not discussed. In most places where<tt>'%'</tt> is used, <tt>'\'</tt> can be substituted, and there are alternate spellings for some of the symbols (for example,<b>%LEFT</b> can be <tt>"%&lt;"</tt> or even <tt>"\&lt;"</tt> ).</p><p>Historically, &lt;<i>tag</i>&gt; can contain any characters except <tt>'&gt;'</tt> , including white space, in theimplementation. However, since the <i>tag</i> must reference an ISO&nbsp;C standard union member, in practice conformingimplementations need to support only the set of characters for ISO&nbsp;C standard identifiers in this context.</p><p>Some historical implementations are known to accept actions that are terminated by a period. Historical implementations oftenallow <tt>'$'</tt> in names. A conforming implementation does not need to support either of these behaviors.</p><p>Deciding when to use <b>%prec</b> illustrates the difficulty in specifying the behavior of <i>yacc</i>. There may be situationsin which the <i>grammar</i> is not, strictly speaking, in error, and yet <i>yacc</i> cannot interpret it unambiguously. Theresolution of ambiguities in the grammar can in many instances be resolved by providing additional information, such as using<b>%type</b> or <b>%union</b> declarations. It is often easier and it usually yields a smaller parser to take this alternative whenit is appropriate.</p><p>The size and execution time of a program produced without the runtime debugging code is usually smaller and slightly faster inhistorical implementations.</p><p>Statistics messages from several historical implementations include the following types of information:</p><pre><i>n</i><tt>/512 terminals,</tt> <i>n</i><tt>/300 non-terminals</tt><i>n</i><tt>/600 grammar rules,</tt> <i>n</i><tt>/1500 states</tt><i>n</i> <tt>shift/reduce,</tt> <i>n</i> <tt>reduce/reduce conflicts reported</tt><i>n</i><tt>/350 working sets usedMemory: states, etc.</tt> <i>n</i><tt>/15000, parser</tt> <i>n</i><tt>/15000</tt><i>n</i><tt>/600 distinct lookahead sets</tt><i>n</i> <tt>extra closures</tt><i>n</i> <tt>shift entries,</tt> <i>n</i> <tt>exceptions</tt><i>n</i> <tt>goto entries</tt><i>n</i> <tt>entries saved by goto defaultOptimizer space used: input</tt> <i>n</i><tt>/15000, output</tt> <i>n</i><tt>/15000</tt><i>n</i> <tt>table entries,</tt> <i>n</i> <tt>zeroMaximum spread:</tt> <i>n</i><tt>, Maximum offset:</tt> <i>n</i></pre><p>The report of internal tables in the description file is left implementation-defined because all aspects of these limits arealso implementation-defined. Some implementations may use dynamic allocation techniques and have no specific limit values toreport.</p><p>The format of the <b>y.output</b> file is not given because specification of the format was not seen to enhance applicationsportability. The listing is primarily intended to help human users understand and debug the parser; use of <b>y.output</b> by aconforming application script would be unusual. Furthermore, implementations have not produced consistent output and no popularformat was apparent. The format selected by the implementation should be human-readable, in addition to the requirement that it bea text file.</p><p>Standard error reports are not specifically described because they are seldom of use to conforming applications and there was noreason to restrict implementations.</p><p>Some implementations recognize <tt>"={"</tt> as equivalent to <tt>'{'</tt> because it appears in historical documentation. Thisconstruction was recognized and documented as obsolete as long ago as 1978, in the referenced <i>Yacc: Yet AnotherCompiler-Compiler</i>. This volume of IEEE&nbsp;Std&nbsp;1003.1-2001 chose to leave it as obsolete and omit it.</p><p>Multi-byte characters should be recognized by the lexical analyzer and returned as tokens. They should not be returned asmulti-byte character literals. The token <b>error</b> that is used for error recovery is normally assigned the value 256 in thehistorical implementation. Thus, the token value 256, which is used in many multi-byte character sets, is not available for use asthe value of a user-defined token.</p></blockquote><h4><a name="tag_04_174_19"></a>FUTURE DIRECTIONS</h4><blockquote><p>None.</p></blockquote><h4><a name="tag_04_174_20"></a>SEE ALSO</h4><blockquote><p><a href="c99.html"><i>c99</i></a> , <a href="lex.html"><i>lex</i></a></p></blockquote><h4><a name="tag_04_174_21"></a>CHANGE HISTORY</h4><blockquote><p>First released in Issue 2.</p></blockquote><h4><a name="tag_04_174_22"></a>Issue 5</h4><blockquote><p>The FUTURE DIRECTIONS section is added.</p></blockquote><h4><a name="tag_04_174_23"></a>Issue 6</h4><blockquote><p>This utility is marked as part of the C-Language Development Utilities option.</p><p>Minor changes have been added to align with the IEEE&nbsp;P1003.2b draft standard.</p><p>The normative text is reworded to avoid use of the term &quot;must&quot; for application requirements.</p><p>IEEE PASC Interpretation 1003.2 #177 is applied, changing the comment on <b>RCURL</b> from the <b>}%</b> token to the<b>%}</b>.</p></blockquote><div class="box"><em>End of informative text.</em></div><hr><hr size="2" noshade><center><font size="2"><!--footer start-->UNIX &reg; is a registered Trademark of The Open Group.<br>POSIX &reg; is a registered Trademark of The IEEE.<br>[ <a href="../mindex.html">Main Index</a> | <a href="../basedefs/contents.html">XBD</a> | <a href="../utilities/contents.html">XCU</a> | <a href="../functions/contents.html">XSH</a> | <a href="../xrat/contents.html">XRAT</a>]</font></center><!--footer end--><hr size="2" noshade></body></html>
上一页 1 2 34
💿 文件大小 2833 K
👤 上传用户 sunqingyan
📂 所属分类 Linux/Unix编程
🏷️ 相关标签

#Specification #1003.1 #Single #IEEE
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -