📄 readme_lemon_tutorial.htm

📁 学习lemon语法分析的windows程序
💻 HTM
📖 第 1 页 / 共 3 页
字号:
    
    31                                           }
    32  expr(A) ::= expr(B) DIVIDE expr(C).  {
    
    
    33           if(C.value != 0){
    34             A.value = B.value / C.value;
    35             A.n = B.n+1 + C.n+1;
    36            }else{
    37             std::cout &lt;&lt; "divide by zero" &lt;&lt; std::endl;
    38             }
    39  }  /* end of DIVIDE */
    40  expr(A) ::= NUM(B). { A.value = B.value; A.n = B.n+1; }
</CODE></PRE>
      <P>As you can see below, taking a close look at lines 23 through 25, the 
      Token structure A now takes on members "A.value" and "A.n", with ".value" 
      taking on the value of the expression and ".n" the number of times an 
      assignment is made: 
      <P><PRE><CODE>
    23  expr(A) ::= expr(B) MINUS  expr(C).   { A.value = B.value - C.value;
    24                                         A.n = B.n+1  + C.n+1;
    25                                        }
</CODE></PRE>
      <P>This is a quick way to see the "shift" and "reduce" dynamically. A 
      "shift" is referred to as the number of times a token is pushed on the 
      stack. A "reduce" is the number of times an expression rule has been 
      matched. Once it's matched, it can be reduced. As you will recall, when 
      <CODE>lemon</CODE> is run, three files are normally created: *.c, *.h, and 
      *.out. This ".out" file contains each step of the grammar, along with the 
      shift and reduce states. If you want a simple summary, run lemon with the 
      "-s" option: 
      <P><PRE><CODE>
    $ ./lemon -s example2.y
    Parser statistics: 6 terminals, 3 nonterminals, 6 rules
    11 states, 0 parser table entries, 0 conflicts
</CODE></PRE>
      <P>Again, as in the previous example, "main_part2", the driver, is 
      appended to "example2.c": 
      <P><PRE><CODE>
    $ cat main_part2 &gt;&gt; example2.c
</CODE></PRE>
      <P>Now "example2.c" can be compiled and executed: 
      <P><PRE><CODE>
    $ g++ -o ex2  example2.c
    
    $ ./ex2
    Result.value=17
    Result.n=4
    Result.value=-9
    Result.n=4
    Result.value=78
    Result.n=10
</CODE></PRE>
      <H2>Example 3: Working with the token destructor</H2>
      <P>One advantage of lemon over bison is the ability to free memory used by 
      a non-terminal. You can call the function of your choice. 
      "<CODE>expr</CODE>" is an example of a non-terminal. When the program is 
      done with the non-terminal, the function defined by 
      <CODE>token_destructor</CODE> is called. 
      <P><A 
      href="http://souptonuts.sourceforge.net/code/example3.y.html">example3.y</A> 

      <P><PRE><CODE>
    1  %include {
    2  #include &lt;iostream&gt;
    3  #include "ex3def.h"
    4  #include "example3.h"
    
    
    5    void token_destructor(Token t)
    6      {
    7        std::cout &lt;&lt; "In token_destructor t.value= " &lt;&lt; t.value &lt;&lt; std::endl;
    8        std::cout &lt;&lt; "In token_destructor t.n= " &lt;&lt; t.n &lt;&lt; std::endl;
    9      }
    
    
    10  }
    
    
    11  %token_type {Token}
    12  %default_type {Token}
    13  %token_destructor { token_destructor($$); }
    ...
</CODE></PRE>
      <P>In line 13, <CODE>token_destructor</CODE> is the function 
      "<CODE>token_destructor($$);</CODE>". The function 
      "<CODE>token_destructor</CODE>" is defined in lines 5 through 9. For this 
      simple example, no memory is allocated, so there is no need to call 
      <CODE>free</CODE>. Instead, to see what is happening, output will be 
      written to std::cout. 
      <P>After the program is compiled, it can be executed as follows. Note that 
      I have added line numbers to the output of "ex3" for easy reference. 
      <P><PRE><CODE>
    $ ./ex3
    1  t0.value=4  PLUS t1.value=13
    2  In token_destructor t.value= 4
    3  In token_destructor t.n= 0
    4  Result.value=17
    5  Result.n=4
    6  parsing complete!
    ...
</CODE></PRE>
      <P>After the expression has been reduced, the destructor is called, but it 
      is only called for the token.value=4. Why? For an answer we will have to 
      take a look at "main_part3". 
      <P><A 
      href="http://souptonuts.sourceforge.net/code/main_part3.html">main_part3</A> 

      <P><PRE><CODE>
    1  int main()
    2  {
    3    void* pParser = ParseAlloc (malloc);
    
    4    struct Token t0,t1;
    5    struct Token mToken;
    
    6    t0.value=4;
    7    t0.n=0;
    
    8    t1.value=13;
    9    t1.n=0;
    
    10    std::cout &lt;&lt; " t0.value=4  PLUS t1.value=13 " &lt;&lt; std::endl;
    
    11    Parse (pParser, NUM, t0);
    12    Parse (pParser, PLUS, t0);
    13    Parse (pParser, NUM, t1);
    14    Parse (pParser, 0, t0);
    
    15    std::cout &lt;&lt; " t0.value=4  DIVIDE t1.value=13 " &lt;&lt; std::endl;
    
    16    Parse (pParser, NUM, t0);
    17    Parse (pParser, DIVIDE, t0);
    18    Parse (pParser, NUM, t1);
    19    Parse (pParser, 0, t1);
    ...
</CODE></PRE>
      <P>Line 14 terminates the grammar with <CODE>t0</CODE> as the third 
      parameter. That third parameter is passed as "<CODE>$$</CODE>" to the 
      defined destructor function, "<CODE>token_destructor(...</CODE>". When 
      calling "<CODE>Parse</CODE>" a second time immediately, it is undefined, 
      so you should only call the destructor function once after you're done 
      passing tokens to complete an expression. In other words, you would never 
      call "<CODE>Parse (pParser, 0, t0);</CODE>", immediately followed by 
      another "<CODE>Parse (pParser, 0, t0);</CODE>". 
      <P>In line 19, <CODE>token_destructor</CODE> is called for <CODE>t1.value= 
      13</CODE>. If you look at "main_part3", line 19, you'll see that 
      <CODE>Parse</CODE> is called with <CODE>t1</CODE> as the third parameter 
      and <CODE>0</CODE> and the second parameter. 
      <P>Continuation of the output from the program: 
      <P><PRE><CODE>
    7
    8
    9   t1.value=13  PLUS  t0.value=4
    10   In token_destructor t.value= 13
    11   In token_destructor t.n= 0
    12   Result.value=17
    13   Result.n=4
    14   parsing complete!
</CODE></PRE>
      <P>So <CODE>t0</CODE> is called at the third parameter position in line 14 
      and <CODE>t1</CODE> is called in line 19. This shouldn't be a problem. One 
      variable could hold the value of the tokens. For instance, main_part3 
      could have had <CODE>Token t0</CODE> used for both the values 4 and 14 as 
      follows: 
      <P><PRE><CODE>
    ...
    struct Token t0;
    
    t0.value=4;
    t0.n=0;
    
    Parse (pParser, NUM, t0);
    Parse (pParser, PLUS, t0);
    
    t0.value=13;
    t0.n=0;
    
    Parse (pParser, NUM, t0);
    Parse (pParser, 0, t0);
    ...
    
</CODE></PRE>
      <H2>Example 4: Ending the grammar with a NEWLINE</H2>
      <P>Notice that in the last three examples, <CODE>Parse(pParse,0..</CODE> 
      had to be called to signal the end of the input for an expression. This is 
      awkward. Instead, the grammar should dictate when the expression can no 
      longer be reduced. 
      <P>"example4.y" contains the following lines: 
      <P><A 
      href="http://souptonuts.sourceforge.net/code/example4.y.html">example4.y</A> 

      <P><PRE><CODE>
    1  %include {
    2  #include &lt;iostream&gt;
    3  #include "ex4def.h"
    4  #include "example4.h"
    
    ...
    
    23
    24  %syntax_error {
    25    std::cout &lt;&lt; "Syntax error!" &lt;&lt; std::endl;
    26  }
    27
    28  /*  This is to terminate with a new line */
    29  main ::= in.
    30  in ::= .
    31  in ::= in state NEWLINE.
    
    
    32  state ::= expr(A).   {
    33                          std::cout &lt;&lt; "Result.value=" &lt;&lt; A.value &lt;&lt; std::end
    34                          std::cout &lt;&lt; "Result.n=" &lt;&lt; A.n &lt;&lt; std::endl;
    
    
    35                           }
    
    
    
    36  expr(A) ::= expr(B) MINUS  expr(C).   { A.value = B.value - C.value;
    37                                         A.n = B.n+1  + C.n+1;
    38                                        }
    
    ...
</CODE></PRE>
      <P>Note lines 29 through 35. "<CODE>main</CODE>" and "<CODE>in</CODE>" 
      must be defined (lines 29-31). If you're a Bison user, you could get away 
      without having to define the non-terminal main, but lemon currently 
      requires it. 
      <P>With this change made to the grammar in "example4.y", "main_part4" can 
      now terminate each expression by passing the token NEWLINE. 
      <P>Here is a section of main_part4: 
      <P><A 
      href="http://souptonuts.sourceforge.net/code/main_part4.html">main_part4</A> 

      <P><PRE><CODE>
    1  int main()
    2  {
    3    void* pParser = ParseAlloc (malloc);
    
    4    struct Token t0,t1;
    5    struct Token mToken;
    
    6    t0.value=4;
    7    t0.n=0;
    
    8    t1.value=13;
    9    t1.n=0;
    
    10    std::cout &lt;&lt; std::endl &lt;&lt;" t0.value=4  PLUS t1.value=13 " &lt;&lt; std::endl &lt;&lt; std::endl;
    
    11    Parse (pParser, NUM, t0);
    12    Parse (pParser, PLUS, t0);
    13    Parse (pParser, NUM, t1);
    14    Parse (pParser, NEWLINE, t1);
    
    
    15    std::cout &lt;&lt; std::endl &lt;&lt;" t0.value=4  TIMES t1.value=13 " &lt;&lt; std::endl &lt;&lt; std::endl;
</CODE></PRE>
      <P>Note that line 14 is passing the token NEWLINE and checking 
      "example4.h". NEWLINE in this case is defined as an integer, 6. 
      <P>So, looking at the output of "ex4", with line numbers added for 
      clarification, we get the following: 
      <P><PRE><CODE>
    $ ./ex4
    
    1  t0.value=4  PLUS t1.value=13
    2
    3  In token_destructor t.value= 4
    4  In token_destructor t.n= 0
    5  Result.value=17
    6  Result.n=4
    7
    8   t0.value=4  TIMES t1.value=13
    9
    10  In token_destructor t.value= 4
    11  In token_destructor t.n= 0
    12  Result.value=52
    13  Result.n=4
    14  parsing complete!
</CODE></PRE>
      <P>We get the result on line 5, and there was no need to call <CODE>Parse 
      (pParser, 0, t0);</CODE>. Instead, <CODE>Parse( pParse, NEWLINE, 
      t0)</CODE> worked. 
      <H2>Example 5: Using flex for the tokenizer</H2>
      <P>The next example takes input directly from the terminal, and flex will 
      create a scanner for finding the appropriate tokens. 
      <P>First, a quick look at the flex program "lexer.l", again with line 
      numbers added for clarification: 
      <P><A 
      href="http://souptonuts.sourceforge.net/code/lexer.l.html">lexer.l</A> 
      <P><PRE><CODE>
    1  %{
    2  #include "lexglobal.h"
    3  #include "example5.h"
    4  #include &lt;string.h&gt;
    5  #include &lt;math.h&gt;
    
    6  int line = 1, col = 1;
    
    7  %}
    8  %%
    
    9  [0-9]+|[0-9]*\.[0-9]+    {                      col += (int) strlen(yytext);
💿 文件大小 338 K
👤 上传用户 bonylee_java
📂 所属分类编译器/解释器
🏷️ 相关标签

#windows #lemon #分 #程序
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -