📄 10parser.html

📁 Visual C++ has been one of most effective tool for the large industrial applications. This book is t
💻 HTML
字号:
<html>
<head>
	<title>Parser</title>
    <meta  name="description" content="Parser implementation in C++">
    <meta name="keywords" content="private method, method">
	<link rel="stylesheet" href="rs.css" tppabs="http://www.relisoft.com/book/rs.css">
</head>

<body background="margin.gif" tppabs="http://www.relisoft.com/book/images/margin.gif" bgcolor="#FFFFDC">

<!-- Main Table -->
<table cellpadding="6">
    <tr>
    <td width="78">
	&nbsp;
	<td>


<h3>Parser</h3>

<p class=topics>Private methods.</p>
<p>The Parser requires a major makeover. Let抯 start with the header file, <b><i>parse.h</i></b>. It contains a lot of forward declarations
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>class Node;
class Scanner;
class Store;
class FunctionTable;
class SymbolTable;
</pre></table><!-- End Code --><p>All these classes are mentioned in <b><i>parse.h</i></b> either through pointers or references梩here is no need to include the header files that define them.

<p>The class <var>Parser</var> has a constructor that takes references to all the major objects it will need in order to parse the stream of tokens from the <var>Scanner</var>. It needs the <var>Store</var> to create <var>VarNode</var>s, <var>FunctionTable</var> to create <var>FunNode</var>s and the <var>SymbolTable</var> to recognize variables and function names. Once the <var>Parser</var> is created, we can only ask it to evaluate the expression passed to it in the <var>Scanner</var>. The <var>Eval</var> method has the side effect of printing the result on the screen (if you're cringing now, I promise you'll be relieved when you read the next chapter of this book).
<var><p>Parser</var> has a number of private methods that it uses to parse expressions, terms and factors; and a method to execute the evaluation of  the expression tree. Private methods are a useful device for code structuring. They cannot be called from outside of the class梩hey are just helper functions used in the implementations of other, public or private, methods.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>class Parser
{
public:
    Parser (Scanner &amp; scanner,
        Store &amp; store,
        FunctionTable &amp; funTab,
        SymbolTable &amp; symTab);
    ~Parser ();
    Status Eval ();
private:
    void   Parse();
    Node * Expr();
    Node * Term();
    Node * Factor();
    void   Execute ();

    Scanner        &amp; _scanner;
    Node           * _pTree;
    Status           _status;
    Store          &amp; _store;
    FunctionTable  &amp; _funTab;
    SymbolTable    &amp; _symTab;
};
</pre></table><!-- End Code --><p>The constructor (we are already looking at the implementation file, <b><i>parse.cpp</i></b>) initializes all the private variables.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>Parser::Parser (Scanner &amp; scanner, 
        Store &amp; store,
        FunctionTable &amp; funTab,
        SymbolTable &amp; symTab )
: _scanner (scanner), 
  _pTree (0), 
  _status (stOk), 
  _funTab (funTab),
  _store (store),
  _symTab (symTab)
{
}
</pre></table><!-- End Code --><p>The destructor deletes the parse tree:
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>Parser::~Parser ()
{
    delete _pTree;
}
</pre></table><!-- End Code --><p>The <var>Eval</var> method calls private methods to parse the input and execute the calculation:
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>Status Parser::Eval ()
{
    Parse ();
    if (_status == stOk)
        Execute ();
    else
        _status = stQuit;
    return _status;
}
</pre></table><!-- End Code -->

<var><p>Execute</var> calls the <var>Calc</var> method of the top node of the parse tree and prints the result of the (recursive) calculation
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>void Parser::Execute ()
{
    if (_pTree)
    {
        double result = _pTree-&gt;Calc ();
        cout &lt;&lt; "  " &lt;&lt; result &lt;&lt; endl;
    }
}
</pre></table><!-- End Code -->
<p>The parsing starts with the assumption that whatever it is, it抯 an expression.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>void Parser::Parse ()
{
    _pTree = Expr ();
}
</pre></table><!-- End Code --><p>The expression starts with a term. The term can be followed by one of the operators that bind terms: the plus sign, the minus sign, the assignment sign; or nothing at all. These correspond to the four productions: 
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre><b>Expression</b> is <b>Term + Expression</b>
or <b>Term - Expression</b>
or <b>Term = Expression</b>  // the Term must be an lvalue
or just <b>Term</b>
</pre></table><!-- End Code -->

<p>That抯 why, after parsing a term we ask the <var>Scanner</var> for the next token and test it against <var>tPlus</var>, <var>tMinus</var> and <var>tAssign</var>. Immediately after a token is recognized we tell the <var>Scanner</var> to accept it.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>Node * Parser::Expr ()
{
    Node * pNode = Term ();
    EToken token = _scanner.Token ();
    if (token == tPlus)
    {
        _scanner.Accept ();
        Node * pRight = Expr ();
        pNode = new AddNode (pNode, pRight);
    }
    else if (token == tMinus)
    {
        _scanner.Accept ();
        Node * pRight = Expr ();
        pNode = new SubNode (pNode, pRight);
    }
    else if (token == tAssign)
    {
        _scanner.Accept ();
        Node * pRight = Expr ();
        if (pNode-&gt;IsLvalue ())
        {
            pNode = new AssignNode (pNode, pRight);
        }
        else
        {
            _status = stError;
            delete pNode;
            pNode = Expr ();
        }
    }
    return pNode;
}
</pre></table><!-- End Code --><p>We proceed in a similar fashion with the <var>Term</var>, following the productions
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre><b>Term</b> is <b>Factor * Term</b>
or <b>Factor / Term</b>
or just <b>Factor</b>
</pre></table><!-- End Code -->
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>Node * Parser::Term ()
{
    Node * pNode = Factor ();
    if (_scanner.Token () == tMult)
    {
        _scanner.Accept ();
        Node * pRight = Term ();
        pNode = new MultNode (pNode, pRight);
    }
    else if (_scanner.Token () == tDivide)
    {
        _scanner.Accept ();
        Node * pRight = Term ();
        pNode = new DivideNode (pNode, pRight);
    }
    return pNode;
}
</pre></table><!-- End Code -->

<p>A <var>Factor</var>, in turn, can be one of these
<!--Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre><b>Factor</b> is <b>( Expression )</b> // parenthesized expression
or <b>Number</b> // literal floating point number
or <b>Identifier ( Expression )</b> // function call
or <b>Identifier</b> // symbolic variable
or <b>- Factor</b>  // unary minus</pre>
</table><!-- End Code -->


<!--Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>Node * Parser::Factor ()
{
    Node * pNode;
    EToken token = _scanner.Token ();

    if (token == tLParen)
    {
        _scanner.Accept (); // accept '('
        pNode = Expr ();
        if (_scanner.Token() != tRParen)
            _status = stError;
        _scanner.Accept (); // accept ')'
    }
    else if (token == tNumber)
    {
        pNode = new NumNode (_scanner.Number ());
        _scanner.Accept ();
    }
    else if (token == tIdent)
    {
        char strSymbol [maxSymLen + 1];
        int lenSym = maxSymLen;
        // copy the symbol into strSymbol
        _scanner.SymbolName (strSymbol, lenSym);
        int id = _symTab.Find (strSymbol, lenSym);
        _scanner.Accept ();
        if (_scanner.Token() == tLParen) // function call
        {
            _scanner.Accept (); // accept '('
            pNode = Expr ();
            if (_scanner.Token () == tRParen)
                _scanner.Accept (); // accept ')'
            else
                _status = stError;
            if (id != idNotFound &amp;&amp; id &lt; _funTab.Size ())
            {
                pNode = new FunNode (
                    _funTab.GetFun (id), pNode);
            }
            else
            {
                cout &lt;&lt; "Unknown function \"";
                cout &lt;&lt; strSymbol &lt;&lt; "\"\n";
            }
        }
        else
        {
            if (id == idNotFound)
                id = _symTab.ForceAdd (strSymbol, lenSym);
            pNode = new VarNode (id, _store);
        }
    }
    else if (token == tMinus) // unary minus
    {
        _scanner.Accept (); // accept minus
        pNode = new UMinusNode (Factor ());
    }
    else
    {
        _scanner.Accept ();
        _status = stError;
        pNode = 0;
    }
    return pNode;
}</pre></table><!-- End Code -->

<p>To see how the parser works, let抯 single step through a simple example. Suppose that the user typed
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>x = 1
</pre></table><!-- End Code --><p>First, a scanner containing the input string is created. It scans the first token and decides that it's an identifier, <var>tIdent</var>. Next, the parser is called to <var>Eval</var> the expression. It starts parsing it by assuming that it is, indeed, an expression

<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>_pTree = Expr ();
</pre></table><!-- End Code -->

<p>Expression must start with a term, so <var>Expr</var> calls <var>Term</var>
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>Node * pNode = Term ();
</pre></table><!-- End Code -->
<var><p>Term</var>, on the other hand, expects to see a <var>Factor</var>
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>Node * pNode = Factor ();
</pre></table><!-- End Code --><p>Finally <var>Factor</var> looks at the first token
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>EToken token = _scanner.Token ();
</pre></table><!-- End Code --><p>and compares it to <var>tLParen</var>, <var>tNumber</var>, <var>tIdent</var> and <var>tMinus</var>--in our case the first token is an identifier. It then ask the scanner for the name of the symbol (it is &quot;x&quot;) and searches for it in the symbol table. It's done with the first token so it tells the scanner to accept it.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>// copy the symbol into strSymbol
_scanner.SymbolName (strSymbol, lenSym);
int id = _symTab.Find (strSymbol, lenSym);
_scanner.Accept ();
</pre></table><!-- End Code --><p>The scanner scans for the next token, which is the assignment operator <var>tAssign</var>. The parser looks at this token to see if it a left parenthesis--that would signify a function call. Since it isn抰, it creates a <var>VarNode</var> using the id of the symbol found in the symbol table. If this is a new symbol for which there wasn抰 any id, it just adds it to the symbol table and creates a new id.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>if (id == idNotFound)
    id = _symTab.ForceAdd (strSymbol, lenSym);
pNode = new VarNode (id, _store);
</pre></table><!-- End Code -->
<!--I'm here-->
<var><p>Factor</var> is now done and it returns a pointer to the node it has just created. Back to <var>Term</var>. It has a node, so now it looks at the next token to see if it is one of  <var>tMult</var> or <var>tDivide</var>. Since it抯 not, it returns the same node. Back to <var>Expr</var>. Again, a look at the token: is it <var>tPlus</var>, <var>tMinus</var> or <var>tAssign</var>? It is <var>tAssign</var>! It accepts the token (the scanner positions itself at <var>tNumber</var> whose value is 1). So far, the parser has seen a node followed by an equal sign. An equal sign can be followed by an arbitrary expression, so <var>Expr</var> now calls itself to parse the rest of the input which is &quot;1&quot;. 
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>_scanner.Accept ();
Node* pRight = Expr ();
</pre></table><!-- End Code -->
<var><p>Expr</var> again goes through <var>Term</var> and <var>Factor</var>, which creates a <var>NumNode</var>
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>pNode = new NumNode (_scanner.Number ());
_scanner.Accept ();
</pre></table><!-- End Code --><p>Back to <var>Term</var> and <var>Expr</var> with the new node <var>pRight</var>. The parser has now seen a node, <var>pNode</var>, followed by the equal sign and another node, <var>pRight</var>. It looks very much like an assignment statement, except that we don抰 know if <var>pNode</var> represents an lvalue. In our case, the node being a <var>VarNode</var>, it does. An assignment node is created:
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>
<pre>if (pNode-&gt;IsLvalue ())
{
    pNode = new AssignNode (pNode, pRight);
}
</pre></table><!-- End code -->
<var><p>Expr</var> returns this <var>AssignNode</var> to <var>Parse</var>. The parse tree now looks like this (Figure 2-7)
<p align="CENTER"><img src="Image42.gif" tppabs="http://www.relisoft.com/book/lang/project/images/Image42.gif" width=222 height=174>
<b><p class=caption align="CENTER">Figure 2-7
</b><p>Since the parsing was successful, we call <var>Execute</var>, which calls <var>_pTree-&gt;Calc ()</var> . The virtual <var>Calc</var> method of <var>AssignNode</var> calculates the value of the right node
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -