📄 10parser.html
字号:
<html>
<head>
<title>Parser</title>
<meta name="description" content="Parser implementation in C++">
<meta name="keywords" content="private method, method">
<link rel="stylesheet" href="rs.css" tppabs="http://www.relisoft.com/book/rs.css">
</head>
<body background="margin.gif" tppabs="http://www.relisoft.com/book/images/margin.gif" bgcolor="#FFFFDC">
<!-- Main Table -->
<table cellpadding="6">
<tr>
<td width="78">
<td>
<h3>Parser</h3>
<p class=topics>Private methods.</p>
<p>The Parser requires a major makeover. Let抯 start with the header file, <b><i>parse.h</i></b>. It contains a lot of forward declarations
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>class Node;
class Scanner;
class Store;
class FunctionTable;
class SymbolTable;
</pre></table><!-- End Code --><p>All these classes are mentioned in <b><i>parse.h</i></b> either through pointers or references梩here is no need to include the header files that define them.
<p>The class <var>Parser</var> has a constructor that takes references to all the major objects it will need in order to parse the stream of tokens from the <var>Scanner</var>. It needs the <var>Store</var> to create <var>VarNode</var>s, <var>FunctionTable</var> to create <var>FunNode</var>s and the <var>SymbolTable</var> to recognize variables and function names. Once the <var>Parser</var> is created, we can only ask it to evaluate the expression passed to it in the <var>Scanner</var>. The <var>Eval</var> method has the side effect of printing the result on the screen (if you're cringing now, I promise you'll be relieved when you read the next chapter of this book).
<var><p>Parser</var> has a number of private methods that it uses to parse expressions, terms and factors; and a method to execute the evaluation of the expression tree. Private methods are a useful device for code structuring. They cannot be called from outside of the class梩hey are just helper functions used in the implementations of other, public or private, methods.
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>class Parser
{
public:
Parser (Scanner & scanner,
Store & store,
FunctionTable & funTab,
SymbolTable & symTab);
~Parser ();
Status Eval ();
private:
void Parse();
Node * Expr();
Node * Term();
Node * Factor();
void Execute ();
Scanner & _scanner;
Node * _pTree;
Status _status;
Store & _store;
FunctionTable & _funTab;
SymbolTable & _symTab;
};
</pre></table><!-- End Code --><p>The constructor (we are already looking at the implementation file, <b><i>parse.cpp</i></b>) initializes all the private variables.
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>Parser::Parser (Scanner & scanner,
Store & store,
FunctionTable & funTab,
SymbolTable & symTab )
: _scanner (scanner),
_pTree (0),
_status (stOk),
_funTab (funTab),
_store (store),
_symTab (symTab)
{
}
</pre></table><!-- End Code --><p>The destructor deletes the parse tree:
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>Parser::~Parser ()
{
delete _pTree;
}
</pre></table><!-- End Code --><p>The <var>Eval</var> method calls private methods to parse the input and execute the calculation:
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>Status Parser::Eval ()
{
Parse ();
if (_status == stOk)
Execute ();
else
_status = stQuit;
return _status;
}
</pre></table><!-- End Code -->
<var><p>Execute</var> calls the <var>Calc</var> method of the top node of the parse tree and prints the result of the (recursive) calculation
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>void Parser::Execute ()
{
if (_pTree)
{
double result = _pTree->Calc ();
cout << " " << result << endl;
}
}
</pre></table><!-- End Code -->
<p>The parsing starts with the assumption that whatever it is, it抯 an expression.
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>void Parser::Parse ()
{
_pTree = Expr ();
}
</pre></table><!-- End Code --><p>The expression starts with a term. The term can be followed by one of the operators that bind terms: the plus sign, the minus sign, the assignment sign; or nothing at all. These correspond to the four productions:
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre><b>Expression</b> is <b>Term + Expression</b>
or <b>Term - Expression</b>
or <b>Term = Expression</b> // the Term must be an lvalue
or just <b>Term</b>
</pre></table><!-- End Code -->
<p>That抯 why, after parsing a term we ask the <var>Scanner</var> for the next token and test it against <var>tPlus</var>, <var>tMinus</var> and <var>tAssign</var>. Immediately after a token is recognized we tell the <var>Scanner</var> to accept it.
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>Node * Parser::Expr ()
{
Node * pNode = Term ();
EToken token = _scanner.Token ();
if (token == tPlus)
{
_scanner.Accept ();
Node * pRight = Expr ();
pNode = new AddNode (pNode, pRight);
}
else if (token == tMinus)
{
_scanner.Accept ();
Node * pRight = Expr ();
pNode = new SubNode (pNode, pRight);
}
else if (token == tAssign)
{
_scanner.Accept ();
Node * pRight = Expr ();
if (pNode->IsLvalue ())
{
pNode = new AssignNode (pNode, pRight);
}
else
{
_status = stError;
delete pNode;
pNode = Expr ();
}
}
return pNode;
}
</pre></table><!-- End Code --><p>We proceed in a similar fashion with the <var>Term</var>, following the productions
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre><b>Term</b> is <b>Factor * Term</b>
or <b>Factor / Term</b>
or just <b>Factor</b>
</pre></table><!-- End Code -->
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>Node * Parser::Term ()
{
Node * pNode = Factor ();
if (_scanner.Token () == tMult)
{
_scanner.Accept ();
Node * pRight = Term ();
pNode = new MultNode (pNode, pRight);
}
else if (_scanner.Token () == tDivide)
{
_scanner.Accept ();
Node * pRight = Term ();
pNode = new DivideNode (pNode, pRight);
}
return pNode;
}
</pre></table><!-- End Code -->
<p>A <var>Factor</var>, in turn, can be one of these
<!--Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre><b>Factor</b> is <b>( Expression )</b> // parenthesized expression
or <b>Number</b> // literal floating point number
or <b>Identifier ( Expression )</b> // function call
or <b>Identifier</b> // symbolic variable
or <b>- Factor</b> // unary minus</pre>
</table><!-- End Code -->
<!--Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>Node * Parser::Factor ()
{
Node * pNode;
EToken token = _scanner.Token ();
if (token == tLParen)
{
_scanner.Accept (); // accept '('
pNode = Expr ();
if (_scanner.Token() != tRParen)
_status = stError;
_scanner.Accept (); // accept ')'
}
else if (token == tNumber)
{
pNode = new NumNode (_scanner.Number ());
_scanner.Accept ();
}
else if (token == tIdent)
{
char strSymbol [maxSymLen + 1];
int lenSym = maxSymLen;
// copy the symbol into strSymbol
_scanner.SymbolName (strSymbol, lenSym);
int id = _symTab.Find (strSymbol, lenSym);
_scanner.Accept ();
if (_scanner.Token() == tLParen) // function call
{
_scanner.Accept (); // accept '('
pNode = Expr ();
if (_scanner.Token () == tRParen)
_scanner.Accept (); // accept ')'
else
_status = stError;
if (id != idNotFound && id < _funTab.Size ())
{
pNode = new FunNode (
_funTab.GetFun (id), pNode);
}
else
{
cout << "Unknown function \"";
cout << strSymbol << "\"\n";
}
}
else
{
if (id == idNotFound)
id = _symTab.ForceAdd (strSymbol, lenSym);
pNode = new VarNode (id, _store);
}
}
else if (token == tMinus) // unary minus
{
_scanner.Accept (); // accept minus
pNode = new UMinusNode (Factor ());
}
else
{
_scanner.Accept ();
_status = stError;
pNode = 0;
}
return pNode;
}</pre></table><!-- End Code -->
<p>To see how the parser works, let抯 single step through a simple example. Suppose that the user typed
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>x = 1
</pre></table><!-- End Code --><p>First, a scanner containing the input string is created. It scans the first token and decides that it's an identifier, <var>tIdent</var>. Next, the parser is called to <var>Eval</var> the expression. It starts parsing it by assuming that it is, indeed, an expression
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>_pTree = Expr ();
</pre></table><!-- End Code -->
<p>Expression must start with a term, so <var>Expr</var> calls <var>Term</var>
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>Node * pNode = Term ();
</pre></table><!-- End Code -->
<var><p>Term</var>, on the other hand, expects to see a <var>Factor</var>
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>Node * pNode = Factor ();
</pre></table><!-- End Code --><p>Finally <var>Factor</var> looks at the first token
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>EToken token = _scanner.Token ();
</pre></table><!-- End Code --><p>and compares it to <var>tLParen</var>, <var>tNumber</var>, <var>tIdent</var> and <var>tMinus</var>--in our case the first token is an identifier. It then ask the scanner for the name of the symbol (it is "x") and searches for it in the symbol table. It's done with the first token so it tells the scanner to accept it.
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>// copy the symbol into strSymbol
_scanner.SymbolName (strSymbol, lenSym);
int id = _symTab.Find (strSymbol, lenSym);
_scanner.Accept ();
</pre></table><!-- End Code --><p>The scanner scans for the next token, which is the assignment operator <var>tAssign</var>. The parser looks at this token to see if it a left parenthesis--that would signify a function call. Since it isn抰, it creates a <var>VarNode</var> using the id of the symbol found in the symbol table. If this is a new symbol for which there wasn抰 any id, it just adds it to the symbol table and creates a new id.
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>if (id == idNotFound)
id = _symTab.ForceAdd (strSymbol, lenSym);
pNode = new VarNode (id, _store);
</pre></table><!-- End Code -->
<!--I'm here-->
<var><p>Factor</var> is now done and it returns a pointer to the node it has just created. Back to <var>Term</var>. It has a node, so now it looks at the next token to see if it is one of <var>tMult</var> or <var>tDivide</var>. Since it抯 not, it returns the same node. Back to <var>Expr</var>. Again, a look at the token: is it <var>tPlus</var>, <var>tMinus</var> or <var>tAssign</var>? It is <var>tAssign</var>! It accepts the token (the scanner positions itself at <var>tNumber</var> whose value is 1). So far, the parser has seen a node followed by an equal sign. An equal sign can be followed by an arbitrary expression, so <var>Expr</var> now calls itself to parse the rest of the input which is "1".
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>_scanner.Accept ();
Node* pRight = Expr ();
</pre></table><!-- End Code -->
<var><p>Expr</var> again goes through <var>Term</var> and <var>Factor</var>, which creates a <var>NumNode</var>
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>pNode = new NumNode (_scanner.Number ());
_scanner.Accept ();
</pre></table><!-- End Code --><p>Back to <var>Term</var> and <var>Expr</var> with the new node <var>pRight</var>. The parser has now seen a node, <var>pNode</var>, followed by the equal sign and another node, <var>pRight</var>. It looks very much like an assignment statement, except that we don抰 know if <var>pNode</var> represents an lvalue. In our case, the node being a <var>VarNode</var>, it does. An assignment node is created:
<!-- Code --><table width="100%" cellspacing=10><tr> <td class=codetable>
<pre>if (pNode->IsLvalue ())
{
pNode = new AssignNode (pNode, pRight);
}
</pre></table><!-- End code -->
<var><p>Expr</var> returns this <var>AssignNode</var> to <var>Parse</var>. The parse tree now looks like this (Figure 2-7)
<p align="CENTER"><img src="Image42.gif" tppabs="http://www.relisoft.com/book/lang/project/images/Image42.gif" width=222 height=174>
<b><p class=caption align="CENTER">Figure 2-7
</b><p>Since the parsing was successful, we call <var>Execute</var>, which calls <var>_pTree->Calc ()</var> . The virtual <var>Calc</var> method of <var>AssignNode</var> calculates the value of the right node
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -