📄 3stub.html

📁 Visual C++ has been one of most effective tool for the large industrial applications. This book is t
💻 HTML
字号:
<html>
<head>
	<title>Stubbed Implementation</title>
    <meta  name="description" content="Stubbed top-down implementation of a calculator">
    <meta name="keywords" content="stub, design, implementation, calculator, top-down, scanner, parser">
	<link rel="stylesheet" href="rs.css" tppabs="http://www.relisoft.com/book/rs.css">
</head>

<body background="margin.gif" tppabs="http://www.relisoft.com/book/images/margin.gif" bgcolor="#FFFFDC">

<!-- Main Table -->
<table cellpadding="6">
    <tr>
    <td width="78">
	&nbsp;
	<td>



<h3>Stubbed Implementation</h3>

<p>I will follow the top-down implementation strategy. It means that I will run the program first and implement it later. (Actually, I will run it as soon as I create stubs for all the top level classes.) <p>Let抯 start with the <var>Scanner</var>. It is constructed out of a buffer of text (a line of text, to be precise). It keeps a pointer to that buffer and, later, it will be able to scan it left to right and convert it to tokens. For now the constructor of the <var>Scanner</var> stub announces its existence to the world and prints the contents of the buffer.


<tr>
<td class=margin valign=top>

<br>
<a href="javascript:if(confirm('http://www.relisoft.com/book/lang/project/source/calc1.zip  \n\nThis file was not retrieved by Teleport Pro, because it is linked too far away from its Starting Address. If you increase the in-domain depth setting for the Starting Address, this file will be queued for retrieval.  \n\nDo you want to open it from the server?'))window.location='http://www.relisoft.com/book/lang/project/source/calc1.zip'" tppabs="http://www.relisoft.com/book/lang/project/source/calc1.zip">
<img src="brace-2.gif" tppabs="http://www.relisoft.com/book/lang/project/Images/brace.gif" width=16 height=16 border=1 alt="Download!"><br>source</a>
<td>

<!--Code--><table width="100%" cellspacing=10><tr>  <td class=codetable>
<pre>class Scanner
{
public:
    Scanner (char const * buf);
private:
    char const * const _buf;
};

Scanner::Scanner (char const * buf)
    : _buf (buf)
{
    cout &lt;&lt; "Scanner with \"" &lt;&lt; buf &lt;&lt; "\"" &lt;&lt; endl;
}</pre>
</table><!-- End Code -->

<p>The <var>SymbolTable</var> stub will be really trivial for now. We only assume that it has a constructor.

<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>class SymbolTable
{
public:
    SymbolTable () {}
};

</pre>
</table><!-- End Code --><p>The <var>Parser</var> will need access to the scanner and to the symbol table. It will parse the tokens retrieved from the Scanner and evaluate the resulting tree. The method <var>Eval</var> is supposed to do that. It should return a status code that would depend on the result of parsing. We combine the three possible statuses into an <i>enumeration</i>. An <var>enum</var> is an integral type that can only take a few predefined values. These values are given symbolic names and are either initialized to concrete values by the programmer or by the compiler. In our case we don抰 really care what values correspond to the various statuses, so we leave it to the compiler. Using an <var>enum</var> rather than an <var>int</var> for the return type of <var>Eval</var> has the advantage of stricter type checking. It also prevents us from returning anything other than one of the three values defined by the <var>enum</var>.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>enum Status
{
    stOk,
    stQuit,
    stError
};

class Parser
{
public:
    Parser (Scanner &amp; scanner, SymbolTable &amp; symTab);
    ~Parser ();
    Status Eval ();
private:
    Scanner &amp;        _scanner;
    SymbolTable &amp;    _symTab;
};

Parser::Parser (Scanner &amp; scanner, SymbolTable &amp; symTab)
    : _scanner (scanner), _symTab (symTab)
{
    cout &lt;&lt; "Parser created\n";
}

Parser::~Parser ()
{
    cout &lt;&lt; "Destroying parser\n";
}

Status Parser::Eval ()
{
    cout &lt;&lt; "Parser eval\n";
    return stQuit;
}

</pre></table><!-- End Code --><p>Finally, the main procedure. Here you can see the top level design of the program in action. The lifetime of the symbol table has to be equal to that of the whole program, since it has to remember the names of all the variables introduced by the user during one session. The scanner and the parser, though, can be created every time a line of text is entered. The parser doesn抰 have any state that has to be preserved from one line of text to another. If it encounters a new variable name, it will store it in the symbol table that has a longer lifespan.

<p>In the main loop of our program a line of text is retrieved from the standard input using the <var>getline</var> method of <var>cin</var>, a scanner is constructed from this line, and a parser is created from this scanner. The <var>Eval</var> method of the parser is then called to parse and evaluate the expression. As long as the status returned by <var>Eval</var> is different from <var>stQuit</var>, the whole process is repeated.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>const int maxBuf = 100;

int main ()
{
    char buf [maxBuf];
    Status status;
    SymbolTable symTab;
    do
    {
        cout &lt;&lt; "&gt; ";  // prompt
        cin.getline (buf, maxBuf);
        Scanner scanner (buf);
        Parser  parser (scanner, symTab);
        status = parser.Eval ();
    } while (status != stQuit);
}

</pre></table><!-- End Code --><p>This program compiles and runs thus proving the validity of the concept.

<h3>Expanding Stubs</h3>

<p>The first stub to be expanded into a full implementation will be that of the <var>Scanner</var>. The scanner converts the input string into a series of tokens. It works like an iterator. Whenever the <var>Parser</var> needs a token it asks the Scanner for the current token. When this token is parsed, the <var>Parser</var> accepts it by calling<var> Scanner::Accept()</var>. The <var>Accept</var> method scans the string further trying to recognize the next token. 
<p>There is a finite, well-defined set of tokens. It is convenient to put them into the enumeration <var>EToken</var>.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>enum EToken
{
    tEnd,
    tError,
    tNumber,
    tPlus,
    tMult
};

</pre></table><!-- End Code -->

<p>We would also like the <var>Scanner</var> to be able to convert the part of the input string recognized as a number to a floating-point value. This is done by the <var>Number()</var> method that may be called only when the current token is <var>tNumber</var> (see the assertion there). Notice the use of the type <var>char const * const</var>梐 <var>const</var> pointer to a <var>const</var> string. The pointer is initialized in the constructor and never changes again. Its contents is read-only, too.
 
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>class Scanner
{
public:
    Scanner (char const * buf);
    EToken  Token () const { return _token; }
    EToken  Accept ();
    double Number ()
    {
        assert (_token == tNumber);
        return _number;
    }
private:
    void EatWhite ();

    char const * const   _buf;
    int                  _iLook;
    EToken               _token;
    double               _number;
};

</pre></table><!-- End Code -->

<p>The constructor of the <var>Scanner</var>, besides initializing all member variables, calls the method <var>Accept</var>. <var>Accept</var> recognizes the first available token and positions the index <var>_iLook</var> past the recognized part of the buffer.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>Scanner::Scanner (char const * buf)
    : _buf (buf), _iLook(0)
{
    cout &lt;&lt; "Scanner with \"" &lt;&lt; buf &lt;&lt; "\"" &lt;&lt; endl;
    Accept ();
}</pre>
<!--End Table--></table>

<var><p>EatWhite</var> is the helper function that skips whitespace characters in the input.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>void Scanner::EatWhite ()
{
    while (isspace (_buf [_iLook]))
        ++_iLook;
}</pre>
<!--End Table--></table>

<var><p>Accept</var> is just one big <b><i>switch</i></b> statement. Depending on the value of the current character in the buffer, <var>_buf[_iLook]</var>, the control passes to the appropriate <b><i>case</i></b> label within the switch. For now I have implemented the addition and the multiplication operators. When they are recognized, the <var>_token</var> variable is initialized to the appropriate enumerated value and the variable <var>_iLook</var> incremented by one. 
<p>The recognition of digits is done using the fall-through property of the case statements. In a switch statement, unless there is an explicit <b><i>break</i></b> statement, the control passes to the next case. All digit cases fall through, to reach the code after case <var>'9'</var>. The string starting with the digit is converted to an integer (later we will convert this part of the program to recognize floating-point numbers) and <var>_iLook</var> is positioned after the last digit. 
<p>The <b><i>default</i></b> case is executed when no other case matches the character in the switch.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>EToken Scanner::Accept ()
{
    EatWhite ();
    switch (_buf[_iLook])
    {
    case '+':
        _token = tPlus;
        ++_iLook;
        break;
    case '*':
        _token = tMult;
        ++_iLook;
        break;
    case '0': case '1': case '2': case '3': case '4':
    case '5': case '6': case '7': case '8': case '9':
        _token = tNumber;
        _number = atoi (&amp;_buf [_iLook]);
        while (isdigit (_buf [_iLook]))
            ++_iLook;
        break;
    case '\0': // end of input
        _token = tEnd;
        break;
    default:
        _token = tError;
        break;
    }
    return Token ();
}

</pre></table><!-- End Code --><p>You might ask: Can this switch statement be replaced by some clever use of polymorphism? I really don抰 know how one could do it. The trouble is that the incoming data梒haracters from the buffer梚s <b><i>amorphic</i></b>. At the stage where we are trying to determine the meaning of the input, to give it <i>form</i>, the best we can do is to go through a multi-way conditional, in this case a switch statement. Dealing with amorphic input is virtually the only case when the use of a switch statement is fully legitimate in C++. Parsing user input, data stored in a file or external events, all require such treatment.
<p>The dummy parser is implemented as a for loop retrieving one token after another and printing the appropriate message.
<!-- Code --><table width="100%" cellspacing=10><tr>	<td class=codetable>

<pre>Status Parser::Parse ()
{
    for (EToken token = _scanner.Token ();
        token != tEnd;
        token = _scanner.Accept ())
    {
        switch (token)
        {
        case tMult:
            cout &lt;&lt; "Times\n";
            break;
        case tPlus:
            cout &lt;&lt; "Plus\n";
            break;
        case tNumber:
            cout &lt;&lt; "Number: " &lt;&lt; _scanner.Number () &lt;&lt; "\n";
            break;
        case tError:
            cout &lt;&lt; "Error\n";
            return stQuit;
        default:
            cout &lt;&lt; "Error: bad token\n";
            return stQuit;
        }
    }
    return stOk;
}

</pre></table><!-- End Code --><p>Once more, this partially implemented program is compiled and tested. We can see the flow of control and the correct recognition of a few tokens. 
<br><a href="4final.html" tppabs="http://www.relisoft.com/book/lang/project/4final.html">Next.</a>

</table>
<!-- End Main Table -->
</body>
</html>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -