⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 6lib.html

📁 Visual C++ has been one of most effective tool for the large industrial applications. This book is t
💻 HTML
📖 第 1 页 / 共 3 页
字号:
<HTML>
<HEAD>

<TITLE>Standard Template Library</TITLE>
    <meta  name="description" content="Taking advantage of STL in C++">
    <meta name="keywords" content="STL, template, library, string, auto_ptr">
<link rel="stylesheet" href="rs.css" tppabs="http://www.relisoft.com/book/rs.css">
</HEAD>
<BODY background="margin.gif" tppabs="http://www.relisoft.com/book/images/margin.gif" bgcolor="#ffffe0">

<!-- Main Table -->
<table cellpadding="6">
    <tr>
    <td width="78">&nbsp;</td>
    <td>

<h3>Making Use of the Standard Template Library</h3>
</td></tr>
<tr>
<td class=margin valign=top>
<a href="calc6.zip" tppabs="http://www.relisoft.com/book/tech/source/calc6.zip">
<img src="brace.gif" tppabs="http://www.relisoft.com/book/images/brace.gif" width=16 height=16 border=1 alt="Download!"><br>source</a>
</td>
<td>
<p>Programming in C++ is so much easier once you start using the Standard Library. If I were to start the whole calculator project from scratch, with the full use of the Standard Library, it would look a lot different. I didn't, though, for several reasons. First of all, the standard library is based on templates--especially the part called the Standard Template Library (STL) that contains all of the containers (most notably, vectors) and algorithms. So I had to explain first what templates were and what kind of problems they solved. Second of all, you wouldn't know how much work they could save you, unless you tried programming without them.

<p>So what would the calculator look like if I had a free run of STL from the very beginning First of all, I wouldn't bother implementing the hash table. No, there is no ready-made implementation of a hash table in STL--although many compiler vendors will undoubtedly add one to their version of the Standard Library. There is, however, another useful data structure that, although not as efficient as the hash table, fulfills the same role. I'm talking about an associative array, known in STL as <var>std::map</var>. 

<p>An associative array is like an array, except that it can be indexed using objects of a type that is not necessarily integral.

<p>You know how to define an array of strings. You index it with an integer and it gives you a string. An associative array can do the inverse--you may index it, for instance, with a string and it could give you an integer. Like this:
<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>int id = assoc ["foo"];</pre>
    </td></tr>
</table>
<!-- End Code -->
<p>This looks very much like what we need in our symbol table. Given the name of an identifier, give me its integer id.

<p>You can easily implement an associative array using the STL's <var>map</var> template. A map is parameterized by two types--the type used to index it, called the <b>key</b>; and the <b>value</b> type. There is only one requirement for the key type--there must be a way to compare keys. Actually, it's not necessary to provide a method to compare keys for (in-) equality. What is important for the map is to be able to tell if one key is less than the other. By default, a map expects the key type to either be directly comparable or have the less-than operator overload.

<p>The naive, and for most purposes incorrect, approach to implementing an associative array of strings would be to use <var>char const *</var> as the key type, like this:

<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>std::map&lt;char const *, int> dictionary;</pre>
    </td></tr>
</table>
<!-- End Code -->

<p>The first problem is that there indeed <i>is</i> an operator less-than defined for this key type, but it's totally inappropriate. Consider this:

<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>char const * str1 = "sin";
char const * str2 = "sin";
if (str1 &lt; str2) // &lt;- not what you'd expect!
    cout &lt;&lt; str1 &lt;&lt; " is less than " &lt;&lt; str2 &lt;&lt; endl; </pre>
    </td></tr>
</table>
<!-- End Code -->
The result of this comparison is meaningless, because we are comparing two pointers--we are numerically comparing two memory addresses. This has nothing to do with what we normally consider "string comparison." There is, however, a special function <var>strcmp</var> defined in <span class="file">&lt;cstring&gt;</span> that compares strings represented as pointers to characters (there is also a case-insensitive version called <var>stricmp</var>). We can tell the map to use this function to do key comparison. Here's how it's done.

<p>There is a version of the map that takes a third template argument--the predicate functor. A predicate is a function that responds with <i>true</i> or <i>false</i> to a particular question. In our case the question would be, "Is one string less than the other?" A functor is a class that defines a function-call operator. Here's an example of a functor class that's suitable for our purposes:

<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>class LessThan
    : public std::binary_function&lt;char const *, char const *, bool&gt;
{
public:
    bool operator () (char const * str1, 
                      char const * str2) const
    {
        return strcmp (str1, str2) &lt; 0;
    }
};

std::map&lt;char const *, int, LessThan&gt; dictionary;</pre>
    </td></tr>
</table>
<!-- End Code -->
<p>Notice that the function <var>strcmp</var> returns a number less than zero when the first string precedes the second one lexicographically (i.e., it would precede it in a dictionary--let's forget for a moment about human-language-specific issues). Similarly, it returns a number greater than zero when the first string follows the other; and zero when they're equal.

<p>The way you could use a functor object, other than passing it as a template argument to a map, would be to treat it just like a function (or function pointer). 

<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>char const * str1 = "cos";
char const * str2 = "sin";
LessThan lessThan;
if (lessThan (str1, str2))
    cout &lt;&lt; str1 &lt;&lt; " is less than " &lt;&lt; str2 &lt;&lt; endl; </pre>
    </td></tr>
</table>
<!-- End Code -->

<p>In fact, a functor is even more efficient than a function pointer, because it may be inlined. In the example above, the implicit call to <var>lessThan.operator ()</var> will be inlined, according to its declaration in <var>LessThan</var>.

<p>If you think this is too complicated--having to define a predicate functor to do such a simple thing as string comparison in a map--you're right! This is because we are trying to fit a square peg in a round hole. The equivalent of a square peg is the low level data structure that we are insisting on using to represent strings. The Standard Library has a perfectly round peg for this purpose--it's called a <var>string</var>.

<p>And that brings us to the second problem with the naive implementation of the symbol table: the keys could disappear from under the map. When you enter a new association--a pair (key, value)--in the map, the map has to remember both the key and the value. If you use a pointer to character as a key, the map will store the pointer, but you'll have no idea where the actual characters are stored and for how long. Somebody could call the map with a character string that's allocated on the stack and which will disappear as soon as the caller returns. We want the map to <i>own</i> the keys! And no, we can't use <var>auto_ptr</var>s as keys. Standard library containers use value semantics for all its data--they get confused by objects that exhibit transfer semantics. And that's the second reason to use <var>string</var>s.

<p>A standard <var>string</var> not only has value semantics, but it also comes with a lot of useful methods--one of them being operator less-than. As a matter of fact, string and map are a perfect match. Implementing an associative array of strings is a no-brainer. So let's do it first, and ask detailed questions later. Here's the declaration of the new symbol table:

<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>#include &lt;map&gt;
#include &lt;string&gt;

class SymbolTable
{
public:
    enum { idNotFound = -1 };
    SymbolTable () : _id (0) {}
    int ForceAdd (char const * str);
    int Find (char const * str) const;
private:
    <font color="red">std::<b>map</b>&lt;std::string, int&gt;</font> _dictionary;
    int _id;
};
</pre>
    </td></tr>
</table>
<!-- End Code -->

<p>Notice that we are still clinging to the old <var>char const *</var> C-style strings in the symbol table interface. That's because I want to show you how you can start converting old legacy code without having to rewrite everything all at once. The two representations of strings can co-exist quite peacefully. Of course, we'll have to make some conversions at the boundaries of the two worlds. For instance, the method <var>ForceAdd</var> starts by creating a standard string from the C-string argument. It then stores the corresponding integer id under this string in the associative array.
<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>int SymbolTable::ForceAdd (char const * str)
{
    std::string name (str);
    _dictionary [name] = _id;
    return _id++;
}</pre>
    </td></tr>
</table>
<!-- End Code -->
<p>Here, the mere act of accessing a map using a key will create an entry, if one wasn't there.

<!-- Sidebar -->
<table width="100%" border=0 cellpadding=5><tr>
<td width=10>&nbsp;</td>
<td bgcolor="#cccccc" class=sidebar>
Don't get fooled by the syntactic simplicity of accessing a map. The time it takes to index an associative array that's implemented as a standard map is not constant, like it is with regular arrays or hash tables, but it depends on its size (number of elements). The good news is that it's not linear--the array is not searched element-by-element. The recommended implementation of an STL map is in terms of a balanced tree that can be searched in logarithmic time. Logarithmic access is pretty good for most applications.
</td></table>
<!-- End Sidebar -->

<p>There is also a more explicit way of adding entries to a map by calling its <var>insert</var> method. You insert a pair (key, value) using a standard <var>pair</var> template. For instance, <var>ForceAdd</var> may be also implemented this way:

<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>int SymbolTable::ForceAdd (char const * str)
{
    std::string name (str);
    <font color="red">std::<b>pair</b>&lt;std::string, int&gt;</font> p (str, _id);
    _dictionary.insert (p);
    return _id++;
}</pre>
    </td></tr>
</table>
<!-- End Code -->

<p>The <var>Find</var> method of the symbol table could be simply implemented by returning <var>_dictionary [str]</var>, if it weren't for the fact that we sometimes want to know whether the name wasn't there. We have to use a more explicit approach.

<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>int SymbolTable::Find (char const * str) const
{
    std::map&lt;std::string, int&gt;::const_iterator it;
    it = _dictionary.find (str);
    if (it != _dictionary.end ())
        return it->second;
    return idNotFound;
}</pre>
    </td></tr>
</table>
<!-- End Code -->

<p>The <var>find</var> method of <var>map</var> returns an iterator. This iterator could either point to the end of the dictionary, or to a pair (key, value). A pair has two public data members, <var>first</var> and <var>second</var>, with obvious meanings.

<p>Notice also that I called <var>find</var> with a C-style string. It was possible, because the standard string has a constructor that accepts a C-string. This constructor can be used for implicit conversions.

<p>We can use this property of strings to further simplify the implementation of the symbol table. We can change the signature of its methods to accept strings rather than C-strings.

<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>int ForceAdd (std::string const &amp; str);
int Find (std::string const &amp; str) const;
</pre>
    </td></tr>
</table>
<!-- End Code -->
<p>This will work without any changes to the callers, because of the implicit conversions. The trick is that both methods take const references to strings. The compiler will have no trouble creating temporary strings at the point of call and passing out const references. This would not work, however, if symbol table methods required non-const references (we already talked about it, but this point is worth repeating--the compiler will refuse to create a non-const reference to a temporary object).

<p>There is an interesting twist on this topic--we may pass strings by value.
<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>int ForceAdd (std::string str);
int Find (std::string str) const;
</pre>
    </td></tr>
</table>
<!-- End Code -->

<p>This would work, because, as I mentioned, strings have value semantics. They behave as if every copy were a deep copy. However, since memory allocation and copying is rather expensive, a good implementation of a standard string uses all kinds of clever tricks to avoid, or at least postpone, these operations. Let's talk about this in some more detail.

<h3>Reference Counting and Copy-On-Write</h3>


</td></tr>
<tr>
<td class=margin valign=top>
<a href="string.zip" tppabs="http://www.relisoft.com/book/tech/source/string.zip">
<img src="brace.gif" tppabs="http://www.relisoft.com/book/images/brace.gif" width=16 height=16 border=1 alt="Download!"><br>source</a>
<br>Strings.
</td>
<td>
<p>First, let's make sure we understand what is meant by value semantics as applied to strings. Here's a simple implementation of a value string. Notice the copy constructor and the assignment operator.

<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>class StringVal
{
public:
    StringVal (char const * cstr = 0)
        :_buf (0)
    {
        if (cstr != 0)
            Init (cstr);
    }
    StringVal (StringVal const &amp; str)
        :_buf (0)
    {
        if (str.c_str () != 0)
            Init (str.c_str ());
    }
    ~StringVal ()
    {
        delete _buf;
    }
    StringVal &amp; operator= (StringVal const &amp; str);
    char const * c_str () const { return _buf; }
    void Upcase ()
private:
    void Init (char const * cstr)
    {
        assert (cstr != 0);
        _buf = new char [strlen (cstr) + 1];
        strcpy (_buf, cstr);
    }
private:
    char  * _buf;
};</pre>
    </td></tr>
</table>
<!-- End Code -->

<p>When overloading the assignment operator, we have to take care of a few special cases. For instance, we should do noting if the source is equal to the target, as in <var>str = str</var>. Then we have to be smart about null strings. Finally, we have to reallocate the buffer if there isn't enough space in it for the new string.
<!-- Code -->
<table width="100%" cellspacing=10><tr>
    <td class=codeTable>
<pre>StringVal &amp; StringVal::operator= (StringVal const &amp; str)
{
    if (this != &amp; str)
    {
        char const * cstr = str.c_str ();
        if (cstr == 0)
        {
            delete _buf;
            _buf = 0;
        }
        else
        {
            int len = strlen (cstr);
            if (_buf == 0 || strlen (_buf) &lt; len)
            {
                delete _buf;
                Init (cstr);
            }
            else
                strcpy (_buf, cstr);
        }
    }
    return *this;
}</pre>
    </td></tr>
</table>
<!-- End Code -->

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -