📄 mc4.htm

📁 高效c++编程
💻 HTM
📖 第 1 页 / 共 5 页
字号:
</PRE>
</UL>
<A NAME="41093"></A><P><A NAME="dingp62"></A>
the jig is up and we've got to compute a complete value for <CODE>m3</CODE>. Similarly, if one of the matrices on which <CODE>m3</CODE> is dependent is about to be modified, we have to take immediate <NOBR>action:<SCRIPT>create_link(62);</SCRIPT>
</NOBR></P>
<A NAME="41094"></A>
<UL><PRE>
m3 = m1 + m2;                                // remember that m3 is the
                                             // sum of m1 and m2
<A NAME="41095"></A>
m1 = m4;                                     // now m3 is the sum of m2
                                             // and the OLD value of m1!
</PRE>
</UL>
<A NAME="41096"></A>
<P><A NAME="dingp63"></A>
Here we've got to do something to ensure that the assignment to <CODE>m1</CODE> doesn't change <CODE>m3</CODE>. Inside the <CODE>Matrix&lt;int&gt;</CODE> assignment operator, we might compute <CODE>m3</CODE>'s value prior to changing <CODE>m1</CODE> or we might make a copy of the old value of <CODE>m1</CODE> and make <CODE>m3</CODE> dependent on that, but we have to do <I>something</I> to guarantee that <CODE>m3</CODE> has the value it's supposed to have after <CODE>m1</CODE> has been the target of an assignment. Other functions that might modify a matrix must be handled in a similar <NOBR>fashion.<SCRIPT>create_link(63);</SCRIPT>
</NOBR></P><A NAME="41097"></A>
<P><A NAME="dingp64"></A>
Because of the need to store dependencies between values; to maintain data structures that can store values, dependencies, or a combination of the two; and to overload operators like assignment, copying, and addition, lazy evaluation in a numerical domain is a lot of work. On the other hand, it often ends up saving significant amounts of time and space during program runs, and in many applications, that's a payoff that easily justifies the significant effort lazy evaluation <NOBR>requires.<SCRIPT>create_link(64);</SCRIPT>
</NOBR></P>
<A NAME="p93"></A>
<P><A NAME="dingp65"></A><font ID="mhtitle">Summary</font><SCRIPT>create_link(65);</SCRIPT>
</P>

<A NAME="63263"></A>
<P><A NAME="dingp66"></A>
These four examples show that lazy evaluation can be useful in a variety of domains: to avoid unnecessary copying of objects, to distinguish reads from writes using <CODE>operator[]</CODE>, to avoid unnecessary reads from databases, and to avoid unnecessary numerical computations. Nevertheless, it's not always a good idea. Just as procrastinating on your clean-up chores won't save you any work if your parents always check up on you, lazy evaluation won't save your program any work if all your computations are necessary. Indeed, if all your computations are essential, lazy evaluation may slow you down and increase your use of memory, because, in addition to having to do all the computations you were hoping to avoid, you'll also have to manipulate the fancy data structures needed to make lazy evaluation possible in the first place. Lazy evaluation is only useful when there's a reasonable chance your software will be asked to perform computations that can be <NOBR>avoided.<SCRIPT>create_link(66);</SCRIPT>
</NOBR></P><A NAME="41117"></A>
<P><A NAME="dingp67"></A>
There's nothing about lazy evaluation that's specific to C++. The technique can be applied in any programming language, and several languages &#151; notably APL, some dialects of Lisp, and virtually all dataflow languages &#151; embrace the idea as a fundamental part of the language. Mainstream programming languages employ eager evaluation, however, and C++ is mainstream. Yet C++ is particularly suitable as a vehicle for user-implemented lazy evaluation, because its support for encapsulation makes it possible to add lazy evaluation to a class without clients of that class knowing it's been <NOBR>done.<SCRIPT>create_link(67);</SCRIPT>
</NOBR></P><A NAME="41119"></A>
<P><A NAME="dingp68"></A>
Look again at the code fragments used in the above examples, and you can verify that the class interfaces offer no hints about whether eager or lazy evaluation is used by the classes. That means it's possible to implement a class using a straightforward eager evaluation strategy, but then, if your profiling investigations (see <A HREF="#40995" onMouseOver = "self.status = 'Link to MEC++ Item 16'; return true" onMouseOut = "self.status = self.defaultStatus">Item 16</A>) show that class's implementation is a performance bottleneck, you can replace its implementation with one based on lazy evaluation. (See also <A HREF="../EC/EC5_FR.HTM#6793" TARGET="_top" onMouseOver = "self.status = 'Link to EC++ Item 34'; return true" onMouseOut = "self.status = self.defaultStatus">Item E34</A>.) The only change your clients will see (after recompilation or relinking) is improved performance. That's the kind of software enhancement clients love, one that can make you downright proud to be <NOBR>lazy.<SCRIPT>create_link(68);</SCRIPT>
</NOBR></P>
<!-- SectionName="M18: Amortize the cost of expected computations" -->
<A NAME="41124"></A>
<DIV ALIGN="CENTER"><FONT SIZE="-1">Back to <A HREF="#41011">Item 17: Consider using lazy evaluation</A> &nbsp;&nbsp;<BR>&nbsp;&nbsp;Continue to <A HREF="#41177">Item 19: Understand the origin of temporary objects</A></FONT></DIV>
<P><A NAME="dingp69"></A><font ID="mititle">Item 18: &nbsp;Amortize the cost of expected computations.</font><SCRIPT>create_link(69);</SCRIPT>
</P>

<A NAME="72146"></A><A NAME="41128"></A>
<P><A NAME="dingp70"></A>
In <A HREF="#41011" onMouseOver = "self.status = 'Link to MEC++ Item 17'; return true" onMouseOut = "self.status = self.defaultStatus">Item 17</A>, I extolled the virtues of laziness, of putting things off as long as possible, and I explained how laziness can improve the efficiency of your programs. In this item, I adopt a different stance. Here, laziness has no place. I now encourage you to improve the performance of your software by having it do <I>more</I> than it's asked to do. The <A NAME="p94"></A>philosophy of this item might be called <I>over-eager evaluation</I>: doing things <I>before</I> you're asked to do <NOBR>them.<SCRIPT>create_link(70);</SCRIPT>
</NOBR></P><A NAME="41129"></A>
<P><A NAME="dingp71"></A>
Consider, for example, a template for classes representing large collections of numeric <NOBR>data:<SCRIPT>create_link(71);</SCRIPT>
</NOBR></P>
<A NAME="41130"></A>
<UL><PRE>template&lt;class NumericalType&gt;
class DataCollection {
public:
  NumericalType min() const;
  NumericalType max() const;
  NumericalType avg() const;
  ...
};
</PRE>
</UL>
<A NAME="83930"></A>
<P><A NAME="dingp72"></A>
Assuming the <CODE>min</CODE>, <CODE>max</CODE>, and <CODE>avg</CODE> functions return the current minimum, maximum, and average values of the collection, there are three ways in which these functions can be implemented. Using eager evaluation, we'd examine all the data in the collection when <CODE>min</CODE>, <CODE>max</CODE>, or <CODE>avg</CODE> was called, and we'd return the appropriate value. Using lazy evaluation, we'd have the functions return data structures that could be used to determine the appropriate value whenever the functions' return values were actually used. Using over-eager evaluation, we'd keep track of the running minimum, maximum, and average values of the collection, so when <CODE>min</CODE>, <CODE>max</CODE>, or <CODE>avg</CODE> was called, we'd be able to return the correct value immediately &#151; no computation would be required. If <CODE>min</CODE>, <CODE>max</CODE>, and <CODE>avg</CODE> were called frequently, we'd be able to amortize the cost of keeping track of the collection's minimum, maximum, and average values over all the calls to those functions, and the amortized cost per call would be lower than with eager or lazy <NOBR>evaluation.<SCRIPT>create_link(72);</SCRIPT>
</NOBR></P><A NAME="41134"></A>
<P><A NAME="dingp73"></A>
The idea behind over-eager evaluation is that if you expect a computation to be requested frequently, you can lower the average cost per request by designing your data structures to handle the requests especially <NOBR>efficiently.<SCRIPT>create_link(73);</SCRIPT>
</NOBR></P><A NAME="41135"></A>
<P><A NAME="dingp74"></A>
One of the simplest ways to do this is by caching values that have already been computed and are likely to be needed again. For example, suppose you're writing a program to provide information about employees, and one of the pieces of information you expect to be requested frequently is an employee's cubicle number. Further suppose that employee information is stored in a database, but, for most applications, an employee's cubicle number is irrelevant, so the database is not optimized to find it. To avoid having your specialized application unduly stress the database with repeated lookups of employee cubicle numbers, you could write a <CODE>findCubicleNumber</CODE> function that caches the cubicle numbers it looks up. Subsequent requests for cubicle <A NAME="p95"></A>numbers that have already been retrieved can then be satisfied by consulting the cache instead of querying the <NOBR>database.<SCRIPT>create_link(74);</SCRIPT>
</NOBR></P><A NAME="63502"></A>
<P><A NAME="dingp75"></A>
Here's one way to implement <CODE>findCubicleNumber</CODE>; it uses a <CODE>map</CODE> object from the Standard Template Library (the "STL" &#151; see <A HREF="./MC6_FR.HTM#5473" TARGET="_top" onMouseOver = "self.status = 'Link to MEC++ Item 35'; return true" onMouseOut = "self.status = self.defaultStatus">Item 35</A>) as a local <NOBR>cache:<SCRIPT>create_link(75);</SCRIPT>
</NOBR></P>
<A NAME="63447"></A>
<UL><PRE>int findCubicleNumber(const string&amp; employeeName)
{
  // define a static map to hold (employee name, cubicle number)
  // pairs. This map is the local cache.
  typedef map&lt;string, int&gt; CubicleMap;
  static CubicleMap cubes;
<A NAME="63454"></A>
  // try to find an entry for employeeName in the cache;
  // the STL iterator "it" will then point to the found
  // entry, if there is one (see <A HREF="./MC6_FR.HTM#5473" TARGET="_top" onMouseOver = "self.status = 'Link to MEC++ Item 35'; return true" onMouseOut = "self.status = self.defaultStatus">Item 35</A> for details)
  CubicleMap::iterator it = cubes.find(employeeName);
<A NAME="63455"></A>
  // "it"'s value will be cubes.end() if no entry was
  // found (this is standard STL behavior). If this is
  // the case, consult the database for the cubicle
  // number, then add it to the cache
  if (it == cubes.end()) {
    int cubicle =
      <I>the result of looking up employeeName's cubicle
      number in the database;</I>
<A NAME="63456"></A>
    cubes[employeeName] = cubicle;           // add the pair
                                             // (employeeName, cubicle)
                                             // to the cache
    return cubicle;
  }
  else {
    // "it" points to the correct cache entry, which is a
    // (employee name, cubicle number) pair. We want only
    // the second component of this pair, and the member
    // "second" will give it to us
    return (*it).second;
  }
}
</PRE>
</UL>
<A NAME="63542"></A>
<P><A NAME="dingp76"></A>
Try not to get bogged down in the details of the STL code (which will be clearer after you've read <A HREF="./MC6_FR.HTM#5473" TARGET="_top" onMouseOver = "self.status = 'Link to MEC++ Item 35'; return true" onMouseOut = "self.status = self.defaultStatus">Item 35</A>). Instead, focus on the general strategy embodied by this function. That strategy is to use a local cache to replace comparatively expensive database queries with comparatively inexpensive lookups in an in-memory data structure. Provided we're correct in assuming that cubicle numbers will frequently be requested more than once, the use of a cache in <CODE>findCubicleNumber</CODE> should reduce the average cost of returning an employee's cubicle <NOBR>number.<SCRIPT>create_link(76);</SCRIPT>
</NOBR></P><A NAME="81306"></A>
<A NAME="p96"></A>
<P><A NAME="dingp77"></A>
(One detail of the code requires explanation. The final statement returns <CODE>(*it).second</CODE> instead of the more conventional <CODE>it-&gt;second</CODE>. Why? The answer has to do with the conventions followed by the STL. In brief, the iterator <CODE>it</CODE> is an object, not a pointer, so there is no guarantee that "<CODE>-&gt;</CODE>" can be applied to <CODE>it</CODE>.<A HREF="#81381"><sup>6</sup></A> The STL does require that "<CODE>.</CODE>" and "<CODE>*</CODE>" be valid for iterators, however, so <CODE>(*it).second</CODE>, though syntactically clumsy, is guaranteed to <NOBR>work.)<SCRIPT>create_link(77);</SCRIPT>
</NOBR></P><A NAME="41146"></A>
<P><A NAME="dingp78"></A>
Caching is one way to amortize the cost of anticipated computations. Prefetching is another. You can think of prefetching as the computational equivalent of a discount for buying in bulk. Disk controllers, for example, read entire blocks or sectors of data when they read from disk, even if a program asks for only a small amount of data. That's because it's faster to read a big chunk once than to read two or three small chunks at different times. Furthermore, experience has shown that if data in one place is requested, it's quite common to want nearby data, too. This is the infamous <I>locality of reference</I> phenomenon, and systems designers rely on it to justify disk caches, memory caches for both instructions and data, and instruction <NOBR>prefetches.<SCRIPT>create_link(78);</SCRIPT>
</NOBR></P><A NAME="41147"></A>
<P><A NAME="dingp79"></A>
Excuse me? You say you don't worry about such low-level things as disk controllers or CPU caches? No problem. Prefetching can yield dividends for even one as high-level as you. Imagine, for example, you'd like to implement a template for dynamic arrays, i.e., arrays that start with a size of one and automatically extend themselves so that all nonnegative indices are <NOBR>valid:<SCRIPT>create_link(79);</SCRIPT>
</NOBR></P>
<A NAME="41148"></A>
<UL><PRE>
template&lt;class T&gt;                            // template for dynamic
class DynArray { ... };                      // array-of-T classes
<A NAME="41149"></A>
DynArray&lt;double&gt; a;                          // at this point, only a[0]
                                             // is a legitimate array
                                             // element
<A NAME="41150"></A>
a[22] = 3.5;                                 // a is automatically
                                             // extended: valid indices
                                             // are now 0-22
<A NAME="41151"></A>
a[32] = 0;                                   // a extends itself again;
                                             // now a[0]-a[32] are valid
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -