📄 mi18.htm
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Frameset//EN" "http://www.w3.org/TR/REC-html40/frameset.dtd">
<HTML LANG="EN">
<HEAD>
<TITLE>More Effective C++ | Item 18: Amortize the cost of expected computations</TITLE>
<LINK REL=STYLESHEET HREF=../INTRO/ECMEC.CSS>
<SCRIPT LANGUAGE="Javascript" SRC="../JAVA/COOKIE.JS"></SCRIPT>
<SCRIPT LANGUAGE="Javascript">var imagemax = 0; setCurrentMax(0);</SCRIPT>
<SCRIPT LANGUAGE="Javascript" SRC="../JAVA/DINGBATS.JS"></SCRIPT>
<SCRIPT>
var dingbase = "MI18_DIR.HTM";
var dingtext = "Item M18, P";
if (self == top) {
top.location.replace(dingbase + this.location.hash);
}
</SCRIPT>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" ONLOAD="setResize()">
<!-- SectionName="M18: Amortize the cost of expected computations" -->
<A NAME="41124"></A>
<DIV ALIGN="CENTER"><FONT SIZE="-1">Back to <A HREF="./MI17_FR.HTM" TARGET="_top">Item 17: Consider using lazy evaluation</A> <BR> Continue to <A HREF="./MI19_FR.HTM" TARGET="_top">Item 19: Understand the origin of temporary objects</A></FONT></DIV>
<P><A NAME="dingp1"></A><font ID="mititle">Item 18: Amortize the cost of expected computations.</font><SCRIPT>create_link(1);</SCRIPT>
</P>
<A NAME="72146"></A><A NAME="41128"></A>
<P><A NAME="dingp2"></A>
In <A HREF="./MI17_FR.HTM#41011" TARGET="_top">Item 17</A>, I extolled the virtues of laziness, of putting things off as long as possible, and I explained how laziness can improve the efficiency of your programs. In this item, I adopt a different stance. Here, laziness has no place. I now encourage you to improve the performance of your software by having it do <I>more</I> than it's asked to do. The <A NAME="p94"></A>philosophy of this item might be called <I>over-eager evaluation</I>: doing things <I>before</I> you're asked to do <NOBR>them.<SCRIPT>create_link(2);</SCRIPT>
</NOBR></P><A NAME="41129"></A>
<P><A NAME="dingp3"></A>
Consider, for example, a template for classes representing large collections of numeric <NOBR>data:<SCRIPT>create_link(3);</SCRIPT>
</NOBR></P>
<A NAME="41130"></A>
<UL><PRE>template<class NumericalType>
class DataCollection {
public:
NumericalType min() const;
NumericalType max() const;
NumericalType avg() const;
...
};
</PRE>
</UL>
<P><A NAME="dingp4"></A><A NAME="83930"></A>
Assuming the <CODE>min</CODE>, <CODE>max</CODE>, and <CODE>avg</CODE> functions return the current minimum, maximum, and average values of the collection, there are three ways in which these functions can be implemented. Using eager evaluation, we'd examine all the data in the collection when <CODE>min</CODE>, <CODE>max</CODE>, or <CODE>avg</CODE> was called, and we'd return the appropriate value. Using lazy evaluation, we'd have the functions return data structures that could be used to determine the appropriate value whenever the functions' return values were actually used. Using over-eager evaluation, we'd keep track of the running minimum, maximum, and average values of the collection, so when <CODE>min</CODE>, <CODE>max</CODE>, or <CODE>avg</CODE> was called, we'd be able to return the correct value immediately — no computation would be required. If <CODE>min</CODE>, <CODE>max</CODE>, and <CODE>avg</CODE> were called frequently, we'd be able to amortize the cost of keeping track of the collection's minimum, maximum, and average values over all the calls to those functions, and the amortized cost per call would be lower than with eager or lazy <NOBR>evaluation.<SCRIPT>create_link(4);</SCRIPT>
</NOBR></P><A NAME="41134"></A>
<P><A NAME="dingp5"></A>
The idea behind over-eager evaluation is that if you expect a computation to be requested frequently, you can lower the average cost per request by designing your data structures to handle the requests especially <NOBR>efficiently.<SCRIPT>create_link(5);</SCRIPT>
</NOBR></P><A NAME="41135"></A>
<P><A NAME="dingp6"></A>
One of the simplest ways to do this is by caching values that have already been computed and are likely to be needed again. For example, suppose you're writing a program to provide information about employees, and one of the pieces of information you expect to be requested frequently is an employee's cubicle number. Further suppose that employee information is stored in a database, but, for most applications, an employee's cubicle number is irrelevant, so the database is not optimized to find it. To avoid having your specialized application unduly stress the database with repeated lookups of employee cubicle numbers, you could write a <CODE>findCubicleNumber</CODE> function that caches the cubicle numbers it looks up. Subsequent requests for cubicle <A NAME="p95"></A>numbers that have already been retrieved can then be satisfied by consulting the cache instead of querying the <NOBR>database.<SCRIPT>create_link(6);</SCRIPT>
</NOBR></P><A NAME="63502"></A>
<P><A NAME="dingp7"></A>
Here's one way to implement <CODE>findCubicleNumber</CODE>; it uses a <CODE>map</CODE> object from the Standard Template Library (the "STL" — see <a href="./MI35_FR.HTM#5473" TARGET="_top">Item 35</A>) as a local <NOBR>cache:<SCRIPT>create_link(7);</SCRIPT>
</NOBR></P>
<A NAME="63447"></A>
<UL><PRE>int findCubicleNumber(const string& employeeName)
{
// define a static map to hold (employee name, cubicle number)
// pairs. This map is the local cache.
typedef map<string, int> CubicleMap;
static CubicleMap cubes;
<A NAME="63454"></A>
// try to find an entry for employeeName in the cache;
// the STL iterator "it" will then point to the found
// entry, if there is one (see <a href="./MI35_FR.HTM#5473" TARGET="_top">Item 35</A> for details)
CubicleMap::iterator it = cubes.find(employeeName);
<A NAME="63455"></A>
// "it"'s value will be cubes.end() if no entry was
// found (this is standard STL behavior). If this is
// the case, consult the database for the cubicle
// number, then add it to the cache
if (it == cubes.end()) {
int cubicle =
<I>the result of looking up employeeName's cubicle
number in the database;</I>
<A NAME="63456"></A>
cubes[employeeName] = cubicle; // add the pair
// (employeeName, cubicle)
// to the cache
return cubicle;
}
else {
// "it" points to the correct cache entry, which is a
// (employee name, cubicle number) pair. We want only
// the second component of this pair, and the member
// "second" will give it to us
return (*it).second;
}
}
</PRE>
</UL>
<P><A NAME="dingp8"></A><A NAME="63542"></A>
Try not to get bogged down in the details of the STL code (which will be clearer after you've read <a href="./MI35_FR.HTM#5473" TARGET="_top">Item 35</A>). Instead, focus on the general strategy embodied by this function. That strategy is to use a local cache to replace comparatively expensive database queries with comparatively inexpensive lookups in an in-memory data structure. Provided we're correct in assuming that cubicle numbers will frequently be requested more than once, the use of a cache in <CODE>findCubicleNumber</CODE> should reduce the average cost of returning an employee's cubicle <NOBR>number.<SCRIPT>create_link(8);</SCRIPT>
</NOBR></P><A NAME="81306"></A>
<A NAME="p96"></A><P><A NAME="dingp9"></A>
One detail of the code requires explanation. The final statement returns <CODE>(*it).second</CODE> instead of the more conventional <CODE>it->second</CODE>. Why? The answer has to do with the conventions followed by the STL. In brief, the iterator <CODE>it</CODE> is an object, not a pointer, so there is no guarantee that "<CODE>-></CODE>" can be applied to <CODE>it</CODE>.<a href="#81381"><sup>6</sup></A> The STL does require that "<CODE>.</CODE>" and "<CODE>*</CODE>" be valid for iterators, however, so <CODE>(*it).second</CODE>, though syntactically clumsy, is guaranteed to <NOBR>work.)<SCRIPT>create_link(9);</SCRIPT>
</NOBR></P><A NAME="41146"></A>
<P><A NAME="dingp10"></A>
Caching is one way to amortize the cost of anticipated computations. Prefetching is another. You can think of prefetching as the computational equivalent of a discount for buying in bulk. Disk controllers, for example, read entire blocks or sectors of data when they read from disk, even if a program asks for only a small amount of data. That's because it's faster to read a big chunk once than to read two or three small chunks at different times. Furthermore, experience has shown that if data in one place is requested, it's quite common to want nearby data, too. This is the infamous <I>locality of reference</I> phenomenon, and systems designers rely on it to justify disk caches, memory caches for both instructions and data, and instruction <NOBR>prefetches.<SCRIPT>create_link(10);</SCRIPT>
</NOBR></P><A NAME="41147"></A>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -