📄 mi17.htm
字号:
private:
ObjectID oid;
<A NAME="57564"></A>
mutable string *field1Value; // see below for a
mutable int *field2Value; // discussion of "mutable"
mutable double *field3Value;
mutable string *field4Value;
...
<A NAME="75179"></A>
};
<A NAME="41056"></A>
<A NAME="p89"></A>LargeObject::LargeObject(ObjectID id)
: oid(id), field1Value(0), field2Value(0), field3Value(0), ...
{}
<A NAME="41057"></A>
const string& LargeObject::field1() const
{
if (field1Value == 0) {
<I>read the data for field 1 from the database and make
field1Value point to it;</I>
}
<A NAME="41058"></A>
return *field1Value;
}
</PRE>
</UL>
<P><A NAME="dingp25"></A><A NAME="41059"></A>Each field in the object is represented as a pointer to the necessary data, and the <CODE>LargeObject</CODE> constructor initializes each pointer to null. Such null pointers signify fields that have not yet been read from the database. Each <CODE>LargeObject</CODE> member function must check the state of a field's pointer before accessing the data it points to. If the pointer is null, the corresponding data must be read from the database before performing any operations on that <NOBR>data.<SCRIPT>create_link(25);</SCRIPT>
</NOBR></P>
<P><A NAME="dingp26"></A><A NAME="41060"></A>
When implementing lazy fetching, you must confront the problem that null pointers may need to be initialized to point to real data from inside any member function, including <CODE>const</CODE> member functions like <CODE>field1</CODE>. However, compilers get cranky when you try to modify data members inside <CODE>const</CODE> member functions, so you've got to find a way to say, "It's okay, I know what I'm doing." The best way to say that is to declare the pointer fields <CODE>mutable</CODE>, which means they can be modified inside any member function, even inside <CODE>const</CODE> member functions (see <A HREF="../EC/EI21_FR.HTM#6003" TARGET="_top">Item E21</A>). That's why the fields inside <CODE>LargeObject</CODE> above are declared <CODE>mutable</CODE>.<SCRIPT>create_link(26);</SCRIPT>
</P>
<P><A NAME="dingp27"></A><A NAME="41061"></A>
The <CODE>mutable</CODE> keyword is a relatively recent addition to C++, so it's possible your vendors don't yet support it. If not, you'll need to find another way to convince your compilers to let you modify data members inside <CODE>const</CODE> member functions. One workable strategy is the "fake <CODE>this</CODE>" approach, whereby you create a pointer-to-non-<CODE>const</CODE> that points to the same object as <CODE>this</CODE> does. When you want to modify a data member, you access it through the "fake <CODE>this</CODE>" <NOBR>pointer:<SCRIPT>create_link(27);</SCRIPT>
</NOBR></P>
<A NAME="41062"></A>
<UL><PRE>class LargeObject {
public:
const string& field1() const; // unchanged
...
<A NAME="57573"></A>
private:
string *field1Value; // not declared mutable
... // so that older
}; // compilers will accept it
<A NAME="41065"></A>
<A NAME="p90"></A>const string& LargeObject::field1() const
{
// declare a pointer, fakeThis, that points where this
// does, but where the constness of the object has been
// cast away
LargeObject * const fakeThis =
const_cast<LargeObject* const>(this);
<A NAME="63052"></A>
if (field1Value == 0) {
fakeThis->field1Value = // this assignment is OK,
<i>the appropriate data </i> // because what fakeThis
<i>from the database; </i> // points to isn't const
}
<A NAME="41066"></A>
return *field1Value;
}
</PRE>
</UL>
<A NAME="63104"></A>
<P><A NAME="dingp28"></A>
This function employs a <CODE>const_cast</CODE> (see<a href="./MI2_FR.HTM#77216" TARGET="_top"> Item 2</A>) to cast away the <CODE>const</CODE>ness of <CODE>*this</CODE>. If your compilers don't support <CODE>const_cast</CODE>, you can use an old C-style <NOBR>cast:<SCRIPT>create_link(28);</SCRIPT>
</NOBR></P>
<A NAME="63108"></A>
<UL><PRE>// Use of old-style cast to help emulate mutable
const string& LargeObject::field1() const
{
LargeObject * const fakeThis = (LargeObject* const)this;
<A NAME="63110"></A>
... // as above
<A NAME="63126"></A>
}
</PRE>
</UL>
<A NAME="41071"></A>
<P><A NAME="dingp29"></A>
Look again at the pointers inside <CODE>LargeObject</CODE>. Let's face it, it's tedious and error-prone to have to initialize all those <i>pointers</i> to null, then test each one before use. Fortunately, such drudgery can be automated through the use of <I>smart</I> pointers, which you can read about in <A HREF="./MI28_FR.HTM#61766" TARGET="_top">Item 28</A>. If you use smart pointers inside <CODE>LargeObject</CODE>, you'll also find you no longer need to declare the pointers <CODE>mutable</CODE>. Alas, it's only a temporary respite, because you'll wind up needing <CODE>mutable</CODE> once you sit down to implement the smart pointer classes. Think of it as conservation of <NOBR>inconvenience.<SCRIPT>create_link(29);</SCRIPT>
</NOBR></P>
<P><A NAME="dingp30"></A><font ID="mhtitle">Lazy Expression Evaluation</font><SCRIPT>create_link(30);</SCRIPT>
</P>
<A NAME="41076"></A>
<P><A NAME="dingp31"></A>A final example of lazy evaluation comes from numerical applications. Consider this <NOBR>code:<SCRIPT>create_link(31);</SCRIPT>
</NOBR></P>
<A NAME="41077"></A>
<UL><PRE>template<class T>
class Matrix { ... }; // for homogeneous matrices
<A NAME="41078"></A>
Matrix<int> m1(1000, 1000); // a 1000 by 1000 matrix
Matrix<int> m2(1000, 1000); // ditto
<A NAME="41079"></A>
...
<A NAME="41080"></A>
Matrix<int> m3 = m1 + m2; // add m1 and m2
</PRE>
</UL>
<A NAME="41081"></A><A NAME="p91"></A>
<P><A NAME="dingp32"></A>
The usual implementation of <CODE>operator+</CODE> would use eager evaluation; in this case it would compute and return the sum of <CODE>m1</CODE> and <CODE>m2</CODE>. That's a fair amount of computation (1,000,000 additions), and of course there's the cost of allocating the memory to hold all those values, <NOBR>too.<SCRIPT>create_link(32);</SCRIPT>
</NOBR></P><A NAME="50695"></A>
<P><A NAME="dingp33"></A>
The lazy evaluation strategy says that's <I>way</I> too much work, so it doesn't do it. Instead, it sets up a data structure inside <CODE>m3</CODE> that indicates that <CODE>m3</CODE>'s value is the sum of <CODE>m1</CODE> and <CODE>m2</CODE>. Such a data structure might consist of nothing more than a pointer to each of <CODE>m1</CODE> and <CODE>m2</CODE>, plus an enum indicating that the operation on them is addition. Clearly, it's going to be faster to set up this data structure than to add <CODE>m1</CODE> and <CODE>m2</CODE>, and it's going to use a lot less memory, <NOBR>too.<SCRIPT>create_link(33);</SCRIPT>
</NOBR></P><A NAME="41082"></A>
<P><A NAME="dingp34"></A>
Suppose that later in the program, before <CODE>m3</CODE> has been used, this code is <NOBR>executed:<SCRIPT>create_link(34);</SCRIPT>
</NOBR></P>
<A NAME="41084"></A>
<UL><PRE>Matrix<int> m4(1000, 1000);
<A NAME="63216"></A>
... // give m4 some values
<A NAME="63215"></A>
m3 = m4 * m1;
</PRE>
</UL>
<A NAME="41085"></A>
<P><A NAME="dingp35"></A>
Now we can forget all about <CODE>m3</CODE> being the sum of <CODE>m1</CODE> and <CODE>m2</CODE> (and thereby save the cost of the computation), and in its place we can start remembering that <CODE>m3</CODE> is the product of <CODE>m4</CODE> and <CODE>m1</CODE>. Needless to say, we don't perform the multiplication. Why bother? We're lazy, <NOBR>remember?<SCRIPT>create_link(35);</SCRIPT>
</NOBR></P><A NAME="41086"></A>
<P><A NAME="dingp36"></A>
This example looks contrived, because no good programmer would write a program that computed the sum of two matrices and failed to use it, but it's not as contrived as it seems. No good programmer would deliberately compute a value that's not needed, but during maintenance, it's not uncommon for a programmer to modify the paths through a program in such a way that a formerly useful computation becomes unnecessary. The likelihood of that happening is reduced by defining objects immediately prior to use (see <A HREF="../EC/EI32_FR.HTM#25939" TARGET="_top">Item E32</A>), but it's still a problem that occurs from time to <NOBR>time.<SCRIPT>create_link(36);</SCRIPT>
</NOBR></P><A NAME="41087"></A>
<P><A NAME="dingp37"></A>
Nevertheless, if that were the only time lazy evaluation paid off, it would hardly be worth the trouble. A more common scenario is that we need only <I>part</I> of a computation. For example, suppose we use <CODE>m3</CODE> as follows after initializing it to the sum of <CODE>m1</CODE> and <CODE>m2</CODE>:<SCRIPT>create_link(37);</SCRIPT>
</P>
<A NAME="41088"></A>
<UL><PRE>
cout << m3[4]; // print the 4th row of m3
</PRE>
</UL>
<A NAME="41089"></A>
<P><A NAME="dingp38"></A>
Clearly we can be completely lazy no longer — we've got to compute the values in the fourth row of <CODE>m3</CODE>. But let's not be overly ambitious, either. There's no reason we have to compute any <I>more</I> than the fourth row of <CODE>m3</CODE>; the remainder of <CODE>m3</CODE> can remain uncomputed until it's actually needed. With luck, it never will <NOBR>be.<SCRIPT>create_link(38);</SCRIPT>
</NOBR></P><A NAME="41090"></A>
<A NAME="p92"></A>
<P><A NAME="dingp39"></A>
How likely are we to be lucky? Experience in the domain of matrix computations suggests the odds are in our favor. In fact, lazy evaluation lies behind the wonder that is APL. APL was developed in the 1960s for interactive use by people who needed to perform matrix-based calculations. Running on computers that had less computational horsepower than the chips now found in high-end microwave ovens, APL was seemingly able to add, multiply, and even divide large matrices instantly! Its trick was lazy evaluation. The trick was usually effective, because APL users typically added, multiplied, or divided matrices not because they needed the entire resulting matrix, but only because they needed a small part of it. APL employed lazy evaluation to defer its computations until it knew exactly what part of a result matrix was needed, then it computed only that part. In practice, this allowed users to perform computationally intensive tasks <I>interactively</I> in an environment where the underlying machine was hopelessly inadequate for an implementation employing eager evaluation. Machines are faster today, but data sets are bigger and users less patient, so many contemporary matrix libraries continue to take advantage of lazy <NOBR>evaluation.<SCRIPT>create_link(39);</SCRIPT>
</NOBR></P><A NAME="41091"></A>
<P><A NAME="dingp40"></A>
To be fair, laziness sometimes fails to pay off. If <CODE>m3</CODE> is used in this <NOBR>way,<SCRIPT>create_link(40);</SCRIPT>
</NOBR></P>
<A NAME="41092"></A>
<UL><PRE>
cout << m3; // print out all of m3
</PRE>
</UL>
<A NAME="41093"></A>
<P><A NAME="dingp41"></A>the jig is up and we've got to compute a complete value for <CODE>m3</CODE>. Similarly, if one of the matrices on which <CODE>m3</CODE> is dependent is about to be modified, we have to take immediate <NOBR>action:<SCRIPT>create_link(41);</SCRIPT>
</NOBR></P>
<A NAME="41094"></A>
<UL><PRE>
m3 = m1 + m2; // remember that m3 is the
// sum of m1 and m2
<A NAME="41095"></A>
m1 = m4; // now m3 is the sum of m2
// and the OLD value of m1!
</PRE>
</UL>
<A NAME="41096"></A>
<P><A NAME="dingp42"></A>
Here we've got to do something to ensure that the assignment to <CODE>m1</CODE> doesn't change <CODE>m3</CODE>. Inside the <CODE>Matrix<int></CODE> assignment operator, we might compute <CODE>m3</CODE>'s value prior to changing <CODE>m1</CODE> or we might make a copy of the old value of <CODE>m1</CODE> and make <CODE>m3</CODE> dependent on that, but we have to do <I>something</I> to guarantee that <CODE>m3</CODE> has the value it's supposed to have after <CODE>m1</CODE> has been the target of an assignment. Other functions that might modify a matrix must be handled in a similar <NOBR>fashion.<SCRIPT>create_link(42);</SCRIPT>
</NOBR></P>
<P><A NAME="dingp43"></A><A NAME="41097"></A>
Because of the need to store dependencies between values; to maintain data structures that can store values, dependencies, or a combination of the two; and to overload operators like assignment, copying, and addition, lazy evaluation in a numerical domain is a lot of work. On the other hand, it often ends up saving significant amounts of time and space during program runs, and in many applications, that's a payoff that easily justifies the significant effort lazy evaluation <NOBR>requires.<SCRIPT>create_link(43);</SCRIPT>
</NOBR></P>
<A NAME="p93"></A>
<P><A NAME="dingp44"></A><font ID="mhtitle">Summary</font><SCRIPT>create_link(44);</SCRIPT>
</P>
<A NAME="63263"></A>
<P><A NAME="dingp45"></A>
These four examples show that lazy evaluation can be useful in a variety of domains: to avoid unnecessary copying of objects, to distinguish reads from writes using <CODE>operator[]</CODE>, to avoid unnecessary reads from databases, and to avoid unnecessary numerical computations. Nevertheless, it's not always a good idea. Just as procrastinating on your clean-up chores won't save you any work if your parents always check up on you, lazy evaluation won't save your program any work if all your computations are necessary. Indeed, if all your computations are essential, lazy evaluation may slow you down and increase your use of memory, because, in addition to having to do all the computations you were hoping to avoid, you'll also have to manipulate the fancy data structures needed to make lazy evaluation possible in the first place. Lazy evaluation is only useful when there's a reasonable chance your software will be asked to perform computations that can be <NOBR>avoided.<SCRIPT>create_link(45);</SCRIPT>
</NOBR></P>
<P><A NAME="dingp46"></A><A NAME="41117"></A>
There's nothing about lazy evaluation that's specific to C++. The technique can be applied in any programming language, and several languages — notably APL, some dialects of Lisp, and virtually all dataflow languages — embrace the idea as a fundamental part of the language. Mainstream programming languages employ eager evaluation, however, and C++ is mainstream. Yet C++ is particularly suitable as a vehicle for user-implemented lazy evaluation, because its support for encapsulation makes it possible to add lazy evaluation to a class without clients of that class knowing it's been <NOBR>done.<SCRIPT>create_link(46);</SCRIPT>
</NOBR></P>
<P><A NAME="dingp47"></A><A NAME="41119"></A>
Look again at the code fragments used in the above examples, and you can verify that the class interfaces offer no hints about whether eager or lazy evaluation is used by the classes. That means it's possible to implement a class using a straightforward eager evaluation strategy, but then, if your profiling investigations (see <A HREF="./MI16_FR.HTM#40995" TARGET="_top">Item 16</A>) show that class's implementation is a performance bottleneck, you can replace its implementation with one based on lazy evaluation. (See also <A HREF="../EC/EI34_FR.HTM#6793" TARGET="_top">Item E34</A>.) The only change your clients will see (after recompilation or relinking) is improved performance. That's the kind of software enhancement clients love, one that can make you downright proud to be <NOBR>lazy.<SCRIPT>create_link(47);</SCRIPT>
</NOBR></P>
<DIV ALIGN="CENTER"><FONT SIZE="-1">Back to <A HREF="./MI16_FR.HTM" TARGET="_top">Item 16: Remember the 80-20 rule</A> <BR> Continue to <A HREF="./MI18_FR.HTM" TARGET="_top">Item 18: Amortize the cost of expected computations</A></FONT></DIV>
</BODY>
</HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -