⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 mi27.htm

📁 一个非常适合初学者入门的有关c++的文档
💻 HTM
📖 第 1 页 / 共 3 页
字号:

<UL><PRE>UPNumber *numberArray = new UPNumber[100];
</PRE>
</UL><A NAME="22067"></A>
<P><A NAME="dingp18"></A>The first problem is that the memory for the array is allocated by <CODE>operator</CODE> <CODE><NOBR>new[]</NOBR></CODE>, not <CODE>operator</CODE> <CODE>new</CODE>, but (provided your compilers support it) you can write the former function as easily as the latter. What is more troublesome is the fact that <CODE>numberArray</CODE> has 100 elements, so there will be 100 constructor calls. But there is only one call to allocate memory, so <CODE>onTheHeap</CODE> will be set to true for only the first of those 100 constructors. When the second constructor is called, an exception is thrown, and woe is <NOBR>you.<SCRIPT>create_link(18);</SCRIPT>
</NOBR></P><A NAME="59757"></A>

<P><A NAME="dingp19"></A>Even without arrays, this bit-setting business may fail. Consider this <NOBR>statement:<SCRIPT>create_link(19);</SCRIPT>
</NOBR></P>
<A NAME="59758"></A>
<UL><PRE>UPNumber *pn = new UPNumber(*new UPNumber);
</PRE>
</UL><A NAME="63799"></A>

<P><A NAME="dingp20"></A>Here we create two <CODE>UPNumber</CODE>s on the heap and make <CODE>pn</CODE> point to one of them; it's initialized with the value of the second one. This code has a resource leak, but let us ignore that in favor of an examination of what happens during execution of this <NOBR>expression:<SCRIPT>create_link(20);</SCRIPT>
</NOBR></P>
<A NAME="59785"></A>
<UL><PRE>new UPNumber(*new UPNumber)
</PRE>
</UL><A NAME="59772"></A>

<P><A NAME="dingp21"></A>This contains two calls to the <CODE>new</CODE> operator, hence two calls to <CODE>operator</CODE> new and two calls to <CODE>UPNumber</CODE> constructors (see <A HREF="./MI8_FR.HTM#33985" TARGET="_top">Item 8</A>). Programmers typically expect these function calls to be executed in this <NOBR>order,<SCRIPT>create_link(21);</SCRIPT>
</NOBR></P>
<A NAME="59798"></A><OL TYPE="1"><A NAME="dingp22"></A><LI>Call <CODE>operator</CODE> <CODE>new</CODE> for first object<SCRIPT>create_link(22);</SCRIPT>

<A NAME="59799"></A><A NAME="dingp23"></A><LI>Call constructor for first object<SCRIPT>create_link(23);</SCRIPT>

<A NAME="59803"></A><A NAME="dingp24"></A><LI>Call <CODE>operator</CODE> <CODE>new</CODE> for second object<SCRIPT>create_link(24);</SCRIPT>

<A NAME="59804"></A><A NAME="dingp25"></A><LI>Call constructor for second object<SCRIPT>create_link(25);</SCRIPT>

</OL>
<A NAME="59805"></A>
<P><A NAME="dingp26"></A>but the language makes no guarantee that this is how it will be done. Some compilers generate the function calls in this order <NOBR>instead:<SCRIPT>create_link(26);</SCRIPT>
</NOBR></P>
<A NAME="59807"></A><OL TYPE="1"><A NAME="dingp27"></A><LI>Call <CODE>operator</CODE> <CODE>new</CODE> for first object<SCRIPT>create_link(27);</SCRIPT>

<A NAME="59818"></A><A NAME="dingp28"></A><LI>Call <CODE>operator</CODE> <CODE>new</CODE> for second object<SCRIPT>create_link(28);</SCRIPT>

<A NAME="59808"></A><A NAME="dingp29"></A><LI>Call constructor for first object<SCRIPT>create_link(29);</SCRIPT>

<A NAME="59810"></A><A NAME="dingp30"></A><LI>Call constructor for second object<SCRIPT>create_link(30);</SCRIPT>

</OL>
<A NAME="59821"></A>

<P><A NAME="dingp31"></A><A NAME="p150"></A>There is nothing wrong with compilers that generate this kind of code, but the set-a-bit-in-<CODE>operator</CODE>-<CODE>new</CODE> trick fails with such compilers. That's because the bit set in steps 1 and 2 is cleared in step 3, thus making the object constructed in step 4 think it's not on the heap, even though it <NOBR>is.<SCRIPT>create_link(31);</SCRIPT>
</NOBR></P><A NAME="22073"></A>

<P><A NAME="dingp32"></A>These difficulties don't invalidate the basic idea of having each constructor check to see if <CODE>*this</CODE> is on the heap. Rather, they indicate that checking a bit set inside <CODE>operator</CODE> <CODE>new</CODE> (or <CODE>operator</CODE> <CODE><NOBR>new[]</NOBR></CODE>) is not a reliable way to determine this information. What we need is a better way to figure it <NOBR>out.<SCRIPT>create_link(32);</SCRIPT>
</NOBR></P><A NAME="22207"></A>

<P><A NAME="dingp33"></A>If you're desperate enough, you might be tempted to descend into the realm of the unportable. For example, you might decide to take advantage of the fact that on many systems, a program's address space is organized as a linear sequence of addresses, with the program's stack growing down from the top of the address space and the heap rising up from the <NOBR>bottom:<SCRIPT>create_link(33);</SCRIPT>
</NOBR></P>

<SPAN ID="Image1of1" STYLE="position: absolute; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_150A1.GIF" BORDER=0></SPAN>
<SPAN ID="Image1of2" STYLE="position: absolute; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_150A2.GIF" BORDER=0></SPAN>
<SPAN ID="Image1of3" STYLE="position: absolute; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_150A3.GIF" BORDER=0></SPAN>
<SPAN ID="Image1of4" STYLE="position: absolute; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_150A4.GIF" BORDER=0></SPAN>
<SPAN ID="Image1of5" STYLE="position: absolute; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_150A5.GIF" BORDER=0></SPAN>

<SPAN ID="Image1of6" STYLE="position: relative; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_150A5.GIF" BORDER=0></SPAN>

<A NAME="22297"></A>
<P><A NAME="dingp34"></A>On systems that organize a program's memory in this way (many do, but many do not), you might think you could use the following function to determine whether a particular address is on the <NOBR>heap:<SCRIPT>create_link(34);</SCRIPT>
</NOBR></P>
<A NAME="22298"></A>

<UL><PRE>// incorrect attempt to determine whether an address
// is on the heap
bool onHeap(const void *address)
{
  char onTheStack;                   // local stack variable
<A NAME="22299"></A>
  return address &lt; &amp;onTheStack;
}
</PRE>
</UL><A NAME="22300"></A>

<P><A NAME="dingp35"></A>The thinking behind this function is interesting. Inside <CODE>onHeap</CODE>, <CODE>onTheStack</CODE> is a local variable. As such, it is, well, it's on the stack. <A NAME="p151"></A>When <CODE>onHeap</CODE> is called, its stack frame (i.e., its activation record) will be placed at the top of the program's stack, and because the stack grows down (toward lower addresses) in this architecture, the address of <CODE>onTheStack</CODE> must be less than the address of any other stack-based variable or object. If the parameter <CODE>address</CODE> is less than the location of <CODE>onTheStack</CODE>, it can't be on the stack, so it must be on the <NOBR>heap.<SCRIPT>create_link(35);</SCRIPT>
</NOBR></P>
<A NAME="22317"></A>

<P><A NAME="dingp36"></A>Such logic is fine, as far as it goes, but it doesn't go far enough. The fundamental problem is that there are <I>three</I> places where objects may be allocated, not two. Yes, the stack and the heap hold objects, but let us not forget about <I>static</I> objects. Static objects are those that are initialized only once during a program run. Static objects comprise not only those objects explicitly declared <CODE>static</CODE>, but also objects at global and namespace scope (see <A HREF="../EC/EI47_FR.HTM#8299" TARGET="_top">Item E47</A>). Such objects have to go somewhere, and that somewhere is neither the stack nor the <NOBR>heap.<SCRIPT>create_link(36);</SCRIPT>
</NOBR></P>

<A NAME="22337"></A>

<P><A NAME="dingp37"></A>Where they go is system-dependent, but on many of the systems that have the stack and heap grow toward one another, they go below the heap. The earlier picture of memory organization, while telling the truth and nothing but the truth for many systems, failed to tell the whole truth for those systems. With static objects added to the picture, it looks like <NOBR>this:<SCRIPT>create_link(37);</SCRIPT>
</NOBR></P>

<SPAN ID="Image2of1" STYLE="position: absolute; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_151A1.GIF" BORDER=0></SPAN>
<SPAN ID="Image2of2" STYLE="position: absolute; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_151A2.GIF" BORDER=0></SPAN>
<SPAN ID="Image2of3" STYLE="position: absolute; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_151A3.GIF" BORDER=0></SPAN>
<SPAN ID="Image2of4" STYLE="position: absolute; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_151A4.GIF" BORDER=0></SPAN>
<SPAN ID="Image2of5" STYLE="position: absolute; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_151A5.GIF" BORDER=0></SPAN>

<SPAN ID="Image2of6" STYLE="position: relative; z-index:1; visibility: hidden"><IMG SRC="./IMAGES/GRAPHICS/DIAGRAMS/I_151A5.GIF" BORDER=0></SPAN>

<A NAME="22200"></A>
<P><A NAME="dingp38"></A>Suddenly it becomes clear why <CODE>onHeap</CODE> won't work, not even on systems where it's purported to: it fails to distinguish between heap objects and static <NOBR>objects:<SCRIPT>create_link(38);</SCRIPT>
</NOBR></P>
<A NAME="22345"></A>
<UL><PRE>
void allocateSomeObjects()
{
  char *pc = new char;               // heap object: onHeap(pc)
                                     // will return true
<A NAME="22347"></A>
<A NAME="p152"></A>  char c;                            // stack object: onHeap(&amp;c)
                                     // will return false
<A NAME="22348"></A>
  static char sc;                    // static object: onHeap(&amp;sc)
                                     // will return true
  ...
<A NAME="22352"></A>
}
</PRE>
</UL><A NAME="22341"></A>

<P><A NAME="dingp39"></A>Now, you may be desperate for a way to tell heap objects from stack objects, and in your desperation you may be willing to strike a deal with the portability Devil, but are you so desperate that you'll strike a deal that fails to guarantee you the right answers? Surely not, so I know you'll reject this seductive but unreliable compare-the-addresses <NOBR>trick.<SCRIPT>create_link(39);</SCRIPT>
</NOBR></P>
<A NAME="22377"></A>

<P><A NAME="dingp40"></A>The sad fact is there's not only no portable way to determine whether an object is on the heap, there isn't even a semi-portable way that works most of the time. If you absolutely, positively have to tell whether an address is on the heap, you're going to have to turn to unportable, implementation-dependent system calls, and that's that. As such, you're better off trying to redesign your software so you don't need to determine whether an object is on the heap in the first <NOBR>place.<SCRIPT>create_link(40);</SCRIPT>
</NOBR></P>
<A NAME="22684"></A>

<P><A NAME="dingp41"></A>If you find yourself obsessing over whether an object is on the heap, the likely cause is that you want to know if it's safe to invoke <CODE>delete</CODE> on it. Often such deletion will take the form of the infamous "<CODE>delete</CODE> <CODE>this</CODE>." Knowing whether it's safe to delete a pointer, however, is not the same as simply knowing whether that pointer points to something on the heap, because not all pointers to things on the heap can be safely <CODE>delete</CODE>d. Consider again an <CODE>Asset</CODE> object that contains a <CODE>UPNumber</CODE> <NOBR>object:<SCRIPT>create_link(41);</SCRIPT>
</NOBR></P>
<A NAME="22704"></A>
<UL><PRE>class Asset {
private:
  UPNumber value;
  ...
<A NAME="22707"></A>
};
<A NAME="22715"></A>
Asset *pa = new Asset;
</PRE>
</UL><A NAME="22716"></A>

<P><A NAME="dingp42"></A>Clearly <CODE>*pa</CODE> (including its member <CODE>value</CODE>) is on the heap. Equally clearly, it's not safe to invoke <CODE>delete</CODE> on a pointer to <CODE>pa-&gt;value</CODE>, because no such pointer was ever returned from <CODE>new</CODE>.<SCRIPT>create_link(42);</SCRIPT>
</P>
<A NAME="23350"></A>

<P><A NAME="dingp43"></A>As luck would have it, it's easier to determine whether it's safe to delete a pointer than to determine whether a pointer points to something on the heap, because all we need to answer the former question is a collection of addresses that have been returned by <CODE>operator</CODE> <CODE>new</CODE>. Since we can write <CODE>operator</CODE> <CODE>new</CODE> ourselves (see Items <A HREF="../EC/EI8_FR.HTM#120851" TARGET="_top">E8</A>-<a href="../EC/EI10_FR.HTM#1986" TARGET="_top">E10</A>), it's easy to construct such a collection. Here's how we might approach the <NOBR>problem:<SCRIPT>create_link(43);</SCRIPT>
</NOBR></P>
<A NAME="23351"></A>
<UL><PRE><A NAME="p153"></A>void *operator new(size_t size)
{
  void *p = getMemory(size);         // call some function to
                                     // allocate memory and
                                     // handle out-of-memory
                                     // conditions
<A NAME="22734"></A>
  <I>add p to the collection of allocated addresses;</I>
<A NAME="22735"></A>
  return p;
<A NAME="22736"></A>
}
<A NAME="22739"></A>
void operator delete(void *ptr)
{
  releaseMemory(ptr);                // return memory to
                                     // free store
<A NAME="22740"></A>
  <I>remove ptr from the collection of allocated addresses;</I>
}
<A NAME="22741"></A>
bool isSafeToDelete(const void *address)
{
  <I>return whether address is in collection of
  allocated addresses;</I>
}
</PRE>
</UL><A NAME="22910"></A>

<P><A NAME="dingp44"></A>This is about as simple as it gets. <CODE>operator</CODE> <CODE>new</CODE> adds entries to a collection of allocated addresses, <CODE>operator</CODE> <CODE>delete</CODE> removes entries, and <CODE>isSafeToDelete</CODE> does a lookup in the collection to see if a particular address is there. If the <CODE>operator</CODE> <CODE>new</CODE> and <CODE>operator</CODE> <CODE>delete</CODE> functions are at global scope, this should work for all types, even the <NOBR>built-ins.<SCRIPT>create_link(44);</SCRIPT>
</NOBR></P>
<A NAME="22911"></A>

<P><A NAME="dingp45"></A>In practice, three things are likely to dampen our enthusiasm for this design. The first is our extreme reluctance to define anything at global scope, especially functions with predefined meanings like <CODE>operator</CODE> <CODE>new</CODE> and <CODE>operator</CODE> <CODE>delete</CODE>. Knowing as we do that there is but one global scope and but a single version of <CODE>operator</CODE> <CODE>new</CODE> and <CODE>operator</CODE> <CODE>delete</CODE> with the "normal" signatures (i.e., sets of parameter types) within that scope (see <A HREF="../EC/EI9_FR.HTM#1961" TARGET="_top">Item E9</A>), the last thing we want to do is seize those function signatures for ourselves. Doing so would render our software incompatible with any other software that also implements global versions of <CODE>operator</CODE> <CODE>new</CODE> and <CODE>operator</CODE> <CODE>delete</CODE> (such as many object-oriented database <NOBR>systems).<SCRIPT>create_link(45);</SCRIPT>
</NOBR></P>
<A NAME="23387"></A>

<P><A NAME="dingp46"></A>Our second consideration is one of efficiency: why burden all heap allocations with the bookkeeping overhead necessary to keep track of returned addresses if we don't need <NOBR>to?<SCRIPT>create_link(46);</SCRIPT>
</NOBR></P><A NAME="59885"></A>

<P><A NAME="dingp47"></A>Our final concern is pedestrian, but important. It turns out to be essentially impossible to implement <CODE>isSafeToDelete</CODE> so that it always works. The difficulty has to do with the fact that objects with multiple <A NAME="p154"></A>or virtual base classes have multiple addresses, so there's no guarantee that the address passed to <CODE>isSafeToDelete</CODE> is the same as the one returned from <CODE>operator</CODE> <CODE>new</CODE>, even if the object in question was allocated on the heap. For details, see Items <a href="./MI24_FR.HTM#41284" TARGET="_top">24</A> and <a href="./MI31_FR.HTM#34883" TARGET="_top">31</A>.<SCRIPT>create_link(47);</SCRIPT>
</P>
<A NAME="22912"></A>

<P><A NAME="dingp48"></A>What we'd like is the functionality provided by these functions without the concomitant pollution of the global namespace, the mandatory overhead, and the correctness problems. Fortunately, C++ gives us exactly what we need in the form of an abstract mixin base <NOBR>class.<SCRIPT>create_link(48);</SCRIPT>
</NOBR></P>
<A NAME="22387"></A>

<P><A NAME="dingp49"></A>An abstract base class is a base class that can't be instantiated, i.e., one with at least one pure virtual function. A mixin ("mix in") class is one that provides a single well-defined capability and is designed to be compatible with any other capabilities an inheriting class might provide (see <A HREF="../EC/EI7_FR.HTM#1894" TARGET="_top">Item E7</A>). Such classes are nearly always abstract. We can therefore come up with an abstract mixin base class that offers derived classes the ability to determine whether a pointer was allocated from <CODE>operator</CODE> <CODE>new</CODE>. Here's such a <NOBR>class:<SCRIPT>create_link(49);</SCRIPT>
</NOBR></P>
<A NAME="60014"></A>
<UL><PRE>
class HeapTracked {                  // mixin class; keeps track of
public:                              // ptrs returned from op. new
<A NAME="60016"></A>
  class MissingAddress{};            // exception class; see below
<A NAME="60017"></A>
  virtual ~HeapTracked() = 0;

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -