📄 mc5.htm

📁 一个非常适合初学者入门的有关c++的文档
💻 HTM
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
UPNumber::UPNumber(){  if (!onTheHeap) {    throw HeapConstraintViolation();  }  proceed with normal construction here;  onTheHeap = false;                    // clear flag for next obj.}There's nothing deep going on here. The idea is to take advantage of the fact that when an object is allocated on the heap, operator new is called to allocate the raw memory, then a constructor is called to initialize an object in that memory. In particular, operator new sets onTheHeap to true, and each constructor checks onTheHeap to see if the raw memory of the object being constructed was allocated by operator new. If not, an exception of type HeapConstraintViolation is thrown. Otherwise, construction proceeds as usual, and when construction is finished, onTheHeap is set to false, thus resetting the default value for the next object to be constructed.This is a nice enough idea, but it won't work. Consider this potential client code: UPNumber *numberArray = new UPNumber[100];The first problem is that the memory for the array is allocated by operator new[], not operator new, but (provided your compilers support it) you can write the former function as easily as the latter. What is more troublesome is the fact that numberArray has 100 elements, so there will be 100 constructor calls. But there is only one call to allocate memory, so onTheHeap will be set to true for only the first of those 100 constructors. When the second constructor is called, an exception is thrown, and woe is you.Even without arrays, this bit-setting business may fail. Consider this statement: UPNumber *pn = new UPNumber(*new UPNumber);Here we create two UPNumbers on the heap and make pn point to one of them; it's initialized with the value of the second one. This code has a resource leak, but let us ignore that in favor of an examination of what happens during execution of this expression: new UPNumber(*new UPNumber)This contains two calls to the new operator, hence two calls to operator new and two calls to UPNumber constructors (see Item 8). Programmers typically expect these function calls to be executed in this order, Call operator new for first object Call constructor for first object Call operator new for second object Call constructor for second objectbut the language makes no guarantee that this is how it will be done. Some compilers generate the function calls in this order instead: Call operator new for first object Call operator new for second object Call constructor for first object Call constructor for second objectThere is nothing wrong with compilers that generate this kind of code, but the set-a-bit-in-operator-new trick fails with such compilers. That's because the bit set in steps 1 and 2 is cleared in step 3, thus making the object constructed in step 4 think it's not on the heap, even though it is.These difficulties don't invalidate the basic idea of having each constructor check to see if *this is on the heap. Rather, they indicate that checking a bit set inside operator new (or operator new[]) is not a reliable way to determine this information. What we need is a better way to figure it out.If you're desperate enough, you might be tempted to descend into the realm of the unportable. For example, you might decide to take advantage of the fact that on many systems, a program's address space is organized as a linear sequence of addresses, with the program's stack growing down from the top of the address space and the heap rising up from the bottom:On systems that organize a program's memory in this way (many do, but many do not), you might think you could use the following function to determine whether a particular address is on the heap: // incorrect attempt to determine whether an address// is on the heapbool onHeap(const void *address){  char onTheStack;                   // local stack variable  return address < &onTheStack;}The thinking behind this function is interesting. Inside onHeap, onTheStack is a local variable. As such, it is, well, it's on the stack. When onHeap is called, its stack frame (i.e., its activation record) will be placed at the top of the program's stack, and because the stack grows down (toward lower addresses) in this architecture, the address of onTheStack must be less than the address of any other stack-based variable or object. If the parameter address is less than the location of onTheStack, it can't be on the stack, so it must be on the heap.Such logic is fine, as far as it goes, but it doesn't go far enough. The fundamental problem is that there are three places where objects may be allocated, not two. Yes, the stack and the heap hold objects, but let us not forget about static objects. Static objects are those that are initialized only once during a program run. Static objects comprise not only those objects explicitly declared static, but also objects at global and namespace scope (see Item E47). Such objects have to go somewhere, and that somewhere is neither the stack nor the heap.Where they go is system-dependent, but on many of the systems that have the stack and heap grow toward one another, they go below the heap. The earlier picture of memory organization, while telling the truth and nothing but the truth for many systems, failed to tell the whole truth for those systems. With static objects added to the picture, it looks like this:Suddenly it becomes clear why onHeap won't work, not even on systems where it's purported to: it fails to distinguish between heap objects and static objects: void allocateSomeObjects(){  char *pc = new char;               // heap object: onHeap(pc)                                     // will return true  char c;                            // stack object: onHeap(&c)                                     // will return false  static char sc;                    // static object: onHeap(&sc)                                     // will return true  ...}Now, you may be desperate for a way to tell heap objects from stack objects, and in your desperation you may be willing to strike a deal with the portability Devil, but are you so desperate that you'll strike a deal that fails to guarantee you the right answers? Surely not, so I know you'll reject this seductive but unreliable compare-the-addresses trick.The sad fact is there's not only no portable way to determine whether an object is on the heap, there isn't even a semi-portable way that works most of the time. If you absolutely, positively have to tell whether an address is on the heap, you're going to have to turn to unportable, implementation-dependent system calls, and that's that. As such, you're better off trying to redesign your software so you don't need to determine whether an object is on the heap in the first place.If you find yourself obsessing over whether an object is on the heap, the likely cause is that you want to know if it's safe to invoke delete on it. Often such deletion will take the form of the infamous "delete this." Knowing whether it's safe to delete a pointer, however, is not the same as simply knowing whether that pointer points to something on the heap, because not all pointers to things on the heap can be safely deleted. Consider again an Asset object that contains a UPNumber object: class Asset {private:  UPNumber value;  ...};Asset *pa = new Asset;Clearly *pa (including its member value) is on the heap. Equally clearly, it's not safe to invoke delete on a pointer to pa->value, because no such pointer was ever returned from new.As luck would have it, it's easier to determine whether it's safe to delete a pointer than to determine whether a pointer points to something on the heap, because all we need to answer the former question is a collection of addresses that have been returned by operator new. Since we can write operator new ourselves (see Items E8-E10), it's easy to construct such a collection. Here's how we might approach the problem: void *operator new(size_t size){  void *p = getMemory(size);         // call some function to                                     // allocate memory and                                     // handle out-of-memory                                     // conditions  add p to the collection of allocated addresses;  return p;}void operator delete(void *ptr){  releaseMemory(ptr);                // return memory to                                     // free store  remove ptr from the collection of allocated addresses;}bool isSafeToDelete(const void *address){  return whether address is in collection of  allocated addresses;}This is about as simple as it gets. operator new adds entries to a collection of allocated addresses, operator delete removes entries, and isSafeToDelete does a lookup in the collection to see if a particular address is there. If the operator new and operator delete functions are at global scope, this should work for all types, even the built-ins.In practice, three things are likely to dampen our enthusiasm for this design. The first is our extreme reluctance to define anything at global scope, especially functions with predefined meanings like operator new and operator delete. Knowing as we do that there is but one global scope and but a single version of operator new and operator delete with the "normal" signatures (i.e., sets of parameter types) within that scope (see Item E9), the last thing we want to do is seize those function signatures for ourselves. Doing so would render our software incompatible with any other software that also implements global versions of operator new and operator delete (such as many object-oriented database systems).Our second consideration is one of efficiency: why burden all heap allocations with the bookkeeping overhead necessary to keep track of returned addresses if we don't need to?Our final concern is pedestrian, but important. It turns out to be essentially impossible to implement isSafeToDelete so that it always works. The difficulty has to do with the fact that objects with multiple or virtual base classes have multiple addresses, so there's no guarantee that the address passed to isSafeToDelete is the same as the one returned from operator new, even if the object in question was allocated on the heap. For details, see Items 24 and 31.What we'd like is the functionality provided by these functions without the concomitant pollution of the global namespace, the mandatory overhead, and the correctness problems. Fortunately, C++ gives us exactly what we need in the form of an abstract mixin base class.An abstract base class is a base class that can't be instantiated, i.e., one with at least one pure virtual function. A mixin ("mix in") class is one that provides a single well-defined capability and is designed to be compatible with any other capabilities an inheriting class might provide (see Item E7). Such classes are nearly always abstract. We can therefore come up with an abstract mixin base class that offers derived classes the ability to determine whether a pointer was allocated from operator new. Here's such a class: class HeapTracked {                  // mixin class; keeps track ofpublic:                              // ptrs returned from op. new  class MissingAddress{};            // exception class; see below  virtual ~HeapTracked() = 0;  static void *operator new(size_t size);  static void operator delete(void *ptr);  bool isOnHeap() const;private:  typedef const void* RawAddress;  static list<RawAddress> addresses;};This class uses the list data structure that's part of the standard C++ library (see Item E49 and Item 35) to keep track of all pointers returned from operator new. That function allocates memory and adds entries to the list; operator delete deallocates memory and removes entries from the list; and isOnHeap returns whether an object's address is in the list.Implementation of the HeapTracked class is simple, because the global operator new and operator delete functions are called to perform the real memory allocation and deallocation, and the list class has functions to make insertion, removal, and lookup single-statement operations. Here's the full implementation of HeapTracked: // mandatory definition of static class memberlist<RawAddress> HeapTracked::addresses;// HeapTracked's destructor is pure virtual to make the// class abstract (see Item E14). The destructor must still// be defined, however, so we provide this empty definition.HeapTracked::~HeapTracked() {}void * HeapTracked::operator new(size_t size){  void *memPtr = ::operator new(size);  // get the memory  addresses.push_front(memPtr);         // put its address at                                        // the front of the list  return memPtr;}void HeapTracked::operator delete(void *ptr){  // get an "iterator" that identifies the list  // entry containing ptr; see Item 35 for details  list<RawAddress>::iterator it =    find(addresses.begin(), addresses.end(), ptr);  if (it != addresses.end()) {       // if an entry was found    addresses.erase(it);             // remove the entry    ::operator delete(ptr);          // deallocate the memory  } else {                           // otherwise    throw MissingAddress();          // ptr wasn't allocated by  }                                  // op. new, so throw an}                                    // exceptionbool HeapTracked::isOnHeap() const{  // get a pointer to the beginning of the memory  // occupied by *this; see below for details  const void *rawAddress = dynamic_cast<const void*>(this);  // look up the pointer in the list of addresses  // returned by operator new  list<RawAddress>::iterator it =    find(addresses.begin(), addresses.end(), rawAddress);  return it != addresses.end();      // return whether it was}                                    // foundThis code is straightforward, though it may not look that way if you are unfamiliar with the list class and the other components of the Standard Template Library. Item 35 explains everything, but the comments in the code above should be sufficient to explain what's happening in this example.The only other thing that may confound you is this statement (in isOnHeap): const void *rawAddress = dynamic_cast<const void*>(this);I mentioned earlier that writing the global function isSafeToDelete is complicated by the fact that objects with multiple or virtual base classes have several addresses. That problem plagues us in isOnHeap, too, but because isOnHeap applies only to HeapTracked objects, we can exploit a special feature of the dynamic_cast operator (see Item 2) to eliminate the problem. Simply put, dynamic_casting a pointer to void* (or const void* or volatile void* or, for those who can't get enough modifiers in their usual diet, const volatile void*) yields a pointer to the beginning of the memory for the object pointed to by the pointer. But dynamic_cast is applicable only to pointers to objects that have at least one virtual function. Our ill-fated isSafeToDelete function had to work with any type of pointer, so dynamic_cast wouldn't help it. isOnHeap is more selective (it tests only pointers to HeapTracked objects), so dynamic_casting this to const void* gives us a pointer to the beginning of the memory for the current object. That's the pointer that HeapTracked::operator new must have returned if the memory for the current object was allocated by HeapTracked::operator new in the first place. Provided your compilers support the dynamic_cast operator, this technique is completely portable.Given this class, even BASIC programmers could add to a class the ability to track pointers to heap allocations. All they'd need to do is have the class inherit from HeapTracked. If, for example, we want to be able to determine whether a pointer to an Asset object points to a heap-based object, we'd modify Asset's class definition to specify HeapTracked as a base class: class Asset: public HeapTracked {private:  UPNumber value;  ...};We could then query Asset* pointers as follows: void inventoryAsset(const Asset *ap){  if (ap->isOnHeap()) {    ap is a heap-based asset  inventory it as such;  }  else {    ap is a non-heap-based asset  record it that way;  }}A disadvantage of a mixin class like HeapTracked is that it can't be used with the built-in types, because types like int and char can't inherit from anything. Still, the most common reason for wanting to use a class like HeapTracked is to determine whether it's okay to "delete this," and you'll never want to do that with a built-in type because such types have no this pointer.Prohibiting Heap-Based ObjectsThus ends our examination of determining whether an object is on the heap. At the opposite end of the spectrum is preventing objects from being allocated on the heap. Here the outlook is a bit brighter. There are, as usual, three cases: objects that are dire
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -