⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 chapter5.html

📁 WRITING BUG-FREE C CODE.
💻 HTML
📖 第 1 页 / 共 3 页
字号:
<html><head><title>Writing Bug-Free C Code: A New Heap Manager</title></head><body><center><font size="+3">Chapter 5: A New Heap Manager</font><br><a href="index.html">Writing Bug-Free C Code</a><br></center><br><center><table><tr><td valign=top><small><a href="#traditional">5.1 The Traditional Heap Manager</a><br><a href="#newmanager">5.2 A New Heap Manager Specification</a><br><a href="#stringinterface">5.3 An Interface for Strings</a><br><a href="#arrayinterface">5.4 An Interface for Arrays</a><br><small></td><td width=30>&nbsp;</td><td valign=top><small><a href="#detectleaks">5.5 Detecting Storage Leaks</a><br><a href="#windowsmemory">5.6 Windows Memory Model Issues</a><br><a href="#summary">5.7 Chapter Summary</a><br></small></td></tr></table></center><br><br>This chapter introduces a new heap manager interface that checks for common mistakes on the part of the programmer allocating and freeing memory and also fulfills the requirements of the class methodology introduced in <a href="chapter4.html">Chapter 4</a>.  All of the code introduced in this chapter can also be found in one convenient location in the <a href="appendix.html">Code Listings Appendix</a>.<br><a name="traditional"><br></a><big><b>5.1 The Traditional Heap Manager</b></big> <br><br>A good friend of mine was visiting one day when we happened to start talking about programming techniques.  He had landed a job with a well-known, large computer company.  I explained to him the replacement heap manager that I had come up with and the reasons why I felt that it was necessary.  I then questioned my friend on the programming guidelines concerning memory deallocations that were in place at his company. <br><br>As it turns out, there were none.  The standard C library routines were it.  Throughout the code there were direct calls to the memory manager to allocate and free memory.  I then asked what was done when an invalid memory pointer was inadvertently passed to free().  Nothing was done.  It was assumed that memory pointers passed to free() were always valid.  Whenever a memory management bug occurred, it would be tracked down and fixed. <br><br>Tracking down memory management bugs as they occur violates the principle of solving the problem that caused the bug and not just the manifestation of the bug. <br><br><b>5.1.1 The Interface</b> <br><br>The standard C library routines for allocating and freeing memory are malloc() and free(). <br><br><table bgcolor="#CCCCEE" cellpadding=0 cellspacing=0><tr><td><pre><b>Standard C library memory function prototypes</b>void *malloc ( size_t size );void  free   ( void *memblock );</pre></tr></td></table> <br>The malloc() function takes one argument, the size of the memory object to allocate, in bytes.  It returns a void pointer to the allocated block of memory or NULL if there is insufficient memory or if some error exists.  The memory is guaranteed to be properly aligned for any type of object. <br><br>The free() function takes one argument, a void memory pointer that was previously allocated through a malloc() call.  The free() function has no return value.  If a NULL pointer is passed to free(), the call is ignored.  If an invalid memory pointer is passed to free(), the behavior of free() is undefined. <br><br>While there are several more standard functions for manipulating memory, for the sake of discussion, we are interested only in malloc() and free(). <br><br><b>5.1.2 The Problem</b> <br><br>The single biggest problem with the standard C library interface to memory is that it assumes the programmer never makes a mistake in calling the memory management functions.  Consider the following. <br><br><table bgcolor="#CCCCEE" cellpadding=0 cellspacing=0><tr><td><pre><b>C program, with memory management problems</b>#include &lt;stdlib.h&gt;int main(void){    void *pMem=malloc(100);    free(pMem);    free(pMem);    return 0;} /* main */</pre></tr></td></table> <br>Using free() on a memory pointer that has already been passed to free() is a common error in many large C programs.  How free() behaves in such a case is undefined.  On some systems, the run-time library may tell you that you have made a mistake and terminate your program.  Other systems will continue, even though the validity of the application's heap is in question. <br><br><table bgcolor="#F0F0F0"><tr><td><img src="./windows.gif">This is what happens in Microsoft C8.  The sample code above runs just fine, producing no error messages.  In Microsoft C8, the documentation for free() states the following: <blockquote><i> Attempting to free an invalid pointer may affect subsequent allocation and cause errors.  An invalid pointer is one not allocated with the appropriate call. </i></blockquote> </td></tr></table> <br>C was designed to be portable and produce small, efficient code.  To accomplish this feat so successfully, a lot of small but important decisions have been left to be decided by the particular implementor.  The designers of C wanted to give as much leeway as possible to each particular implementation.   <blockquote><table bgcolor="#E0E0E0" border=1 cellpadding=2 cellspacing=0><tr><td>   The C language provides a minimal, efficient framework.   </tr></td></table></blockquote>It is up to the programmer to build upon this framework. <br><br><table bgcolor="#F0F0F0"><tr><td><img src="./windows.gif"> <b>5.1.3 Windows Issues</b> <br><br>A problem for many Windows programmers is what memory interface to use.  The standard C library provides one interface (malloc() and free()) and the Windows API provides two interfaces based upon handles to memory objects.  The first Windows API is a local memory interface (LocalAlloc(), LocalLock(), LocalUnlock() and LocalFree()) in which the total of all local objects must be less than 64K.  The second Windows API is a global memory interface (GlobalAlloc(), GlobalLock(), GlobalUnlock() and GlobalFree()), which allows for a large number of varied-sized objects to be allocated. <br><br>The reasons for the handle-based interface used by Windows is largely historical and can be almost totally ignored today in favor of using the C library interface.  When Windows supported the real-mode of the Intel processor, the handle-based alloc/lock/unlock/free model allowed the Windows kernel to move memory around to avoid fragmentation.  However, now that Windows supports only the protected-mode of the Intel processor, the handle-based model is no longer needed because the processor is capable of memory management tasks, even on locked memory objects. <br><br>The only time that you need to deal with the handle-based memory interface is when you are dealing with those Windows API calls that use or return a memory object based upon handles.  SetClipboardData() and GetClipboardData() are two examples of Windows APIs that still use the handle-based interface.</td></tr></table><br><a name="newmanager"><br></a><big><b>5.2 A New Heap Manager Specification</b></big> <br><br>A clear set of goals is needed before a new heap manager specification can be designed.  The job is finished when the goals have been accomplished. <br><br><b>5.2.1 Flat and Segmented Architectures</b> <br><br><i>Flat memory model</i>.  For programmers using machine architectures that are based upon a flat (non-segmented) memory model, the choice for a heap manager interface is straightforward.  There are no choices about which pointer size to use.  Only one address size exists and it is usually 32 bits. <br><br><i>Segmented architectures</i>.  For programmers using machine architectures that are segmented, the choice for a heap manager interface is not at all straightforward.  In fact, it is filled with a lot of choices.  The primary reason for this is usually a concern for speed. <br><br>The following discussion centers around the segmented architecture used by Intel. <br><br>An address in a segmented architecture is composed of two parts.  Part of the address is used to determine which segment is being used.  The other part of the address is used to determine the byte within the segment.  (See Figure 5-1). <blockquote> <img src="images/fig5-1.gif"><br> Figure 5-1: A segmented architecture address. </blockquote>There are two primary ways an address can be specified.  The first is by specifying the full segment and offset (far address).  The second is by specifying only the offset (near address), where the segment is implied to be one of the segment registers contained in the CPU.  In terms of execution speed, specifying only the offset part of an address is fastest. <br><br>You may be thinking that the choice of which addressing method to use is easy.  Pick near addresses for speed.  The problem here is that a segment can address only 64 kilobytes of memory (i.e., 65,536 bytes).  If the total amount of memory that you need to allocate is less than 64K bytes, then great; use near addresses for your heap pointers.  However, if the total amount of memory that you need to allocate is more than 64K bytes, then you are forced to use multiple segments and hence far addresses. <br><br>All compilers allow different memory model programs to be built because of the different addressing possibilities:  small, for when code and data together are less than 64K; compact, for code less than 64K and data greater than 64K; medium, for code greater than 64K and data less than 64K; large, for code and data greater than 64K. <br><br>This situation is complicated by the fact that all C compilers allow pointers to be tagged on a case-by-case basis as either near or far.  This is called mixed-model programming. <br><br>All segmented and non-segmented issues can be isolated through the use of a set a macros that act as an interface to memory.  A heap manager interface can be tailored to the specific needs of your environment and program.  There is no need to come up with a super memory management routine that allows megabytes of memory to be allocated if you are writing small utility programs  requiring only a few kilobytes of memory. <br><br>I feel that segmented architectures have gotten a bum rap.  It is true that a 64K segment is a bit limiting, but this is only one implementation.  It can be implemented in a better way.  With 64-bit architectures coming down the road,  there are a lot of possibilities!  The biggest advantage of a segmented architecture is that every segment is protected from every other because all memory references are checked for validity in two very special ways.  First, the segment value must be a valid segment.  Second, the offset must be within the limits of the segment.  If either check fails, the memory access generates a protection violation.  Memory overwrites on a heap object are detected by the hardware, even an overwrite of 1 byte.  This feature is great for debugging. <br><br><b>5.2.2 Requirements</b> <br><br>The requirements of the new heap manager interface are as follows. <br><br><i>Invalid memory pointers are detected and reported</i>.  First and foremost on the list is to plug the gap left by the standard C malloc() and free() functions.  Invalid memory pointers passed to the memory deallocation routine are automatically detected and reported. <br><br><i>Memory allocation/deallocation performance must not be adversely affected</i>.  The last thing we want to do is come up with a set of requirements that are expensive to implement in terms of execution time.  All objects in our class methodology are dynamically allocated and freed and we want to be careful not to adversely impact the performance of the object system. <br><br><i>Support for run-time type checking</i>.  All objects in the heap must have a header containing type information preceding each object.  The <a href="chapter4.html">object system (Chapter 4)</a> requires a specific format for the two data items immediately preceding the object. <br><br><i>Memory is zero initialized</i>.  Memory that is allocated and freed is zero initialized.  Memory is zero initialized for convenience.  Knowing that all items of an object, after a NEWOBJ(), are bit initialized to zero is a nice feature to have.  Memory is zero initialized when being deallocated for any incorrect references to the object after the object is deallocated.  This is highly unlikely, however, because the FREE() macro sets the object pointer to NULL. <br><br><i>Memory allocations do not fail</i>.  This requirement makes sense only in a virtual memory environment.  If a call is made to the heap management memory allocator, the call returns successfully.  In the case of an error, it does not return at all and reports the error.  This prevents having to place failure checks everywhere in the code.  In the majority of applications, the maximum amount of memory that can be allocated by the application is well within the virtual memory limits for an individual program.  In these cases, this requirement makes a lot of sense. <br><br>For the minority of applications whose memory allocations may not fit into the virtual address space limits, the memory allocator should return a failure status (NULL).  Error checks must then be made throughout the code. <br><br>In the spirit of the design of C, it is up to you to decide how to implement this requirement. <br><br><i>Memory overwrites past the end of the object are detected and reported</i>.  The heap manager provides memory space for any type of object.  This includes true objects in the object system as well as strings and whatever else is needed.  Writing past the end of a dynamically allocated string should be detected and reported.  Accidentally writing past the end of a class object is highly unlikely. <br><br><i>Storage leak detection</i>.  As a program executes, it allocates and frees memory.  At program termination, if there is any allocated memory remaining in the heap, that memory is considered to be a storage leak.  Any storage leaks should be reported. <br><br><i>Filename and line number tags</i>.  All heap objects should be tagged with the filename and line number at which the object was created.  This information is useful for producing heap dumps and is required for producing useful information on storage leaks. <br><br><i>Symbolic heap dumps</i>.  The heap manager must know how many objects there are in the heap and must be able to walk the heap, producing useful information about all objects in the heap.  The heap dumps are primarily for tracking down the cause of storage leaks. <br><br><i>Alignment preservation</i>.  Any special data alignment requirements of the CPU must be meet.  Most RISC architectures require that data items be aligned on 2, 4, 8 or even 16-byte boundaries.  Even on architectures with no absolute alignment requirements, it is usually more efficient to have aligned data items because the hardware expends extra CPU cycles to align unaligned data. <br><br><i>Be as portable as possible</i>.  The ideal interface would be easy to port to any system.  While this is possible, the execution time on the varied platforms would probably be considerable.  For this reason, two layers should be used.  The first is a set of macros that present to the programmer a logical, high-level view of memory.  An example of a macro like this is the NEWOBJ() macro.  The second layer is the heap manager function call specification, which is used by the macros. <br><br>The only way to allocate and deallocate memory is through this set of macros.  The macros, in turn, call the actual heap manager functions.  If you port to another platform, simply tailor the heap manager to the particular environment, change the macros that call the new heap manager and recompile. <br><br>In practice, I have found that this technique works well. <br><br>Freeing the NULL pointer is OK. Calling the memory deallocator with the NULL pointer is allowed and the call is simply ignored.  In practice, I have found this feature to be useful because it prevents having to bracket every memory deallocation in an if statement, checking to see if the pointer is non-NULL. <br><br><b>5.2.3 The Interface</b> <br><br>The heap manager interface that I recommend is based upon the 32-bit model.  Whether the architecture of the machine is segmented or not is not an issue (except for its performance impact).  Users of the heap interface do not know and do not need to know the true nature

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -