📄 chapter4.html

📁 WRITING BUG-FREE C CODE.
💻 HTML
📖 第 1 页 / 共 5 页
字号:
 an object. <br><br>At the core of the VERIFY() macro is its usage of <a href="chapter3.html#winassert">WinAssert() &sect;3.3</a>. This allows VERIFY() to be terminated with either a semicolon or a block of code. <br><br>Again, notice that the only piece of information needed is the object's variable name.  No other information needs to be provided.  The VERIFY() macro implements the syntax that is desired but leaves the implementation to another macro called _VERIFY(). <br><br>The VERIFYZ() macro is a slight variation on the VERIFY() macro.  If a NULL pointer is passed to VERIFYZ(), the optional body of code is not executed, nor is this treated as an error.  VERIFYZ() is useful in allowing NULL pointers to be passed to an object's destroy method. <br><br>Given a handle to an object, which is just a pointer to the object, you should be able to obtain information about the object maintained by the heap manager.  As we will see in <a href="chapter5.html">Chapter 5</a>, the heap manager just provides a wrapper around the object.  This means that the heap manager's information about the object can be accessed by using negative offsets from the object pointer.  For speed, these offsets are known by both the heap manager code and the run-time object verification code.  (See Figure 4-1). <blockquote> <img src="images/fig4-1.gif"><br> Figure 4-1: Memory layout of a heap object. </blockquote>The data item immediately before a valid heap object is a long pointer to the class descriptor of the object or NULL, which indicates that no class descriptor exists.  The data item before the class descriptor pointer is a pointer to the heap object, which is used for heap pointer validation. <br><br>Using hRand as an example, the steps needed to verify that the address contained in hRand does indeed point to a valid random number object are as follows. <br><br>1.	Is hRand a valid pointer into the heap?  This step is the most difficult since it depends upon the machine architecture that the program is running on.  More on this later, but for now we will use FmIsPtrOk(hRand). <br><br>2.	Does the address in hRand match the address at hRand minus two data items?  Namely, (((LPVOID)hRand)==*(LPVOID FAR*) ((LPSTR) hRand-sizeof(LPCLASSDESC)-sizeof(LPVOID))). <br><br>3.	Does the address at hRand minus one data item match &_CD(hRand)? Namely, ((&_CD(hRand))==*(LPCLASSDESC FAR*)((LPSTR)hRand-sizeof(LPCLASSDESC))). <br><br>One possible _VERIFY() macro implementation is as follows.<br><br><table bgcolor="#CCCCEE" cellpadding=0 cellspacing=0><tr><td><pre><b>_VERIFY() macro</b>#define _S4 (sizeof(LPCLASSDESC))#define _S8 (sizeof(LPCLASSDESC)+sizeof(LPVOID))#define _VERIFY(hObj) \    ( FmIsPtrOk(hObj) && \    (((LPVOID)hObj)==*(LPVOID FAR*)((LPSTR)hObj-_S8)) \    && ((&_CD(hObj))==*(LPCLASSDESC FAR*)((LPSTR)hObj-_S4)) )</pre></tr></td></table> <br>To be efficient, the _VERIFY() implementation must be tailored to a specific development environment.  It also assumes that an effective FmIsPtrOk() can be written.  This will be discussed in <a href="chapter5.html">Chapter 5</a>.  I have found out over the years that the source code has stayed the same, but the _VERIFY() macro implementation keeps on changing to suit my development environment.   <blockquote><table bgcolor="#E0E0E0" border=1 cellpadding=2 cellspacing=0><tr><td>   To be efficient, the _VERIFY() implementation must be tailored to a   specific development environment.   </tr></td></table></blockquote>My development environment was once based upon the small memory model of the Microsoft compiler.  Then it moved to the medium memory model;  then to a based heap allocation scheme and finally to a model in which data is kept in far data segments.  Through each of these changes, the code has stayed the same, but the _VERIFY() implementation has changed quite a bit.   <blockquote><table bgcolor="#E0E0E0" border=1 cellpadding=2 cellspacing=0><tr><td>   You must code a _VERIFY() that works in your particular environment.   </tr></td></table></blockquote>I cannot provide you with a generalized _VERIFY() implementation.  You must code a _VERIFY() that works in your particular environment.  The _VERIFY() that I use in my environment follows.<br><br><table bgcolor="#F0F0F0"><tr><td><img src="./windows.gif"> <b>4.4.6 My _VERIFY() Macro</b><br><br>My _VERIFY() macro is tailored to the segmented architecture of the Intel CPU and is highly optimized.  It assumes that a program was developed using the medium memory model and that object handles are 32-bit segment/offset pointers. <br><br><table bgcolor="#CCCCEE" cellpadding=0 cellspacing=0><tr><td><pre><b>My _VERIFY() macro</b>#define _VERIFY(hObj) Verify_##hObj((long)hObj, (WORD)&_CD(hObj))</pre></tr></td></table> <br>My implementation of _VERIFY() ends up calling a local (near) function whose arguments are passed using the register calling convention.  I turned the code into a function call, because I was dissatisfied with the speed (too slow) and size (too big) of the code generated by the compiler for the macro form of _VERIFY().  I discovered this by using the code generation option (/Fc) of the Microsoft C8 compiler.  The function call saves code size and since the call is a near call using the register calling convention, the speed is actually quite good.  The CLASS() macro was changed slightly to automatically prototype the Verify_##hObj function for me. <br><br>The 32-bit object pointer is type cast into a long because 32-bit pointers cannot be passed through the register calling convention, but a long can.  The 32-bit class descriptor address is type cast to a WORD for two reasons.  First, because the register calling convention does not allow for two long values to be passed through registers, but it does allow a long and a WORD.  Second, because the medium memory model is being used, the segment for all class descriptors has the same value, so it is ignored and only the lower 16 bits (the offset) are used for type checking. <br><br><table bgcolor="#CCCCEE" cellpadding=0 cellspacing=0><tr><td><pre><b>The verification code used by my _VERIFY() macro</b>; DX:AX = far pointer to verify; BX    = offset to object class descriptor;; WARNING: This code assumes the register calling convention; used by Microsoft C8.  It may change in future compiler; versions.    xchg ax, bx    xor cx, cx                   ;; assume a bad selector    lsl cx, dx                   ;; verify selector, length    cmp bx, cx    mov cx, 0                    ;; assume false return    jae done                     ;; long pointer was bad    mov es, dx    cmp word ptr es:[bx-8], bx   ;; test offset    jne done    cmp word ptr es:[bx-6], dx   ;; test segment    jne done    cmp word ptr es:[bx-4], ax   ;; test class desc offset    jne done    inc cx                       ;; true returndone:    mov ax, cx    ret</pre></tr></td></table> <br>This verification code is really part of a macro that is used by an assembly file that creates the properly named code segment and verification code so that it can be called as a near function.  This assembly file is part of my project makefile.  It uses an inlining file feature of the makefile to accomplish this. <br><br>The execution overhead of the verify code that I use is low because it has been handwritten in assembly.  A fair estimate is that one verify takes 66 clock cycles.  Assuming that you run the code on an Intel 66-MHz 80486, you can perform one million object verifications per second.  A 1 percent processor overhead would require 10,000 object verifications per second.  The application that I wrote usually does less than 10,000 object verifications per second (as measured by changing the _VERIFY() macro to increment a counter), so I know that the overhead is less than 1 percent of the processor. <br><br>This implementation of _VERIFY() takes full advantage of the features of my own environment to meet my demanding speed and space requirements.</td></tr></table><br><b>4.4.7 Summary</b> <br><br>The CLASS() and VERIFY() macros work together to provide what is needed to run-time type check object handles.  The stringizing operator and the token pasting operator are key features of C that make these macros so easy to use.<br><a name="managememory"><br></a><big><b>4.5 Managing Memory</b></big> <br><br>What should be the interface for allocating and freeing objects?  The interface should probably be implemented through a set of macros to allow the implementation to change without having to change any source code. <br><br><b>4.5.1 NEWOBJ() and FREE() Interfaces</b> <br><br>A model for allocating an object is NEWOBJ(hObj).  The NEWOBJ() macro implementation should do all the dirty work of allocating the memory from the heap manager, passing the appropriate type information and assigning the memory pointer to hObj. <br><br>A model for freeing an object is FREE(hObj).  It should call the heap manager to free the memory associated with hObj.  It should also ensure that hObj is set to NULL as well.  This allows us to find any bugs that involve using the handle after calling FREE(), because dereferencing a far NULL pointer causes a CPU fault to occur in protected-mode architectures.  See  <a href="chapter7.html#pointers">&sect;7.12</a> for more information on using NULL. <br><br>However, before we can write these macros, the interface to the new heap manager must be specified. <br><br><b>4.5.2 Heap Manager Interface Specification</b> <br><br>Because the heap manager is at the core of the object management system, it should have as much error checking information available in it as possible.  One piece of information we already know it must have is the address of a class descriptor.  Since this allows us to write a heap manager that provides great symbolic dumps of the heap, why not add some more information that would be meaningful in the heap dump? <br><br>Why not include the filename and line number where the object was allocated?  This information is useful for non-object heap objects like strings.  The reason that it is not as useful for objects is that objects are created only in one method function. <br><br>Another concern is which memory model to use for heap objects.  For specialized applications, this is of major concern since the memory model affects the performance of the application.  However, for the object model, an interface that uses 32-bit pointers is assumed.  The 32-bit address may be a segment and offset for segmented architectures, or it may be a linear virtual address in flat-model architectures.  Whichever architecture it is, it does not matter. <br><br><table bgcolor="#CCCCEE" cellpadding=0 cellspacing=0><tr><td><pre><b>The heap manager interface</b>EXTERNC LPVOID APIENTRY FmNew  ( SIZET, LPCLASSDESC, LPSTR, int );EXTERNC LPVOID APIENTRY FmFree ( LPVOID );</pre></tr></td></table> <br>The FmNew (far memory new) takes four arguments.  The first argument indicates the number of bytes to allocate in the object.  It is of type SIZET.  Under most C environments, this will be defined to be size_t.  The second argument is a pointer to a class descriptor or NULL if no class descriptor exists.  The third and fourth arguments specify the filename and line number where the FmNew call took place.  The return value is a long void pointer to the allocated memory. <br><br>The FmFree (far memory free) takes one argument.  The argument is a memory object that was previously allocated through FmNew(), or NULL.  The return value is a long void pointer that is always NULL. <br><br>The heap manager is discussed in further detail in <a href="chapter5.html">Chapter 5</a>.  For now, this gives us enough information to implement the NEWOBJ() and FREE() macros. <br><br><b>4.5.3 NEWOBJ() and FREE() Implementations</b> <br><br>Now that the heap manager interface has been specified, the NEWOBJ() and FREE() macros can be designed. <br><br><table bgcolor="#CCCCEE" cellpadding=0 cellspacing=0><tr><td><pre><b>NEWOBJ() and FREE() implementation, first cut</b>#define NEWOBJ(hObj) \  hObj = FmNew(sizeof(*hObj),&_CD(hObj),__FILE__,__LINE__))#define FREE(hObj) hObj = FmFree(hObj)</pre></tr></td></table> <br>Notice that hObj is the only piece of information needed by NEWOBJ() and FREE().  The size, in bytes, of the object pointed to by hObj is sizeof(*hObj).  The address of the class descriptor for hObj is &_CD(hObj).  Finally, the filename and line number of the memory allocation are simply __FILE__ and __LINE__.  This implementation does in fact work quite well except for two minor problems. <br><br>The first problem is with __FILE__.  Every time it is used, it introduces a new string into the program.  However, a solution exists.  Use the filename variable that is used by the <a href="chapter3.html#winassert">WinAssert() &sect;3.3</a> code.  The variable is named szSRCFILE.  You just have to make sure that USEWINASSERT is placed at the top of the source file.   <blockquote><table bgcolor="#E0E0E0" border=1 cellpadding=2 cellspacing=0><tr><td>   Every source file should have a USEWINASSERT at the top of the   source file.   </tr></td></table></blockquote>The second problem is with the differences between C and C++.  The first pass implementation works just fine in C but not in C++.  It involves the usage of void pointers.  In C, a void pointer may be legally assigned to a typed pointer.  In C++, this is illegal without the appropriate type cast.  We do not want to have to pass in the data type of the object, since this would ruin the slick implementation of NEWOBJ() and FREE().
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -