📄 ch12.htm

📁 c++语言操作手册
💻 HTM
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
<tt>  };</tt><tt>public:</tt><tt>  PersonalDetails(const char *nm); </tt><tt>  PersonalDetails(long id) : ID(id) {/**/}  // direct access to a member</tt><tt>  //...</tt><tt>};</tt></pre><p>By using a union, the size of class <tt>PersonalDetails</tt> is halved. Again,   saving four bytes of memory is not worth the trouble unless this class serves   as a mold for millions of database records or if the records are transmitted   on slow communication lines. Note that unions do not incur any runtime overhead,   so there is no speed tradeoff in this case. The advantage of an anonymous union   over a named one is that its members can be accessed directly. </p><h2> <a name="Heading18"> Speed Optimizations</a></h2><p>In time-critical applications, every CPU cycle counts. This section presents   a few simple guidelines for speed optimization. Some of them have been around   since the early days of C; others are C++ specific. </p><h3> <a name="Heading19"> Using a Class To Pack a Long Argument List</a></h3><p>The overhead of a function call is increased when the function has a long list   of arguments. The runtime system has to initialize the stack with the values   of the arguments; naturally, this operation takes longer when there are more   arguments. For example, executing the following function 100,000,000 times takes   8.5 seconds on average on my machine:</p><pre><tt>void retrieve(const string&amp; title, //5 arguments </tt><tt>              const string&amp; author, </tt><tt>              int ISBN,  </tt><tt>              int year, </tt><tt>              bool&amp;  inStore)</tt><tt>{} </tt></pre><p>Packing the argument list into a single class and passing it by reference as   the only argument reduces the result to five seconds, on average. Of course,   for functions that take a long time to execute, the stack initialization overhead   is negligible. However, for short and fast functions that are called very often,   packing a long parameter list within a single object and passing it by reference   can improve performance.</p><h3> <a name="Heading20"> Register Variables</a></h3><p>The storage specifier <tt>register</tt> can be used as a hint to the compiler   that an object will be heavily used in the program. For example</p><pre><tt>void f()</tt><tt>{</tt><tt>  int *p = new int[3000000];</tt><tt>  register int *p2 = p; //store the address in a register</tt><tt>  for (register int j = 0; j&lt;3000000; j++)</tt><tt>  {</tt><tt>    *p2++ = 0;</tt><tt>  }</tt><tt>  //</tt>...<tt>use  p</tt><tt>  <br>  delete [] p;</tt><tt>}</tt></pre><p>Loop counters are good candidates for being declared as register variables.   When they are not stored in a register, a substantial amount of the loop's execution   time is wasted in fetching the variable from memory, assigning a new value to   it, and storing it back in memory repeatedly. Storing it in a machine register   reduces this overhead. Note, however, that <tt>register</tt> is only a recommendation   to the compiler. As with function inlining, the compiler can refuse to store   the object in a machine register. Furthermore, modern compilers optimize loop   counters and move them to the machine's registers anyway. The <tt>register</tt>   storage specification is not confined to fundamental types. Rather, it can be   used for any type of object. If the object is too large to fit into a register,   the compiler can still store the object in a faster memory region, such as the   cache memory (cache memory is about ten times faster than the main memory).</p><blockquote>  <hr>  <strong>NOTE: </strong> Some compilers ignore the <tt>register</tt> specification   altogether and automatically store the program's variables according to a set   of built-in optimization rules. Please consult your vendor's specifications   for more details on the compiler's handling of register declarations.   <hr></blockquote><p>Declaring function parameters with the <tt>register</tt> storage specifier   is a recommendation to pass the arguments on the machine's registers rather   than passing them on the stack. For example</p><pre><tt>void f(register int j, register Date d);</tt></pre><h3> <a name="Heading21"> Declaring Constant Objects as const</a></h3><p>In addition to the other boons of declaring constant objects as <tt>const</tt>,   an optimizing compiler can take advantage of this declaration, too, and store   such an object in a machine register instead of in ordinary memory. Note that   the same optimization can be applied to function parameters that are declared   <tt>const</tt>. On the other hand, the <tt>volatile</tt> qualifier disables   such an optimization (see Appendix A, "Manual of Programming Style"), so use   it only when it is unavoidable.</p><h3> <a name="Heading22"> Runtime Overhead of Virtual Functions </a></h3><p>When a virtual function is called through a pointer or a reference of an object,   the call doesn't necessarily impose additional runtime penalties. If the compiler   can resolve the call statically, no extra overhead is incurred. Furthermore,   a very short virtual function can be inlined in this case. In the following   example, a clever compiler can resolve the calls of the virtual member functions   statically: </p><pre><tt>#include &lt;iostream&gt;</tt><tt>using namespace std;</tt><tt>class V </tt><tt>{</tt><tt>public:  </tt><tt>  virtual void show() const { cout&lt;&lt;"I'm V"&lt;&lt;endl; }</tt><tt>};</tt><tt>class W : public V </tt><tt>{</tt><tt>public:</tt><tt>  void show() const { cout&lt;&lt;"I'm W"&lt;&lt;endl; }</tt><tt>};</tt><tt>void f(V &amp; v, V *pV) </tt><tt>{</tt><tt>  v.show();   </tt><tt>  pV-&gt;show();  </tt><tt>}</tt><tt>void g()</tt><tt>{</tt><tt>  V v;</tt><tt>  f(v, &amp;v);</tt><tt>}</tt><tt>int main()</tt><tt>{</tt><tt>  g();</tt><tt>  return 0;</tt><tt>}</tt></pre><p>If the entire program appears in a single translation unit, the compiler can   perform an inline substitution of the call of the function <tt>g()</tt> in <tt>main()</tt>.   The invocation of <tt>f()</tt> within <tt>g()</tt> can also be inlined, and   because the dynamic type of the arguments that are passed to <tt>f()</tt> is   known at compile time, the compiler can resolve the virtual function calls inside   <tt>f()</tt> statically. There is no guarantee that every compiler actually   inlines all the function calls; however, some compilers certainly take advantage   of the fact that the dynamic type of the arguments of <tt>f()</tt> can be determined   at compile time, and avoid the overhead of dynamic binding in this case. </p><h3> <a name="Heading23"> Function Objects Versus Function Pointers</a></h3><p>The benefits of using function objects instead of function pointers (function   objects are discussed in Chapter 10 and in Chapter 3, "Operator Overloading")   are not limited to genericity and easier maintenance. Furthermore, compilers   can inline the call of a function object, thereby enhancing performance even   further (inlining a function pointer call is rarely possible). </p><h2> <a name="Heading24"> A Last Resort</a></h2><p>The optimization techniques that have been presented thus far do not dictate   design compromises or less readable code. In fact, some of them <i>improve</i>   the software's robustness and the ease of maintenance. Packing a long argument   list within a class object, <tt>const</tt> declarations, and using function   objects rather than function pointers provide additional benefits on top of   the performance boost. Under strict time and memory constraints, however, these   techniques might not suffice; additional tweaks are sometimes required, which   affect the portability and extensibility of the software. The techniques that   are presented in this section are to be used only as a last resort, and only   after all the other optimizations have been applied.</p><h3> <a name="Heading25"> Disabling RTTI and Exception Handling Support</a></h3><p>When you port pure C code to a C++ compiler, you might discover a slight performance   degradation. This is not a fault in the programming language or the compiler,   but a matter of compiler tuning. All you have to do to gain the same (or better)   performance that you might get from a C compiler is switch off the compiler's   RTTI and exception handling support. Why is this? In order to support RTTI or   exception handling, a C++ compiler inserts additional "scaffolding" code to   the original source file. This increases the executable size a little, and imposes   slight runtime overhead (the overhead of exception handling and RTTI are discussed   in Chapter 6, "Exception<i> </i>Handling,"<i> </i>and Chapter 7, "Runtime Type   Identification", respectively). When pure C is used, this additional code is   unnecessary. Please note, however, that you should not attempt to apply this   tweak with C++ code or C code that uses any C++ constructs such as operator   <tt>new</tt> and virtual functions.</p><h3> <a name="Heading26"> Inline Assembly</a></h3><p>Time-critical sections of C++ code can be rewritten in native assembly code.   The result can be a significant increase in speed. Note, however, that this   measure is not to be taken lightly because it makes future modifications much   more difficult. Programmers who maintain the code might not be familiar with   the particular assembly language that is used, or they might have no prior experience   in assembly language at all. Furthermore, porting the software to other platforms   requires rewriting of the assembly code parts (in some instances, upgrading   the processor can also necessitate rewriting). In addition, developing and testing   assembly code is an arduous task that can take much more time than developing   and testing code that is written in a high-level language. </p><p>Generally, operations that are coded in assembly are low-level library functions.   On most implementations, for example, the standard library functions <tt>memset()</tt>   and <tt>strcpy()</tt> are written in native assembly code. C and C++ enable   the programmer to embed inline assembly code within an <tt>asm</tt> block. For   example</p><pre><tt>asm   </tt><tt>{</tt><tt>  mov a, ecx</tt><tt>   //</tt>...}</pre><h3> <a name="Heading27"> Interacting with the Operating System Directly</a></h3><p>API functions and classes enable you to interact with the operating system.   Sometimes, however, executing a system command directly can be much faster.   For this purpose, you can use the standard function <tt>system()</tt> that takes   a shell command as a <tt>const char *</tt>. For example, on a DOS/Windows system,   you can display the files in the current directory as follows:</p><pre><tt>#include &lt;cstdlib&gt;</tt><tt>using namespace std;</tt><tt>int main()</tt><tt>{</tt><tt>  system("dir");  //execute the "dir" command</tt><tt>} </tt></pre><p>Here again, the tradeoff is between speed on the one hand and portability and   future extensibility on the other hand.</p><h2> <a name="Heading28"> Conclusions</a></h2><p>In an ideal world, software designers and developers might focus their efforts   on robust, extensible, and readable code. Fortunately, the current state of   affairs in the software world is much closer to that ideal than it was 15, 30,   or 50 years ago. Notwithstanding that, performance tuning and optimizations   will probably remain a necessity for a long time. The faster hardware becomes,   the more the software that runs on it is required to meet higher demands. Speech   recognition, online translation of natural languages, neural networks, and complex   mathematical computations are only a few examples of resource-hungry applications   that will evolve in the future and require careful optimizations. </p><p>Textbooks often recommend that you put off optimization consideration to the   final stages of testing. Indeed, the primary goal is to get the system to work   correctly. Nonetheless, some of the techniques presented here -- such as declaring   objects locally, preferring prefix to postfix operators, and using initialization   instead of assignment -- need to become a natural habit. It is a well-known   fact that programs usually spend 90% of their time executing only 10% of their   code (the numbers might vary, but they range between 80% and 20% to 95% and   5%). The first step in optimization is, therefore, identifying that 10% of your   programs and optimizing them. Many automated profiling and optimization tools   can assist you in identifying these critical code parts. Some of these tools   can also suggest solutions to enhance performance. Still, many of the optimization   techniques are implementation-specific and always require human expertise. It   is important to empirically verify your suspicions and to test the effect of   suggested code modifications to ensure that they indeed improve the system's   performance. Programmers' intuitions regarding the cost of certain operations   are often misleading. For example, shorter code is not necessarily faster code.   Similarly, writing convoluted code to avoid the cost of a simple <tt>if</tt>   statement is not worth the trouble because it saves only one or two CPU cycles. </p><CENTER><P><HR>  <A HREF="/publishers/que/series/professional/0789720221/index.htm"><img src="/publishers/que/series/professional/0789720221/button/contents.gif" WIDTH="128"HEIGHT="28" ALIGN="BOTTOM" ALT="Contents" BORDER="0"></A> <BR><BR><BR><p></P><P>&#169; <A HREF="/publishers/que/series/professional/0789720221/copy.htm">Copyright 1999</A>, Macmillan Computer Publishing. Allrights reserved.</p></CENTER></BODY></HTML>
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -