📄 index.html

📁 C程序员手册(英文)
💻 HTML
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
<tt>  };</tt>
<tt>public:</tt>
<tt>  PersonalDetails(const char *nm); </tt>
<tt>  PersonalDetails(long id) : ID(id) {/**/}  // direct access to a member</tt>
<tt>  //...</tt>
<tt>};</tt>
</pre>
<p>By using a union, the size of class <tt>PersonalDetails</tt> is halved. Again, 
  saving four bytes of memory is not worth the trouble unless this class serves 
  as a mold for millions of database records or if the records are transmitted 
  on slow communication lines. Note that unions do not incur any runtime overhead, 
  so there is no speed tradeoff in this case. The advantage of an anonymous union 
  over a named one is that its members can be accessed directly. </p>
<h2> <a name="Heading18"> Speed Optimizations</a></h2>
<p>In time-critical applications, every CPU cycle counts. This section presents 
  a few simple guidelines for speed optimization. Some of them have been around 
  since the early days of C; others are C++ specific. </p>
<h3> <a name="Heading19"> Using a Class To Pack a Long Argument List</a></h3>
<p>The overhead of a function call is increased when the function has a long list 
  of arguments. The runtime system has to initialize the stack with the values 
  of the arguments; naturally, this operation takes longer when there are more 
  arguments. For example, executing the following function 100,000,000 times takes 
  8.5 seconds on average on my machine:</p>
<pre>
<tt>void retrieve(const string&amp; title, //5 arguments </tt>
<tt>              const string&amp; author, </tt>
<tt>              int ISBN,  </tt>
<tt>              int year, </tt>
<tt>              bool&amp;  inStore)</tt>
<tt>{} </tt>
</pre>
<p>Packing the argument list into a single class and passing it by reference as 
  the only argument reduces the result to five seconds, on average. Of course, 
  for functions that take a long time to execute, the stack initialization overhead 
  is negligible. However, for short and fast functions that are called very often, 
  packing a long parameter list within a single object and passing it by reference 
  can improve performance.</p>
<h3> <a name="Heading20"> Register Variables</a></h3>
<p>The storage specifier <tt>register</tt> can be used as a hint to the compiler 
  that an object will be heavily used in the program. For example</p>
<pre>
<tt>void f()</tt>
<tt>{</tt>
<tt>  int *p = new int[3000000];</tt>
<tt>  register int *p2 = p; //store the address in a register</tt>
<tt>  for (register int j = 0; j&lt;3000000; j++)</tt>
<tt>  {</tt>
<tt>    *p2++ = 0;</tt>
<tt>  }</tt>
<tt>  //</tt>...<tt>use  p</tt><tt>  <br>  delete [] p;</tt>
<tt>}</tt>
</pre>
<p>Loop counters are good candidates for being declared as register variables. 
  When they are not stored in a register, a substantial amount of the loop's execution 
  time is wasted in fetching the variable from memory, assigning a new value to 
  it, and storing it back in memory repeatedly. Storing it in a machine register 
  reduces this overhead. Note, however, that <tt>register</tt> is only a recommendation 
  to the compiler. As with function inlining, the compiler can refuse to store 
  the object in a machine register. Furthermore, modern compilers optimize loop 
  counters and move them to the machine's registers anyway. The <tt>register</tt> 
  storage specification is not confined to fundamental types. Rather, it can be 
  used for any type of object. If the object is too large to fit into a register, 
  the compiler can still store the object in a faster memory region, such as the 
  cache memory (cache memory is about ten times faster than the main memory).</p>
<blockquote>
  <hr>
  <strong>NOTE: </strong> Some compilers ignore the <tt>register</tt> specification 
  altogether and automatically store the program's variables according to a set 
  of built-in optimization rules. Please consult your vendor's specifications 
  for more details on the compiler's handling of register declarations. 
  <hr>
</blockquote>
<p>Declaring function parameters with the <tt>register</tt> storage specifier 
  is a recommendation to pass the arguments on the machine's registers rather 
  than passing them on the stack. For example</p>
<pre>
<tt>void f(register int j, register Date d);</tt>
</pre>
<h3> <a name="Heading21"> Declaring Constant Objects as const</a></h3>
<p>In addition to the other boons of declaring constant objects as <tt>const</tt>, 
  an optimizing compiler can take advantage of this declaration, too, and store 
  such an object in a machine register instead of in ordinary memory. Note that 
  the same optimization can be applied to function parameters that are declared 
  <tt>const</tt>. On the other hand, the <tt>volatile</tt> qualifier disables 
  such an optimization (see Appendix A, "Manual of Programming Style"), so use 
  it only when it is unavoidable.</p>
<h3> <a name="Heading22"> Runtime Overhead of Virtual Functions </a></h3>
<p>When a virtual function is called through a pointer or a reference of an object, 
  the call doesn't necessarily impose additional runtime penalties. If the compiler 
  can resolve the call statically, no extra overhead is incurred. Furthermore, 
  a very short virtual function can be inlined in this case. In the following 
  example, a clever compiler can resolve the calls of the virtual member functions 
  statically: </p>
<pre>
<tt>#include &lt;iostream&gt;</tt>
<tt>using namespace std;</tt>
<tt>class V </tt>
<tt>{</tt>
<tt>public:  </tt>
<tt>  virtual void show() const { cout&lt;&lt;"I'm V"&lt;&lt;endl; }</tt>
<tt>};</tt>
<tt>class W : public V </tt>
<tt>{</tt>
<tt>public:</tt>
<tt>  void show() const { cout&lt;&lt;"I'm W"&lt;&lt;endl; }</tt>
<tt>};</tt>
<tt>void f(V &amp; v, V *pV) </tt>
<tt>{</tt>
<tt>  v.show();   </tt>
<tt>  pV-&gt;show();  </tt>
<tt>}</tt>
<tt>void g()</tt>
<tt>{</tt>
<tt>  V v;</tt>
<tt>  f(v, &amp;v);</tt>
<tt>}</tt>
<tt>int main()</tt>
<tt>{</tt>
<tt>  g();</tt>
<tt>  return 0;</tt>
<tt>}</tt>
</pre>
<p>If the entire program appears in a single translation unit, the compiler can 
  perform an inline substitution of the call of the function <tt>g()</tt> in <tt>main()</tt>. 
  The invocation of <tt>f()</tt> within <tt>g()</tt> can also be inlined, and 
  because the dynamic type of the arguments that are passed to <tt>f()</tt> is 
  known at compile time, the compiler can resolve the virtual function calls inside 
  <tt>f()</tt> statically. There is no guarantee that every compiler actually 
  inlines all the function calls; however, some compilers certainly take advantage 
  of the fact that the dynamic type of the arguments of <tt>f()</tt> can be determined 
  at compile time, and avoid the overhead of dynamic binding in this case. </p>
<h3> <a name="Heading23"> Function Objects Versus Function Pointers</a></h3>
<p>The benefits of using function objects instead of function pointers (function 
  objects are discussed in Chapter 10 and in Chapter 3, "Operator Overloading") 
  are not limited to genericity and easier maintenance. Furthermore, compilers 
  can inline the call of a function object, thereby enhancing performance even 
  further (inlining a function pointer call is rarely possible). </p>
<h2> <a name="Heading24"> A Last Resort</a></h2>
<p>The optimization techniques that have been presented thus far do not dictate 
  design compromises or less readable code. In fact, some of them <i>improve</i> 
  the software's robustness and the ease of maintenance. Packing a long argument 
  list within a class object, <tt>const</tt> declarations, and using function 
  objects rather than function pointers provide additional benefits on top of 
  the performance boost. Under strict time and memory constraints, however, these 
  techniques might not suffice; additional tweaks are sometimes required, which 
  affect the portability and extensibility of the software. The techniques that 
  are presented in this section are to be used only as a last resort, and only 
  after all the other optimizations have been applied.</p>
<h3> <a name="Heading25"> Disabling RTTI and Exception Handling Support</a></h3>
<p>When you port pure C code to a C++ compiler, you might discover a slight performance 
  degradation. This is not a fault in the programming language or the compiler, 
  but a matter of compiler tuning. All you have to do to gain the same (or better) 
  performance that you might get from a C compiler is switch off the compiler's 
  RTTI and exception handling support. Why is this? In order to support RTTI or 
  exception handling, a C++ compiler inserts additional "scaffolding" code to 
  the original source file. This increases the executable size a little, and imposes 
  slight runtime overhead (the overhead of exception handling and RTTI are discussed 
  in Chapter 6, "Exception<i> </i>Handling,"<i> </i>and Chapter 7, "Runtime Type 
  Identification", respectively). When pure C is used, this additional code is 
  unnecessary. Please note, however, that you should not attempt to apply this 
  tweak with C++ code or C code that uses any C++ constructs such as operator 
  <tt>new</tt> and virtual functions.</p>
<h3> <a name="Heading26"> Inline Assembly</a></h3>
<p>Time-critical sections of C++ code can be rewritten in native assembly code. 
  The result can be a significant increase in speed. Note, however, that this 
  measure is not to be taken lightly because it makes future modifications much 
  more difficult. Programmers who maintain the code might not be familiar with 
  the particular assembly language that is used, or they might have no prior experience 
  in assembly language at all. Furthermore, porting the software to other platforms 
  requires rewriting of the assembly code parts (in some instances, upgrading 
  the processor can also necessitate rewriting). In addition, developing and testing 
  assembly code is an arduous task that can take much more time than developing 
  and testing code that is written in a high-level language. </p>
<p>Generally, operations that are coded in assembly are low-level library functions. 
  On most implementations, for example, the standard library functions <tt>memset()</tt> 
  and <tt>strcpy()</tt> are written in native assembly code. C and C++ enable 
  the programmer to embed inline assembly code within an <tt>asm</tt> block. For 
  example</p>
<pre>
<tt>asm   </tt>
<tt>{</tt>
<tt>  mov a, ecx</tt>
<tt>   //</tt>...
}
</pre>
<h3> <a name="Heading27"> Interacting with the Operating System Directly</a></h3>
<p>API functions and classes enable you to interact with the operating system. 
  Sometimes, however, executing a system command directly can be much faster. 
  For this purpose, you can use the standard function <tt>system()</tt> that takes 
  a shell command as a <tt>const char *</tt>. For example, on a DOS/Windows system, 
  you can display the files in the current directory as follows:</p>
<pre>
<tt>#include &lt;cstdlib&gt;</tt>
<tt>using namespace std;</tt>
<tt>int main()</tt>
<tt>{</tt>
<tt>  system("dir");  //execute the "dir" command</tt>
<tt>} </tt>
</pre>
<p>Here again, the tradeoff is between speed on the one hand and portability and 
  future extensibility on the other hand.</p>
<h2> <a name="Heading28"> Conclusions</a></h2>
<p>In an ideal world, software designers and developers might focus their efforts 
  on robust, extensible, and readable code. Fortunately, the current state of 
  affairs in the software world is much closer to that ideal than it was 15, 30, 
  or 50 years ago. Notwithstanding that, performance tuning and optimizations 
  will probably remain a necessity for a long time. The faster hardware becomes, 
  the more the software that runs on it is required to meet higher demands. Speech 
  recognition, online translation of natural languages, neural networks, and complex 
  mathematical computations are only a few examples of resource-hungry applications 
  that will evolve in the future and require careful optimizations. </p>
<p>Textbooks often recommend that you put off optimization consideration to the 
  final stages of testing. Indeed, the primary goal is to get the system to work 
  correctly. Nonetheless, some of the techniques presented here -- such as declaring 
  objects locally, preferring prefix to postfix operators, and using initialization 
  instead of assignment -- need to become a natural habit. It is a well-known 
  fact that programs usually spend 90% of their time executing only 10% of their 
  code (the numbers might vary, but they range between 80% and 20% to 95% and 
  5%). The first step in optimization is, therefore, identifying that 10% of your 
  programs and optimizing them. Many automated profiling and optimization tools 
  can assist you in identifying these critical code parts. Some of these tools 
  can also suggest solutions to enhance performance. Still, many of the optimization 
  techniques are implementation-specific and always require human expertise. It 
  is important to empirically verify your suspicions and to test the effect of 
  suggested code modifications to ensure that they indeed improve the system's 
  performance. Programmers' intuitions regarding the cost of certain operations 
  are often misleading. For example, shorter code is not necessarily faster code. 
  Similarly, writing convoluted code to avoid the cost of a simple <tt>if</tt> 
  statement is not worth the trouble because it saves only one or two CPU cycles. 
</p>
<CENTER>
<P>
<HR>
  <A HREF="/publishers/que/series/professional/0789720221/index.htm"><img src="/publishers/que/series/professional/0789720221/button/contents.gif" WIDTH="128"
HEIGHT="28" ALIGN="BOTTOM" ALT="Contents" BORDER="0"></A> <BR>
<BR>
<BR>
<p></P>

<P>&#169; <A HREF="/publishers/que/series/professional/0789720221/copy.htm">Copyright 1999</A>, Macmillan Computer Publishing. All
rights reserved.</p>
</CENTER>


</BODY>

</HTML>
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -