📄 index.html
字号:
<tt> };</tt>
<tt>public:</tt>
<tt> PersonalDetails(const char *nm); </tt>
<tt> PersonalDetails(long id) : ID(id) {/**/} // direct access to a member</tt>
<tt> //...</tt>
<tt>};</tt>
</pre>
<p>By using a union, the size of class <tt>PersonalDetails</tt> is halved. Again,
saving four bytes of memory is not worth the trouble unless this class serves
as a mold for millions of database records or if the records are transmitted
on slow communication lines. Note that unions do not incur any runtime overhead,
so there is no speed tradeoff in this case. The advantage of an anonymous union
over a named one is that its members can be accessed directly. </p>
<h2> <a name="Heading18"> Speed Optimizations</a></h2>
<p>In time-critical applications, every CPU cycle counts. This section presents
a few simple guidelines for speed optimization. Some of them have been around
since the early days of C; others are C++ specific. </p>
<h3> <a name="Heading19"> Using a Class To Pack a Long Argument List</a></h3>
<p>The overhead of a function call is increased when the function has a long list
of arguments. The runtime system has to initialize the stack with the values
of the arguments; naturally, this operation takes longer when there are more
arguments. For example, executing the following function 100,000,000 times takes
8.5 seconds on average on my machine:</p>
<pre>
<tt>void retrieve(const string& title, //5 arguments </tt>
<tt> const string& author, </tt>
<tt> int ISBN, </tt>
<tt> int year, </tt>
<tt> bool& inStore)</tt>
<tt>{} </tt>
</pre>
<p>Packing the argument list into a single class and passing it by reference as
the only argument reduces the result to five seconds, on average. Of course,
for functions that take a long time to execute, the stack initialization overhead
is negligible. However, for short and fast functions that are called very often,
packing a long parameter list within a single object and passing it by reference
can improve performance.</p>
<h3> <a name="Heading20"> Register Variables</a></h3>
<p>The storage specifier <tt>register</tt> can be used as a hint to the compiler
that an object will be heavily used in the program. For example</p>
<pre>
<tt>void f()</tt>
<tt>{</tt>
<tt> int *p = new int[3000000];</tt>
<tt> register int *p2 = p; //store the address in a register</tt>
<tt> for (register int j = 0; j<3000000; j++)</tt>
<tt> {</tt>
<tt> *p2++ = 0;</tt>
<tt> }</tt>
<tt> //</tt>...<tt>use p</tt><tt> <br> delete [] p;</tt>
<tt>}</tt>
</pre>
<p>Loop counters are good candidates for being declared as register variables.
When they are not stored in a register, a substantial amount of the loop's execution
time is wasted in fetching the variable from memory, assigning a new value to
it, and storing it back in memory repeatedly. Storing it in a machine register
reduces this overhead. Note, however, that <tt>register</tt> is only a recommendation
to the compiler. As with function inlining, the compiler can refuse to store
the object in a machine register. Furthermore, modern compilers optimize loop
counters and move them to the machine's registers anyway. The <tt>register</tt>
storage specification is not confined to fundamental types. Rather, it can be
used for any type of object. If the object is too large to fit into a register,
the compiler can still store the object in a faster memory region, such as the
cache memory (cache memory is about ten times faster than the main memory).</p>
<blockquote>
<hr>
<strong>NOTE: </strong> Some compilers ignore the <tt>register</tt> specification
altogether and automatically store the program's variables according to a set
of built-in optimization rules. Please consult your vendor's specifications
for more details on the compiler's handling of register declarations.
<hr>
</blockquote>
<p>Declaring function parameters with the <tt>register</tt> storage specifier
is a recommendation to pass the arguments on the machine's registers rather
than passing them on the stack. For example</p>
<pre>
<tt>void f(register int j, register Date d);</tt>
</pre>
<h3> <a name="Heading21"> Declaring Constant Objects as const</a></h3>
<p>In addition to the other boons of declaring constant objects as <tt>const</tt>,
an optimizing compiler can take advantage of this declaration, too, and store
such an object in a machine register instead of in ordinary memory. Note that
the same optimization can be applied to function parameters that are declared
<tt>const</tt>. On the other hand, the <tt>volatile</tt> qualifier disables
such an optimization (see Appendix A, "Manual of Programming Style"), so use
it only when it is unavoidable.</p>
<h3> <a name="Heading22"> Runtime Overhead of Virtual Functions </a></h3>
<p>When a virtual function is called through a pointer or a reference of an object,
the call doesn't necessarily impose additional runtime penalties. If the compiler
can resolve the call statically, no extra overhead is incurred. Furthermore,
a very short virtual function can be inlined in this case. In the following
example, a clever compiler can resolve the calls of the virtual member functions
statically: </p>
<pre>
<tt>#include <iostream></tt>
<tt>using namespace std;</tt>
<tt>class V </tt>
<tt>{</tt>
<tt>public: </tt>
<tt> virtual void show() const { cout<<"I'm V"<<endl; }</tt>
<tt>};</tt>
<tt>class W : public V </tt>
<tt>{</tt>
<tt>public:</tt>
<tt> void show() const { cout<<"I'm W"<<endl; }</tt>
<tt>};</tt>
<tt>void f(V & v, V *pV) </tt>
<tt>{</tt>
<tt> v.show(); </tt>
<tt> pV->show(); </tt>
<tt>}</tt>
<tt>void g()</tt>
<tt>{</tt>
<tt> V v;</tt>
<tt> f(v, &v);</tt>
<tt>}</tt>
<tt>int main()</tt>
<tt>{</tt>
<tt> g();</tt>
<tt> return 0;</tt>
<tt>}</tt>
</pre>
<p>If the entire program appears in a single translation unit, the compiler can
perform an inline substitution of the call of the function <tt>g()</tt> in <tt>main()</tt>.
The invocation of <tt>f()</tt> within <tt>g()</tt> can also be inlined, and
because the dynamic type of the arguments that are passed to <tt>f()</tt> is
known at compile time, the compiler can resolve the virtual function calls inside
<tt>f()</tt> statically. There is no guarantee that every compiler actually
inlines all the function calls; however, some compilers certainly take advantage
of the fact that the dynamic type of the arguments of <tt>f()</tt> can be determined
at compile time, and avoid the overhead of dynamic binding in this case. </p>
<h3> <a name="Heading23"> Function Objects Versus Function Pointers</a></h3>
<p>The benefits of using function objects instead of function pointers (function
objects are discussed in Chapter 10 and in Chapter 3, "Operator Overloading")
are not limited to genericity and easier maintenance. Furthermore, compilers
can inline the call of a function object, thereby enhancing performance even
further (inlining a function pointer call is rarely possible). </p>
<h2> <a name="Heading24"> A Last Resort</a></h2>
<p>The optimization techniques that have been presented thus far do not dictate
design compromises or less readable code. In fact, some of them <i>improve</i>
the software's robustness and the ease of maintenance. Packing a long argument
list within a class object, <tt>const</tt> declarations, and using function
objects rather than function pointers provide additional benefits on top of
the performance boost. Under strict time and memory constraints, however, these
techniques might not suffice; additional tweaks are sometimes required, which
affect the portability and extensibility of the software. The techniques that
are presented in this section are to be used only as a last resort, and only
after all the other optimizations have been applied.</p>
<h3> <a name="Heading25"> Disabling RTTI and Exception Handling Support</a></h3>
<p>When you port pure C code to a C++ compiler, you might discover a slight performance
degradation. This is not a fault in the programming language or the compiler,
but a matter of compiler tuning. All you have to do to gain the same (or better)
performance that you might get from a C compiler is switch off the compiler's
RTTI and exception handling support. Why is this? In order to support RTTI or
exception handling, a C++ compiler inserts additional "scaffolding" code to
the original source file. This increases the executable size a little, and imposes
slight runtime overhead (the overhead of exception handling and RTTI are discussed
in Chapter 6, "Exception<i> </i>Handling,"<i> </i>and Chapter 7, "Runtime Type
Identification", respectively). When pure C is used, this additional code is
unnecessary. Please note, however, that you should not attempt to apply this
tweak with C++ code or C code that uses any C++ constructs such as operator
<tt>new</tt> and virtual functions.</p>
<h3> <a name="Heading26"> Inline Assembly</a></h3>
<p>Time-critical sections of C++ code can be rewritten in native assembly code.
The result can be a significant increase in speed. Note, however, that this
measure is not to be taken lightly because it makes future modifications much
more difficult. Programmers who maintain the code might not be familiar with
the particular assembly language that is used, or they might have no prior experience
in assembly language at all. Furthermore, porting the software to other platforms
requires rewriting of the assembly code parts (in some instances, upgrading
the processor can also necessitate rewriting). In addition, developing and testing
assembly code is an arduous task that can take much more time than developing
and testing code that is written in a high-level language. </p>
<p>Generally, operations that are coded in assembly are low-level library functions.
On most implementations, for example, the standard library functions <tt>memset()</tt>
and <tt>strcpy()</tt> are written in native assembly code. C and C++ enable
the programmer to embed inline assembly code within an <tt>asm</tt> block. For
example</p>
<pre>
<tt>asm </tt>
<tt>{</tt>
<tt> mov a, ecx</tt>
<tt> //</tt>...
}
</pre>
<h3> <a name="Heading27"> Interacting with the Operating System Directly</a></h3>
<p>API functions and classes enable you to interact with the operating system.
Sometimes, however, executing a system command directly can be much faster.
For this purpose, you can use the standard function <tt>system()</tt> that takes
a shell command as a <tt>const char *</tt>. For example, on a DOS/Windows system,
you can display the files in the current directory as follows:</p>
<pre>
<tt>#include <cstdlib></tt>
<tt>using namespace std;</tt>
<tt>int main()</tt>
<tt>{</tt>
<tt> system("dir"); //execute the "dir" command</tt>
<tt>} </tt>
</pre>
<p>Here again, the tradeoff is between speed on the one hand and portability and
future extensibility on the other hand.</p>
<h2> <a name="Heading28"> Conclusions</a></h2>
<p>In an ideal world, software designers and developers might focus their efforts
on robust, extensible, and readable code. Fortunately, the current state of
affairs in the software world is much closer to that ideal than it was 15, 30,
or 50 years ago. Notwithstanding that, performance tuning and optimizations
will probably remain a necessity for a long time. The faster hardware becomes,
the more the software that runs on it is required to meet higher demands. Speech
recognition, online translation of natural languages, neural networks, and complex
mathematical computations are only a few examples of resource-hungry applications
that will evolve in the future and require careful optimizations. </p>
<p>Textbooks often recommend that you put off optimization consideration to the
final stages of testing. Indeed, the primary goal is to get the system to work
correctly. Nonetheless, some of the techniques presented here -- such as declaring
objects locally, preferring prefix to postfix operators, and using initialization
instead of assignment -- need to become a natural habit. It is a well-known
fact that programs usually spend 90% of their time executing only 10% of their
code (the numbers might vary, but they range between 80% and 20% to 95% and
5%). The first step in optimization is, therefore, identifying that 10% of your
programs and optimizing them. Many automated profiling and optimization tools
can assist you in identifying these critical code parts. Some of these tools
can also suggest solutions to enhance performance. Still, many of the optimization
techniques are implementation-specific and always require human expertise. It
is important to empirically verify your suspicions and to test the effect of
suggested code modifications to ensure that they indeed improve the system's
performance. Programmers' intuitions regarding the cost of certain operations
are often misleading. For example, shorter code is not necessarily faster code.
Similarly, writing convoluted code to avoid the cost of a simple <tt>if</tt>
statement is not worth the trouble because it saves only one or two CPU cycles.
</p>
<CENTER>
<P>
<HR>
<A HREF="/publishers/que/series/professional/0789720221/index.htm"><img src="/publishers/que/series/professional/0789720221/button/contents.gif" WIDTH="128"
HEIGHT="28" ALIGN="BOTTOM" ALT="Contents" BORDER="0"></A> <BR>
<BR>
<BR>
<p></P>
<P>© <A HREF="/publishers/que/series/professional/0789720221/copy.htm">Copyright 1999</A>, Macmillan Computer Publishing. All
rights reserved.</p>
</CENTER>
</BODY>
</HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -