📄 lib0035.html
字号:
<html>
<META http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>Measuring Performance</title>
<link rel="STYLESHEET" type="text/css" href="images/xpolecat.css">
<link rel="STYLESHEET" type="text/css" href="images/ie.content.books24x7.css">
</head>
<body >
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<td><div STYLE="MARGIN-LEFT: 0.15in;">
<a href="toc.html"><img src="images/teamlib.gif" width="62" height="15" border="0" align="absmiddle" alt="Team LiB"></a></div></td>
<td valign="top" class="v2" align="right"><div STYLE="MARGIN-RIGHT: 0.15in"><a href="LiB0034.html"><img src="images/previous.gif" width="62" height="15" border="0" align="absmiddle" alt="Previous Section"></a>
<a href="LiB0036.html"><img src="images/next.gif" width="41" height="15" border="0" align="absmiddle" alt="Next Section"></a>
</div></td></tr>
</table>
<div class="chapter">
<a name="ch04"></a>
<div class="section">
<h2 class="first-section-title"><a name="431"></a><a name="ch04lev1sec4"></a>Measuring Performance</h2><p class="first-para">Given that I have decided to error on the side of simplicity, it would be interesting to see what it costs me in terms of performance relative to a commercial implementation of <span class="fixed">malloc()</span> and <span class="fixed">free()</span>. Although marketing people will barrage you with all sorts of obscure metrics when they are trying to sell you something, measuring performance is really not that complicated, as you will see.</p>
<div class="section">
<h3 class="sect3-title">
<a name="432"></a><a name="ch04lev2sec1"></a>The Ultimate Measure: Time</h3>
<p class="first-para">According to John Hennessy and David Patterson in their book <I>Computer Architecture: A Quantitative Approach</I>, there is one true measure of performance: <i class="emphasis">execution time</i>. In other words, how long did it take for an event to transpire from the moment it started to the moment it completed? Execution time is also known as <i class="emphasis">wall clock time,</i> or <i class="emphasis">response time</i>.</p>
<table border="0" cellspacing="0" cellpadding="0" class="note">
<tr>
<td valign="top" class="admon-check"></td><td valign="top" class="admon-title">Note </td><td valign="top" class="admon-body">
<p class="first-para">There are other measures of performance that you may see in literature, like MIPS (millions of instructions per second) and MFLOPS (million floating-point operations per second), but these are poor metrics that can lead to confusing comparisons. For an explanation of why, see Patterson and Hennessey's book.</p>
</td>
</tr>
</table>
<p class="para">Naturally, there are ways to slice and dice execution time. You can measure how much time the processor itself spends executing a program. This may be a pertinent value if you are working on a multitasking operating system. If you are executing a program on a machine that is carrying a heavy load of tasks, the program under observation may appear to run more slowly (via wall clock time) even if it is actually running faster.</p>
<p class="para">You can subdivide processor time into how much time the processor spends executing a task in user mode and how much time the processor spends executing a task in kernel mode. Encryption programs spend most of their time in user space crunching numbers, while I/O-intensive programs spend most of their time in kernel space interacting with the hardware.</p>
<a name="433"></a><a name="IDX-213"></a>
<table border="0" cellspacing="0" cellpadding="0" class="note">
<tr>
<td valign="top" class="admon-check"></td><td valign="top" class="admon-title">Note </td><td valign="top" class="admon-body">
<p class="first-para">Given that time is such a useful performance metric, what exactly is it? According to Einstein, "Time is that which is measured by a clock." While this is correct, it is not really satisfying. Another possible definition lies in the nature of light; you could define one second as the amount of time it takes a photon in a vacuum to travel 3×10<sup>8</sup> meters. Again, this still may not give you a concise enough definition. There is a story of a foreign student who once asked a professor at Princeton, "Please, sir, what is time?" The professor responded by saying, "I am afraid I can't tell you; I am a physics professor. Maybe you should ask someone in the philosophy department."</p>
</td>
</tr>
</table>
<p class="para">Unfortunately, a time measurement in and of itself really doesn't tell you everything. This is because time measurements are context sensitive. Telling someone that a program took 300 milliseconds to execute doesn't give them the entire picture. The execution time of a program is dependent upon many things, including:</p>
<ul class="itemizedlist">
<li class="first-listitem">
<p class="first-para">The hardware that the program ran on</p>
</li>
<li class="listitem">
<p class="first-para">The algorithms that the program used</p>
</li>
<li class="listitem">
<p class="first-para">The development tools used to build the program</p>
</li>
<li class="listitem">
<p class="first-para">The distribution of the data that the program operated on</p>
</li>
</ul>
<p class="para">The time measurements that I collected in this chapter were generated by programs running on a 700 MHz Pentium III. All of the programs were built using Visual Studio 6.0. Each implementation uses a different algorithm in conjunction with a data distribution that, I will admit, is slightly artificial. The system that I worked on was a single-user Windows 2000 machine with a bare minimum number of running tasks.</p>
<p class="last-para">My hope, however, is not that the individual measurements will mean anything by themselves. Rather, I am more interested in seeing how the different algorithms compare to each other. You should notice a time differential between algorithms, as long as the other three independent variables (hardware, tools, data) are held constant. This time differential is what is important.</p>
<a></a>
</div>
<div class="section">
<h3 class="sect3-title">
<a name="434"></a><a name="ch04lev2sec2"></a>ANSI and Native Time Routines</h3>
<p class="first-para">In order to determine the execution time of a program, you will need to take advantage of related library calls. There are two standard ANSI C functions declared in the <span class="fixed">time.h</span> header file that can be applied to this end:</p>
<div class="informalexample">
<pre class="literallayout">
clock_t clock();
time_t time(time_t *tptr);
</pre>
</div>
<p class="para">The <span class="fixed">time()</span> function returns the number of seconds that have occurred since the epoch. The <i class="emphasis">epoch</i> is an arbitrary reference point; <a name="435"></a><a name="IDX-214"></a>in most cases, it is January 1, 1970 (00:00:00). The problem with the <span class="fixed">time()</span> function is that it works with time units that do not possess a fine enough granularity. Most important application events occur on the scale of milliseconds, microseconds, or even nanoseconds.</p>
<p class="para">The <span class="fixed">clock()</span> function returns the number of system clock ticks that have occurred since a process was launched. The number of ticks per second is defined by the <span class="fixed">CLOCKS_PER_SEC</span> macro. This value will vary from one hardware platform to the next, seeing as how it is a processor-specific value.</p>
<p class="para">The Win32 API provides a routine called <span class="fixed">GetTickCount()</span> that returns the number of milliseconds that have passed since the operating system was booted. I decided to use this function to time my code. If you are more interested in writing portable code, you might want to use <span class="fixed">clock()</span>.</p>
<p class="para">Here is a short example that demonstrates how all three of these functions are used in practice:</p>
<div class="informalexample">
<pre class="literallayout">
<span style="background-color:d9d9d9">/* --ansiTime.c-- */</span>
<span style="background-color:d9d9d9">#include<stdio.h></span>
<span style="background-color:d9d9d9">#include<time.h></span>
<span style="background-color:d9d9d9">#include<windows.h></span>
<span style="background-color:d9d9d9">void main()</span>
<span style="background-color:d9d9d9">{</span>
<span style="background-color:d9d9d9">unsigned long i;</span>
<span style="background-color:d9d9d9">time_t t1,t2;</span>
<span style="background-color:d9d9d9">clock_t ticks1,ticks2, dt;</span>
<span style="background-color:d9d9d9">unsigned long msec1,msec2;</span>
<span style="background-color:d9d9d9">time(&t1);</span>
<span style="background-color:d9d9d9">ticks1 = clock();</span>
<span style="background-color:d9d9d9">msec1 = GetTickCount();</span>
<span style="background-color:d9d9d9">/*do some work*/</span>
<span style="background-color:d9d9d9">for(i=0;i<0xFFFFFFFF;i++){}</span>
<span style="background-color:d9d9d9">time(&t2);</span>
<span style="background-color:d9d9d9">ticks2 = clock();</span>
<span style="background-color:d9d9d9">msec2 = GetTickCount();</span>
<span style="background-color:d9d9d9">printf("number of elapsed seconds = %lu\n",t2-t1);</span>
<span style="background-color:d9d9d9">dt = ticks2-ticks1;</span>
<span style="background-color:d9d9d9">printf("number of clock ticks = %lu\n",dt);</span>
<span style="background-color:d9d9d9">printf("ticks/second = %lu\n",CLOCKS_PER_SEC);</span><a name="436"></a><a name="IDX-215"></a>
<span style="background-color:d9d9d9">printf("number of elapsed seconds = %lu\n",</span>
<span style="background-color:d9d9d9">dt/CLOCKS_PER_SEC);</span>
<span style="background-color:d9d9d9">printf("msecs=%lu\n",msec2-msec1);</span>
<span style="background-color:d9d9d9">return;</span>
<span style="background-color:d9d9d9">}</span>
</pre>
</div>
<p class="para">If this program is built and run, the following type of output will be generated:</p>
<div class="informalexample">
<pre class="literallayout">
number of elapsed seconds = 31
number of clock ticks = 30980
ticks/second = 1000
number of elapsed seconds = 30
msecs=30960
</pre>
</div>
<div class="section">
<h4 class="sect4-title">The Data Distribution: Creating Random Variates</h4>
<p class="first-para">To test the performance of my manual memory managers, I will need to run them through a lengthy series of allocations and deallocations. Naturally, I cannot simply allocate memory blocks that all have the same size.</p>
<div class="informalexample">
<pre class="literallayout">
<span style="background-color:d9d9d9">#include<stdlib.h></span>
<span style="background-color:d9d9d9">void main()</span>
<span style="background-color:d9d9d9">{</span>
<span style="background-color:d9d9d9">unsigned int i,j;</span>
<span style="background-color:d9d9d9">unsigned int nblocks;</span>
<span style="background-color:d9d9d9">unsigned int nbytes;</span>
<span style="background-color:d9d9d9">unsigned char* ptrs[1024];</span>
<span style="background-color:d9d9d9">nbytes=4096;</span>
<span style="background-color:d9d9d9">nblocks=1024;</span>
<span style="background-color:d9d9d9">for(i=0;i<nblocks;i++)</span>
<span style="background-color:d9d9d9">{</span>
<span style="background-color:d9d9d9">ptrs[i]=malloc(nbytes);</span>
<span style="background-color:d9d9d9">for(j=0;j<nbytes;j++)</span>
<span style="background-color:d9d9d9">{</span>
<span style="background-color:d9d9d9">char *cptr;</span>
<span style="background-color:d9d9d9">cptr = ptrs[i];</span>
<span style="background-color:d9d9d9">cptr[j] = 'a';</span>
<span style="background-color:d9d9d9">}</span>
<span style="background-color:d9d9d9">}</span>
<span style="background-color:d9d9d9">for(i=0;i<nblocks;i++)</span>
<span style="background-color:d9d9d9">{</span>
<span style="background-color:d9d9d9">free (ptrs[i]);</span>
<span style="background-color:d9d9d9">}</span>
<span style="background-color:d9d9d9">return;</span>
<span style="background-color:d9d9d9">}</span><a name="437"></a><a name="IDX-216"></a>
</pre>
</div>
<p class="para">The previous program does not force a manager to deal with the arbitrary requests that a heap manager normally encounters. This kind of test is unrealistic.</p>
<p class="para">On the other hand, a completely random series of allocations is also not very realistic.</p>
<div class="informalexample">
<pre class="literallayout">
<span style="background-color:d9d9d9">#include<stdlib.h></span>
<span style="background-color:d9d9d9">void main()</span>
<span style="background-color:d9d9d9">{</span>
<span style="background-color:d9d9d9">unsigned int i,j;</span>
<span style="background-color:d9d9d9">unsigned int nblocks;</span>
<span style="background-color:d9d9d9">unsigned int nbytes;</span>
<span style="background-color:d9d9d9">unsigned char* ptrs[1024];</span>
<span style="background-color:d9d9d9">nblocks=1024;</span>
<span style="background-color:d9d9d9">for(i=0;i<nblocks;i++)</span>
<span style="background-color:d9d9d9">{</span>
<span style="background-color:d9d9d9">nbytes=rand();</span>
<span style="background-color:d9d9d9">ptrs[i]=malloc(nbytes);</span>
<span style="background-color:d9d9d9">for(j=0;j<nbytes;j++)</span>
<span style="background-color:d9d9d9">{</span>
<span style="background-color:d9d9d9">char *cptr;</span>
<span style="background-color:d9d9d9">cptr = ptrs[i];</span>
<span style="background-color:d9d9d9">cptr[j] = 'a';</span>
<span style="background-color:d9d9d9">}</span>
<span style="background-color:d9d9d9">}</span>
<span style="background-color:d9d9d9">for(i=0;i<nblocks;i++)</span>
<span style="background-color:d9d9d9">{</span>
<span style="background-color:d9d9d9">free(ptrs[i]);</span>
<span style="background-color:d9d9d9">}</span>
<span style="background-color:d9d9d9">return;</span>
<span style="background-color:d9d9d9">}</span>
</pre>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -