📄 perf-tuning.html.en
字号:
<p>As discussed in <a href="http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-connection-00.txt">
draft-ietf-http-connection-00.txt</a> section 8, in order for
an HTTP server to <strong>reliably</strong> implement the
protocol it needs to shutdown each direction of the
communication independently (recall that a TCP connection is
bi-directional, each half is independent of the other). This
fact is often overlooked by other servers, but is correctly
implemented in Apache as of 1.2.</p>
<p>When this feature was added to Apache it caused a flurry of
problems on various versions of Unix because of a
shortsightedness. The TCP specification does not state that the
<code>FIN_WAIT_2</code> state has a timeout, but it doesn't prohibit it.
On systems without the timeout, Apache 1.2 induces many sockets
stuck forever in the <code>FIN_WAIT_2</code> state. In many cases this
can be avoided by simply upgrading to the latest TCP/IP patches
supplied by the vendor. In cases where the vendor has never
released patches (<em>i.e.</em>, SunOS4 -- although folks with
a source license can patch it themselves) we have decided to
disable this feature.</p>
<p>There are two ways of accomplishing this. One is the socket
option <code>SO_LINGER</code>. But as fate would have it, this
has never been implemented properly in most TCP/IP stacks. Even
on those stacks with a proper implementation (<em>i.e.</em>,
Linux 2.0.31) this method proves to be more expensive (cputime)
than the next solution.</p>
<p>For the most part, Apache implements this in a function
called <code>lingering_close</code> (in
<code>http_main.c</code>). The function looks roughly like
this:</p>
<div class="example"><p><code>
void lingering_close (int s)<br />
{<br />
<span class="indent">
char junk_buffer[2048];<br />
<br />
/* shutdown the sending side */<br />
shutdown (s, 1);<br />
<br />
signal (SIGALRM, lingering_death);<br />
alarm (30);<br />
<br />
for (;;) {<br />
<span class="indent">
select (s for reading, 2 second timeout);<br />
if (error) break;<br />
if (s is ready for reading) {<br />
<span class="indent">
if (read (s, junk_buffer, sizeof (junk_buffer)) <= 0) {<br />
<span class="indent">
break;<br />
</span>
}<br />
/* just toss away whatever is here */<br />
</span>
}<br />
</span>
}<br />
<br />
close (s);<br />
</span>
}
</code></p></div>
<p>This naturally adds some expense at the end of a connection,
but it is required for a reliable implementation. As HTTP/1.1
becomes more prevalent, and all connections are persistent,
this expense will be amortized over more requests. If you want
to play with fire and disable this feature you can define
<code>NO_LINGCLOSE</code>, but this is not recommended at all.
In particular, as HTTP/1.1 pipelined persistent connections
come into use <code>lingering_close</code> is an absolute
necessity (and <a href="http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html">
pipelined connections are faster</a>, so you want to support
them).</p>
<h3>Scoreboard File</h3>
<p>Apache's parent and children communicate with each other
through something called the scoreboard. Ideally this should be
implemented in shared memory. For those operating systems that
we either have access to, or have been given detailed ports
for, it typically is implemented using shared memory. The rest
default to using an on-disk file. The on-disk file is not only
slow, but it is unreliable (and less featured). Peruse the
<code>src/main/conf.h</code> file for your architecture and
look for either <code>USE_MMAP_SCOREBOARD</code> or
<code>USE_SHMGET_SCOREBOARD</code>. Defining one of those two
(as well as their companions <code>HAVE_MMAP</code> and
<code>HAVE_SHMGET</code> respectively) enables the supplied
shared memory code. If your system has another type of shared
memory, edit the file <code>src/main/http_main.c</code> and add
the hooks necessary to use it in Apache. (Send us back a patch
too please.)</p>
<div class="note">Historical note: The Linux port of Apache didn't start to
use shared memory until version 1.2 of Apache. This oversight
resulted in really poor and unreliable behaviour of earlier
versions of Apache on Linux.</div>
<h3>DYNAMIC_MODULE_LIMIT</h3>
<p>If you have no intention of using dynamically loaded modules
(you probably don't if you're reading this and tuning your
server for every last ounce of performance) then you should add
<code>-DDYNAMIC_MODULE_LIMIT=0</code> when building your
server. This will save RAM that's allocated only for supporting
dynamically loaded modules.</p>
</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
<div class="section">
<h2><a name="trace" id="trace">Appendix: Detailed Analysis of a Trace</a></h2>
<p>Here is a system call trace of Apache 2.0.38 with the worker MPM
on Solaris 8. This trace was collected using:</p>
<div class="example"><p><code>
truss -l -p <var>httpd_child_pid</var>.
</code></p></div>
<p>The <code>-l</code> option tells truss to log the ID of the
LWP (lightweight process--Solaris's form of kernel-level thread)
that invokes each system call.</p>
<p>Other systems may have different system call tracing utilities
such as <code>strace</code>, <code>ktrace</code>, or <code>par</code>.
They all produce similar output.</p>
<p>In this trace, a client has requested a 10KB static file
from the httpd. Traces of non-static requests or requests
with content negotiation look wildly different (and quite ugly
in some cases).</p>
<div class="example"><pre>/67: accept(3, 0x00200BEC, 0x00200C0C, 1) (sleeping...)
/67: accept(3, 0x00200BEC, 0x00200C0C, 1) = 9</pre></div>
<p>In this trace, the listener thread is running within LWP #67.</p>
<div class="note">Note the lack of <code>accept(2)</code> serialization. On this
particular platform, the worker MPM uses an unserialized accept by
default unless it is listening on multiple ports.</div>
<div class="example"><pre>/65: lwp_park(0x00000000, 0) = 0
/67: lwp_unpark(65, 1) = 0</pre></div>
<p>Upon accepting the connection, the listener thread wakes up
a worker thread to do the request processing. In this trace,
the worker thread that handles the request is mapped to LWP #65.</p>
<div class="example"><pre>/65: getsockname(9, 0x00200BA4, 0x00200BC4, 1) = 0</pre></div>
<p>In order to implement virtual hosts, Apache needs to know
the local socket address used to accept the connection. It
is possible to eliminate this call in many situations (such
as when there are no virtual hosts, or when
<code class="directive"><a href="../mod/mpm_common.html#listen">Listen</a></code> directives
are used which do not have wildcard addresses). But
no effort has yet been made to do these optimizations. </p>
<div class="example"><pre>/65: brk(0x002170E8) = 0
/65: brk(0x002190E8) = 0</pre></div>
<p>The <code>brk(2)</code> calls allocate memory from the heap.
It is rare to see these in a system call trace, because the httpd
uses custom memory allocators (<code>apr_pool</code> and
<code>apr_bucket_alloc</code>) for most request processing.
In this trace, the httpd has just been started, so it must
call <code>malloc(3)</code> to get the blocks of raw memory
with which to create the custom memory allocators.</p>
<div class="example"><pre>/65: fcntl(9, F_GETFL, 0x00000000) = 2
/65: fstat64(9, 0xFAF7B818) = 0
/65: getsockopt(9, 65535, 8192, 0xFAF7B918, 0xFAF7B910, 2190656) = 0
/65: fstat64(9, 0xFAF7B818) = 0
/65: getsockopt(9, 65535, 8192, 0xFAF7B918, 0xFAF7B914, 2190656) = 0
/65: setsockopt(9, 65535, 8192, 0xFAF7B918, 4, 2190656) = 0
/65: fcntl(9, F_SETFL, 0x00000082) = 0</pre></div>
<p>Next, the worker thread puts the connection to the client (file
descriptor 9) in non-blocking mode. The <code>setsockopt(2)</code>
and <code>getsockopt(2)</code> calls are a side-effect of how
Solaris's libc handles <code>fcntl(2)</code> on sockets.</p>
<div class="example"><pre>/65: read(9, " G E T / 1 0 k . h t m".., 8000) = 97</pre></div>
<p>The worker thread reads the request from the client.</p>
<div class="example"><pre>/65: stat("/var/httpd/apache/httpd-8999/htdocs/10k.html", 0xFAF7B978) = 0
/65: open("/var/httpd/apache/httpd-8999/htdocs/10k.html", O_RDONLY) = 10</pre></div>
<p>This httpd has been configured with <code>Options FollowSymLinks</code>
and <code>AllowOverride None</code>. Thus it doesn't need to
<code>lstat(2)</code> each directory in the path leading up to the
requested file, nor check for <code>.htaccess</code> files.
It simply calls <code>stat(2)</code> to verify that the file:
1) exists, and 2) is a regular file, not a directory.</p>
<div class="example"><pre>/65: sendfilev(0, 9, 0x00200F90, 2, 0xFAF7B53C) = 10269</pre></div>
<p>In this example, the httpd is able to send the HTTP response
header and the requested file with a single <code>sendfilev(2)</code>
system call. Sendfile semantics vary among operating systems. On some other
systems, it is necessary to do a <code>write(2)</code> or
<code>writev(2)</code> call to send the headers before calling
<code>sendfile(2)</code>.</p>
<div class="example"><pre>/65: write(4, " 1 2 7 . 0 . 0 . 1 - ".., 78) = 78</pre></div>
<p>This <code>write(2)</code> call records the request in the
access log. Note that one thing missing from this trace is a
<code>time(2)</code> call. Unlike Apache 1.3, Apache 2.0 uses
<code>gettimeofday(3)</code> to look up the time. On some operating
systems, like Linux or Solaris, <code>gettimeofday</code> has an
optimized implementation that doesn't require as much overhead
as a typical system call.</p>
<div class="example"><pre>/65: shutdown(9, 1, 1) = 0
/65: poll(0xFAF7B980, 1, 2000) = 1
/65: read(9, 0xFAF7BC20, 512) = 0
/65: close(9) = 0</pre></div>
<p>The worker thread does a lingering close of the connection.</p>
<div class="example"><pre>/65: close(10) = 0
/65: lwp_park(0x00000000, 0) (sleeping...)</pre></div>
<p>Finally the worker thread closes the file that it has just delivered
and blocks until the listener assigns it another connection.</p>
<div class="example"><pre>/67: accept(3, 0x001FEB74, 0x001FEB94, 1) (sleeping...)</pre></div>
<p>Meanwhile, the listener thread is able to accept another connection
as soon as it has dispatched this connection to a worker thread (subject
to some flow-control logic in the worker MPM that throttles the listener
if all the available workers are busy). Though it isn't apparent from
this trace, the next <code>accept(2)</code> can (and usually does, under
high load conditions) occur in parallel with the worker thread's handling
of the just-accepted connection.</p>
</div></div>
<div class="bottomlang">
<p><span>Available Languages: </span><a href="../en/misc/perf-tuning.html" title="English"> en </a> |
<a href="../ko/misc/perf-tuning.html" hreflang="ko" rel="alternate" title="Korean"> ko </a></p>
</div><div id="footer">
<p class="apache">Copyright 2006 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>
<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p></div>
</body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -