📄 perf-tuning.html.en

📁 Apache_2.0.59-Openssl_0.9 配置tomcat. Apache_2.0.59-Openssl_0.9 配置tomcat.
💻 EN
📖 第 1 页 / 共 4 页
字号:
      the status report contains timing indications. For highest
      performance, set <code>ExtendedStatus off</code> (which is the
      default).</p>

    

    <h3>accept Serialization - multiple sockets</h3>

      

    <div class="warning"><h3>Warning:</h3>
      <p>This section has not been fully updated
      to take into account changes made in the 2.0 version of the
      Apache HTTP Server. Some of the information may still be
      relevant, but please use it with care.</p>
    </div>

      <p>This discusses a shortcoming in the Unix socket API. Suppose
      your web server uses multiple <code class="directive"><a href="../mod/mpm_common.html#listen">Listen</a></code> statements to listen on either multiple
      ports or multiple addresses. In order to test each socket
      to see if a connection is ready Apache uses
      <code>select(2)</code>. <code>select(2)</code> indicates that a
      socket has <em>zero</em> or <em>at least one</em> connection
      waiting on it. Apache's model includes multiple children, and
      all the idle ones test for new connections at the same time. A
      naive implementation looks something like this (these examples
      do not match the code, they're contrived for pedagogical
      purposes):</p>

      <div class="example"><p><code>
        for (;;) {<br />
        <span class="indent">
          for (;;) {<br />
          <span class="indent">
            fd_set accept_fds;<br />
            <br />
            FD_ZERO (&amp;accept_fds);<br />
            for (i = first_socket; i &lt;= last_socket; ++i) {<br />
            <span class="indent">
              FD_SET (i, &amp;accept_fds);<br />
            </span>
            }<br />
            rc = select (last_socket+1, &amp;accept_fds, NULL, NULL, NULL);<br />
            if (rc &lt; 1) continue;<br />
            new_connection = -1;<br />
            for (i = first_socket; i &lt;= last_socket; ++i) {<br />
            <span class="indent">
              if (FD_ISSET (i, &amp;accept_fds)) {<br />
              <span class="indent">
                new_connection = accept (i, NULL, NULL);<br />
                if (new_connection != -1) break;<br />
              </span>
              }<br />
            </span>
            }<br />
            if (new_connection != -1) break;<br />
          </span>
          }<br />
          process the new_connection;<br />
        </span>
        }
      </code></p></div>

      <p>But this naive implementation has a serious starvation problem.
      Recall that multiple children execute this loop at the same
      time, and so multiple children will block at
      <code>select</code> when they are in between requests. All
      those blocked children will awaken and return from
      <code>select</code> when a single request appears on any socket
      (the number of children which awaken varies depending on the
      operating system and timing issues). They will all then fall
      down into the loop and try to <code>accept</code> the
      connection. But only one will succeed (assuming there's still
      only one connection ready), the rest will be <em>blocked</em>
      in <code>accept</code>. This effectively locks those children
      into serving requests from that one socket and no other
      sockets, and they'll be stuck there until enough new requests
      appear on that socket to wake them all up. This starvation
      problem was first documented in <a href="http://bugs.apache.org/index/full/467">PR#467</a>. There
      are at least two solutions.</p>

      <p>One solution is to make the sockets non-blocking. In this
      case the <code>accept</code> won't block the children, and they
      will be allowed to continue immediately. But this wastes CPU
      time. Suppose you have ten idle children in
      <code>select</code>, and one connection arrives. Then nine of
      those children will wake up, try to <code>accept</code> the
      connection, fail, and loop back into <code>select</code>,
      accomplishing nothing. Meanwhile none of those children are
      servicing requests that occurred on other sockets until they
      get back up to the <code>select</code> again. Overall this
      solution does not seem very fruitful unless you have as many
      idle CPUs (in a multiprocessor box) as you have idle children,
      not a very likely situation.</p>

      <p>Another solution, the one used by Apache, is to serialize
      entry into the inner loop. The loop looks like this
      (differences highlighted):</p>

      <div class="example"><p><code>
        for (;;) {<br />
        <span class="indent">
          <strong>accept_mutex_on ();</strong><br />
          for (;;) {<br />
          <span class="indent">
            fd_set accept_fds;<br />
            <br />
            FD_ZERO (&amp;accept_fds);<br />
            for (i = first_socket; i &lt;= last_socket; ++i) {<br />
            <span class="indent">
              FD_SET (i, &amp;accept_fds);<br />
            </span>
            }<br />
            rc = select (last_socket+1, &amp;accept_fds, NULL, NULL, NULL);<br />
            if (rc &lt; 1) continue;<br />
            new_connection = -1;<br />
            for (i = first_socket; i &lt;= last_socket; ++i) {<br />
            <span class="indent">
              if (FD_ISSET (i, &amp;accept_fds)) {<br />
              <span class="indent">
                new_connection = accept (i, NULL, NULL);<br />
                if (new_connection != -1) break;<br />
              </span>
              }<br />
            </span>
            }<br />
            if (new_connection != -1) break;<br />
          </span>
          }<br />
          <strong>accept_mutex_off ();</strong><br />
          process the new_connection;<br />
        </span>
        }
      </code></p></div>

      <p><a id="serialize" name="serialize">The functions</a>
      <code>accept_mutex_on</code> and <code>accept_mutex_off</code>
      implement a mutual exclusion semaphore. Only one child can have
      the mutex at any time. There are several choices for
      implementing these mutexes. The choice is defined in
      <code>src/conf.h</code> (pre-1.3) or
      <code>src/include/ap_config.h</code> (1.3 or later). Some
      architectures do not have any locking choice made, on these
      architectures it is unsafe to use multiple
      <code class="directive"><a href="../mod/mpm_common.html#listen">Listen</a></code>
      directives.</p>

      <p>The directive <code class="directive"><a href="../mod/mpm_common.html#acceptmutex">AcceptMutex</a></code> can be used to
      change the selected mutex implementation at run-time.</p>

      <dl>
        <dt><code>AcceptMutex flock</code></dt>

        <dd>
          <p>This method uses the <code>flock(2)</code> system call to
          lock a lock file (located by the <code class="directive"><a href="../mod/mpm_common.html#lockfile">LockFile</a></code> directive).</p>
        </dd>

        <dt><code>AcceptMutex fcntl</code></dt>

        <dd>
          <p>This method uses the <code>fcntl(2)</code> system call to
          lock a lock file (located by the <code class="directive"><a href="../mod/mpm_common.html#lockfile">LockFile</a></code> directive).</p>
        </dd>

        <dt><code>AcceptMutex sysvsem</code></dt>

        <dd>
          <p>(1.3 or later) This method uses SysV-style semaphores to
          implement the mutex. Unfortunately SysV-style semaphores have
          some bad side-effects. One is that it's possible Apache will
          die without cleaning up the semaphore (see the
          <code>ipcs(8)</code> man page). The other is that the
          semaphore API allows for a denial of service attack by any
          CGIs running under the same uid as the webserver
          (<em>i.e.</em>, all CGIs, unless you use something like
          <code class="program"><a href="../programs/suexec.html">suexec</a></code> or <code>cgiwrapper</code>). For these
          reasons this method is not used on any architecture except
          IRIX (where the previous two are prohibitively expensive
          on most IRIX boxes).</p>
        </dd>

        <dt><code>AcceptMutex pthread</code></dt>

        <dd>
          <p>(1.3 or later) This method uses POSIX mutexes and should
          work on any architecture implementing the full POSIX threads
          specification, however appears to only work on Solaris (2.5
          or later), and even then only in certain configurations. If
          you experiment with this you should watch out for your server
          hanging and not responding. Static content only servers may
          work just fine.</p>
        </dd>

        <dt><code>AcceptMutex posixsem</code></dt>

        <dd>
          <p>(2.0 or later)  This method uses POSIX semaphores.  The
          semaphore ownership is not recovered if a thread in the process
          holding the mutex segfaults, resulting in a hang of the web
          server.</p>
        </dd>

      </dl>

      <p>If your system has another method of serialization which
      isn't in the above list then it may be worthwhile adding code
      for it to APR.</p>

      <p>Another solution that has been considered but never
      implemented is to partially serialize the loop -- that is, let
      in a certain number of processes. This would only be of
      interest on multiprocessor boxes where it's possible multiple
      children could run simultaneously, and the serialization
      actually doesn't take advantage of the full bandwidth. This is
      a possible area of future investigation, but priority remains
      low because highly parallel web servers are not the norm.</p>

      <p>Ideally you should run servers without multiple
      <code class="directive"><a href="../mod/mpm_common.html#listen">Listen</a></code>
      statements if you want the highest performance.
      But read on.</p>

    

    <h3>accept Serialization - single socket</h3>

      

      <p>The above is fine and dandy for multiple socket servers, but
      what about single socket servers? In theory they shouldn't
      experience any of these same problems because all children can
      just block in <code>accept(2)</code> until a connection
      arrives, and no starvation results. In practice this hides
      almost the same "spinning" behaviour discussed above in the
      non-blocking solution. The way that most TCP stacks are
      implemented, the kernel actually wakes up all processes blocked
      in <code>accept</code> when a single connection arrives. One of
      those processes gets the connection and returns to user-space,
      the rest spin in the kernel and go back to sleep when they
      discover there's no connection for them. This spinning is
      hidden from the user-land code, but it's there nonetheless.
      This can result in the same load-spiking wasteful behaviour
      that a non-blocking solution to the multiple sockets case
      can.</p>

      <p>For this reason we have found that many architectures behave
      more "nicely" if we serialize even the single socket case. So
      this is actually the default in almost all cases. Crude
      experiments under Linux (2.0.30 on a dual Pentium pro 166
      w/128Mb RAM) have shown that the serialization of the single
      socket case causes less than a 3% decrease in requests per
      second over unserialized single-socket. But unserialized
      single-socket showed an extra 100ms latency on each request.
      This latency is probably a wash on long haul lines, and only an
      issue on LANs. If you want to override the single socket
      serialization you can define
      <code>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</code> and then
      single-socket servers will not serialize at all.</p>

    

    <h3>Lingering Close</h3>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -