📄 rewrite_guide_advanced.html

📁 这个是我在web培训时老师提供的手册
💻 HTML
📖 第 1 页 / 共 4 页
字号:

<div class="example"><pre>
RewriteEngine on
RewriteCond   /your/docroot/%{REQUEST_FILENAME} <strong>!-f</strong>
RewriteRule   ^(.+)                             http://<strong>webserverB</strong>.dom/$1
</pre></div>

          <p>The problem here is that this will only work for pages
          inside the <code class="directive"><a href="../mod/core.html#documentroot">DocumentRoot</a></code>. While you can add more
          Conditions (for instance to also handle homedirs, etc.)
          there is better variant:</p>

<div class="example"><pre>
RewriteEngine on
RewriteCond   %{REQUEST_URI} <strong>!-U</strong>
RewriteRule   ^(.+)          http://<strong>webserverB</strong>.dom/$1
</pre></div>

          <p>This uses the URL look-ahead feature of <code class="module"><a href="../mod/mod_rewrite.html">mod_rewrite</a></code>.
          The result is that this will work for all types of URLs
          and is a safe way. But it does a performance impact on
          the webserver, because for every request there is one
          more internal subrequest. So, if your webserver runs on a
          powerful CPU, use this one. If it is a slow machine, use
          the first approach or better a <code class="directive"><a href="../mod/core.html#errordocument">ErrorDocument</a></code> CGI-script.</p>
        </dd>
      </dl>

    </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
<div class="section">
<h2>Archive Access Multiplexer</h2>

      

      <dl>
        <dt>Description:</dt>

        <dd>
          <p>Do you know the great CPAN (Comprehensive Perl Archive
          Network) under <a href="http://www.perl.com/CPAN">http://www.perl.com/CPAN</a>?
          This does a redirect to one of several FTP servers around
          the world which carry a CPAN mirror and is approximately
          near the location of the requesting client. Actually this
          can be called an FTP access multiplexing service. While
          CPAN runs via CGI scripts, how can a similar approach
          implemented via <code class="module"><a href="../mod/mod_rewrite.html">mod_rewrite</a></code>?</p>
        </dd>

        <dt>Solution:</dt>

        <dd>
          <p>First we notice that from version 3.0.0
          <code class="module"><a href="../mod/mod_rewrite.html">mod_rewrite</a></code> can
          also use the "<code>ftp:</code>" scheme on redirects.
          And second, the location approximation can be done by a
          <code class="directive"><a href="../mod/mod_rewrite.html#rewritemap">RewriteMap</a></code>
          over the top-level domain of the client.
          With a tricky chained ruleset we can use this top-level
          domain as a key to our multiplexing map.</p>

<div class="example"><pre>
RewriteEngine on
RewriteMap    multiplex                txt:/path/to/map.cxan
RewriteRule   ^/CxAN/(.*)              %{REMOTE_HOST}::$1                 [C]
RewriteRule   ^.+\.<strong>([a-zA-Z]+)</strong>::(.*)$  ${multiplex:<strong>$1</strong>|ftp.default.dom}$2  [R,L]
</pre></div>

<div class="example"><pre>
##
##  map.cxan -- Multiplexing Map for CxAN
##

de        ftp://ftp.cxan.de/CxAN/
uk        ftp://ftp.cxan.uk/CxAN/
com       ftp://ftp.cxan.com/CxAN/
 :
##EOF##
</pre></div>
        </dd>
      </dl>

    </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
<div class="section">
<h2><a name="content" id="content">Content Handling</a></h2>

    

   <h3>Browser Dependent Content</h3>

      

      <dl>
        <dt>Description:</dt>

        <dd>
          <p>At least for important top-level pages it is sometimes
          necessary to provide the optimum of browser dependent
          content, i.e. one has to provide a maximum version for the
          latest Netscape variants, a minimum version for the Lynx
          browsers and a average feature version for all others.</p>
        </dd>

        <dt>Solution:</dt>

        <dd>
          <p>We cannot use content negotiation because the browsers do
          not provide their type in that form. Instead we have to
          act on the HTTP header "User-Agent". The following condig
          does the following: If the HTTP header "User-Agent"
          begins with "Mozilla/3", the page <code>foo.html</code>
          is rewritten to <code>foo.NS.html</code> and and the
          rewriting stops. If the browser is "Lynx" or "Mozilla" of
          version 1 or 2 the URL becomes <code>foo.20.html</code>.
          All other browsers receive page <code>foo.32.html</code>.
          This is done by the following ruleset:</p>

<div class="example"><pre>
RewriteCond %{HTTP_USER_AGENT}  ^<strong>Mozilla/3</strong>.*
RewriteRule ^foo\.html$         foo.<strong>NS</strong>.html          [<strong>L</strong>]

RewriteCond %{HTTP_USER_AGENT}  ^<strong>Lynx/</strong>.*         [OR]
RewriteCond %{HTTP_USER_AGENT}  ^<strong>Mozilla/[12]</strong>.*
RewriteRule ^foo\.html$         foo.<strong>20</strong>.html          [<strong>L</strong>]

RewriteRule ^foo\.html$         foo.<strong>32</strong>.html          [<strong>L</strong>]
</pre></div>
        </dd>
      </dl>

    

    <h3>Dynamic Mirror</h3>

      

      <dl>
        <dt>Description:</dt>

        <dd>
          <p>Assume there are nice webpages on remote hosts we want
          to bring into our namespace. For FTP servers we would use
          the <code>mirror</code> program which actually maintains an
          explicit up-to-date copy of the remote data on the local
          machine. For a webserver we could use the program
          <code>webcopy</code> which acts similar via HTTP. But both
          techniques have one major drawback: The local copy is
          always just as up-to-date as often we run the program. It
          would be much better if the mirror is not a static one we
          have to establish explicitly. Instead we want a dynamic
          mirror with data which gets updated automatically when
          there is need (updated data on the remote host).</p>
        </dd>

        <dt>Solution:</dt>

        <dd>
          <p>To provide this feature we map the remote webpage or even
          the complete remote webarea to our namespace by the use
          of the <dfn>Proxy Throughput</dfn> feature
          (flag <code>[P]</code>):</p>

<div class="example"><pre>
RewriteEngine  on
RewriteBase    /~quux/
RewriteRule    ^<strong>hotsheet/</strong>(.*)$  <strong>http://www.tstimpreso.com/hotsheet/</strong>$1  [<strong>P</strong>]
</pre></div>

<div class="example"><pre>
RewriteEngine  on
RewriteBase    /~quux/
RewriteRule    ^<strong>usa-news\.html</strong>$   <strong>http://www.quux-corp.com/news/index.html</strong>  [<strong>P</strong>]
</pre></div>
        </dd>
      </dl>

    

    <h3>Reverse Dynamic Mirror</h3>

      

      <dl>
        <dt>Description:</dt>

        <dd>...</dd>

        <dt>Solution:</dt>

        <dd>
<div class="example"><pre>
RewriteEngine on
RewriteCond   /mirror/of/remotesite/$1           -U
RewriteRule   ^http://www\.remotesite\.com/(.*)$ /mirror/of/remotesite/$1
</pre></div>
        </dd>
      </dl>

    

    <h3>Retrieve Missing Data from Intranet</h3>

      

      <dl>
        <dt>Description:</dt>

        <dd>
          <p>This is a tricky way of virtually running a corporate
          (external) Internet webserver
          (<code>www.quux-corp.dom</code>), while actually keeping
          and maintaining its data on a (internal) Intranet webserver
          (<code>www2.quux-corp.dom</code>) which is protected by a
          firewall. The trick is that on the external webserver we
          retrieve the requested data on-the-fly from the internal
          one.</p>
        </dd>

        <dt>Solution:</dt>

        <dd>
          <p>First, we have to make sure that our firewall still
          protects the internal webserver and that only the
          external webserver is allowed to retrieve data from it.
          For a packet-filtering firewall we could for instance
          configure a firewall ruleset like the following:</p>

<div class="example"><pre>
<strong>ALLOW</strong> Host www.quux-corp.dom Port &gt;1024 --&gt; Host www2.quux-corp.dom Port <strong>80</strong>
<strong>DENY</strong>  Host *                 Port *     --&gt; Host www2.quux-corp.dom Port <strong>80</strong>
</pre></div>

          <p>Just adjust it to your actual configuration syntax.
          Now we can establish the <code class="module"><a href="../mod/mod_rewrite.html">mod_rewrite</a></code>
          rules which request the missing data in the background
          through the proxy throughput feature:</p>

<div class="example"><pre>
RewriteRule ^/~([^/]+)/?(.*)          /home/$1/.www/$2
RewriteCond %{REQUEST_FILENAME}       <strong>!-f</strong>
RewriteCond %{REQUEST_FILENAME}       <strong>!-d</strong>
RewriteRule ^/home/([^/]+)/.www/?(.*) http://<strong>www2</strong>.quux-corp.dom/~$1/pub/$2 [<strong>P</strong>]
</pre></div>
        </dd>
      </dl>

    

    <h3>Load Balancing</h3>

      

      <dl>
        <dt>Description:</dt>

        <dd>
          <p>Suppose we want to load balance the traffic to
          <code>www.foo.com</code> over <code>www[0-5].foo.com</code>
          (a total of 6 servers). How can this be done?</p>
        </dd>

        <dt>Solution:</dt>

        <dd>
          <p>There are a lot of possible solutions for this problem.
          We will discuss first a commonly known DNS-based variant
          and then the special one with <code class="module"><a href="../mod/mod_rewrite.html">mod_rewrite</a></code>:</p>

          <ol>
            <li>
              <strong>DNS Round-Robin</strong>

              <p>The simplest method for load-balancing is to use
              the DNS round-robin feature of <code>BIND</code>.
              Here you just configure <code>www[0-9].foo.com</code>
              as usual in your DNS with A(address) records, e.g.</p>

<div class="example"><pre>
www0   IN  A       1.2.3.1
www1   IN  A       1.2.3.2
www2   IN  A       1.2.3.3
www3   IN  A       1.2.3.4
www4   IN  A       1.2.3.5
www5   IN  A       1.2.3.6
</pre></div>

              <p>Then you additionally add the following entry:</p>

<div class="example"><pre>
www    IN  CNAME   www0.foo.com.
       IN  CNAME   www1.foo.com.
       IN  CNAME   www2.foo.com.
       IN  CNAME   www3.foo.com.
       IN  CNAME   www4.foo.com.
       IN  CNAME   www5.foo.com.
       IN  CNAME   www6.foo.com.
</pre></div>

              <p>Notice that this seems wrong, but is actually an
              intended feature of <code>BIND</code> and can be used
              in this way. However, now when <code>www.foo.com</code> gets
              resolved, <code>BIND</code> gives out <code>www0-www6</code>
              - but in a slightly permutated/rotated order every time.
              This way the clients are spread over the various
              servers. But notice that this not a perfect load
              balancing scheme, because DNS resolve information
              gets cached by the other nameservers on the net, so
              once a client has resolved <code>www.foo.com</code>
              to a particular <code>wwwN.foo.com</code>, all
              subsequent requests also go to this particular name
              <code>wwwN.foo.com</code>. But the final result is
              ok, because the total sum of the requests are really
              spread over the various webservers.</p>
            </li>

            <li>
              <strong>DNS Load-Balancing</strong>

              <p>A sophisticated DNS-based method for
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -