rfc3229.txt

来自「RFC 的详细文档!」· 文本 代码 · 共 1,453 行 · 第 1/5 页

TXT
1,453
字号






Network Working Group                                           J. Mogul
Request for Comments: 3229                                    Compaq WRL
Category: Standards Track                               B. Krishnamurthy
                                                              F. Douglis
                                                                    AT&T
                                                             A. Feldmann
                                                   Univ. of Saarbruecken
                                                               Y. Goland
                                                             A. van Hoff
                                                                 Marimba
                                                          D. Hellerstein
                                                                ERS/USDA
                                                            January 2002


                         Delta encoding in HTTP

Status of this Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2002).  All Rights Reserved.

Abstract

   This document describes how delta encoding can be supported as a
   compatible extension to HTTP/1.1.

   Many HTTP (Hypertext Transport Protocol) requests cause the retrieval
   of slightly modified instances of resources for which the client
   already has a cache entry.  Research has shown that such modifying
   updates are frequent, and that the modifications are typically much
   smaller than the actual entity.  In such cases, HTTP would make more
   efficient use of network bandwidth if it could transfer a minimal
   description of the changes, rather than the entire new instance of
   the resource.  This is called "delta encoding."









Mogul, et al.               Standards Track                     [Page 1]

RFC 3229                 Delta encoding in HTTP             January 2002


Table of Contents

   1 Introduction....................................................  3
        1.1 Related research and proposals...........................  4
   2 Goals...........................................................  5
   3 Terminology.....................................................  6
   4 The HTTP message-generation sequence............................  8
        4.1 Relationship between deltas and ranges................... 11
   5 Basic mechanisms................................................ 13
        5.1 Background: an overview of HTTP cache validation......... 13
        5.2 Requesting the transmission of deltas.................... 14
        5.3 Choice of delta algorithm and format..................... 16
        5.4 Identification of delta-encoded responses................ 16
        5.5 Guaranteeing cache safety................................ 17
        5.6 Transmission of delta-encoded responses.................. 18
        5.7 Examples of requests combining Range and delta encoding.. 19
   6 Encoding algorithms and formats................................. 22
   7 Management of base instances.................................... 23
        7.1 Multiple entity tags in the If-None-Match header......... 24
        7.2 Hints for managing the client cache...................... 25
   8 Deltas and intermediate caches.................................. 27
   9 Digests for data integrity...................................... 28
   10 Specification.................................................. 28
        10.1 Protocol parameter specifications....................... 28
        10.2 IANA Considerations..................................... 30
        10.3 Basic requirements for delta-encoded responses.......... 30
        10.4 Status code specifications.............................. 30
             10.4.1 226 IM Used...................................... 31
        10.5 Header specifications................................... 31
             10.5.1 Delta-Base....................................... 31
             10.5.2 IM............................................... 32
             10.5.3 A-IM............................................. 33
        10.6 Caching rules for 226 responses......................... 35
        10.7 Rules for deltas in the presence of content-codings..... 36
             10.7.1 Rules for generating deltas in the presence of
                    content-codings.................................. 37
             10.7.2 Rules for applying deltas in the presence of
                    content-codings.................................. 37
             10.7.3 Examples for using A-IM, IM, and content-codings. 38
        10.8 New Cache-Control directives............................ 40
             10.8.1 Retain directive................................. 40
             10.8.2 IM directive..................................... 40
        10.9 Use of compression with delta encoding.................. 41
        10.10 Delta encoding and multipart/byteranges................ 42
   11 Quantifying the protocol overhead.............................. 42
   12 Security Considerations........................................ 44
   13 Acknowledgements............................................... 44
   14 Intellectual Property Rights................................... 44



Mogul, et al.               Standards Track                     [Page 2]

RFC 3229                 Delta encoding in HTTP             January 2002


   15 References..................................................... 44
   16 Authors' addresses............................................. 47
   17 Full Copyright Statement....................................... 49

1 Introduction

   The World Wide Web is a distributed system, and so often benefits
   from caching to reduce retrieval delays.  Retrieval of a Web resource
   (such as a  document, image, icon, or applet) over the Internet or
   other wide-area networks usually takes enough time that the delay is
   over the human threshold of perception.  Often, that delay is
   measured in seconds.  Caching can often eliminate or significantly
   reduce retrieval delays.

   Many Web resources change over time, so a practical caching approach
   must include a coherency mechanism, to avoid presenting stale
   information to the user.  Originally, the Hypertext Transfer Protocol
   (HTTP) provided little support for caching, but under operational
   pressures, it quickly evolved to support a simple mechanism for
   maintaining cache coherency.

   In HTTP/1.0 [2], the server may supply a "last-modified" timestamp
   with a response.  If a client stores this response in a cache entry,
   and then later wishes to re-use the response, it may transmit a
   request message with an "If-modified-since" field containing that
   timestamp; this is known as a conditional retrieval.  Upon receiving
   a conditional request, the server may either reply with a full
   response, or, if the resource has not changed, it may send an
   abbreviated reply, indicating that the client's cache entry is still
   valid.  HTTP/1.0 also includes a means for the server to indicate,
   via an "Expires" timestamp, that a response will be valid until that
   time; if so, a client may use a cached copy of the response until
   that time, without first validating it using a conditional retrieval.

   HTTP/1.1 [10] adds many new features to improve cache coherency and
   performance.  However, it preserves the all-or-none model for
   responses to conditional retrievals: either the server indicates that
   the resource value has not changed at all, or it must transmit the
   entire current value.

   Common sense suggests (and traces confirm), however, that even when a
   Web resource does change, the new instance is often substantially
   similar to the old one.  If the difference, or "delta", between the
   two instances could be sent to the client instead of the entire new
   instance, a client holding a cached copy of the old instance could
   apply the delta to construct the new version.  In a world of finite
   bandwidth, the reduction in response size and delay could be
   significant.



Mogul, et al.               Standards Track                     [Page 3]

RFC 3229                 Delta encoding in HTTP             January 2002


   One can think of deltas as a way to squeeze as much benefit as
   possible from client and proxy caches.  Rather than treating an
   entire response as the "cache line", with deltas we can treat
   arbitrary pieces of a cached response as the replaceable unit, and
   avoid transferring pieces that have not changed.

   This document proposes a set of compatible extensions to HTTP/1.1
   that allow clients and servers to use delta encoding with minimal
   overhead.

   We assume that the reader is familiar with the HTTP/1.1
   specification.

1.1 Related research and proposals

   The idea of delta encoding to reduce communication or storage costs
   is not new.  For example, the MPEG-1 video compression standard
   transmits occasional still-image frames, but most of the frames sent
   are encoded (to oversimplify) as changes from an adjacent frame.  The
   SCCS and RCS [27] systems for software version control represent
   intermediate versions as deltas; SCCS starts with an original version
   and encodes subsequent ones with forward deltas, whereas RCS encodes
   previous versions as reverse deltas from their successors.
   Jacobson's technique for compressing IP and TCP headers over slow
   links [17] uses a clever, highly specialized form of delta encoding.

   In spite of this history, it appears to have taken several years
   before anyone thought of applying delta encoding to HTTP, perhaps
   because the development of HTTP caching has been somewhat haphazard.
   The first published suggestion for delta encoding appears to have
   been by Williams et al. in a paper about HTTP cache removal policies
   [30], but these authors did not elaborate on their design until later
   [29].

   The WebExpress project [15] appears to be the first published
   description of an implementation of delta encoding for HTTP (which
   they call "differencing").  WebExpress is aimed specifically at
   wireless environments, and includes a number of orthogonal
   optimizations.  Also, the WebExpress design does not propose changing
   the HTTP protocol itself, but rather uses a pair of interposed
   proxies to convert the HTTP message stream into an optimized form.
   The results reported for WebExpress differencing are impressive, but
   are limited to a few selected benchmarks.

   Banga et al. [1] describe the use of optimistic deltas, in which a
   layer of interposed proxies on either end of a slow link collaborate
   to reduce latency.  If the client-side proxy has a cached copy of a
   resource, the server-side proxy can simply send a delta (or a 304



Mogul, et al.               Standards Track                     [Page 4]

RFC 3229                 Delta encoding in HTTP             January 2002


   [Not Modified] response).  If only the server-side proxy has a cached
   copy, it may optimistically send its (possibly stale) copy to the
   client-side proxy, followed (if necessary) by a delta once the
   server-side proxy has validated its own cache entry with the origin
   server.  The use of optimistic deltas, unlike delta encoding,
   actually increases the number of bytes sent over the network, in an
   attempt to improve latency by anticipating a "Not Modified" response
   from the origin server.  The optimistic delta paper, like the
   WebExpress paper, did not propose a change to the HTTP protocol
   itself, and reported results only for a small set of selected URLs.

   Mogul et al. [23] collected lengthy traces, at two different sites,
   of the full contents of HTTP messages, to quantify the potential
   benefits of delta-encoded responses.  They showed that delta encoding
   can provide remarkable improvements in response-size and response-
   delay for an important subset of HTTP content types.  They proposed a
   set of HTTP extensions, but without the level of detail required for
   a specification.  Douglis et al. [8] used the same sets of full-
   content traces to quantify the rate at which resources change in the
   Web.

   The HTTP Distribution and Replication Protocol (DRP), proposed to W3C
   by Marimba, Netscape, Sun, Novell, and At Home, aims to provide a
   collection of new features for HTTP, to support "the efficient
   replication of data over HTTP" [13].  One aspect of the DRP proposal
   is the use of "differential downloading," which is essentially a form
   of delta encoding.  The original DRP proposal uses a different
   approach than is described here, but a forthcoming revision of DRP
   will be revised to conform to the proposal in this document.

   Tridgell and Mackerras [28] describe the "rsync" algorithm, which
   accomplishes something similar to delta encoding.  In rsync, the
   client breaks a cache entry into a series of fixed-sized blocks,
   computes a digest value for each block, and sends the series of
   digest values to the server as part of its request.  The origin
   server does the same block-based computation, and returns only those
   blocks whose digest values differ.  We believe that it might be
   possible to support rsync using the "instance manipulation" framework
   described later in this document, but this has not been worked out in
   any detail.

2 Goals

   The goals of this proposal are:

      1. Reduce the mean size of HTTP responses, thereby improving
         latency and network utilization.




Mogul, et al.               Standards Track                     [Page 5]

RFC 3229                 Delta encoding in HTTP             January 2002


      2. Avoid any extra network round trips.

      3. Minimize the amount of per-request and per-response overheads.

      4. Support a variety of encoding algorithms and formats.

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?