rfc3230.txt

来自「RFC 的详细文档!」· 文本 代码 · 共 732 行 · 第 1/2 页

TXT
732
字号






Network Working Group                                           J. Mogul
Request for Comments: 3230                                    Compaq WRL
Category: Standards Track                                    A. Van Hoff
                                                                 Marimba
                                                            January 2002


                        Instance Digests in HTTP

Status of this Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2002).  All Rights Reserved.

Abstract

   HTTP/1.1 defines a Content-MD5 header that allows a server to include
   a digest of the response body.  However, this is specifically defined
   to cover the body of the actual message, not the contents of the full
   file (which might be quite different, if the response is a Content-
   Range, or uses a delta encoding).  Also, the Content-MD5 is limited
   to one specific digest algorithm; other algorithms, such as SHA-1
   (Secure Hash Standard), may be more appropriate in some
   circumstances.  Finally, HTTP/1.1 provides no explicit mechanism by
   which a client may request a digest.  This document proposes HTTP
   extensions that solve these problems.

Table of Contents

   1 Introduction....................................................  2
        1.1 Other limitations of HTTP/1.1............................  3
   2 Goals...........................................................  4
   3 Terminology.....................................................  5
   4 Specification...................................................  6
        4.1 Protocol parameter specifications........................  6
             4.1.1 Digest algorithms.................................  6
        4.2 Instance digests.........................................  7
        4.3 Header specifications....................................  8
             4.3.1 Want-Digest.......................................  8
             4.3.2 Digest............................................  9
   5 Negotiation of Content-MD5......................................  9



Mogul, et. al.              Standards Track                     [Page 1]

RFC 3230                Instance Digests in HTTP            January 2002


   6 IANA Considerations............................................. 10
   7 Security Considerations......................................... 10
   8 Acknowledgements................................................ 10
   9 References...................................................... 10
   10 Authors' Addresses............................................. 12
   11 Full Copyright Statement....................................... 13

1 Introduction

   Although HTTP is typically layered over a reliable transport
   protocol, such as TCP, this does not guarantee reliable transport of
   information from sender to receiver.  Various problems, including
   undetected transmission errors, programming errors, corruption of
   stored data, and malicious intervention can cause errors in the
   transmitted information.

   A common approach to the problem of data integrity in a network
   protocol or distributed system, such as HTTP, is the use of digests,
   checksums, or hash values.  The sender computes a digest and sends it
   with the data; the recipient computes a digest of the received data,
   and then verifies the integrity of this data by comparing the
   digests.

   Checksums are used at virtually all layers of the IP stack.  However,
   different digest algorithms might be used at each layer, for reasons
   of computational cost, because the size and nature of the data being
   protected varies, and because the possible threats to data integrity
   vary.  For example, Ethernet uses a Cyclic Redundancy Check (CRC).
   The IPv4 protocol uses a ones-complement checksum over the IP header
   (but not the rest of the packet).  TCP uses a ones-complement
   checksum over the TCP header and data, and includes a "pseudo-header"
   to detect certain kinds of programming errors.

   HTTP/1.1 [4] includes a mechanism for ensuring message integrity, the
   Content-MD5 header.  This header is actually defined for MIME-
   conformant messages in a standalone specification [10].  According to
   the HTTP/1.1 specification,

      The Content-MD5 entity-header field [...]  is an MD5 digest of the
      entity-body for the purpose of providing an end-to-end message
      integrity check (MIC) of the entity-body.

   HTTP/1.1 borrowed Content-MD5 from the MIME world based on an analogy
   between MIME messages (e.g., electronic mail messages) and HTTP
   messages (requests to or responses from an HTTP server).






Mogul, et. al.              Standards Track                     [Page 2]

RFC 3230                Instance Digests in HTTP            January 2002


   As discussed in more detail in section 3, this analogy between MIME
   messages and HTTP messages has resulted in some confusion.  In
   particular, while a MIME message is self-contained, an HTTP message
   might not contain the entire representation of the current state of a
   resource.  (More precisely, an HTTP response might not contain an
   entire "instance"; see section 3 for a definition of this term.)

   There are at least two situations where this distinction is an issue:

      1. When an HTTP server sends a 206 (Partial Content) response, as
         defined in HTTP/1.1.  The client may form its view of an
         instance (e.g., an HTML document) by combining a cache entry
         with the partial content in the message.

      2. When an HTTP server uses a "delta encoding", as proposed in a
         separate document [9].  A delta encoding represents the changes
         between the current instance of a resource and a previous
         instance, and is an efficient way of reducing the bandwidth
         required for cache updates.  The client forms its view of an
         instance by applying the delta in the message to one of its
         cache entries.

   We include these two kinds of transformations in a potentially
   broader category we call "instance manipulations."

   In each of these cases, the server might use a Content-MD5 header to
   protect the integrity of the response message.  However, because the
   MIC in a Content-MD5 header field applies only to the entity in that
   message, and not to the entire instance being reassembled, it cannot
   protect against errors due to data corruption (e.g., of cache
   entries), programming errors (e.g., improper application of a partial
   content or delta), certain malicious attacks [9], or corruption of
   certain HTTP headers in transit.

   Thus, the Content-MD5 header, while useful and sufficient in many
   cases, is not sufficient for verifying instance integrity in all uses
   of HTTP.

   The Digest Authentication mechanism [5] provides (in addition to its
   other goals) a message-digest function similar to Content-MD5, except
   that it includes certain header fields.  Like Content-MD5, it covers
   a specific message, not an entire instance.

1.1 Other limitations of HTTP/1.1

   Checksums are not free.  Computing a digest takes CPU resources, and
   might add latency to the generation of a message.  (Some of these
   costs can be avoided by careful caching at the sender's end, but in



Mogul, et. al.              Standards Track                     [Page 3]

RFC 3230                Instance Digests in HTTP            January 2002


   many cases such a cache would not have a useful hit ratio.)
   Transmitting a digest consumes HTTP header space (and therefore
   increases latency and network bandwidth requirements.)  If the
   message recipient does not intend to use the digest, why should the
   message sender waste resources computing and sending it?

   The Content-MD5 header, of course, implies the use of the MD5
   algorithm [15].  Other algorithms, however, might be more appropriate
   for some purposes.  These include the SHA-1 algorithm [12] and
   various "fingerprinting" algorithms [7].  HTTP currently provides no
   standardized support for the use of these algorithms.

   HTTP/1.1 apparently assumes that the choice to generate a digest is
   up to the sender, and provides no mechanism for the recipient to
   indicate whether a checksum would be useful, or what checksum
   algorithms it would understand.

2 Goals

   The goals of this proposal are:

      1. Digest coverage for entire instances communicated via HTTP.

      2. Support for multiple digest algorithms.

      3. Negotiation of the use of digests.

   The goals do not include:

      -  header integrity
         The digest mechanisms described here cover only the bodies of
         instances, and do not protect the integrity of associated
         "entity headers" or other message headers.

      -  authentication
         The digest mechanisms described here are not meant to support
         authentication of the source of a digest or of a message or
         instance.  These mechanisms, therefore, are not sufficient
         defense against many kinds of malicious attacks.

      -  privacy
         Digest mechanisms do not provide message privacy.

      -  authorization
         The digest mechanisms described here are not meant to support
         authorization or other kinds of access controls.





Mogul, et. al.              Standards Track                     [Page 4]

RFC 3230                Instance Digests in HTTP            January 2002


   The Digest Access Authentication mechanism [5] can provide some
   integrity for certain HTTP headers, and does provide authentication.

3 Terminology

   HTTP/1.1 [4] defines the following terms:

   resource          A network data object or service that can be
                     identified by a URI, as defined in section 3.2.
                     Resources may be available in multiple
                     representations (e.g. multiple languages, data
                     formats, size, resolutions) or vary in other ways.

   entity            The information transferred as the payload of a
                     request or response.  An entity consists of
                     metainformation in the form of entity-header fields
                     and content in the form of an entity-body, as
                     described in section 7.

   variant           A resource may have one, or more than one,
                     representation(s) associated with it at any given
                     instant.  Each of these representations is termed a
                     `variant.' Use of the term `variant' does not
                     necessarily imply that the resource is subject to
                     content negotiation.

   The dictionary definition for "entity" is "something that has
   separate and distinct existence and objective or conceptual reality"
   [8].  Unfortunately, the definition for "entity" in HTTP/1.1 is
   similar to that used in MIME [6], based on an entirely false analogy
   between MIME and HTTP.

   In MIME, electronic mail messages do have distinct and separate
   existences. MIME defines "entity" as something that "refers
   specifically to the MIME-defined header fields and contents of either
   a message or one of the parts in the body of a multipart entity."

   In HTTP, however, a response message to a GET does not have a
   distinct and separate existence.  Rather, it is describing the
   current state of a resource (or a variant, subject to a set of
   constraints).  The HTTP/1.1 specification provides no term to
   describe "the value that would be returned in response to a GET
   request at the current time for the selected variant of the specified
   resource."  This leads to awkward wordings in the HTTP/1.1
   specification in places where this concept is necessary.






Mogul, et. al.              Standards Track                     [Page 5]

RFC 3230                Instance Digests in HTTP            January 2002


   It is too late to fix the terminological failure in the HTTP/1.1
   specification, so we instead define a new term, for use in this
   document:

   instance          The entity that would be returned in a status-200
                     response to a GET request, at the current time, for
                     the selected variant of the specified resource,
                     with the application of zero or more content-
                     codings, but without the application of any
                     instance manipulations or transfer-codings.

   It is convenient to think of an entity tag, in HTTP/1.1, as being
   associated with an instance, rather than an entity.  That is, for a
   given resource, two different response messages might include the
   same entity tag, but two different instances of the resource should
   never be associated with the same (strong) entity tag.

   We also define this term:

   instance manipulation
                     An operation on one or more instances which may
                     result in an instance being conveyed from server to
                     client in parts, or in more than one response
                     message.  For example, a range selection or a delta
                     encoding.  Instance manipulations are end-to-end,
                     and often involve the use of a cache at the client.

4 Specification

   In this specification, the key words "MUST", "MUST NOT", "SHOULD",
   "SHOULD NOT", and "MAY" are to be interpreted as described in RFC
   2119 [2].

4.1 Protocol parameter specifications

4.1.1 Digest algorithms

   Digest algorithm values are used to indicate a specific digest
   computation.  For some algorithms, one or more parameters may be
   supplied.

      digest-algorithm = token

   The BNF for "parameter" is as is used in RFC 2616 [4].  All digest-
   algorithm values are case-insensitive.






Mogul, et. al.              Standards Track                     [Page 6]

RFC 3230                Instance Digests in HTTP            January 2002


   The Internet Assigned Numbers Authority (IANA) acts as a registry for
   digest-algorithm values.  Initially, the registry contains the
   following tokens:

   MD5               The MD5 algorithm, as specified in RFC 1321 [15].
                     The output of this algorithm is encoded using the
                     base64 encoding [1].

   SHA               The SHA-1 algorithm [12].  The output of this
                     algorithm is encoded using the base64 encoding [1].

   UNIXsum           The algorithm computed by the UNIX "sum" command,
                     as defined by the Single UNIX Specification,
                     Version 2 [13].  The output of this algorithm is an
                     ASCII decimal-digit string representing the 16-bit
                     checksum, which is the first word of the output of
                     the UNIX "sum" command.

   UNIXcksum         The algorithm computed by the UNIX "cksum" command,
                     as defined by the Single UNIX Specification,
                     Version 2 [13].  The output of this algorithm is an
                     ASCII digit string representing the 32-bit CRC,
                     which is the first word of the output of the UNIX
                     "cksum" command.

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?