⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 rfc1945.txt

📁 RFC 的详细文档!
💻 TXT
📖 第 1 页 / 共 5 页
字号:
   HTTP uses the same definition of the term "character set" as that
   described for MIME:

      The term "character set" is used in this document to refer to a
      method used with one or more tables to convert a sequence of
      octets into a sequence of characters. Note that unconditional
      conversion in the other direction is not required, in that not all
      characters may be available in a given character set and a
      character set may provide more than one sequence of octets to
      represent a particular character. This definition is intended to
      allow various kinds of character encodings, from simple single-
      table mappings such as US-ASCII to complex table switching methods
      such as those that use ISO 2022's techniques. However, the
      definition associated with a MIME character set name must fully
      specify the mapping to be performed from octets to characters. In
      particular, use of external profiling information to determine the
      exact mapping is not permitted.

      Note: This use of the term "character set" is more commonly
      referred to as a "character encoding." However, since HTTP and
      MIME share the same registry, it is important that the terminology
      also be shared.

   HTTP character sets are identified by case-insensitive tokens. The
   complete set of tokens are defined by the IANA Character Set registry
   [15]. However, because that registry does not define a single,
   consistent token for each character set, we define here the preferred
   names for those character sets most likely to be used with HTTP
   entities. These character sets include those registered by RFC 1521
   [5] -- the US-ASCII [17] and ISO-8859 [18] character sets -- and
   other names specifically recommended for use within MIME charset
   parameters.

     charset = "US-ASCII"
             | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3"
             | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6"
             | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9"
             | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR"
             | "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8"
             | token

   Although HTTP allows an arbitrary token to be used as a charset
   value, any token that has a predefined value within the IANA
   Character Set registry [15] must represent the character set defined



Berners-Lee, et al           Informational                     [Page 17]

RFC 1945                        HTTP/1.0                        May 1996


   by that registry. Applications should limit their use of character
   sets to those defined by the IANA registry.

   The character set of an entity body should be labelled as the lowest
   common denominator of the character codes used within that body, with
   the exception that no label is preferred over the labels US-ASCII or
   ISO-8859-1.

3.5  Content Codings

   Content coding values are used to indicate an encoding transformation
   that has been applied to a resource. Content codings are primarily
   used to allow a document to be compressed or encrypted without losing
   the identity of its underlying media type. Typically, the resource is
   stored in this encoding and only decoded before rendering or
   analogous usage.

       content-coding = "x-gzip" | "x-compress" | token

       Note: For future compatibility, HTTP/1.0 applications should
       consider "gzip" and "compress" to be equivalent to "x-gzip"
       and "x-compress", respectively.

   All content-coding values are case-insensitive. HTTP/1.0 uses
   content-coding values in the Content-Encoding (Section 10.3) header
   field. Although the value describes the content-coding, what is more
   important is that it indicates what decoding mechanism will be
   required to remove the encoding. Note that a single program may be
   capable of decoding multiple content-coding formats. Two values are
   defined by this specification:

   x-gzip
       An encoding format produced by the file compression program
       "gzip" (GNU zip) developed by Jean-loup Gailly. This format is
       typically a Lempel-Ziv coding (LZ77) with a 32 bit CRC.

   x-compress
       The encoding format produced by the file compression program
       "compress". This format is an adaptive Lempel-Ziv-Welch coding
       (LZW).

       Note: Use of program names for the identification of
       encoding formats is not desirable and should be discouraged
       for future encodings. Their use here is representative of
       historical practice, not good design.






Berners-Lee, et al           Informational                     [Page 18]

RFC 1945                        HTTP/1.0                        May 1996


3.6  Media Types

   HTTP uses Internet Media Types [13] in the Content-Type header field
   (Section 10.5) in order to provide open and extensible data typing.

       media-type     = type "/" subtype *( ";" parameter )
       type           = token
       subtype        = token

   Parameters may follow the type/subtype in the form of attribute/value
   pairs.

       parameter      = attribute "=" value
       attribute      = token
       value          = token | quoted-string

   The type, subtype, and parameter attribute names are case-
   insensitive. Parameter values may or may not be case-sensitive,
   depending on the semantics of the parameter name. LWS must not be
   generated between the type and subtype, nor between an attribute and
   its value. Upon receipt of a media type with an unrecognized
   parameter, a user agent should treat the media type as if the
   unrecognized parameter and its value were not present.

   Some older HTTP applications do not recognize media type parameters.
   HTTP/1.0 applications should only use media type parameters when they
   are necessary to define the content of a message.

   Media-type values are registered with the Internet Assigned Number
   Authority (IANA [15]). The media type registration process is
   outlined in RFC 1590 [13]. Use of non-registered media types is
   discouraged.

3.6.1 Canonicalization and Text Defaults

   Internet media types are registered with a canonical form. In
   general, an Entity-Body transferred via HTTP must be represented in
   the appropriate canonical form prior to its transmission. If the body
   has been encoded with a Content-Encoding, the underlying data should
   be in canonical form prior to being encoded.

   Media subtypes of the "text" type use CRLF as the text line break
   when in canonical form. However, HTTP allows the transport of text
   media with plain CR or LF alone representing a line break when used
   consistently within the Entity-Body. HTTP applications must accept
   CRLF, bare CR, and bare LF as being representative of a line break in
   text media received via HTTP.




Berners-Lee, et al           Informational                     [Page 19]

RFC 1945                        HTTP/1.0                        May 1996


   In addition, if the text media is represented in a character set that
   does not use octets 13 and 10 for CR and LF respectively, as is the
   case for some multi-byte character sets, HTTP allows the use of
   whatever octet sequences are defined by that character set to
   represent the equivalent of CR and LF for line breaks. This
   flexibility regarding line breaks applies only to text media in the
   Entity-Body; a bare CR or LF should not be substituted for CRLF
   within any of the HTTP control structures (such as header fields and
   multipart boundaries).

   The "charset" parameter is used with some media types to define the
   character set (Section 3.4) of the data. When no explicit charset
   parameter is provided by the sender, media subtypes of the "text"
   type are defined to have a default charset value of "ISO-8859-1" when
   received via HTTP. Data in character sets other than "ISO-8859-1" or
   its subsets must be labelled with an appropriate charset value in
   order to be consistently interpreted by the recipient.

      Note: Many current HTTP servers provide data using charsets other
      than "ISO-8859-1" without proper labelling. This situation reduces
      interoperability and is not recommended. To compensate for this,
      some HTTP user agents provide a configuration option to allow the
      user to change the default interpretation of the media type
      character set when no charset parameter is given.

3.6.2 Multipart Types

   MIME provides for a number of "multipart" types -- encapsulations of
   several entities within a single message's Entity-Body. The multipart
   types registered by IANA [15] do not have any special meaning for
   HTTP/1.0, though user agents may need to understand each type in
   order to correctly interpret the purpose of each body-part. An HTTP
   user agent should follow the same or similar behavior as a MIME user
   agent does upon receipt of a multipart type. HTTP servers should not
   assume that all HTTP clients are prepared to handle multipart types.

   All multipart types share a common syntax and must include a boundary
   parameter as part of the media type value. The message body is itself
   a protocol element and must therefore use only CRLF to represent line
   breaks between body-parts. Multipart body-parts may contain HTTP
   header fields which are significant to the meaning of that part.

3.7  Product Tokens

   Product tokens are used to allow communicating applications to
   identify themselves via a simple product token, with an optional
   slash and version designator. Most fields using product tokens also
   allow subproducts which form a significant part of the application to



Berners-Lee, et al           Informational                     [Page 20]

RFC 1945                        HTTP/1.0                        May 1996


   be listed, separated by whitespace. By convention, the products are
   listed in order of their significance for identifying the
   application.

       product         = token ["/" product-version]
       product-version = token

   Examples:

       User-Agent: CERN-LineMode/2.15 libwww/2.17b3

       Server: Apache/0.8.4

   Product tokens should be short and to the point -- use of them for
   advertizing or other non-essential information is explicitly
   forbidden. Although any token character may appear in a product-
   version, this token should only be used for a version identifier
   (i.e., successive versions of the same product should only differ in
   the product-version portion of the product value).

4.  HTTP Message

4.1  Message Types

   HTTP messages consist of requests from client to server and responses
   from server to client.

       HTTP-message   = Simple-Request           ; HTTP/0.9 messages
                      | Simple-Response
                      | Full-Request             ; HTTP/1.0 messages
                      | Full-Response

   Full-Request and Full-Response use the generic message format of RFC
   822 [7] for transferring entities. Both messages may include optional
   header fields (also known as "headers") and an entity body. The
   entity body is separated from the headers by a null line (i.e., a
   line with nothing preceding the CRLF).

       Full-Request   = Request-Line             ; Section 5.1
                        *( General-Header        ; Section 4.3
                         | Request-Header        ; Section 5.2
                         | Entity-Header )       ; Section 7.1
                        CRLF
                        [ Entity-Body ]          ; Section 7.2

       Full-Response  = Status-Line              ; Section 6.1
                        *( General-Header        ; Section 4.3
                         | Response-Header       ; Section 6.2



Berners-Lee, et al           Informational                     [Page 21]

RFC 1945                        HTTP/1.0                        May 1996


                         | Entity-Header )       ; Section 7.1
                        CRLF
                        [ Entity-Body ]          ; Section 7.2

   Simple-Request and Simple-Response do not allow the use of any header
   information and are limited to a single request method (GET).

       Simple-Request  = "GET" SP Request-URI CRLF

       Simple-Response = [ Entity-Body ]

   Use of the Simple-Request format is discouraged because it prevents
   the server from identifying the media type of the returned entity.

4.2  Message Headers

   HTTP header fields, which include General-Header (Section 4.3),
   Request-Header (Section 5.2), Response-Header (Section 6.2), and
   Entity-Header (Section 7.1) fields, follow the same generic format as
   that given in Section 3.1 of RFC 822 [7]. Each header field consists
   of a name followed immediately by a colon (":"), a single space (SP)
   character, and the field value. Field names are case-insensitive.
   Header fields can be extended over multiple lines by preceding each
   extra line with at least one SP or HT, though this is not
   recommended.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -