📄 rfc1945.txt
字号:
also be shared. HTTP character sets are identified by case-insensitive tokens. The complete set of tokens are defined by the IANA Character Set registry [15]. However, because that registry does not define a single, consistent token for each character set, we define here the preferred names for those character sets most likely to be used with HTTP entities. These character sets include those registered by RFC 1521 [5] -- the US-ASCII [17] and ISO-8859 [18] character sets -- and other names specifically recommended for use within MIME charset parameters. charset = "US-ASCII" | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3" | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6" | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9" | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR" | "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8" | token Although HTTP allows an arbitrary token to be used as a charset value, any token that has a predefined value within the IANA Character Set registry [15] must represent the character set definedBerners-Lee, et al Informational [Page 17]RFC 1945 HTTP/1.0 May 1996 by that registry. Applications should limit their use of character sets to those defined by the IANA registry. The character set of an entity body should be labelled as the lowest common denominator of the character codes used within that body, with the exception that no label is preferred over the labels US-ASCII or ISO-8859-1.3.5 Content Codings Content coding values are used to indicate an encoding transformation that has been applied to a resource. Content codings are primarily used to allow a document to be compressed or encrypted without losing the identity of its underlying media type. Typically, the resource is stored in this encoding and only decoded before rendering or analogous usage. content-coding = "x-gzip" | "x-compress" | token Note: For future compatibility, HTTP/1.0 applications should consider "gzip" and "compress" to be equivalent to "x-gzip" and "x-compress", respectively. All content-coding values are case-insensitive. HTTP/1.0 uses content-coding values in the Content-Encoding (Section 10.3) header field. Although the value describes the content-coding, what is more important is that it indicates what decoding mechanism will be required to remove the encoding. Note that a single program may be capable of decoding multiple content-coding formats. Two values are defined by this specification: x-gzip An encoding format produced by the file compression program "gzip" (GNU zip) developed by Jean-loup Gailly. This format is typically a Lempel-Ziv coding (LZ77) with a 32 bit CRC. x-compress The encoding format produced by the file compression program "compress". This format is an adaptive Lempel-Ziv-Welch coding (LZW). Note: Use of program names for the identification of encoding formats is not desirable and should be discouraged for future encodings. Their use here is representative of historical practice, not good design.Berners-Lee, et al Informational [Page 18]RFC 1945 HTTP/1.0 May 19963.6 Media Types HTTP uses Internet Media Types [13] in the Content-Type header field (Section 10.5) in order to provide open and extensible data typing. media-type = type "/" subtype *( ";" parameter ) type = token subtype = token Parameters may follow the type/subtype in the form of attribute/value pairs. parameter = attribute "=" value attribute = token value = token | quoted-string The type, subtype, and parameter attribute names are case- insensitive. Parameter values may or may not be case-sensitive, depending on the semantics of the parameter name. LWS must not be generated between the type and subtype, nor between an attribute and its value. Upon receipt of a media type with an unrecognized parameter, a user agent should treat the media type as if the unrecognized parameter and its value were not present. Some older HTTP applications do not recognize media type parameters. HTTP/1.0 applications should only use media type parameters when they are necessary to define the content of a message. Media-type values are registered with the Internet Assigned Number Authority (IANA [15]). The media type registration process is outlined in RFC 1590 [13]. Use of non-registered media types is discouraged.3.6.1 Canonicalization and Text Defaults Internet media types are registered with a canonical form. In general, an Entity-Body transferred via HTTP must be represented in the appropriate canonical form prior to its transmission. If the body has been encoded with a Content-Encoding, the underlying data should be in canonical form prior to being encoded. Media subtypes of the "text" type use CRLF as the text line break when in canonical form. However, HTTP allows the transport of text media with plain CR or LF alone representing a line break when used consistently within the Entity-Body. HTTP applications must accept CRLF, bare CR, and bare LF as being representative of a line break in text media received via HTTP.Berners-Lee, et al Informational [Page 19]RFC 1945 HTTP/1.0 May 1996 In addition, if the text media is represented in a character set that does not use octets 13 and 10 for CR and LF respectively, as is the case for some multi-byte character sets, HTTP allows the use of whatever octet sequences are defined by that character set to represent the equivalent of CR and LF for line breaks. This flexibility regarding line breaks applies only to text media in the Entity-Body; a bare CR or LF should not be substituted for CRLF within any of the HTTP control structures (such as header fields and multipart boundaries). The "charset" parameter is used with some media types to define the character set (Section 3.4) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character sets other than "ISO-8859-1" or its subsets must be labelled with an appropriate charset value in order to be consistently interpreted by the recipient. Note: Many current HTTP servers provide data using charsets other than "ISO-8859-1" without proper labelling. This situation reduces interoperability and is not recommended. To compensate for this, some HTTP user agents provide a configuration option to allow the user to change the default interpretation of the media type character set when no charset parameter is given.3.6.2 Multipart Types MIME provides for a number of "multipart" types -- encapsulations of several entities within a single message's Entity-Body. The multipart types registered by IANA [15] do not have any special meaning for HTTP/1.0, though user agents may need to understand each type in order to correctly interpret the purpose of each body-part. An HTTP user agent should follow the same or similar behavior as a MIME user agent does upon receipt of a multipart type. HTTP servers should not assume that all HTTP clients are prepared to handle multipart types. All multipart types share a common syntax and must include a boundary parameter as part of the media type value. The message body is itself a protocol element and must therefore use only CRLF to represent line breaks between body-parts. Multipart body-parts may contain HTTP header fields which are significant to the meaning of that part.3.7 Product Tokens Product tokens are used to allow communicating applications to identify themselves via a simple product token, with an optional slash and version designator. Most fields using product tokens also allow subproducts which form a significant part of the application toBerners-Lee, et al Informational [Page 20]RFC 1945 HTTP/1.0 May 1996 be listed, separated by whitespace. By convention, the products are listed in order of their significance for identifying the application. product = token ["/" product-version] product-version = token Examples: User-Agent: CERN-LineMode/2.15 libwww/2.17b3 Server: Apache/0.8.4 Product tokens should be short and to the point -- use of them for advertizing or other non-essential information is explicitly forbidden. Although any token character may appear in a product- version, this token should only be used for a version identifier (i.e., successive versions of the same product should only differ in the product-version portion of the product value).4. HTTP Message4.1 Message Types HTTP messages consist of requests from client to server and responses from server to client. HTTP-message = Simple-Request ; HTTP/0.9 messages | Simple-Response | Full-Request ; HTTP/1.0 messages | Full-Response Full-Request and Full-Response use the generic message format of RFC 822 [7] for transferring entities. Both messages may include optional header fields (also known as "headers") and an entity body. The entity body is separated from the headers by a null line (i.e., a line with nothing preceding the CRLF). Full-Request = Request-Line ; Section 5.1 *( General-Header ; Section 4.3 | Request-Header ; Section 5.2 | Entity-Header ) ; Section 7.1 CRLF [ Entity-Body ] ; Section 7.2 Full-Response = Status-Line ; Section 6.1 *( General-Header ; Section 4.3 | Response-Header ; Section 6.2Berners-Lee, et al Informational [Page 21]RFC 1945 HTTP/1.0 May 1996 | Entity-Header ) ; Section 7.1 CRLF [ Entity-Body ] ; Section 7.2 Simple-Request and Simple-Response do not allow the use of any header information and are limited to a single request method (GET). Simple-Request = "GET" SP Request-URI CRLF Simple-Response = [ Entity-Body ] Use of the Simple-Request format is discouraged because it prevents the server from identifying the media type of the returned entity.4.2 Message Headers HTTP header fields, which include General-Header (Section 4.3), Request-Header (Section 5.2), Response-Header (Section 6.2), and Entity-Header (Section 7.1) fields, follow the same generic format as that given in Section 3.1 of RFC 822 [7]. Each header field consists of a name followed immediately by a colon (":"), a single space (SP) character, and the field value. Field names are case-insensitive. Header fields can be extended over multiple lines by preceding each extra line with at least one SP or HT, though this is not recommended. HTTP-header = field-name ":" [ field-value ] CRLF field-name = token field-value = *( field-content | LWS ) field-content = <the OCTETs making up the field-value and consisting of either *TEXT or combinations of token, tspecials, and quoted-string> The order in which header fields are received is not significant. However, it is "good practice" to send General-Header fields first, followed by Request-Header or Response-Header fields prior to the Entity-Header fields. Multiple HTTP-header fields with the same field-name may be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It must be possible to combine the multiple header fields into one "field- name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma.Berners-Lee, et al Informational [Page 22]RFC 1945 HTTP/1.0 May 1996
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -