📄 http11.txt
字号:
message was received.
delta-seconds = 1*DIGIT
Fielding, et al [Page 21]
INTERNET-DRAFT HTTP/1.1 Monday, August 12, 1996
3.4 Character Sets
HTTP uses the same definition of the term "character set" as that
described for MIME:
The term "character set" is used in this document to refer to a
method used with one or more tables to convert a sequence of octets
into a sequence of characters. Note that unconditional conversion
in the other direction is not required, in that not all characters
may be available in a given character set and a character set may
provide more than one sequence of octets to represent a particular
character. This definition is intended to allow various kinds of
character encodings, from simple single-table mappings such as US-
ASCII to complex table switching methods such as those that use ISO
2022's techniques. However, the definition associated with a MIME
character set name MUST fully specify the mapping to be performed
from octets to characters. In particular, use of external profiling
information to determine the exact mapping is not permitted.
Note: This use of the term "character set" is more commonly
referred to as a "character encoding." However, since HTTP and MIME
share the same registry, it is important that the terminology also
be shared.
HTTP character sets are identified by case-insensitive tokens. The
complete set of tokens is defined by the IANA Character Set registry
[19].
charset = token
Although HTTP allows an arbitrary token to be used as a charset value,
any token that has a predefined value within the IANA Character Set
registry MUST represent the character set defined by that registry.
Applications SHOULD limit their use of character sets to those defined
by the IANA registry.
3.5 Content Codings
Content coding values indicate an encoding transformation that has been
or can be applied to an entity. Content codings are primarily used to
allow a document to be compressed or otherwise usefully transformed
without losing the identity of its underlying media type and without
loss of information. Frequently, the entity is stored in coded form,
transmitted directly, and only decoded by the recipient.
content-coding = token
All content-coding values are case-insensitive. HTTP/1.1 uses content-
coding values in the Accept-Encoding (section 14.3) and Content-Encoding
(section 14.12) header fields. Although the value describes the content-
Fielding, et al [Page 22]
INTERNET-DRAFT HTTP/1.1 Monday, August 12, 1996
coding, what is more important is that it indicates what decoding
mechanism will be required to remove the encoding.
The Internet Assigned Numbers Authority (IANA) acts as a registry for
content-coding value tokens. Initially, the registry contains the
following tokens:
gzip An encoding format produced by the file compression program "gzip"
(GNU zip) as described in RFC 1952 [25]. This format is a Lempel-
Ziv coding (LZ77) with a 32 bit CRC.
compress
The encoding format produced by the common UNIX file compression
program "compress". This format is an adaptive Lempel-Ziv-Welch
coding (LZW).
Note: Use of program names for the identification of encoding
formats is not desirable and should be discouraged for future
encodings. Their use here is representative of historical practice,
not good design. For compatibility with previous implementations of
HTTP, applications should consider "x-gzip" and "x-compress" to be
equivalent to "gzip" and "compress" respectively.
deflate The "zlib" format defined in RFC 1950[31] in combination with
the "deflate" compression mechanism described in RFC 1951[29].
New content-coding value tokens should be registered; to allow
interoperability between clients and servers, specifications of the
content coding algorithms needed to implement a new value should be
publicly available and adequate for independent implementation, and
conform to the purpose of content coding defined in this section.
3.6 Transfer Codings
Transfer coding values are used to indicate an encoding transformation
that has been, can be, or may need to be applied to an entity-body in
order to ensure "safe transport" through the network. This differs from
a content coding in that the transfer coding is a property of the
message, not of the original entity.
transfer-coding = "chunked" | transfer-extension
transfer-extension = token
All transfer-coding values are case-insensitive. HTTP/1.1 uses transfer
coding values in the Transfer-Encoding header field (section 14.40).
Transfer codings are analogous to the Content-Transfer-Encoding values
of MIME , which were designed to enable safe transport of binary data
over a 7-bit transport service. However, safe transport has a different
Fielding, et al [Page 23]
INTERNET-DRAFT HTTP/1.1 Monday, August 12, 1996
focus for an 8bit-clean transfer protocol. In HTTP, the only unsafe
characteristic of message-bodies is the difficulty in determining the
exact body length (section 7.2.2), or the desire to encrypt data over a
shared transport.
The chunked encoding modifies the body of a message in order to transfer
it as a series of chunks, each with its own size indicator, followed by
an optional footer containing entity-header fields. This allows
dynamically-produced content to be transferred along with the
information necessary for the recipient to verify that it has received
the full message.
Chunked-Body = *chunk
"0" CRLF
footer
CRLF
chunk = chunk-size [ chunk-ext ] CRLF
chunk-data CRLF
hex-no-zero = <HEX excluding "0">
chunk-size = hex-no-zero *HEX
chunk-ext = *( ";" chunk-ext-name [ "=" chunk-ext-value ] )
chunk-ext-name = token
chunk-ext-val = token | quoted-string
chunk-data = chunk-size(OCTET)
footer = *entity-header
The chunked encoding is ended by a zero-sized chunk followed by the
footer, which is terminated by an empty line. The purpose of the footer
is to provide an efficient way to supply information about an entity
that is generated dynamically; applications MUST NOT send header fields
in the footer which are not explicitly defined as being appropriate for
the footer, such as Content-MD5 or future extensions to HTTP for digital
signatures or other facilities.
An example process for decoding a Chunked-Body is presented in appendix
19.4.6.
All HTTP/1.1 applications MUST be able to receive and decode the
"chunked" transfer coding, and MUST ignore transfer coding extensions
they do not understand. A server which receives an entity-body with a
transfer-coding it does not understand SHOULD return 501
(Unimplemented), and close the connection. A server MUST NOT send
transfer-codings to an HTTP/1.0 client.
Fielding, et al [Page 24]
INTERNET-DRAFT HTTP/1.1 Monday, August 12, 1996
3.7 Media Types
HTTP uses Internet Media Types in the Content-Type (section 14.18) and
Accept (section 14.1) header fields in order to provide open and
extensible data typing and type negotiation.
media-type = type "/" subtype *( ";" parameter )
type = token
subtype = token
Parameters may follow the type/subtype in the form of attribute/value
pairs.
parameter = attribute "=" value
attribute = token
value = token | quoted-string
The type, subtype, and parameter attribute names are case-insensitive.
Parameter values may or may not be case-sensitive, depending on the
semantics of the parameter name. Linear white space (LWS) MUST NOT be
used between the type and subtype, nor between an attribute and its
value. User agents that recognize the media-type MUST process (or
arrange to be processed by any external applications used to process
that type/subtype by the user agent) the parameters for that MIME type
as described by that type/subtype definition to the and inform the user
of any problems discovered.
Note: some older HTTP applications do not recognize media type
parameters. When sending data to older HTTP applications,
implementations should only use media type parameters when they are
required by that type/subtype definition.
Media-type values are registered with the Internet Assigned Number
Authority (IANA). The media type registration process is outlined in RFC
1590 [17]. Use of non-registered media types is discouraged.
3.7.1 Canonicalization and Text Defaults
Internet media types are registered with a canonical form. In general,
an entity-body transferred via HTTP messages MUST be represented in the
appropriate canonical form prior to its transmission; the exception is
"text" types, as defined in the next paragraph.
When in canonical form, media subtypes of the "text" type use CRLF as
the text line break. HTTP relaxes this requirement and allows the
transport of text media with plain CR or LF alone representing a line
break when it is done consistently for an entire entity-body. HTTP
applications MUST accept CRLF, bare CR, and bare LF as being
representative of a line break in text media received via HTTP. In
addition, if the text is represented in a character set that does not
Fielding, et al [Page 25]
INTERNET-DRAFT HTTP/1.1 Monday, August 12, 1996
use octets 13 and 10 for CR and LF respectively, as is the case for some
multi-byte character sets, HTTP allows the use of whatever octet
sequences are defined by that character set to represent the equivalent
of CR and LF for line breaks. This flexibility regarding line breaks
applies only to text media in the entity-body; a bare CR or LF MUST NOT
be substituted for CRLF within any of the HTTP control structures (such
as header fields and multipart boundaries).
If an entity-body is encoded with a Content-Encoding, the underlying
data MUST be in a form defined above prior to being encoded.
The "charset" parameter is used with some media types to define the
character set (section 3.4) of the data. When no explicit charset
parameter is provided by the sender, media subtypes of the "text" type
are defined to have a default charset value of "ISO-8859-1" when
received via HTTP. Data in character sets other than "ISO-8859-1" or its
subsets MUST be labeled with an appropriate charset value.
Some HTTP/1.0 software has interpreted a Content-Type header without
charset parameter incorrectly to mean "recipient should guess." Senders
wishing to defeat this behavior MAY include a charset parameter even
when the charset is ISO-8859-1 and SHOULD do so when it is known that it
will not confuse the recipient.
Unfortunately, some older HTTP/1.0 clients did not deal properly with an
explicit charset parameter. HTTP/1.1 recipients MUST respect the charset
label provided by the sender; and those user agents that have a
provision to "guess" a charset MUST use the charset from the content-
type field if they support that charset, rather than the recipient's
preference, when initially displaying a document.
3.7.2 Multipart Types
MIME provides for a number of "multipart" types -- encapsulations of one
or more entities within a single message-body. All multipart types share
a common syntax, as defined in section 7.2.1 of RFC 1521 [7], and MUST
include a boundary parameter as part of the media type value. The
message body is itself a protocol element and MUST therefore use only
CRLF to represent line breaks between body-parts. Unlike in RFC 1521,
the epilogue of any multipart message MUST be empty; HTTP applications
MUST NOT transmit the epilogue (even if the original multipart contains
an epilogue).
In HTTP, multipart body-parts MAY contain header fields which are
significant to the meaning of that part. A Content-Location header field
(section 14.15) SHOULD be included in the body-part of each enclosed
entity that can be identified by a URL.
In general, an HTTP user agent SHOULD follow the same or similar
behavior as a MIME user agent would upon receipt of a multipart type. If
Fielding, et al [Page 26]
INTERNET-DRAFT HTTP/1.1 Monday, August 12, 1996
an application receives a
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -