📄 rfc2616.html
字号:
SHOULD return 414 (Request-URI Too Long) status if a URI is longer
than the server can handle (see section 10.4.15).
Note: Servers ought to be cautious about depending on URI lengths
above 255 bytes, because some older client or proxy
implementations might not properly support these lengths.
3.2.2 http URL
The "http" scheme is used to locate network resources via the HTTP
protocol. This section defines the scheme-specific syntax and
semantics for http URLs.
http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
If the port is empty or not given, port 80 is assumed. The semantics
are that the identified resource is located at the server listening
for TCP connections on that port of that host, and the Request-URI
for the resource is abs_path (section 5.1.2). The use of IP addresses
in URLs SHOULD be avoided whenever possible (see <A href="../../../../rfc.net/rfc1900.html">RFC 1900</A> [24]). If
the abs_path is not present in the URL, it MUST be given as "/" when
used as a Request-URI for a resource (section 5.1.2). If a proxy
receives a host name which is not a fully qualified domain name, it
MAY add its domain to the host name it received. If a proxy receives
a fully qualified domain name, the proxy MUST NOT change the host
name.
Fielding, et al. Standards Track [Page 19]
<HR>
<A href="rfc2616.html">RFC 2616</A> HTTP/1.1 June 1999
3.2.3 URI Comparison
When comparing two URIs to decide if they match or not, a client
SHOULD use a case-sensitive octet-by-octet comparison of the entire
URIs, with these exceptions:
- A port that is empty or not given is equivalent to the default
port for that URI-reference;
- Comparisons of host names MUST be case-insensitive;
- Comparisons of scheme names MUST be case-insensitive;
- An empty abs_path is equivalent to an abs_path of "/".
Characters other than those in the "reserved" and "unsafe" sets (see
<A href="../../../../rfc.net/rfc2396.html">RFC 2396</A> [42]) are equivalent to their ""%" HEX HEX" encoding.
For example, the following three URIs are equivalent:
http://abc.com:80/~smith/home.html
http://ABC.com/%7Esmith/home.html
http://ABC.com:/%7esmith/home.html
3.3 Date/Time Formats
3.3.1 Full Date
HTTP applications have historically allowed three different formats
for the representation of date/time stamps:
Sun, 06 Nov 1994 08:49:37 GMT ; <A href="../../../../rfc.net/rfc822.html">RFC 822</A>, updated by <A href="../../../../rfc.net/rfc1123.html">RFC 1123</A>
Sunday, 06-Nov-94 08:49:37 GMT ; <A href="../../../../rfc.net/rfc850.html">RFC 850</A>, obsoleted by <A href="../../../../rfc.net/rfc1036.html">RFC 1036</A>
Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
The first format is preferred as an Internet standard and represents
a fixed-length subset of that defined by <A href="../../../../rfc.net/rfc1123.html">RFC 1123</A> [8] (an update to
<A href="../../../../rfc.net/rfc822.html">RFC 822</A> [9]). The second format is in common use, but is based on the
obsolete <A href="../../../../rfc.net/rfc850.html">RFC 850</A> [12] date format and lacks a four-digit year.
HTTP/1.1 clients and servers that parse the date value MUST accept
all three formats (for compatibility with HTTP/1.0), though they MUST
only generate the <A href="../../../../rfc.net/rfc1123.html">RFC 1123</A> format for representing HTTP-date values
in header fields. See section 19.3 for further information.
Note: Recipients of date values are encouraged to be robust in
accepting date values that may have been sent by non-HTTP
applications, as is sometimes the case when retrieving or posting
messages via proxies/gateways to SMTP or NNTP.
Fielding, et al. Standards Track [Page 20]
<HR>
<A href="rfc2616.html">RFC 2616</A> HTTP/1.1 June 1999
All HTTP date/time stamps MUST be represented in Greenwich Mean Time
(GMT), without exception. For the purposes of HTTP, GMT is exactly
equal to UTC (Coordinated Universal Time). This is indicated in the
first two formats by the inclusion of "GMT" as the three-letter
abbreviation for time zone, and MUST be assumed when reading the
asctime format. HTTP-date is case sensitive and MUST NOT include
additional LWS beyond that specifically included as SP in the
grammar.
HTTP-date = rfc1123-date | rfc850-date | asctime-date
rfc1123-date = wkday "," SP date1 SP time SP "GMT"
rfc850-date = weekday "," SP date2 SP time SP "GMT"
asctime-date = wkday SP date3 SP time SP 4DIGIT
date1 = 2DIGIT SP month SP 4DIGIT
; day month year (e.g., 02 Jun 1982)
date2 = 2DIGIT "-" month "-" 2DIGIT
; day-month-year (e.g., 02-Jun-82)
date3 = month SP ( 2DIGIT | ( SP 1DIGIT ))
; month day (e.g., Jun 2)
time = 2DIGIT ":" 2DIGIT ":" 2DIGIT
; 00:00:00 - 23:59:59
wkday = "Mon" | "Tue" | "Wed"
| "Thu" | "Fri" | "Sat" | "Sun"
weekday = "Monday" | "Tuesday" | "Wednesday"
| "Thursday" | "Friday" | "Saturday" | "Sunday"
month = "Jan" | "Feb" | "Mar" | "Apr"
| "May" | "Jun" | "Jul" | "Aug"
| "Sep" | "Oct" | "Nov" | "Dec"
Note: HTTP requirements for the date/time stamp format apply only
to their usage within the protocol stream. Clients and servers are
not required to use these formats for user presentation, request
logging, etc.
3.3.2 Delta Seconds
Some HTTP header fields allow a time value to be specified as an
integer number of seconds, represented in decimal, after the time
that the message was received.
delta-seconds = 1*DIGIT
3.4 Character Sets
HTTP uses the same definition of the term "character set" as that
described for MIME:
Fielding, et al. Standards Track [Page 21]
<HR>
<A href="rfc2616.html">RFC 2616</A> HTTP/1.1 June 1999
The term "character set" is used in this document to refer to a
method used with one or more tables to convert a sequence of octets
into a sequence of characters. Note that unconditional conversion in
the other direction is not required, in that not all characters may
be available in a given character set and a character set may provide
more than one sequence of octets to represent a particular character.
This definition is intended to allow various kinds of character
encoding, from simple single-table mappings such as US-ASCII to
complex table switching methods such as those that use ISO-2022's
techniques. However, the definition associated with a MIME character
set name MUST fully specify the mapping to be performed from octets
to characters. In particular, use of external profiling information
to determine the exact mapping is not permitted.
Note: This use of the term "character set" is more commonly
referred to as a "character encoding." However, since HTTP and
MIME share the same registry, it is important that the terminology
also be shared.
HTTP character sets are identified by case-insensitive tokens. The
complete set of tokens is defined by the IANA Character Set registry
[19].
charset = token
Although HTTP allows an arbitrary token to be used as a charset
value, any token that has a predefined value within the IANA
Character Set registry [19] MUST represent the character set defined
by that registry. Applications SHOULD limit their use of character
sets to those defined by the IANA registry.
Implementors should be aware of IETF character set requirements [38]
[41].
3.4.1 Missing Charset
Some HTTP/1.0 software has interpreted a Content-Type header without
charset parameter incorrectly to mean "recipient should guess."
Senders wishing to defeat this behavior MAY include a charset
parameter even when the charset is ISO-8859-1 and SHOULD do so when
it is known that it will not confuse the recipient.
Unfortunately, some older HTTP/1.0 clients did not deal properly with
an explicit charset parameter. HTTP/1.1 recipients MUST respect the
charset label provided by the sender; and those user agents that have
a provision to "guess" a charset MUST use the charset from the
Fielding, et al. Standards Track [Page 22]
<HR>
<A href="rfc2616.html">RFC 2616</A> HTTP/1.1 June 1999
content-type field if they support that charset, rather than the
recipient's preference, when initially displaying a document. See
section 3.7.1.
3.5 Content Codings
Content coding values indicate an encoding transformation that has
been or can be applied to an entity. Content codings are primarily
used to allow a document to be compressed or otherwise usefully
transformed without losing the identity of its underlying media type
and without loss of information. Frequently, the entity is stored in
coded form, transmitted directly, and only decoded by the recipient.
content-coding = token
All content-coding values are case-insensitive. HTTP/1.1 uses
content-coding values in the Accept-Encoding (section 14.3) and
Content-Encoding (section 14.11) header fields. Although the value
describes the content-coding, what is more important is that it
indicates what decoding mechanism will be required to remove the
encoding.
The Internet Assigned Numbers Authority (IANA) acts as a registry for
content-coding value tokens. Initially, the registry contains the
following tokens:
gzip An encoding format produced by the file compression program
"gzip" (GNU zip) as described in <A href="../../../../rfc.net/rfc1952.html">RFC 1952</A> [25]. This format is a
Lempel-Ziv coding (LZ77) with a 32 bit CRC.
compress
The encoding format produced by the common UNIX file compression
program "compress". This format is an adaptive Lempel-Ziv-Welch
coding (LZW).
Use of program names for the identification of encoding formats
is not desirable and is discouraged for future encodings. Their
use here is representative of historical practice, not good
design. For compatibility with previous implementations of HTTP,
applications SHOULD consider "x-gzip" and "x-compress" to be
equivalent to "gzip" and "compress" respectively.
deflate
The "zlib" format defined in <A href="../../../../rfc.net/rfc1950.html">RFC 1950</A> [31] in combination with
the "deflate" compression mechanism described in <A href="../../../../rfc.net/rfc1951.html">RFC 1951</A> [29].
Fielding, et al. Standards Track [Page 23]
<HR>
<A href="rfc2616.html">RFC 2616</A> HTTP/1.1
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -