📄 rfc1738.txt
字号:
wais Wide Area Information Servers
file Host-specific file names
prospero Prospero Directory Service
Other schemes may be specified by future specifications. Section 4 of
this document describes how new schemes may be registered, and lists
some scheme names that are under development.
3.1. Common Internet Scheme Syntax
While the syntax for the rest of the URL may vary depending on the
particular scheme selected, URL schemes that involve the direct use
of an IP-based protocol to a specified host on the Internet use a
common syntax for the scheme-specific data:
//<user>:<password>@<host>:<port>/<url-path>
Some or all of the parts "<user>:<password>@", ":<password>",
":<port>", and "/<url-path>" may be excluded. The scheme specific
data start with a double slash "//" to indicate that it complies with
the common Internet scheme syntax. The different components obey the
following rules:
user
An optional user name. Some schemes (e.g., ftp) allow the
specification of a user name.
password
An optional password. If present, it follows the user
name separated from it by a colon.
The user name (and password), if present, are followed by a
commercial at-sign "@". Within the user and password field, any ":",
"@", or "/" must be encoded.
Berners-Lee, Masinter & McCahill [Page 5]
RFC 1738 Uniform Resource Locators (URL) December 1994
Note that an empty user name or password is different than no user
name or password; there is no way to specify a password without
specifying a user name. E.g., <URL:ftp://@host.com/> has an empty
user name and no password, <URL:ftp://host.com/> has no user name,
while <URL:ftp://foo:@host.com/> has a user name of "foo" and an
empty password.
host
The fully qualified domain name of a network host, or its IP
address as a set of four decimal digit groups separated by
".". Fully qualified domain names take the form as described
in Section 3.5 of RFC 1034 [13] and Section 2.1 of RFC 1123
[5]: a sequence of domain labels separated by ".", each domain
label starting and ending with an alphanumerical character and
possibly also containing "-" characters. The rightmost domain
label will never start with a digit, though, which
syntactically distinguishes all domain names from the IP
addresses.
port
The port number to connect to. Most schemes designate
protocols that have a default port number. Another port number
may optionally be supplied, in decimal, separated from the
host by a colon. If the port is omitted, the colon is as well.
url-path
The rest of the locator consists of data specific to the
scheme, and is known as the "url-path". It supplies the
details of how the specified resource can be accessed. Note
that the "/" between the host (or port) and the url-path is
NOT part of the url-path.
The url-path syntax depends on the scheme being used, as does the
manner in which it is interpreted.
3.2. FTP
The FTP URL scheme is used to designate files and directories on
Internet hosts accessible using the FTP protocol (RFC959).
A FTP URL follow the syntax described in Section 3.1. If :<port> is
omitted, the port defaults to 21.
Berners-Lee, Masinter & McCahill [Page 6]
RFC 1738 Uniform Resource Locators (URL) December 1994
3.2.1. FTP Name and Password
A user name and password may be supplied; they are used in the ftp
"USER" and "PASS" commands after first making the connection to the
FTP server. If no user name or password is supplied and one is
requested by the FTP server, the conventions for "anonymous" FTP are
to be used, as follows:
The user name "anonymous" is supplied.
The password is supplied as the Internet e-mail address
of the end user accessing the resource.
If the URL supplies a user name but no password, and the remote
server requests a password, the program interpreting the FTP URL
should request one from the user.
3.2.2. FTP url-path
The url-path of a FTP URL has the following syntax:
<cwd1>/<cwd2>/.../<cwdN>/<name>;type=<typecode>
Where <cwd1> through <cwdN> and <name> are (possibly encoded) strings
and <typecode> is one of the characters "a", "i", or "d". The part
";type=<typecode>" may be omitted. The <cwdx> and <name> parts may be
empty. The whole url-path may be omitted, including the "/"
delimiting it from the prefix containing user, password, host, and
port.
The url-path is interpreted as a series of FTP commands as follows:
Each of the <cwd> elements is to be supplied, sequentially, as the
argument to a CWD (change working directory) command.
If the typecode is "d", perform a NLST (name list) command with
<name> as the argument, and interpret the results as a file
directory listing.
Otherwise, perform a TYPE command with <typecode> as the argument,
and then access the file whose name is <name> (for example, using
the RETR command.)
Within a name or CWD component, the characters "/" and ";" are
reserved and must be encoded. The components are decoded prior to
their use in the FTP protocol. In particular, if the appropriate FTP
sequence to access a particular file requires supplying a string
containing a "/" as an argument to a CWD or RETR command, it is
Berners-Lee, Masinter & McCahill [Page 7]
RFC 1738 Uniform Resource Locators (URL) December 1994
necessary to encode each "/".
For example, the URL <URL:ftp://myname@host.dom/%2Fetc/motd> is
interpreted by FTP-ing to "host.dom", logging in as "myname"
(prompting for a password if it is asked for), and then executing
"CWD /etc" and then "RETR motd". This has a different meaning from
<URL:ftp://myname@host.dom/etc/motd> which would "CWD etc" and then
"RETR motd"; the initial "CWD" might be executed relative to the
default directory for "myname". On the other hand,
<URL:ftp://myname@host.dom//etc/motd>, would "CWD " with a null
argument, then "CWD etc", and then "RETR motd".
FTP URLs may also be used for other operations; for example, it is
possible to update a file on a remote file server, or infer
information about it from the directory listings. The mechanism for
doing so is not spelled out here.
3.2.3. FTP Typecode is Optional
The entire ;type=<typecode> part of a FTP URL is optional. If it is
omitted, the client program interpreting the URL must guess the
appropriate mode to use. In general, the data content type of a file
can only be guessed from the name, e.g., from the suffix of the name;
the appropriate type code to be used for transfer of the file can
then be deduced from the data content of the file.
3.2.4 Hierarchy
For some file systems, the "/" used to denote the hierarchical
structure of the URL corresponds to the delimiter used to construct a
file name hierarchy, and thus, the filename will look similar to the
URL path. This does NOT mean that the URL is a Unix filename.
3.2.5. Optimization
Clients accessing resources via FTP may employ additional heuristics
to optimize the interaction. For some FTP servers, for example, it
may be reasonable to keep the control connection open while accessing
multiple URLs from the same server. However, there is no common
hierarchical model to the FTP protocol, so if a directory change
command has been given, it is impossible in general to deduce what
sequence should be given to navigate to another directory for a
second retrieval, if the paths are different. The only reliable
algorithm is to disconnect and reestablish the control connection.
Berners-Lee, Masinter & McCahill [Page 8]
RFC 1738 Uniform Resource Locators (URL) December 1994
3.3. HTTP
The HTTP URL scheme is used to designate Internet resources
accessible using HTTP (HyperText Transfer Protocol).
The HTTP protocol is specified elsewhere. This specification only
describes the syntax of HTTP URLs.
An HTTP URL takes the form:
http://<host>:<port>/<path>?<searchpart>
where <host> and <port> are as described in Section 3.1. If :<port>
is omitted, the port defaults to 80. No user name or password is
allowed. <path> is an HTTP selector, and <searchpart> is a query
string. The <path> is optional, as is the <searchpart> and its
preceding "?". If neither <path> nor <searchpart> is present, the "/"
may also be omitted.
Within the <path> and <searchpart> components, "/", ";", "?" are
reserved. The "/" character may be used within HTTP to designate a
hierarchical structure.
3.4. GOPHER
The Gopher URL scheme is used to designate Internet resources
accessible using the Gopher protocol.
The base Gopher protocol is described in RFC 1436 and supports items
and collections of items (directories). The Gopher+ protocol is a set
o
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -