📄 rfc1630.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 4 页
字号:
Berners-Lee                                                     [Page 7]

RFC 1630                      URIs in WWW                      June 1994


Encoding reserved characters

   When a system uses a local addressing scheme, it is useful to provide
   a mapping from local addresses into URIs so that references to
   objects within the addressing scheme may be referred to globally, and
   possibly accessed through gateway servers.

   For a new naming scheme, any mapping scheme may be defined provided
   it is unambiguous, reversible, and provides valid URIs.  It is
   recommended that where hierarchical aspects to the local naming
   scheme exist, they be mapped onto the hierarchical URL path syntax in
   order to allow the partial form to be used.

   It is also recommended that the conventional scheme below be used in
   all cases except for any scheme which encodes binary data as opposed
   to text, in which case a more compact encoding such as pure
   hexadecimal or base 64 might be more appropriate.  For example, the
   conventional URI encoding method is used for mapping WAIS, FTP,
   Prospero and Gopher addresses in the URI specification.

   CONVENTIONAL URI ENCODING SCHEME

      Where the local naming scheme uses ASCII characters which are not
      allowed in the URI, these may be represented in the URL by a
      percent sign "%" immediately followed by two hexadecimal digits
      (0-9, A-F) giving the ISO Latin 1 code for that character.
      Character codes other than those allowed by the syntax shall not
      be used unencoded in a URI.

   REDUCED OR INCREASED SAFE CHARACTER SETS

      The same encoding method may be used for encoding characters whose
      use, although technically allowed in a URI, would be unwise due to
      problems of corruption by imperfect gateways or misrepresentation
      due to the use of variant character sets, or which would simply be
      awkward in a given environment.  Because a % sign always indicates
      an encoded character, a URI may be made "safer" simply by encoding
      any characters considered unsafe, while leaving already encoded
      characters still encoded.  Similarly, in cases where a larger set
      of characters is acceptable, % signs can be selectively and
      reversibly expanded.

      Before two URIs can be compared, it is therefore necessary to
      bring them to the same encoding level.

      However, the reserved characters mentioned above have a quite
      different significance when encoded, and so may NEVER be encoded
      and unencoded in this way.



Berners-Lee                                                     [Page 8]

RFC 1630                      URIs in WWW                      June 1994


      The percent sign intended as such must always be encoded, as its
      presence otherwise always indicates an encoding.  Sequences which
      start with a percent sign but are not followed by two hexadecimal
      characters are reserved for future extension.  (See Example 3.)

   Example 1

   The URIs

                http://info.cern.ch/albert/bertram/marie-claude

   and

                http://info.cern.ch/albert/bertram/marie%2Dclaude

   are identical, as the %2D encodes a hyphen character.

   Example 2

   The URIs

                http://info.cern.ch/albert/bertram/marie-claude

   and

                http://info.cern.ch/albert/bertram%2Fmarie-claude

   are NOT identical, as in the second case the encoded slash does not
   have hierarchical significance.

   Example 3

   The URIs

                fxqn:/us/va/reston/cnri/ietf/24/asdf%*.fred

   and

                news:12345667123%asdghfh@info.cern.ch

   are illegal, as all % characters imply encodings, and there is no
   decoding defined for "%*"  or "%as" in this recommendation.

Partial (relative) form

   Within a object whose URI is well defined, the URI of another object
   may be given in abbreviated form, where parts of the two URIs are the
   same. This allows objects within a group to refer to each other



Berners-Lee                                                     [Page 9]

RFC 1630                      URIs in WWW                      June 1994


   without requiring the space for a complete reference, and it
   incidentally allows the group of objects to be moved without changing
   any references.  It must be emphasized that when a reference is
   passed in anything other than a well controlled context, the full
   form must always be used.

   In the World-Wide Web applications, the context URI is that of the
   document or object containing a reference. In this case partial URIs
   can be generated in virtual objects or stored in real objects,
   without the need for dramatic change if the higher-order parts of a
   hierarchical naming system are modified.  Apart from terseness, this
   gives greater robustness to practical systems, by enabling
   information hiding between system components.

   The partial form relies on a property of the URI syntax that certain
   characters ("/") and certain path elements ("..", ".") have a
   significance reserved for representing a hierarchical space, and must
   be recognized as such by both clients and servers.

   A partial form can be distinguished from an absolute form in that the
   latter must have a colon and that colon must occur before any slash
   characters. Systems not requiring partial forms should not use any
   unencoded slashes in their naming schemes.  If they do, absolute URIs
   will still work, but confusion may result. (See note on Gopher
   below.)

   The rules for the use of a partial name relative to the URI of the
   context are:

      If the scheme parts are different, the whole absolute URI must
      be given.  Otherwise, the scheme is omitted, and:

      If the partial URI starts with a non-zero number of consecutive
      slashes, then everything from the context URI up to (but not
      including) the first occurrence of exactly the same number of
      consecutive slashes which has no greater number of consecutive
      slashes anywhere to the right of it is taken to be the same and
      so prepended to the partial URL to form the full URL. Otherwise:

      The last part of the path of the context URI (anything following
      the rightmost slash) is removed, and the given partial URI
      appended in its place, and then:

      Within the result, all occurrences of "xxx/../" or "/." are
      recursively removed, where xxx, ".." and "." are complete path
      elements.





Berners-Lee                                                    [Page 10]

RFC 1630                      URIs in WWW                      June 1994


      Note: Trailing slashes

   If a path of the context locator ends in slash, partial URIs are
   treated differently to the URI with the same path but without a
   trailing slash. The trailing slash indicates a void segment of the
   path.

      Note: Gopher

   The gopher system does not have the concept of relative URIs, and the
   gopher community currently allows / as data characters in gopher URIs
   without escaping them to %2F.  Relative forms may not in general be
   used for documents served by gopher servers.  If they are used, then
   WWW software assumes, normally correctly, that in fact they do have
   hierarchical significance despite the specifications. The use of HTTP
   rather than gopher protocol is however recommended.

   Examples

   In the context of URI

                        magic://a/b/c//d/e/f

   the partial URIs would expand as follows:

   g                       magic://a/b/c//d/e/g

   /g                      magic://a/g

   //g                     magic://g

   ../g                    magic://a/b/c//d/g

   g:h                     g:h

   and in the context of the URI

                           magic://a/b/c//d/e/

   the results would be exactly the same.

Fragment-id

   This represents a part of, fragment of, or a sub-function within, an
   object.  Its syntax and semantics are defined by the application
   responsible for the object, or the specification of the content type
   of the object.  The only definition here is of the allowed characters
   by which it may be represented in a URL.



Berners-Lee                                                    [Page 11]

RFC 1630                      URIs in WWW                      June 1994


   Specific syntaxes for representing fragments in text documents by
   line and character range, or in graphics by coordinates, or in
   structured documents using ladders, are suitable for standardization
   but not defined here.

   The fragment-id follows the URL of the whole object from which it is
   separated by a hash sign (#).  If the fragment-id is void, the hash
   sign may be omitted: A void fragment-id with or without the hash sign
   means that the URL refers to the whole object.

   While this hook is allowed for identification of fragments, the
   question of addressing of parts of objects, or of the grouping of
   objects and relationship between continued and containing objects, is
   not addressed by this document.

   Fragment identifiers do NOT address the question of objects which are
   different versions of a "living" object, nor of expressing the
   relationships between different versions and the living object.

   There is no implication that a fragment identifier refers to anything
   which can be extracted as an object in its own right.  It may, for
   example, refer to an indivisible point within an object.

Specific Schemes

   The mapping for URIs onto some existing standard and experimental
   protocols is outlined in the BNF syntax definition.  Notes on
   particular protocols follow.  These URIs are frequently referred to
   as URLs, though the exact definition of the term URL is still under
   discussion (March 1993).  The schemes covered are:

   http                    Hypertext Transfer Protocol (examples)

   ftp                     File Transfer protocol

   gopher                  Gopher protocol

   mailto                  Electronic mail address

   news                    Usenet news

   telnet, rlogin and tn3270
                           Reference to interactive sessions

   wais                    Wide Area Information Servers

   file                    Local file access




Berners-Lee                                                    [Page 12]

RFC 1630                      URIs in WWW                      June 1994


   The following schemes are proposed as essential to the unification of
   the web with electronic mail, but not currently (to the author's
   knowledge) implemented:

   mid                     Message identifiers for electronic mail

   cid                     Content identifiers for MIME body part

   The schemes for X.500, network management database, and Whois++ have
   not been specified and may be the subject of further study.  Schemes
   for Prospero, and restricted NNTP use are not currently implemented
   as far as the author is aware.

   The "urn" prefix is reserved for use in encoding a Uniform Resource
   Name when that has been developed by the IETF working group.

   New schemes may be registered at a later time.

HTTP

   The HTTP protocol specifies that the path is handled transparently by
   those who handle URLs, except for the servers which de-reference
   them.  The path is passed by the client to the server with any
   request, but is not otherwise understood by the client.

   The host details are not passed on to the client when the URL is an
   HTTP URL which refers to the server in question.  In this case the
   string sent starts with the slash which follows the host details.
   However, when an HTTP server is being used as a gateway (or "proxy")
   then the entire URI, whether HTTP or some other scheme, is passed on
   the HTTP command line.  The search part, if present, is sent as part
   of the HTTP command, and may in this respect be treated as part of
   the path.  No fragmentid part of a WWW URI (the hash sign and
   following) is sent with the request.  Spaces and control characters
   in URLs must be escaped for transmission in HTTP, as must other
   disallowed characters.

   EXAMPLES

      These examples are not part of the specification: they are
      provided as illustations only.  The URI of the "welcome" page to a
      server is conventionally

         http://www.my.work.com/

         As the rest of the URL (after the hostname an port) is opaque
         to the client, it shows great variety but the following are all
         fairly typical.



Berners-Lee                                                    [Page 13]

RFC 1630                      URIs in WWW                      June 1994


http://www.my.uni.edu/info/matriculation/enroling.html

http://info.my.org/AboutUs/Phonebook

http://www.library.my.town.va.us/Catalogue/76523471236%2Fwen44--4.98

http://www.my.org/462F4F2D4241522A314159265358979323846

   A URL for a server on a different port to 80 looks like

        http://info.cern.ch:8000/imaginary/test

   A reference to a particular part of a document may, including the
   fragment identifier, look like

        http://www.myu.edu/org/admin/people#andy

   in which case the string "#andy" is not sent to the server, but is
   retained by the client and used when the whole object had been
   retrieved.

    A search on a text database might look like

        http://info.my.org/AboutUs/Index/Phonebook?dobbins

   and on another database

        http://info.cern.ch/RDB/EMP?*%20where%20name%%3Ddobbins

   In all cases the client passes the path string to the server
   uninterpreted, and for the client to deduce anything from

FTP

   The ftp: prefix indicates that the FTP protocol is used, as defined
   in STD 9, RFC 959 or any successor.  The port number, if present,
   gives the port of the FTP server if not the FTP default.

   User name and password

      The syntax allows for the inclusion of a user name and even a
      password for those systems which do not use the anonymous FTP
      convention. The default, however, if no user or password is
      supplied, will be to use that convention, viz. that the user name
      is "anonymous" and the password the user's Internet-style mail
      address.





Berners-Lee                                                    [Page 14]
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -