rfc2396.txt

来自「RFC 的详细文档!」· 文本 代码 · 共 1,584 行 · 第 1/5 页

TXT
1,584
字号
   is typically constructed by specifying a port number other than that
   reserved for the network protocol in question.  The client
   unwittingly contacts a site that is in fact running a different
   protocol.  The content of the URL contains instructions that, when
   interpreted according to this other protocol, cause an unexpected
   operation.  An example has been the use of a gopher URL to cause an
   unintended or impersonating message to be sent via a SMTP server.

   Caution should be used when using any URL that specifies a port
   number other than the default for the protocol, especially when it is
   a number within the reserved space.

   Care should be taken when a URL contains escaped delimiters for a
   given protocol (for example, CR and LF characters for telnet
   protocols) that these are not unescaped before transmission.  This
   might violate the protocol, but avoids the potential for such





Berners-Lee, et. al.        Standards Track                    [Page 23]

RFC 2396                   URI Generic Syntax                August 1998


   characters to be used to simulate an extra operation or parameter in
   that protocol, which might lead to an unexpected and possibly harmful
   remote operation to be performed.

   It is clearly unwise to use a URL that contains a password which is
   intended to be secret. In particular, the use of a password within
   the 'userinfo' component of a URL is strongly disrecommended except
   in those rare cases where the 'password' parameter is intended to be
   public.

8. Acknowledgements

   This document was derived from RFC 1738 [RFC1738] and RFC 1808
   [RFC1808]; the acknowledgements in those specifications still apply.
   In addition, contributions by Gisle Aas, Martin Beet, Martin Duerst,
   Jim Gettys, Martijn Koster, Dave Kristol, Daniel LaLiberte, Foteos
   Macrides, James Marshall, Ryan Moats, Keith Moore, and Lauren Wood
   are gratefully acknowledged.

9. References

   [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and
             Languages", BCP 18, RFC 2277, January 1998.

   [RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A
             Unifying Syntax for the Expression of Names and Addresses
             of Objects on the Network as used in the World-Wide Web",
             RFC 1630, June 1994.

   [RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, Editors,
             "Uniform Resource Locators (URL)", RFC 1738, December 1994.

   [RFC1866] Berners-Lee T., and D. Connolly, "HyperText Markup Language
             Specification -- 2.0", RFC 1866, November 1995.

   [RFC1123] Braden, R., Editor, "Requirements for Internet Hosts --
             Application and Support", STD 3, RFC 1123, October 1989.

   [RFC822]  Crocker, D., "Standard for the Format of ARPA Internet Text
             Messages", STD 11, RFC 822, August 1982.

   [RFC1808] Fielding, R., "Relative Uniform Resource Locators", RFC
             1808, June 1995.

   [RFC2046] Freed, N., and N. Borenstein, "Multipurpose Internet Mail
             Extensions (MIME) Part Two: Media Types", RFC 2046,
             November 1996.




Berners-Lee, et. al.        Standards Track                    [Page 24]

RFC 2396                   URI Generic Syntax                August 1998


   [RFC1736] Kunze, J., "Functional Recommendations for Internet
             Resource Locators", RFC 1736, February 1995.

   [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997.

   [RFC1034] Mockapetris, P., "Domain Names - Concepts and Facilities",
             STD 13, RFC 1034, November 1987.

   [RFC2110] Palme, J., and A. Hopmann, "MIME E-mail Encapsulation of
             Aggregate Documents, such as HTML (MHTML)", RFC 2110, March
             1997.

   [RFC1737] Sollins, K., and L. Masinter, "Functional Requirements for
             Uniform Resource Names", RFC 1737, December 1994.

   [ASCII]   US-ASCII. "Coded Character Set -- 7-bit American Standard
             Code for Information Interchange", ANSI X3.4-1986.

   [UTF-8]   Yergeau, F., "UTF-8, a transformation format of ISO 10646",
             RFC 2279, January 1998.































Berners-Lee, et. al.        Standards Track                    [Page 25]

RFC 2396                   URI Generic Syntax                August 1998


10. Authors' Addresses

   Tim Berners-Lee
   World Wide Web Consortium
   MIT Laboratory for Computer Science, NE43-356
   545 Technology Square
   Cambridge, MA 02139

   Fax: +1(617)258-8682
   EMail: timbl@w3.org


   Roy T. Fielding
   Department of Information and Computer Science
   University of California, Irvine
   Irvine, CA  92697-3425

   Fax: +1(949)824-1715
   EMail: fielding@ics.uci.edu


   Larry Masinter
   Xerox PARC
   3333 Coyote Hill Road
   Palo Alto, CA 94034

   Fax: +1(415)812-4333
   EMail: masinter@parc.xerox.com























Berners-Lee, et. al.        Standards Track                    [Page 26]

RFC 2396                   URI Generic Syntax                August 1998


A. Collected BNF for URI

      URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
      absoluteURI   = scheme ":" ( hier_part | opaque_part )
      relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]

      hier_part     = ( net_path | abs_path ) [ "?" query ]
      opaque_part   = uric_no_slash *uric

      uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
                      "&" | "=" | "+" | "$" | ","

      net_path      = "//" authority [ abs_path ]
      abs_path      = "/"  path_segments
      rel_path      = rel_segment [ abs_path ]

      rel_segment   = 1*( unreserved | escaped |
                          ";" | "@" | "&" | "=" | "+" | "$" | "," )

      scheme        = alpha *( alpha | digit | "+" | "-" | "." )

      authority     = server | reg_name

      reg_name      = 1*( unreserved | escaped | "$" | "," |
                          ";" | ":" | "@" | "&" | "=" | "+" )

      server        = [ [ userinfo "@" ] hostport ]
      userinfo      = *( unreserved | escaped |
                         ";" | ":" | "&" | "=" | "+" | "$" | "," )

      hostport      = host [ ":" port ]
      host          = hostname | IPv4address
      hostname      = *( domainlabel "." ) toplabel [ "." ]
      domainlabel   = alphanum | alphanum *( alphanum | "-" ) alphanum
      toplabel      = alpha | alpha *( alphanum | "-" ) alphanum
      IPv4address   = 1*digit "." 1*digit "." 1*digit "." 1*digit
      port          = *digit

      path          = [ abs_path | opaque_part ]
      path_segments = segment *( "/" segment )
      segment       = *pchar *( ";" param )
      param         = *pchar
      pchar         = unreserved | escaped |
                      ":" | "@" | "&" | "=" | "+" | "$" | ","

      query         = *uric

      fragment      = *uric



Berners-Lee, et. al.        Standards Track                    [Page 27]

RFC 2396                   URI Generic Syntax                August 1998


      uric          = reserved | unreserved | escaped
      reserved      = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
                      "$" | ","
      unreserved    = alphanum | mark
      mark          = "-" | "_" | "." | "!" | "~" | "*" | "'" |
                      "(" | ")"

      escaped       = "%" hex hex
      hex           = digit | "A" | "B" | "C" | "D" | "E" | "F" |
                              "a" | "b" | "c" | "d" | "e" | "f"

      alphanum      = alpha | digit
      alpha         = lowalpha | upalpha

      lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
                 "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
                 "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
      upalpha  = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
                 "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
                 "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
      digit    = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
                 "8" | "9"





























Berners-Lee, et. al.        Standards Track                    [Page 28]

RFC 2396                   URI Generic Syntax                August 1998


B. Parsing a URI Reference with a Regular Expression

   As described in Section 4.3, the generic URI syntax is not sufficient
   to disambiguate the components of some forms of URI.  Since the
   "greedy algorithm" described in that section is identical to the
   disambiguation method used by POSIX regular expressions, it is
   natural and commonplace to use a regular expression for parsing the
   potential four components and fragment identifier of a URI reference.

   The following line is the regular expression 

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?