📄 rfc2396.txt

📁 C/C++语言的CGI接口库
💻 TXT
📖 第 1 页 / 共 5 页
字号:
   The base URI of a document can be established in one of four ways,   listed below in order of precedence.  The order of precedence can be   thought of in terms of layers, where the innermost defined base URI   has the highest precedence.  This can be visualized graphically as:      .----------------------------------------------------------.      |  .----------------------------------------------------.  |      |  |  .----------------------------------------------.  |  |      |  |  |  .----------------------------------------.  |  |  |      |  |  |  |  .----------------------------------.  |  |  |  |      |  |  |  |  |       <relative_reference>       |  |  |  |  |      |  |  |  |  `----------------------------------'  |  |  |  |      |  |  |  | (5.1.1) Base URI embedded in the       |  |  |  |      |  |  |  |         document's content             |  |  |  |      |  |  |  `----------------------------------------'  |  |  |      |  |  | (5.1.2) Base URI of the encapsulating entity |  |  |      |  |  |         (message, document, or none).        |  |  |      |  |  `----------------------------------------------'  |  |      |  | (5.1.3) URI used to retrieve the entity            |  |      |  `----------------------------------------------------'  |      | (5.1.4) Default Base URI is application-dependent        |      `----------------------------------------------------------'Berners-Lee, et. al.        Standards Track                    [Page 18]RFC 2396                   URI Generic Syntax                August 19985.1.1. Base URI within Document Content   Within certain document media types, the base URI of the document can   be embedded within the content itself such that it can be readily   obtained by a parser.  This can be useful for descriptive documents,   such as tables of content, which may be transmitted to others through   protocols other than their usual retrieval context (e.g., E-Mail or   USENET news).   It is beyond the scope of this document to specify how, for each   media type, the base URI can be embedded.  It is assumed that user   agents manipulating such media types will be able to obtain the   appropriate syntax from that media type's specification.  An example   of how the base URI can be embedded in the Hypertext Markup Language   (HTML) [RFC1866] is provided in Appendix D.   A mechanism for embedding the base URI within MIME container types   (e.g., the message and multipart types) is defined by MHTML   [RFC2110].  Protocols that do not use the MIME message header syntax,   but which do allow some form of tagged metainformation to be included   within messages, may define their own syntax for defining the base   URI as part of a message.5.1.2. Base URI from the Encapsulating Entity   If no base URI is embedded, the base URI of a document is defined by   the document's retrieval context.  For a document that is enclosed   within another entity (such as a message or another document), the   retrieval context is that entity; thus, the default base URI of the   document is the base URI of the entity in which the document is   encapsulated.5.1.3. Base URI from the Retrieval URI   If no base URI is embedded and the document is not encapsulated   within some other entity (e.g., the top level of a composite entity),   then, if a URI was used to retrieve the base document, that URI shall   be considered the base URI.  Note that if the retrieval was the   result of a redirected request, the last URI used (i.e., that which   resulted in the actual retrieval of the document) is the base URI.5.1.4. Default Base URI   If none of the conditions described in Sections 5.1.1--5.1.3 apply,   then the base URI is defined by the context of the application.   Since this definition is necessarily application-dependent, failingBerners-Lee, et. al.        Standards Track                    [Page 19]RFC 2396                   URI Generic Syntax                August 1998   to define the base URI using one of the other methods may result in   the same content being interpreted differently by different types of   application.   It is the responsibility of the distributor(s) of a document   containing relative URI to ensure that the base URI for that document   can be established.  It must be emphasized that relative URI cannot   be used reliably in situations where the document's base URI is not   well-defined.5.2. Resolving Relative References to Absolute Form   This section describes an example algorithm for resolving URI   references that might be relative to a given base URI.   The base URI is established according to the rules of Section 5.1 and   parsed into the four main components as described in Section 3.  Note   that only the scheme component is required to be present in the base   URI; the other components may be empty or undefined.  A component is   undefined if its preceding separator does not appear in the URI   reference; the path component is never undefined, though it may be   empty.  The base URI's query component is not used by the resolution   algorithm and may be discarded.   For each URI reference, the following steps are performed in order:   1) The URI reference is parsed into the potential four components and      fragment identifier, as described in Section 4.3.   2) If the path component is empty and the scheme, authority, and      query components are undefined, then it is a reference to the      current document and we are done.  Otherwise, the reference URI's      query and fragment components are defined as found (or not found)      within the URI reference and not inherited from the base URI.   3) If the scheme component is defined, indicating that the reference      starts with a scheme name, then the reference is interpreted as an      absolute URI and we are done.  Otherwise, the reference URI's      scheme is inherited from the base URI's scheme component.      Due to a loophole in prior specifications [RFC1630], some parsers      allow the scheme name to be present in a relative URI if it is the      same as the base URI scheme.  Unfortunately, this can conflict      with the correct parsing of non-hierarchical URI.  For backwards      compatibility, an implementation may work around such references      by removing the scheme if it matches that of the base URI and the      scheme is known to always use the <hier_part> syntax.  The parserBerners-Lee, et. al.        Standards Track                    [Page 20]RFC 2396                   URI Generic Syntax                August 1998      can then continue with the steps below for the remainder of the      reference components.  Validating parsers should mark such a      misformed relative reference as an error.   4) If the authority component is defined, then the reference is a      network-path and we skip to step 7.  Otherwise, the reference      URI's authority is inherited from the base URI's authority      component, which will also be undefined if the URI scheme does not      use an authority component.   5) If the path component begins with a slash character ("/"), then      the reference is an absolute-path and we skip to step 7.   6) If this step is reached, then we are resolving a relative-path      reference.  The relative path needs to be merged with the base      URI's path.  Although there are many ways to do this, we will      describe a simple method using a separate string buffer.      a) All but the last segment of the base URI's path component is         copied to the buffer.  In other words, any characters after the         last (right-most) slash character, if any, are excluded.      b) The reference's path component is appended to the buffer         string.      c) All occurrences of "./", where "." is a complete path segment,         are removed from the buffer string.      d) If the buffer string ends with "." as a complete path segment,         that "." is removed.      e) All occurrences of "<segment>/../", where <segment> is a         complete path segment not equal to "..", are removed from the         buffer string.  Removal of these path segments is performed         iteratively, removing the leftmost matching pattern on each         iteration, until no matching pattern remains.      f) If the buffer string ends with "<segment>/..", where <segment>         is a complete path segment not equal to "..", that         "<segment>/.." is removed.      g) If the resulting buffer string still begins with one or more         complete path segments of "..", then the reference is         considered to be in error.  Implementations may handle this         error by retaining these components in the resolved path (i.e.,         treating them as part of the final URI), by removing them from         the resolved path (i.e., discarding relative levels above the         root), or by avoiding traversal of the reference.Berners-Lee, et. al.        Standards Track                    [Page 21]RFC 2396                   URI Generic Syntax                August 1998      h) The remaining buffer string is the reference URI's new path         component.   7) The resulting URI components, including any inherited from the      base URI, are recombined to give the absolute form of the URI      reference.  Using pseudocode, this would be         result = ""         if scheme is defined then             append scheme to result             append ":" to result         if authority is defined then             append "//" to result             append authority to result         append path to result         if query is defined then             append "?" to result             append query to result         if fragment is defined then             append "#" to result             append fragment to result         return result      Note that we must be careful to preserve the distinction between a      component that is undefined, meaning that its separator was not      present in the reference, and a component that is empty, meaning      that the separator was present and was immediately followed by the      next component separator or the end of the reference.   The above algorithm is intended to provide an example by which the   output of implementations can be tested -- implementation of the   algorithm itself is not required.  For example, some systems may find   it more efficient to implement step 6 as a pair of segment stacks   being merged, rather than as a series of string pattern replacements.      Note: Some WWW client applications will fail to separate the      reference's query component from its path component before merging      the base and reference paths in step 6 above.  This may result in      a loss of information if the query component contains the strings      "/../" or "/./".   Resolution examples are provided in Appendix C.Berners-Lee, et. al.        Standards Track                    [Page 22]RFC 2396                   URI Generic Syntax                August 19986. URI Normalization and Equivalence   In many cases, different URI strings may actually identify the   identical resource. For example, the host names used in URL are   actually case insensitive, and the URL <http://www.XEROX.com> is   equivalent to <http://www.xerox.com>. In general, the rules for   equivalence and definition of a normal form, if any, are scheme   dependent. When a scheme uses elements of the common syntax, it will   also use the common syntax equivalence rules, namely that the scheme   and hostname are case insensitive and a URL with an explicit ":port",   where the port is the default for the scheme, is equivalent to one   where the port is elided.7. Security Considerations   A URI does not in itself pose a security threat.  Users should beware   that there is no general guarantee that a URL, which at one time   located a given resource, will continue to do so.  Nor is there any   guarantee that a URL will not locate a different resource at some   later point in time, due to the lack of any constraint on how a given   authority apportions its namespace.  Such a guarantee can only be   obtained from the person(s) controlling that namespace and the   resource in question.  A specific URI scheme may include additional   semantics, such as name persistence, if those semantics are required   of all naming authorities for that scheme.   It is sometimes possible to construct a URL such that an attempt to   perform a seemingly harmless, idempotent operation, such as the   retrieval of an entity associated with the resource, will in fact   cause a possibly damaging remote operation to occur.  The unsafe URL   is typically constructed by specifying a port number other than that   reserved for the network protocol in question.  The client   unwittingly contacts a site that is in fact running a different   protocol.  The content of the URL contains instructions that, when   interpreted according to this other protocol, cause an unexpected   operation.  An example has been the use of a gopher URL to cause an   unintended or impersonating message to be sent via a SMTP server.   Caution should be used when using any URL that specifies a port   number other than the default for the protocol, especially when it is   a number within the reserved space.   Care should be taken when a URL contains escaped delimiters for a   given protocol (for example, CR and LF characters for telnet   protocols) that these are not unescaped before transmission.  This   might violate the protocol, but avoids the potential for suchBerners-Lee, et. al.        Standards Track                    [Page 23]RFC 2396                   URI Generic Syntax                August 1998   characters to be used to simulate an extra operation or parameter in   that protocol, which might lead to an unexpected and possibly harmful   remote operation to be performed.   It is clearly unwise to use a URL that contains a password which is   intended to be secret. In particular, the use of a password within   the 'userinfo' component of a URL is strongly disrecommended except   in those rare cases where the 'password' parameter is intended to be   public.8. Acknowledgements   This document was derived from RFC 1738 [RFC1738] and RFC 1808   [RFC1808]; the acknowledgements in those specifications still apply.   In addition, contributions by Gisle Aas, Martin Beet, Martin Duerst,   Jim Gettys, Martijn Koster, Dave Kristol, Daniel LaLiberte, Foteos   Macrides, James Marshall, Ryan Moats, Keith Moore, and Lauren Wood   are gratefully acknowledged.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -