📄 rfc2396.txt

📁 C/C++语言的CGI接口库
💻 TXT
📖 第 1 页 / 共 5 页
字号:
12 3 4 5 下一页
Network Working Group                                     T. Berners-LeeRequest for Comments: 2396                                       MIT/LCSUpdates: 1808, 1738                                          R. FieldingCategory: Standards Track                                    U.C. Irvine                                                             L. Masinter                                                       Xerox Corporation                                                             August 1998           Uniform Resource Identifiers (URI): Generic SyntaxStatus of this Memo   This document specifies an Internet standards track protocol for the   Internet community, and requests discussion and suggestions for   improvements.  Please refer to the current edition of the "Internet   Official Protocol Standards" (STD 1) for the standardization state   and status of this protocol.  Distribution of this memo is unlimited.Copyright Notice   Copyright (C) The Internet Society (1998).  All Rights Reserved.IESG Note   This paper describes a "superset" of operations that can be applied   to URI.  It consists of both a grammar and a description of basic   functionality for URI.  To understand what is a valid URI, both the   grammar and the associated description have to be studied.  Some of   the functionality described is not applicable to all URI schemes, and   some operations are only possible when certain media types are   retrieved using the URI, regardless of the scheme used.Abstract   A Uniform Resource Identifier (URI) is a compact string of characters   for identifying an abstract or physical resource.  This document   defines the generic syntax of URI, including both absolute and   relative forms, and guidelines for their use; it revises and replaces   the generic definitions in RFC 1738 and RFC 1808.   This document defines a grammar that is a superset of all valid URI,   such that an implementation can parse the common components of a URI   reference without knowing the scheme-specific requirements of every   possible identifier type.  This document does not define a generative   grammar for URI; that task will be performed by the individual   specifications of each URI scheme.Berners-Lee, et. al.        Standards Track                     [Page 1]RFC 2396                   URI Generic Syntax                August 19981. Introduction   Uniform Resource Identifiers (URI) provide a simple and extensible   means for identifying a resource.  This specification of URI syntax   and semantics is derived from concepts introduced by the World Wide   Web global information initiative, whose use of such objects dates   from 1990 and is described in "Universal Resource Identifiers in WWW"   [RFC1630].  The specification of URI is designed to meet the   recommendations laid out in "Functional Recommendations for Internet   Resource Locators" [RFC1736] and "Functional Requirements for Uniform   Resource Names" [RFC1737].   This document updates and merges "Uniform Resource Locators"   [RFC1738] and "Relative Uniform Resource Locators" [RFC1808] in order   to define a single, generic syntax for all URI.  It excludes those   portions of RFC 1738 that defined the specific syntax of individual   URL schemes; those portions will be updated as separate documents, as   will the process for registration of new URI schemes.  This document   does not discuss the issues and recommendation for dealing with   characters outside of the US-ASCII character set [ASCII]; those   recommendations are discussed in a separate document.   All significant changes from the prior RFCs are noted in Appendix G.1.1 Overview of URI   URI are characterized by the following definitions:      Uniform         Uniformity provides several benefits: it allows different types         of resource identifiers to be used in the same context, even         when the mechanisms used to access those resources may differ;         it allows uniform semantic interpretation of common syntactic         conventions across different types of resource identifiers; it         allows introduction of new types of resource identifiers         without interfering with the way that existing identifiers are         used; and, it allows the identifiers to be reused in many         different contexts, thus permitting new applications or         protocols to leverage a pre-existing, large, and widely-used         set of resource identifiers.      Resource         A resource can be anything that has identity.  Familiar         examples include an electronic document, an image, a service         (e.g., "today's weather report for Los Angeles"), and a         collection of other resources.  Not all resources are network         "retrievable"; e.g., human beings, corporations, and bound         books in a library can also be considered resources.Berners-Lee, et. al.        Standards Track                     [Page 2]RFC 2396                   URI Generic Syntax                August 1998         The resource is the conceptual mapping to an entity or set of         entities, not necessarily the entity which corresponds to that         mapping at any particular instance in time.  Thus, a resource         can remain constant even when its content---the entities to         which it currently corresponds---changes over time, provided         that the conceptual mapping is not changed in the process.      Identifier         An identifier is an object that can act as a reference to         something that has identity.  In the case of URI, the object is         a sequence of characters with a restricted syntax.   Having identified a resource, a system may perform a variety of   operations on the resource, as might be characterized by such words   as `access', `update', `replace', or `find attributes'.1.2. URI, URL, and URN   A URI can be further classified as a locator, a name, or both.  The   term "Uniform Resource Locator" (URL) refers to the subset of URI   that identify resources via a representation of their primary access   mechanism (e.g., their network "location"), rather than identifying   the resource by name or by some other attribute(s) of that resource.   The term "Uniform Resource Name" (URN) refers to the subset of URI   that are required to remain globally unique and persistent even when   the resource ceases to exist or becomes unavailable.   The URI scheme (Section 3.1) defines the namespace of the URI, and   thus may further restrict the syntax and semantics of identifiers   using that scheme.  This specification defines those elements of the   URI syntax that are either required of all URI schemes or are common   to many URI schemes.  It thus defines the syntax and semantics that   are needed to implement a scheme-independent parsing mechanism for   URI references, such that the scheme-dependent handling of a URI can   be postponed until the scheme-dependent semantics are needed.  We use   the term URL below when describing syntax or semantics that only   apply to locators.   Although many URL schemes are named after protocols, this does not   imply that the only way to access the URL's resource is via the named   protocol.  Gateways, proxies, caches, and name resolution services   might be used to access some resources, independent of the protocol   of their origin, and the resolution of some URL may require the use   of more than one protocol (e.g., both DNS and HTTP are typically used   to access an "http" URL's resource when it can't be found in a local   cache).Berners-Lee, et. al.        Standards Track                     [Page 3]RFC 2396                   URI Generic Syntax                August 1998   A URN differs from a URL in that it's primary purpose is persistent   labeling of a resource with an identifier.  That identifier is drawn   from one of a set of defined namespaces, each of which has its own   set name structure and assignment procedures.  The "urn" scheme has   been reserved to establish the requirements for a standardized URN   namespace, as defined in "URN Syntax" [RFC2141] and its related   specifications.   Most of the examples in this specification demonstrate URL, since   they allow the most varied use of the syntax and often have a   hierarchical namespace.  A parser of the URI syntax is capable of   parsing both URL and URN references as a generic URI; once the scheme   is determined, the scheme-specific parsing can be performed on the   generic URI components.  In other words, the URI syntax is a superset   of the syntax of all URI schemes.1.3. Example URI   The following examples illustrate URI that are in common use.   ftp://ftp.is.co.za/rfc/rfc1808.txt      -- ftp scheme for File Transfer Protocol services   gopher://spinaltap.micro.umn.edu/00/Weather/California/Los%20Angeles      -- gopher scheme for Gopher and Gopher+ Protocol services   http://www.math.uio.no/faq/compression-faq/part1.html      -- http scheme for Hypertext Transfer Protocol services   mailto:mduerst@ifi.unizh.ch      -- mailto scheme for electronic mail addresses   news:comp.infosystems.www.servers.unix      -- news scheme for USENET news groups and articles   telnet://melvyl.ucop.edu/      -- telnet scheme for interactive services via the TELNET Protocol1.4. Hierarchical URI and Relative Forms   An absolute identifier refers to a resource independent of the   context in which the identifier is used.  In contrast, a relative   identifier refers to a resource by describing the difference within a   hierarchical namespace between the current context and an absolute   identifier of the resource.Berners-Lee, et. al.        Standards Track                     [Page 4]RFC 2396                   URI Generic Syntax                August 1998   Some URI schemes support a hierarchical naming system, where the   hierarchy of the name is denoted by a "/" delimiter separating the   components in the scheme. This document defines a scheme-independent   `relative' form of URI reference that can be used in conjunction with   a `base' URI (of a hierarchical scheme) to produce another URI. The   syntax of hierarchical URI is described in Section 3; the relative   URI calculation is described in Section 5.1.5. URI Transcribability   The URI syntax was designed with global transcribability as one of   its main concerns. A URI is a sequence of characters from a very   limited set, i.e. the letters of the basic Latin alphabet, digits,   and a few special characters.  A URI may be represented in a variety   of ways: e.g., ink on paper, pixels on a screen, or a sequence of   octets in a coded character set.  The interpretation of a URI depends   only on the characters used and not how those characters are   represented in a network protocol.   The goal of transcribability can be described by a simple scenario.   Imagine two colleagues, Sam and Kim, sitting in a pub at an   international conference and exchanging research ideas.  Sam asks Kim   for a location to get more information, so Kim writes the URI for the   research site on a napkin.  Upon returning home, Sam takes out the   napkin and types the URI into a computer, which then retrieves the   information to which Kim referred.   There are several design concerns revealed by the scenario:      o  A URI is a sequence of characters, which is not always         represented as a sequence of octets.      o  A URI may be transcribed from a non-network source, and thus         should consist of characters that are most likely to be able to         be typed into a computer, within the constraints imposed by         keyboards (and related input devices) across languages and         locales.      o  A URI often needs to be remembered by people, and it is easier         for people to remember a URI when it consists of meaningful         components.   These design concerns are not always in alignment.  For example, it   is often the case that the most meaningful name for a URI component   would require characters that cannot be typed into some systems.  The   ability to transcribe the resource identifier from one medium to   another was considered more important than having its URI consist of   the most meaningful of components.  In local and regional contextsBerners-Lee, et. al.        Standards Track                     [Page 5]RFC 2396                   URI Generic Syntax                August 1998   and with improving technology, users might benefit from being able to   use a wider range of characters; such use is not defined in this   document.1.6. Syntax Notation and Common Elements   This document uses two conventions to describe and define the syntax   for URI.  The first, called the layout form, is a general description   of the order of components and component separators, as in      <first>/<second>;<third>?<fourth>   The component names are enclosed in angle-brackets and any characters   outside angle-brackets are literal separators.  Whitespace should be   ignored.  These descriptions are used informally and do not define   the syntax requirements.   The second convention is a BNF-like grammar, used to define the   formal URI syntax.  The grammar is that of [RFC822], except that "|"   is used to designate alternatives.  Briefly, rules are separated from   definitions by an equal "=", indentation is used to continue a rule   definition over more than one line, literals are quoted with "",   parentheses "(" and ")" are used to group elements, optional elements   are enclosed in "[" and "]" brackets, and elements may be preceded   with <n>* to designate n or more repetitions of the following   element; n defaults to 0.   Unlike many specifications that use a BNF-like grammar to define the   bytes (octets) allowed by a protocol, the URI grammar is defined in   terms of characters.  Each literal in the grammar corresponds to the   character it represents, rather than to the octet encoding of that   character in any particular coded character set.  How a URI is   represented in terms of bits and bytes on the wire is dependent upon   the character encoding of the protocol used to transport it, or the   charset of the document which contains it.   The following definitions are common to many elements:      alpha    = lowalpha | upalpha      lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |                 "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
12 3 4 5 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -