rfc1630.txt

来自「著名的RFC文档,其中有一些文档是已经翻译成中文的的.」· 文本代码 · 共 1,572 行 · 第 1/4 页
TXT
1,572 行
Network Working Group                                     T. Berners-LeeRequest for Comments: 1630                                          CERNCategory: Informational                                        June 1994                 Universal Resource Identifiers in WWW                A Unifying Syntax for the Expression of             Names and Addresses of Objects on the Network                     as used in the World-Wide WebStatus of this Memo   This memo provides information for the Internet community.  This memo   does not specify an Internet standard of any kind.  Distribution of   this memo is unlimited.IESG Note:   Note that the work contained in this memo does not describe an   Internet standard.  An Internet standard for general Resource   Identifiers is under development within the IETF.Introduction   This document defines the syntax used by the World-Wide Web   initiative to encode the names and addresses of objects on the   Internet.  The web is considered to include objects accessed using an   extendable number of protocols, existing, invented for the web   itself, or to be invented in the future.  Access instructions for an   individual object under a given protocol are encoded into forms of   address string.  Other protocols allow the use of object names of   various forms.  In order to abstract the idea of a generic object,   the web needs the concepts of the universal set of objects, and of   the universal set of names or addresses of objects.   A Universal Resource Identifier (URI) is a member of this universal   set of names in registered name spaces and addresses referring to   registered protocols or name spaces.  A Uniform Resource Locator   (URL), defined elsewhere, is a form of URI which expresses an address   which maps onto an access algorithm using network protocols. Existing   URI schemes which correspond to the (still mutating) concept of IETF   URLs are listed here. The Uniform Resource Name (URN) debate attempts   to define a name space (and presumably resolution protocols) for   persistent object names. This area is not addressed by this document,   which is written in order to document existing practice and provide a   reference point for URL and URN discussions.Berners-Lee                                                     [Page 1]RFC 1630                      URIs in WWW                      June 1994   The world-wide web protocols are discussed on the mailing list www-   talk-request@info.cern.ch and the newsgroup comp.infosystems.www is   preferable for beginner's questions. The mailing list uri-   request@bunyip.com has discussion related particularly to the URI   issue.  The author may be contacted as timbl@info.cern.ch.   This document is available in hypertext form at:   http://info.cern.ch/hypertext/WWW/Addressing/URL/URI_Overview.htmlThe Need For a Universal Syntax   This section describes the concept of the URI and does not form part   of the specification.   Many protocols and systems for document search and retrieval are   currently in use, and many more protocols or refinements of existing   protocols are to be expected in a field whose expansion is explosive.   These systems are aiming to achieve global search and readership of   documents across differing computing platforms, and despite a   plethora of protocols and data formats.  As protocols evolve,   gateways can allow global access to remain possible. As data formats   evolve, format conversion programs can preserve global access.  There   is one area, however, in which it is impractical to make conversions,   and that is in the names and addresses used to identify objects.   This is because names and addresses of objects are passed on in so   many ways, from the backs of envelopes to hypertext objects, and may   have a long life.   A common feature of almost all the data models of past and proposed   systems is something which can be mapped onto a concept of "object"   and some kind of name, address, or identifier for that object.  One   can therefore define a set of name spaces in which these objects can   be said to exist.   Practical systems need to access and mix objects which are part of   different existing and proposed systems.  Therefore, the concept of   the universal set of all objects, and hence the universal set of   names and addresses, in all name spaces, becomes important.  This   allows names in different spaces to be treated in a common way, even   though names in different spaces have differing characteristics, as   do the objects to which they refer.Berners-Lee                                                     [Page 2]RFC 1630                      URIs in WWW                      June 1994   URIs      This document defines a way to encapsulate a name in any      registered name space, and label it with the the name space,      producing a member of the universal set.  Such an encoded and      labelled member of this set is known as a Universal Resource      Identifier, or URI.      The universal syntax allows access of objects available using      existing protocols, and may be extended with technology.      The specification of the URI syntax does not imply anything about      the properties of names and addresses in the various name spaces      which are mapped onto the set of URI strings.  The properties      follow from the specifications of the protocols and the associated      usage conventions for each scheme.   URLs      For existing Internet access protocols, it is necessary in most      cases to define the encoding of the access algorithm into      something concise enough to be termed address.  URIs which refer      to objects accessed with existing protocols are known as "Uniform      Resource Locators" (URLs) and are listed here as used in WWW, but      to be formally defined in a separate document.   URNs      There is currently a drive to define a space of more persistent      names than any URLs.  These "Uniform Resource Names" are the      subject of an IETF working group's discussions.  (See Sollins and      Masinter, Functional Specifications for URNs, circulated      informally.)      The URI syntax and URL forms have been in widespread use by      World-Wide Web software since 1990.Berners-Lee                                                     [Page 3]RFC 1630                      URIs in WWW                      June 1994Design Criteria and Choices   This section is not part of the specification: it is simply an   explanation of the way in which the specification was derived.   Design criteria      The syntax was designed to be:      Extensible              New naming schemes may be added later.      Complete                It is possible to encode any naming                              scheme.      Printable               It is possible to express any URI using                              7-bit ASCII characters so that URIs may,                              if necessary, be passed using pen and ink.   Choices for a universal syntax      For the syntax itself there is little choice except for the order      and punctuation of the elements, and the acceptable characters and      escaping rules.      The extensibility requirement is met by allowing an arbitrary (but      registered) string to be used as a prefix.  A prefix is chosen as      left to right parsing is more common than right to left.  The      choice of a colon as separator of the prefix from the rest of the      URI was arbitrary.      The decoding of the rest of the string is defined as a function of      the prefix.  New prefixed are introduced for new schemes as      necessary, in agreement with the registration authority.  The      registration of a new scheme clearly requires the definition of      the decoding of the URI into a given name space, and a definition      of the properties and, where applicable, resolution protocols, for      the name space.      The completeness requirement is easily met by allowing      particularly strange or plain binary names to be encoded in base      16 or 64 using the acceptable characters.      The printability requirement could have been met by requiring all      schemes to encode characters not part of a basic set.  This led to      many discussions of what the basic set should be.  A difficult      case, for example, is when an ISO latin 1 string appears in a URL,      and within an application with ISO Latin-1 capability, it can be      handled intact.  However, for transport in general, the non-ASCIIBerners-Lee                                                     [Page 4]RFC 1630                      URIs in WWW                      June 1994      characters need to be escaped.      The solution to this was to specify a safe set of characters, and      a general escaping scheme which may be used for encoding "unsafe"      characters.  This "safe" set is suitable, for example, for use in      electronic mail.  This is the canonical form of a URI.      The choice of escape character for introducing representations of      non-allowed characters also tends to be a matter of taste.  An      ANSI standard exists in the C language, using the back-slash      character "\".  The use of this character on unix command lines,      however, can be a problem as it is interpreted by many shell      programs, and would have itself to be escaped.  It is also a      character which is not available on certain keyboards.  The equals      sign is commonly used in the encoding of names having      attribute=value pairs.  The percent sign was eventually chosen as      a suitable escape character.      There is a conflict between the need to be able to represent many      characters including spaces within a URI directly, and the need to      be able to use a URI in environments which have limited character      sets or in which certain characters are prone to corruption.  This      conflict has been resolved by use of an hexadecimal escaping      method which may be applied to any characters forbidden in a given      context.  When URLs are moved between contexts, the set of      characters escaped may be enlarged or reduced unambiguously.      The use of white space characters is risky in URIs to be printed      or sent by electronic mail, and the use of multiple white space      characters is very risky.  This is because of the frequent      introduction of extraneous white space when lines are wrapped by      systems such as mail, or sheer necessity of narrow column width,      and because of the inter-conversion of various forms of white      space which occurs during character code conversion and the      transfer of text between applications.  This is why the canonical      form for URIs has all white spaces encoded.Reommendations   This section describes the syntax for URIs as used in the WorldWide   Web initiative.  The generic syntax provides a framework for new   schemes for names to be resolved using as yet undefined protocols.URI syntax   A complete URI consists of a naming scheme specifier followed by a   string whose format is a function of the naming scheme.  For locators   of information on the Internet, a common syntax is used for the IPBerners-Lee                                                     [Page 5]RFC 1630                      URIs in WWW                      June 1994   address part. A BNF description of the URL syntax is given in an a   later section. The components are as follows.  Fragment identifiers   and relative URIs are not involved in the basic URL definition.   SCHEME      Within the URI of a object, the first element is the name of the      scheme, separated from the rest of the object by a colon.   PATH      The rest of the URI follows the colon in a format depending on the      scheme. The path is interpreted in a manner dependent on the      protocol being used.  However, when it contains slashes, these      must imply a hierarchical structure.Reserved characters   The path in the URI has a significance defined by the particular   scheme.  Typically, it is used to encode a name in a given name   space, or an algorithm for accessing an object.  In either case, the   encoding may use those characters allowed by the BNF syntax, or   hexadecimal encoding of other characters.   Some of the reserved characters have special uses as defined here.   THE PERCENT SIGN      The percent sign ("%", ASCII 25 hex) is used as the escape      character in the encoding scheme and is never allowed for anything      else.   HIERARCHICAL FORMS      The slash ("/", ASCII 2F hex) character is reserved for the      delimiting of substrings whose relationship is hierarchical.  This      enables partial forms of the URI.  Substrings consisting of single      or double dots ("." or "..") are similarly reserved.      The significance of the slash between two segments is that the      segment of the path to the left is more significant than the      segment of the path to the right.  ("Significance" in this case      refers solely to closeness to the root of the hierarchical      structure and makes no value judgement!)Berners-Lee                                                     [Page 6]RFC 1630                      URIs in WWW                      June 1994      Note         The similarity to unix and other disk operating system filename         conventions should be taken as purely coincidental, and should         not be taken to indicate that URIs should be interpreted as         file names.   HASH FOR FRAGMENT IDENTIFIERS      The hash ("#", ASCII 23 hex) character is reserved as a delimiter      to separate the URI of an object from a fragment identifier .   QUERY STRINGS      The question mark ("?", ASCII 3F hex) is used to delimit the      boundary between the URI of a queryable object, and a set of words      used to express a query on that object.  When this form is used,      the combined URI stands for the object which results from the      query being applied to the original object.      Within the query string, the plus sign is reserved as shorthand      notation for a space.  Therefore, real plus signs must be encoded.      This method was used to make query URIs easier to pass in systems      which did not allow spaces.      The query string represents some operation applied to the object,      but this specification gives no common syntax or semantics for it.      In practice the syntax and sematics may depend on the scheme and      may even on the base URI.   OTHER RESERVED CHARACTERS      The astersik ("*", ASCII 2A hex) and exclamation mark ("!" , ASCII      21 hex) are reserved for use as having special signifiance within      specific schemes.Unsafe characters   In canonical form, certain characters such as spaces, control   characters, some characters whose ASCII code is used differently in   different national character variant 7 bit sets, and all 8bit   characters beyond DEL (7F hex) of the ISO Latin-1 set, shall not be   used unencoded. This is a recommendation for trouble-free   interchange, and as indicated below, the encoded set may be extended   or reduced.
rfc1630.txt - 源码说明

本页面展示了「著名的RFC文档,其中有一些文档是已经翻译成中文的的.」中的 rfc1630.txt 源码文件，采用文本编程语言编写，共 1,572 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与RFC相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?