⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 rfc2655.txt

📁 著名的RFC文档,其中有一些文档是已经翻译成中文的的.
💻 TXT
📖 第 1 页 / 共 3 页
字号:
Network Working Group                                           T. HardieRequest for Comments: 2655                                        EquinixCategory: Experimental                                          M. Bowman                                                                 Transarc                                                                 D. Hardy                                                                 Netscape                                                              M. Schwartz                                                            Affinia, Inc.                                                               D. Wessels                                                                    NLANR                                                              August 1999                CIP Index Object Format for SOIF ObjectsStatus of this Memo   This memo defines an Experimental Protocol for the Internet   community.  It does not specify an Internet standard of any kind.   Discussion and suggestions for improvement are requested.   Distribution of this memo is unlimited.Copyright Notice   Copyright (C) The Internet Society (1999).  All Rights Reserved.1.  Abstract   The Common Indexing Protocol (CIP) allows servers to form a referral   mesh for query handling by defining a mechanism by which cooperating   servers exchange hints about the searchable indices they maintain.   The structure and transport of CIP are described in (Ref. 1), as are   general rules for the definition of index object types.  This   document describes SOIF, the Summary Object Interchange Format, as an   index object type in the context of the CIP framework.  SOIF is a   machine-readable syntax for transmitting structured summary objects,   currently used primarily in the context of the World Wide Web.   Query referral has often been dismissed as an ineffective strategy   for handling searches of Web resources, and Web resources certainly   present challenges not present in structured directory services like   Rwhois.  In situations where a keyword-based free text search is   desired, query referral is not likely to be effective because the   query will probably be routed to every server participating in the   referral mesh.  Where a search can be limited by reference to a   specific resource attribute, however, query referral is an effective   tool.  SOIF can be used to create such a known-attribute query mesh   because it provides a method for associating attributes with net-   addressable resources.Hardie, et al.                Experimental                      [Page 1]RFC 2655        CIP Index Object Format for SOIF Objects     August 19991.1 History   SOIF was first defined by the Harvest project [Ref 2.] in January   1994.  SOIF was derived from a combination of the Internet Anonymous   FTP Archives IETF Working Group (IAFA) templates [Ref 3.] and the   BibTeX bibliography format [Ref 4.].  The combination was originally   noted for its advantages of providing a convenient and intuitive way   for delimiting objects within a stream, and setting apart the URL for   easy object access or invocation, while still preserving   compatibility with IAFA templates.   Mic Bowman, Darren Hardy, Mike Schwartz, and Duane Wessels each   contributed to the creation of the SOIF format as part of the Harvest   Project; later work took place as part of the FIND working group.2.  Name   The index object described below will have the MIME type of   application/index.obj.HARVEST-SOIF-1.3.  Payload Format   Each summary object has 3 fundamental components: a template type, a   URL, and zero or more ATTRIBUTE-VALUE pairs.  Because the VALUEs in   the ATTRIBUTE-VALUE pairs may contain arbitrary data (cf. Section   3.5), SOIF objects should be encoded in Base64 unless the template   type unambiguously establishes that the VALUEs do not contain binary   data.3.1  Template Type   The Template type is used to identify the set of ATTRIBUTEs contained   within a particular SOIF object.  SOIF does not define the template   types themselves; it only provides a way to associate the summary   object with a predefined template type name.  Template types may be   registered or unregistered.  Unregistered template types provide an   indication of available ATTRIBUTE-VALUE pairs, but these may vary   both according to the original resource and the method by which the   summary object was generated.  Registered template types must refer   to a formally specified description of all mandatory and optional   ATTRIBUTE-VALUE pairs available for that type.  See [10] for a   description of the process of registering template types with the   IANA.   Historically, the template types used by SOIF were derived from IAFA   template types (Ref. 3). SOIF objects generated by the Harvest system   have a "FILE" template type; in current practice this is the most   common template type.  The "FILE" template type is a generic templateHardie, et al.                Experimental                      [Page 2]RFC 2655        CIP Index Object Format for SOIF Objects     August 1999   type meant to handle a large variety of web-based resources.  No   formal specification of it is available, though a list of ATTRIBUTE-   VALUE pairs common to the "FILE" template type is found in Appendix   A.  "DOCUMENT" and "OBJECT" are other generic template-types.   The use of unregistered template types obviously presents some   problems to the correct operation of query referral.  Two efforts   have been mounted to allow peer-to-peer agreement on the association   of template types with specific attribute sets: Netscape's RDM (Ref.   6) and the STARTS project (Ref. 7).  Initially, CIP meshes based on   systems which use unregisterested template types may need to use   these or similar methods to associate template types with specific   attribute sets.   Mesh operators are strongly encouraged, however, to migrate to   registered template types as soon as is practical.  Registered   template types allow CIP meshes to derive the definitions of   attributes, which enables multiple-language interfaces to the base   attributes.  In addition, registered template types allow CIP meshes   and other users of SOIF to establish the permitted data types and   encodings of the VALUEs associated with each ATTRIBUTE.  This makes   deriving the appropriate matching semantics for a particular VALUE   much more straightforward and eliminates the limitations of the   default octet-by-octet matching (cf. Section 4.).3.2  URL   Uniform Resource Locators (URLs) (Ref 5.) are used by SOIF as object   IDENTIFIERs.  SOIF associates its summary objects with net-   addressable resources by using the URL by which the resource was   addressed as the initial field of the object body.  See section 3.4   for the formal grammar associated with SOIF objects.   This association allows the same resource to have multiple summary   objects, differentiated only by the URL by which the resource was   accessed.  This possibility does not, however, impact the usability   of the URL as an object IDENTIFIER. Furthermore, since it can be   argued that the net address is a salient part of the metadata, there   may be compensating benefits to using the URL as an object   IDENTIFIER.   As noted in Appendix A, the Harvest project used several additional   identity attributes ("Gatherer-Name", "Gatherer-Host", "Gatherer-   Port" and "Gatherer-Version") to further identify the provenance of a   particular object.  Within the context of CIP, it may be useful to   identify the base sources of particular index objects; see Appendix B   for one example of how a SOIF-based CIP hint could use the base   source URL.Hardie, et al.                Experimental                      [Page 3]RFC 2655        CIP Index Object Format for SOIF Objects     August 19993.3  ATTRIBUTE-VALUE pairs.   Each summary object has zero or more ATTRIBUTE-VALUE pairs, which   contain metadata about the net-addressable resource referenced by the   URL.  Pairs are composed of an ATTRIBUTE IDENTIFIER, the length of   the VALUE, a delimeter, and the VALUE.  It should be stressed that   ATTRIBUTE VALUE pairs are not CR/LF terminated, but parsed according   to grammar set out in section 3.4.  In the examples in Section 3.6   and in many other representations of SOIF objects, ATTRIBUTE-VALUE   pairs are represented on individual lines to enhance readability.   VALUEs may contain CR/LF, however, and implementors must be careful   to parse the full VALUE.  Implementors of SOIF parsers MUST ignore   <CR>,<LF>,<TAB>,<SPACE>, or other whitespace found between the VALUE   of an ATTRIBUTE-VALUE pair and the ATTRIBUTE-IDENTIFIER of the   subsequent pair.   The SOIF syntax does not explicitly allow for a single ATTRIBUTE to   have multiple VALUEs.  To handle multiple VALUEs for the same   ATTRIBUTE, SOIF uses an ATTRIBUTE naming convention; a hyphen and   positive integer are appended to the ATTRIBUTE name to create an   ATTRIBUTE IDENTIFIER VALUE associated with a specific ATTRIBUTE.  For   example, the ATTRIBUTE IDENTIFIERs "Author-1", "Author-2", and   "Author-3" can be used to represent three VALUEs associated with the   ATTRIBUTE "Author" where a specific resource has three authors.  See   section 4 for the implications of this strategy on matching   semantics.3.4  SOIF Grammar   The SOIF syntax is defined by the following grammar:      SOIF            ::=  OBJECT SOIF |                           OBJECT      OBJECT          ::=  @ TEMPLATE-TYPE { URL ATTRIBUTE-LIST }      TEMPLATE-TYPE   ::=  IDENTIFIER      ATTRIBUTE-LIST  ::=  ATTRIBUTE ATTRIBUTE-LIST |                           ATTRIBUTE |                           NULL      ATTRIBUTE       ::=  IDENTIFIER {VALUE-SIZE} DELIMITER VALUE      URL             ::=  RFC1738-URL-Syntax | "-"      IDENTIFIER      ::=  ALPHA-NUMERIC-STRING      VALUE           ::=  ARBITRARY-DATA      VALUE-SIZE      ::=  NUMERIC-STRING      DELIMITER       ::=  ":<TAB>"Hardie, et al.                Experimental                      [Page 4]RFC 2655        CIP Index Object Format for SOIF Objects     August 19993.5   Grammar Description   URL      a Uniform Resource Locator encoded in the syntax defined by RFC      1738 [3].  If the summary object has no URL associated with it,      then a Latin-1 hyphen (octal \055) is used instead.   IDENTIFIER      an ASCII character string that only contains alphanumeric      characters and hyphens or underscores.  IDENTIFIERs should avoid      including hyphens followed by positive integers except when      constructing multiple-VALUE ATTRIBUTE IDENTIFIERs.   VALUE      a buffer of VALUE-SIZE octets containing the VALUE.  The VALUE may      contain data in arbitrary formats or encodings, which recipients      recognize based on Template-Type.   VALUE-SIZE      a non-negative integer encoded as an ASCII character string.  The      integer indicates how many octets the VALUE occupies after the      DELIMITER.   DELIMITER      a two octet delimiter which is a Latin-1 colon (:) and a tab (\t),      (octal \072\011).   { }  the Latin-1 curly braces (octal \173 and \175) are used to wrap      the VALUE-SIZE (no spaces) as well as the URL and ATTRIBUTE-LIST      combination.   @TEMPLATE-TYPE      the Latin-1 @ (octal \100) and TEMPLATE-TYPE (no space between      them) is used to mark the beginning of the SOIF object.   NUMERIC-STRING      Zero or more ASCII numerals.   ALPHA-NUMERIC-STRING      Zero or more ASCII letters or numerals, plus hyphens or      underscore.  [a-z,A-Z,0-9,- and _].   ARBITRARY-DATA      Octets of data in arbitrary formats or encodings.Hardie, et al.                Experimental                      [Page 5]RFC 2655        CIP Index Object Format for SOIF Objects     August 19994.  Matching Semantics   As was discussed in Section 1, query referral of SOIF objects will be   most effective when a query identifies a particular ATTRIBUTE or set   of ATTRIBUTEs as the target of the query match.  A query-identified   ATTRIBUTE should be considered to match a SOIF ATTRIBUTE when a   case-insentive character-by-character comparison matches that portion   of the ATTRIBUTE IDENTIFIER prior to any hyphen-integer suffix.  For   example, a query which asks for a match on the ATTRIBUTE "author"   should match the IDENTIFIERs "author", "Author", "AUTHOR", and   "Author-1".  [10] discourages the registration of template types   containing ATTRIBUTEs which have previously been registered with   substantially different definitions.  This will help eliminate mis-   referral, but a CIP mesh may nonetheless need to maintain a thesaurus   matching ATTRIBUTEs from particular template-types to those of other,   especially unregistered, template-types.   The matching semantics appropriate for a particular VALUE are derived   from its data type and encoding.  For VALUEs associated with   ATTRIBUTEs which are part of a registered template type, the data   type and encoding are readily available.  For VALUEs associated with   ATTRIBUTES associated with unregistered template-types, an octet-by-   octet comparison is the default.  In cases where previous experience   has demonstrated that a particular ATTRIBUTE contains string data, a   case-insensitive substring match may be used.  For example, in a   query against the "AUTHOR" ATTRIBUTE of the generic "DOCUMENT"   template type, the query VALUE "Garcia" should match the SOIF VALUEs   "Garcia", "GARCIA", and "Jose Garcia y Montes".   Over time, there may well emerge an understanding of which attributes   tend to produce correct query referrals within a mesh.  As such   understandings emerge, mesh maintainers may wish to define a   particular SOIF TEMPLATE-TYPE which restricts included ATTRIBUTES to

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -