rfc2651.txt

来自「著名的RFC文档,其中有一些文档是已经翻译成中文的的.」· 文本代码 · 共 1,068 行 · 第 1/3 页
TXT
1,068 行
Network Working Group                                           J. AllenRequest for Comments: 2651                                WebTV NetworksCategory: Standards Track                                    M. Mealling                                                 Network Solutions, Inc.                                                             August 1999         The Architecture of the Common Indexing Protocol (CIP)Status of this Memo   This document specifies an Internet standards track protocol for the   Internet community, and requests discussion and suggestions for   improvements.  Please refer to the current edition of the "Internet   Official Protocol Standards" (STD 1) for the standardization state   and status of this protocol.  Distribution of this memo is unlimited.Copyright Notice   Copyright (C) The Internet Society (1999).  All Rights Reserved.Abstract   The Common Indexing Protocol (CIP) is used to pass indexing   information from server to server in order to facilitate query   routing. Query routing is the process of redirecting and replicating   queries through a distributed database system towards servers holding   the desired results. This document describes the CIP framework,   including its architecture and the protocol specifics of exchanging   indices.1. Introduction1.1. History and Motivation   The Common Indexing Protocol (CIP) is an evolution and refinement of   distributed indexing concepts first introduced in the Whois++   Directory Service [RFC1913, RFC1914]. While indexing proved useful in   that system to promote query routing, the centroid index object which   is passed among Whois++ servers is specifically designed for   template-based databases searchable by token-based matching.  With   alternative index objects, the index-passing technology will prove   useful to many more application domains, not simply Directory   Services and those applications which can be cast into the form of   template collections.Allen & Mealling            Standards Track                     [Page 1]RFC 2651                  The CIP Architecture               August 1999   The indexing part of Whois++ is integrated with the data access   protocol. The goal in designing CIP is to extract the indexing   portion of Whois++, while abstracting the index objects to apply more   broadly to information retrieval. In addition, another kind of   technology reuse has been undertaken by converting the ad-hoc data   representations used by Whois++ into structures based on the MIME   specification for structured Internet mail.   Whois++ used a version number field in centroid objects to facilitate   future growth. The initial version was "1". Version 1 of CIP (then   embedded in Whois++, and not referred to separately as CIP) had   support for only ISO-8895-1 characters, and for only the centroid   index object type.   Version 2 of the Whois++ centroid was used in the Digger software by   Bunyip Information Systems to notify recipients that the centroid   carried extra character set information. Digger's centroids can carry   UTF-8 encoded 16-bit Unicode characters, or ISO-8859-1 characters,   determined by a field in the headers.   This specification is for CIP version 3.  Version 3 is a major   overhaul to the protocol.  However, by using of a short negotiation   sequence, CIP version 3 servers can interoperate with earlier servers   in an index-passing mesh.   For unclear terms the reader is referred to the glossary in Appendix   A.1.2 CIP's place in the Information Retrieval world   CIP facilitates query routing. CIP is a protocol used between servers   in a network to pass hints which make data access by clients at a   later date more efficient. Query routing is the act of redirecting   and replicating queries through a distributed database system towards   the servers holding the actual results via reference to indexing   information.   CIP is a "backend" protocol -- it is implemented in and "spoken" only   among network servers. These same servers must also speak some kind   of data access protocol to communicate with clients. During query   resolution in the native protocol implementation, the server will   refer to the indexing information collected by the CIP implementation   for guidance on how to route the query.   Data access protocols used with CIP must have some provision for   control information in the form of a referral. The syntax and   semantics of these referrals are outside the scope of this   specification.Allen & Mealling            Standards Track                     [Page 2]RFC 2651                  The CIP Architecture               August 19992. Related Documents   This document is one of three documents. This document describes the   fundamental concepts and framework of CIP.   The document "MIME Object Definitions for the Common Indexing   Protocol" [CIP-MIME] describes the MIME objects that make up the   items that are passed by the transport system.   Requirements and examples of several transport systems are specified   in the "CIP Transport Protocols" [CIP-TRANSPORT] document.   A second set of document describe the various specifications for   specific index types.3. Architecture3.1 CIP in the Information Retrieval World3.1.1 Information Retrieval in the Abstract   In order to better understand how CIP fits into the information   retrieval world, we need to first understand the unifying abstract   features of existing information retrieval technology. Next, we   discuss why adding indexing technology to this model results in a   system capable of query routing, and why query routing is useful.   An abstract view of the client/server data retrieval process includes   data sets and data access protocols. An individual server is   responsible for handling queries over a fixed domain of data. For the   purposes of CIP, we call this domain of data the dataset. Clients   make searches in the dataset and retrieve parts of it via a data   access protocol. There are many data access protocols, each optimized   for the data in question. For instance, LDAP and Whois++ are access   protocols that reflect the needs of the directory services   application domain. Other data access protocols include HTTP and   Z39.50.3.1.2 Indexing Information Facilitates Query Routing   The above description reflects a world without indexing, where no   server knows about any other server. In some cases (as with X.500   referrals, and HTTP redirects) a server will, as part of its reply,   implicate another server in the process of resolving the query.   However, those servers generate replies based solely on their local   knowledge. When indexing information is introduced into a server's   local database, the server now knows not only answers based on theAllen & Mealling            Standards Track                     [Page 3]RFC 2651                  The CIP Architecture               August 1999   local dataset, but also answers based on external indices. These   indices come from peer servers, via an indexing protocol. CIP is one   such indexing protocol.   Replies based on index information may not be the complete answer.   After all, an index is not a replicated version of the remote   dataset, but a possibly reduced version of it. Thus, in addition to   giving complete replies from the local dataset, the server may give   referrals to other datasets. These referrals are the core feature   necessary for effective query routing. When servers use CIP to pass   indices from server to server, they make a kind of investment. At the   cost of some resources to create, transmit and store the indices,   query routing becomes possible.   Query Routing is the process of replicating and moving a query closer   to datasets which can satisfy the query. In some distributed systems,   widely distributed searches must be accomplished by replicating the   query to all sub-datasets. This approach can be wasteful of resources   both in the network, and on the servers, and is thus sometimes   explicitly disabled. Using indexing in such a system opens the door   to more efficient distributed searching.   While CIP-equipped servers provide the referrals necessary to make   query routing work, it is always the client's responsibility to   collate, filter, and chase the referrals it receives. This gives the   end-user (or agent, in the case that there's no human user involved   in the search) greatest control over the query resolution process.   The cost of the added client complexity is weighed against the   benefits of total control over query resolution. In some cases, it   may also be possible to decouple the referral chasing from the client   by introducing a proxy, allowing existing simple clients to make use   of query routing. Such a proxy would transparently resolve referrals   into concrete results before returning them to the simple-minded   client.3.1.3 Abstracting the CIP index object   As useful as indices seem, the fact remains that not all queries can   benefit from the same type of index. For example, say the index   consists of a simple list of keywords. With such an index, it is   impossible to answer queries about whether two keywords were near one   another, or if a keyword was present in a certain context (for   instance, in the title).   Because of the need for application domain specific indices, CIP   index objects are abstract; they must be defined by a separate   specification. The basic protocols for moving index objects are   widely applicable, but the specific design of the index, and theAllen & Mealling            Standards Track                     [Page 4]RFC 2651                  The CIP Architecture               August 1999   structure of the mesh of servers which pass a particular type of   index is dependent on the application domain. This document describes   only the protocols for moving indices among servers. Companion   documents describe initial index objects.   The requirements that index type specifications must address are   specified in the [CIP-MIME] document.3.2 Architectural Details   CIP implements index passing, providing the forward knowledge   necessary to generate the referrals used for query routing. The core   of the protocol is the index object. In the following sections, the   structure of the index objects themselves is presented. Next, how and   why indices are passed from server to server is discussed. Finally,   the circumstances under which a server may synthesize an index object   based on incoming ones are discussed.3.2.1 The CIP Index Object   A CIP index object is composed of two parts, the header and the   payload. The header contains metadata necessary to process and make   use of the index object being transmitted. The actual index resides   in the payload.   Three particular headers warrant specific mention at this point.  The   "type" of the index object selects one of many distinct CIP index   object specifications which define exactly how the index blocks are   to be created, parsed and used to facilitate query routing.  Another   header of note is the "DSI", or Dataset Identifier, which uniquely   identifies the dataset from which the index was created.  Another   header that is crucial for generating referrals is the "Base-URI".   The URI (or URI's) contained in this header form the basis of any   referrals generated based on this index block. The URI is also used   as input during the index aggregation process to constrain the kinds   of aggregation possible, due to multiprotocol constraints.  How that   URI is used is defined by the aggregation algorithm.  The exact   syntax of these headers is specified in the CIP MIME specification   document [CIP-MIME].   The payload is opaque to CIP itself. It is defined exclusively by the   index object specification associated with the object's MIME type.   Specifications on how to parse and use the payload are published   separately as "CIP index object specifications". This abstract   definition of the index object forms the basis of CIP's applicability   to indexing needs across multiple application domains.Allen & Mealling            Standards Track                     [Page 5]RFC 2651                  The CIP Architecture               August 1999   A precise definition of the content and form of a CIP index block can   be found in the Protocol document [CIP-MIME]3.2.2 Moving Index Objects: How to Build a Mesh   Indices are transmitted among servers participating in a CIP mesh. By   distributing this information in anticipation of a query, efficient,   accurate query routing is possible at the time a query arrives.   A CIP mesh is a set of CIP servers which pass indices of the same   type among themselves. Typically, a mesh is arranged in a   hierarchical tree fashion, with servers nearer the root of the tree   having larger and more comprehensive indices. See Figure 1. However,   a CIP mesh is explicitly allowed to have lateral links in it, and   there may be more than one part of the mesh that has the properties   of a "root". Mesh administrators are encouraged to avoid loops in the   system, but they are not obliged to maintain a strict tree structure.   Clients wishing to completely resolve all referrals they receive   should protect against referral loops while attempting to traverse   the mesh to avoid wasting time and network resources.  See the   section on "Navigating the Mesh" for a discussion of this.Allen & Mealling            Standards Track                     [Page 6]RFC 2651                  The CIP Architecture               August 1999     base level             index                    index     directory             servers                  servers      servers                for                      for                          base level               lower-level                           servers                index servers     _______    |       |    |   A   |__    |_______|  \            _______                \---CIP----|       |     _______               |   D   |__    |       |   /---CIP----|_______|  \             ------    |   B   |__/                       \--CIP------|      |    |_______|                                      |  F   |
rfc2651.txt - 源码说明

本页面展示了「著名的RFC文档,其中有一些文档是已经翻译成中文的的.」中的 rfc2651.txt 源码文件，采用文本编程语言编写，共 1,068 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与RFC相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?