rfc2654.txt
来自「RFC 的详细文档!」· 文本 代码 · 共 1,348 行 · 第 1/4 页
TXT
1,348 行
Network Working Group R. Hedberg
Request for Comments: 2654 Catalogix
Category: Experimental B. Greenblatt
Directory Tools and Application Services, Inc.
R. Moats
AT&T
M. Wahl
Innosoft International, Inc.
August 1999
A Tagged Index Object for use in the Common Indexing Protocol
Status of this Memo
This memo defines an Experimental Protocol for the Internet
community. It does not specify an Internet standard of any kind.
Discussion and suggestions for improvement are requested.
Distribution of this memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (1999). All Rights Reserved.
Abstract
This document defines a mechanism by which information servers can
exchange indices of information from their databases by making use of
the Common Indexing Protocol (CIP). This document defines the
structure of the index information being exchanged, as well as the
appropriate meanings for the headers that are defined in the Common
Indexing Protocol. It is assumed that the structures defined here
can be used by X.500 DSAs, LDAP servers, Whois++ servers, CSO Ph
servers and many others.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4. The Tagged Index Object . . . . . . . . . . . . . . . . . . . . 5
4.1. The Agreement . . . . . . . . . . . . . . . . . . . . . . . . 5
4.2. Content Type . . . . . . . . . . . . . . . . . . . . . . . . 8
4.3 Tagged Index BNF . . . . . . . . . . . . . . . . . . . . . . . 9
4.3.1. Header Descriptions . . . . . . . . . . . . . . . . . . . .10
4.3.2. Tokenization types . . . . . . . . . . . . . . . . . . . .11
4.3.3. Tag Conventions . . . . . . . . . . . . . . . . . . . . . .11
4.4. Incremental Indexing . . . . . . . . . . . . . . . . . . . .12
Hedberg, et al. Experimental [Page 1]
RFC 2654 Tagged Index Object for use in CIP August 1999
5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . .13
5.1 The original database . . . . . . . . . . . . . . . . . . . .13
5.1.1 "complete" consistency based full update . . . . . . . . . .14
5.1.2 "tag" consistency based full update . . . . . . . . . . . .14
5.1.3 "unique" consistency based full update . . . . . . . . . . .15
5.2 First update . . . . . . . . . . . . . . . . . . . . . . . . .16
5.2.1 "complete" consistency based incremental update . . . . . .16
5.2.2 "tag" consistency based incremental update . . . . . . . .17
5.2.3 "unique" consistency based incremental update . . . . . . .17
5.3 Second update . . . . . . . . . . . . . . . . . . . . . . . .18
5.3.1 "complete" consistency based incremental update . . . . . .18
5.3.2 "tag" consistency based incremental update . . . . . . . . .19
5.3.3 "unique" consistency based incremental update . . . . . . .20
6. Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . .21
6.1 Aggregation of Tagged Index Objects . . . . . . . . . . . . .21
7. Security Considerations . . . . . . . . . . . . . . . . . . . .21
8. References . . . . . . . . . . . . . . . . . . . . . . . . . .22
9. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . .23
Full Copyright Statement . . . . . . . . . . . . . . . . . . . . .24
1. Introduction
The Common Indexing Protocol (CIP) as defined in [1] proposes a
mechanism for distributing searches across several instances of a
single type of search engine to create a global directory. CIP
provides a scalable, flexible scheme to tie individual databases into
distributed data warehouses that can scale gracefully with the growth
of the Internet. CIP provides a mechanism for meeting these goals
that is independent of the access method that is used to access the
data that underlies the indices. Separate from CIP is the definition
of the Index Object that is used to contain the information that is
exchanged among Index Servers. One such Index Object that has
already been defined is the Centroid that is derived from the Whois++
protocol [2].
The Centroid does not meet all the requirements for the exchange of
index information amongst information servers. For example, it does
not support the notion of incremental updates natively. For
information servers that contain millions of records in their
database, constant exchange of complete dredges of the database is
bandwidth intensive. The Tagged Index Object is specifically
designed to support the exchange of index update information. This
design comes at the cost of an increase in the size of the index
object being exchanged. The Centroid is also not tailored to always
be able to give boolean answers to queries. In the Centroid Model,
"an index server will take a query in standard Whois++ format, search
its collections of centroids and other forward information, determine
which servers hold records which may fill that query, and then
Hedberg, et al. Experimental [Page 2]
RFC 2654 Tagged Index Object for use in CIP August 1999
notifies the user's client of the next servers to contact to submit
the query." [2] Thus, the exchange of Centroids amongst index servers
allows hints to be given about which information server actually
contains the information. The Tagged Index Object labels the various
pieces of information with identifiers that tie the individual object
attributes back to an object as a whole. This "tagging" of
information allows an index server to be more capable of directing a
specific query to the appropriate information server. Again, this
feature is added to the Tagged Index Object at the expense of an
increase in the size of the index object.
2. Background
The Lightweight Directory Access Protocol (LDAP) is defined in [3],
and it defines a mechanism for accessing a collection of information
arranged hierarchically in such a way as to provide a globally
distributed database which is normally called the Directory
Information Tree (DIT). Some distinguishing characteristics of LDAP
servers are that normally, several servers cooperate to manage a
common subtree of the DIT. LDAP servers are expected to respond to
requests that pertain to portions of the DIT for which they have
data, as well as for those portions for which they have no
information in their database. For example, the LDAP server for a
portion of the DIT in the United States (c=US) must be able to
provide a response to a Search operation that pertains to a portion
of the DIT in Sweden (c=se). Normally, the response given will be a
referral to another LDAP server that is expected to be more
knowledgeable about the appropriate subtree. However, there is no
mechanism that currently enables these LDAP servers to refer the LDAP
client to the supposedly more knowledgeable server. Typically, an
LDAP (v3) server is configured with the name of exactly one other
LDAP server to which all LDAP clients are referred when their
requests fall outside the subtree of the DIT for which that LDAP
server has knowledge. This specification defines a mechanism whereby
LDAP server can exchange index information that will allow referrals
to point towards a clearly accurate destination.
The X.500 series of recommendations defines the Directory Information
Shadowing Protocol (DISP) [4] which allows X.500 DSAs to exchange
information in the DIT. Shadowing allows various information from
various portions of the DIT to be replicated amongst participating
DSAs. The design point of DISP is improved at the exchange of entire
portions of the DIT, whereas the design point of CIP and the Tagged
Index Object is optimized at the exchange of structural index
information about the DIT, and improving the performance of tree
navigation amongst various information servers. The Tagged Index
Object is more appropriate for the exchange of index information than
is DISP. DISP is more targeted at DIT distribution and fault
Hedberg, et al. Experimental [Page 3]
RFC 2654 Tagged Index Object for use in CIP August 1999
tolerance. DISP is thus more appropriate for the exchange of the
data in order to spread the load amongst several information servers.
DISP is tailored specifically to X.500 (and other hierarchical
directory systems), while the Tagged Index Object and CIP can be used
in a wide variety of information server environments.
While DISP allows an individual directory server to collect
information about large parts of the DIT, it would require a huge
database to collect all the replicas for a significant portion of the
DIT. Furthermore, as X.525 states: "Before shadowing can occur, an
agreement, covering the conditions under which shadowing may occur is
required. Although such agreements may be established in a variety
of ways, such as policy statements covering all DSAs within a given
DMD ...", where a DMD is a Directory Management Domain. This is
owing to the case that the data in the DIT is being exchanged amongst
DSA rather than only the information required to maintain an Index.
In many environments such an agreement is not appropriate, and to
collect information for a meaningful portion of the DIT, many
agreements may need to be arranged.
3. Object
What is desired is to have an information server (or network of
information servers) that can quickly respond to real world requests,
like:
- What is Tim Howes's email address? This is much harder than;
What email address does Tim Howes at Netscape have ?
- What is the X.509 certificate for Fred Smith at compuserve.com?
One certainly doesn't want to search CompuServe's entire
directory tree to find out this one piece of information. I
also don't want to have to shadow the entire CompuServe
directory subtree onto my server. If this request is being made
because Fred is trying to log into my server, I'd certainly want
to be able to respond to the BIND in real time.
- Who are all the people at Novell that have a title of
programmer?
all these requests can reasonably be translated into LDAP or Whois++,
and other directory access protocol queries. They can also be
serviced in a straightforward way by the users home information
server if it has the appropriate reference information into the
database that contains the source data. Here, the first server would
be able to "chain" the request for the user. Alternatively, a
precise referral could be returned. If the home information server
wants to service (i.e chain) the request based on the index
Hedberg, et al. Experimental [Page 4]
RFC 2654 Tagged Index Object for use in CIP August 1999
information that it has on hand, this servicing could be done several
different means:
- issuing LDAP operations to the remote directory server
- issuing DSP operations to the remote directory server
- issuing DAP operations to the remote directory server
- issuing Whois++ operations to the remote Whois++ server
- ...
4. The Tagged Index Object
This section defines a Tagged Index Object that can be exchanged by
Information Servers using CIP. While often it is acceptable for
Information Servers to make use of the Centroid definition (from [2])
to exchange index information, the goals in defining a new construct
are multi-pronged:
- When the Information Server receives a search request that
warrants that a referral be returned, allow the server to return
a referral that will point client to a server that is most
likely able to answer the request correctly. False positive
referrals (the search turns up hits in the index object that
generate referrals to servers that don't hold the desired
information) can be reduced, depending on the choice of
attribute tokenization types that are used.
- Potentially allow incremental updates that will then consume
substantially less bandwidth then if full updates always had to
be used.
4.1. The Agreement
Before a Tagged Index Object can be exchanged, the organization that
administers the object supplier and the organization that administers
the object consumer must reach an agreement on how the servers will
communicate. This agreement contains the following:
- "index-type": This specification describes the index type "x-
tagged-index-1"
- "dsi": An OID that uniquely identifies the subtree and scope.
This field is not explicitly necessary, as it may not provide
information beyond what is contained in the "base-uri" below.
Hedberg, et al. Experimental [Page 5]
RFC 2654 Tagged Index Object for use in CIP August 1999
- "base-uri": One or more URI's that will form the base of any
referrals created based on the index object that is governed by
this agreement. For example, in the LDAP URL format [8] the
base-uri would specify (among other items): the LDAP host, the
base object to which this index object refers (e.g. c=SE), and
the scope of the index object (e.g. single container).
- "supplier": The hostname and listening portnumber of the
supplier server, as well as any alternative servers holding that
same naming contexts, if the supplier is unavailable.
- "consumeraddr": This is a URI of the "mailto:" form, with the
RFC 822 email address of the consumer server. Further versions
of this draft allow other forms of URI, so that the consumer may
retrieve the update via the WWW, FTP or CIP.
- "updateinterval": The maximum duration in seconds between
occurances of the supplier server generating an update. If the
consumer server has not received an update from the supplier
server after waiting this long since the previous update, it is
likely that the index information is now out of date. A typical
value for a server with frequent updates would be 604800
seconds, or every week. Servers whose DITs are only modified
annually could have a much longer update interval.
- "attributeNamespace": Every set of index servers that together
wants to support a specific usage of indeces, has to agree on
which attributenames to use in the index objects. The
participating directory servers also has to agree on the mapping
from local attributenames to the attributenames used in the
index. Since one specific index server might be involved in
several such sets, it has to have some way to connect a update
to the proper set of indexes. One possible solution to this
would be to use different DSIs.
- "consistencybase": How consistency of the index is maintained
over incremental updates:
"complete" - every change or delete concerning one object
has to contain all tokens connected to that object. This
method must be supported by any server who wants to comply
with this standard.
"tag" - starting at a full update every incremental update
refering back to this full updated has to maintain state-
information regarding tags, such that a object within the
original database is assigned the same tagnumber every time.
This method is optional.
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?