rfc2258.txt

来自「著名的RFC文档,其中有一些文档是已经翻译成中文的的.」· 文本 代码 · 共 844 行 · 第 1/3 页

TXT
844
字号
RFC 2258              Internet Nomenclator Project          January 1998   The distributed catalog service is logically one network service, but   it can be divided into pieces that are distributed and/or replicated.   Query resolvers access this distributed, replicated service using the   same techniques that work for multiple data repositories.   A Nomenclator system naturally includes many query resolvers.   Resolvers are independent, but renewable, query agents that can be as   powerful as the resources available at the user site.  Caching   decreases the dependence of the resolver on the distributed catalog   service for frequently used meta-data, and on data repositories for   frequently used data.  Caching thus improves the number of users that   can be supported and the local availability of the query service.2.2 Meta-Data Techniques   The active catalog structures the information space into a collection   of relations about people, hosts, organizations, services and other   objects. It collects meta-data for each relation and structures it   into "access functions" for locating and retrieving data.  Access   functions respond to the question: "Where is data to answer this   query?"  There are two types of responses corresponding to the two   types of access functions.  The first type of response is: "Look over   there." "Catalog functions" return this response; they constrain the   query search by limiting the data repositories contacted to those   having data relevant to the query. Catalog functions return a   referral to data access functions that will answer the query or to   additional catalog functions to contact for more detailed   information.  The second response to "Where?" is: "Here it is!" "Data   access functions" return this response; they understand how to obtain   query answers from specific data repositories.  They return tuples   that answer the query.  Nomenclator supplies access functions for   common name services, such as the CCSO service, and organizations can   write and supply access functions for data in their repositories.   Access functions are implemented as remote or local services.  Remote   access functions are services that are available through a standard   remote procedure call interface.  Local access functions are   functions that are supplied with the query resolver.  Local access   functions can be applied to a variety of indexing and data retrieval   tasks by loading them with meta-data stored in distributed catalog   service.  Remote access functions are preferred over local ones when   the resources of the query resolver are inadequate to support the   access function.  The owners of data may also choose to supply remote   access functions for privacy reasons if their access functions use   proprietary information or algorithms.  Local functions are preferred   whenever possible, because they are highly replicated in resolver   caches.  They can reduce system and network load by bringing the   resources of the active catalog directly to the users.Ordille                      Informational                      [Page 6]RFC 2258              Internet Nomenclator Project          January 1998   Remote access functions are simple to add to Nomenclator and local   access functions are simple to apply to new data repositories,   because the active catalog provides "referrals" that describe the   conditions for using access functions.  For simplicity, this document   describes referral techniques for exact matching of query strings.   Extensions to these techniques in Nomenclator support matching query   strings that contain wildcards or word-based matching of query   strings in the style of the CCSO services.   Each referral contains a template and a list of references to access   functions.  The template is a conjunctive selection predicate that   describes the scope of the access functions.  Conjunctive queries   that are within the scope of the template can be answered with the   referral.  When a template contains a wildcard value ("*") for an   attribute, the attribute must be present in any queries that are   processed by the referral.  The system follows the following rule:     Query Coverage Rule:     If the set of tuples satisfying the selection predicate in a query     is covered by (is a subset of) the set of tuples satisfying the     template, then the query can be answered by the access functions in     the reference list of the referral.   For example, the query below:     select * from People where country = "US" and surname = "Ordille";   is covered by the following templates in Lines (1) through (3), but   not by the templates in Lines (4) and (5):      (1) country = "US" and surname = "*"      (2) country = "US" and surname = "Ordille"      (3) country = "US"      (4) organization = "*"      (5) country = "US" and surname = "Elliott"   Referrals form a generalization/specialization graph for a relation   called a "referral graph."  Referral graphs are a conceptual tool   that guides the integration of different catalog functions into our   system and that supplies a basis for catalog function construction   and query processing.  A "referral graph" is a partial ordering ofOrdille                      Informational                      [Page 7]RFC 2258              Internet Nomenclator Project          January 1998   the referrals for a relation.  It is constructed using the   subset/superset relationship: "S is a subset of G."  A referral S is   a subset of referral G if the set of queries covered by the template   of S is a subset of the set of queries covered by the template of G.   S is considered a more specific referral than G; G is considered a   more general referral than S.  For example, the subset relationship   exists between the pairs of referrals with the templates listed   below:      (1) country = "US" and surname = "Ordille"          is a subset of          country = "US"      (2) country = "US" and surname = "Ordille"          is a subset of          country = "US" and surname = "*"      (3) country = "US" and surname = "*"          is a subset of          country ="US"      (4) country = "US"          is a subset          "empty template"   but it does not exist between the pairs of referrals with the   following templates:      (5) country = "US"          is not a subset of          department = "CS"      (6) country = "US" and name = "Ordille"          is not a subset of          country = "US" and name = "Elliott"   In Lines (1) and (2), the more general referral covers more queries,   because it covers queries that list different values for surname.  In   Line (3), the more general referral covers more queries, because it   covers queries that do not constrain surname to a value.  In Line   (4), the specific referral covers only those queries that constrain   the country to "US" while the empty template covers all queries.   During query processing, wildcards in a template are replaced with   the value of the corresponding attribute in the query.  For any query   covered by two referrals S and G such that S is a subset of G, the   set of tuples satisfying the template in S is covered by the set ofOrdille                      Informational                      [Page 8]RFC 2258              Internet Nomenclator Project          January 1998   tuples satisfying the template in G.  S is used to process the query,   because it provides the more constrained (and faster) search space.   The referral S has a more constrained logical search space than G,   because the set of tuples in the scope of S is no larger, and often   smaller, than the set in the scope of G. Moreover, S has a more   constrained physical search space than G, because the data   repositories that must contacted for answers to S must also be   contacted for answers to G, but additional data repositories may need   to be contacted to answer G.   In constraining a query, a catalog function always produces a   referral that is more specific than the referral containing the   catalog function.  Wildcards ("*") in a template indicate which   attribute values are used by the associated catalog function to   generate a more specific referral.  In other words, catalog functions   always follow the rule:      Catalog Function Constrained Search Rule:      Given a referral R with a template t and a catalog function cf,      and a query q covered by t, the result of using cf to process q,      cf(q), is a referral R' with template t' such that q  is covered      by t' and R' is more specific than R.   Catalog functions make it possible to import a portion of the indices   for the information space into the query resolver.  Since they   generate referrals, the resolver can cache the most useful referrals   for a relation and call the catalog function as needed to generate   new referrals.   The resolver query processing algorithm obtains an initial set of   referrals from the distributed catalog service.  It then navigates   the referral graph, calling catalog functions as necessary to obtain   additional referrals that narrow the search space. Sometimes, two   referrals that cover the query have the relationship of general to   specific to each other.  The resolver eliminates unnecessary access   function processing by using only the most specific referral along   each path of the referral graph.   The search space for the query is initially set to all the data   repositories in the relation.  As the resolver obtains referrals to   sets of relevant data repositories (and their associated data access   functions) it forms the intersection of the referrals to constrain   the search space further.  The intersection of the referrals includes   only those data repositories listed in all the referrals.   Intersection combines independent paths through the referral graph to   derive benefit from indices on different attributes.Ordille                      Informational                      [Page 9]RFC 2258              Internet Nomenclator Project          January 19982.3 Meta-Data and Data Caching   A Nomenclator query resolver caches the meta-data that result from   calling catalog functions.  It also caches the responses for queries.   If the predicate of a new query is covered by the predicate of a   previous query, Nomenclator calculates the response for the new query   from the cached response of the old query.  Nomenclator timestamps   its cache entries to provide measures of the currentness of query   responses and selective cache refresh.  The timestamps are used to   calculate a t-bound on query responses [5][1].  A t-bound is the time   after which changes may have occurred to the data that are not   reflected in the query response. It is the time of the oldest cache   entry used to calculate the response.  Nomenclator returns a t-bound   with each query response.  Users can request more current data by   asking for responses that are more recent than this t-bound. Making   such a request flushes older items from the cache if more recent   items are available.  Query resolvers calculate a minimum t-bound   that is some refresh interval earlier than the current time.   Resolvers keep themselves current by replacing items in the cache   that are earlier than the minimum t-bound.2.4 Scale and Performance   Three performance studies of active catalog and meta-data caching   techniques are available [5].  The first study shows that the active   catalog and meta-data caching can constrain the search effectively in   a real environment, the X.500 name space.  The second study examined   the performance of an active catalog and meta-data caching for single   users on a local area network.  The experiments showed that the   techniques to eliminate data repositories from the search space can   dramatically improve response time.  Response times improve, because   latency is reduced.  The reduction of latency in communications and   processing is critical to large-scale descriptive query optimization.   The experiments also showed that an active catalog is the most   significant contributor to better response time in a system with low   load, and that meta-data caching functions to reduce the load on the   system.  The third study used an analytical model to evaluate the   performance and scaling of these techniques for a large Internet   environment.  It showed that meta-data caching plays an essential   role in scaling the distributed catalog service to millions of users.   It also showed that constraining the search space with an active   catalog contributes significantly to scaling data repositories to   millions of users.  Replication and data caching also contribute to   the scale of the system in a large Internet environment.Ordille                      Informational                     [Page 10]RFC 2258              Internet Nomenclator Project          January 1998

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?