rfc1862.txt

来自「RFC 的详细文档!」· 文本 代码 · 共 1,516 行 · 第 1/5 页

TXT
1,516
字号






Network Working Group                                        M. McCahill
Request For Comments: 1862                       University of Minnesota
Category: Informational                                J. Romkey, Editor
                                                             M. Schwartz
                                                  University of Colorado
                                                              K. Sollins
                                                                     MIT
                                                           T. Verschuren
                                                                 SURFnet
                                                               C. Weider
                                        Bunyip Information Systems, Inc.
                                                           November 1995


   Report of the IAB Workshop on Internet Information Infrastructure,
                          October 12-14, 1994

Status of this Memo

   This memo provides information for the Internet community.  This memo
   does not specify an Internet standard of any kind.  Distribution of
   this memo is unlimited.

Abstract

   This document is a report on an Internet architecture workshop,
   initiated by the IAB and held at MCI on October 12-14, 1994.  This
   workshop generally focused on aspects of the information
   infrastructure on the Internet.

1. Introduction

   The Internet Architecture Board (IAB) holds occasional workshops
   designed to consider long-term issues and strategies for the
   Internet, and to suggest future directions for the Internet
   architecture.  This long-term planning function of the IAB is
   complementary to the ongoing engineering efforts performed by working
   groups of the Internet Engineering Task Force (IETF), under the
   leadership of the Internet Engineering Steering Group (IESG) and area
   directorates.

   An IAB-initiated workshop on the architecture of the "information
   infrastructure" of the Internet was held on October 12-14, 1994 at
   MCI in Tysons Corner, Virginia.

   In addition to the IAB members, attendees at this meeting included
   the IESG Area Directors for the relevant areas (Applications, User
   Services) and a group of other experts in the following areas:



McCahill, et al              Informational                      [Page 1]

RFC 1862                  IAB Workshop Report              November 1995


   gopher, the World Wide Web, naming, WAIS, searching, indexing, and
   library services.  The IAB explicitly tried to balance the number of
   attendees from each area of expertise.  Logistics limited the
   attendance to about 35, which unfortunately meant that many highly
   qualified experts were omitted from the invitation list.

   The objectives of the workshop were to explore the architecture of
   "information" applications on the Internet, to provide the IESG with
   a solid set of recommendations for further work, and to provide a
   place for communication between the communities of people associated
   with the lower and upper layers of the Internet protocol suite, as
   well as allow experience to be exchanged between the communities.

   The 34 attendees divided into three "breakout groups" which met for
   the second half of the first day and the entire second day. Each
   group wrote a report of its activities. The reports are contained in
   this document, in addition to a set of specific recommendations to
   the IESG and IETF community.

2. Summary

   Although there were some disagreements between the groups on specific
   functionalities for architectural components, there was broad
   agreement on the general shape of an information architecture and on
   general principles for constructing the architecture. The discussions
   of the architecture generalized a number of concepts that are
   currently used in deployed systems such as the World Wide Web, but
   the main thrust was to define general architectural components rather
   than focus on current technologies.

   Research recommendations include:

  -  increased focus on a general caching and replication architecture

  -  a rapid deployment of name resolution services, and

  -  the articulation of a common security architecture for information
     applications.

   Procedural recommendations for forwarding this work in the IETF
   include:

  -  making common identifiers such as the IANA assigned numbers
     available in an on-line database

  -  tightening the requirements on Proposed Standards to insure that
     they adequately address security




McCahill, et al              Informational                      [Page 2]

RFC 1862                  IAB Workshop Report              November 1995


  -  articulating the procedures necessary to facilitate joining IETF
     working group meetings, and

  -  reviewing the key distribution infrastructure for use in
     information applications

3. Group 1 report: The Distributed Database Problem

   Elise Gerich, Tim Berners-Lee, Mark McCahill, Dave Sincoskie, Mike
   Schwartz, Mitra, Yakov Rekhter, John Klensin, Steve Crocker, Ton
   Verschuren

   Editors: Mark McCahill, Mike Schwartz, Ton Verschuren

3.1 Problem and Needs

   Because of the increasing popularity of accessing networked
   information, current Internet information services are experiencing
   performance, reliability, and scaling problems.  These are general
   problems, given the distributed nature of the Internet.  Current and
   future applications would benefit from much more widespread use of
   caching and replication.

   For instance, popular WWW and Gopher servers experience serious
   overloading, as many thousands of users per day attempt to access
   them simultaneously.  Neither of these systems was designed with
   explicit caching or replication support in the core protocol.
   Moreover, because the DNS is currently the only widely deployed
   distributed and replicated data storage system in the Internet, it is
   often used to help support more scalable operation in this
   environment -- for example, storing service-specific pointer
   information, or providing a means of rotating service accesses among
   replicated copies of NCSA's extremely popular WWW server.  In most
   cases, such uses of the DNS semantically overload the system.  The
   DNS may not be able to stand such "semantic extensions" and continue
   to perform well.  It was not designed to be a general-purpose
   replicated distributed database system.

   There are many examples of systems that need or would benefit from
   caching or replication.  Examples include key distribution for
   authentication services, DHCP, multicast SD, and Internet white
   pages.

   To date there have been a number of independent attempts to provide
   caching and replication facilities.  The question we address here is
   whether it might be possible to define a general service interface or
   protocol, so that caches and replica servers (implemented in a
   variety of ways to support a range of different situations) might



McCahill, et al              Informational                      [Page 3]

RFC 1862                  IAB Workshop Report              November 1995


   interoperate, and so that we might reduce the amount of wasted re-
   implementation effort currently being expended.  Replication and
   caching schemes could form a sort of network "middleware" to fulfill
   a common need of distributed services.

   It should be noted that it is an open question whether it would be
   feasible to define a unified interface to all caching and replication
   problems.  For example, very different considerations must go into
   providing a system to support a nationwide video service for
   1,000,000 concurrent users than would be needed for supporting
   worldwide accesses to popular WWW pages.  We recommend research and
   experimentation to address this more general issue.

3.2 Characteristics of Solutions

   While on the surface caching and replication may appear to occupy two
   ends of a spectrum, further analysis shows that these are two
   different approaches with different characteristics. There are cases
   where a combination of the two techniques is the optimal solution,
   which further complicates the situation.

   We can roughly characterize the two approaches as follows:

   Caching:

        - a cache contains a partial set of data

        - a cache is built on demand

        - a cache is audience-specific, since the cache is built in
          response to demands of a community

   Replication:

        - replicated databases contain the entire data set or a
          server-defined subset of a given database

        - a replicated database can return an authoritative answer about
          existence of an item

        - data is pushed onto the replicating server rather than pulled on
          demand

   While there are important differences between caches and replicated
   databases, there are some issues common to both, especially when
   considering how updates and data consistency can be handled.





McCahill, et al              Informational                      [Page 4]

RFC 1862                  IAB Workshop Report              November 1995


   A variety of methods can be used to update caches and replicas:

        - master-slave

        - peer-to-peer

        - flooding techniques (such as that used by NNTP).

   Which strategy one chooses influences important characteristics of
   the cache or replicated database, such as:

        - consistency of data

        - is locking used to achieve consistency? this influences
          performance...

        - are there a priori guarantees of existence of an item in the
          database (is the answer authoritative, do you detect conflicts
          after the fact, or is there no guarantee on authoritativeness of
          the answer?)

   Consistency guarantees depend on the granularity of synchronization
   (ms, sec, hr, day), and there are cases where it is acceptable to
   trade consistency for better performance or availability. Since there
   is a range of qualities of service with respect to consistency and
   performance, we would like to be able to tune these parameters for a
   given application. However, we recognize that this may not be
   possible in all cases since it is unlikely one can implement a high
   performance solution to all of these problems in a single system.

   Beyond simply performing replication or caching, there is a need for
   managing cache and replication servers. There are several models for
   organizing groups of caches/replication servers that range from
   totally adaptive to a rigidly administered, centrally controlled
   model:

    - a club model. Minimal administrative overhead to join the club.
      Participation is a function of disk space, CPU, available
      network bandwidth.

    - centrally coordinated service. Here administrators can take
      advantage of their knowledge of the system's topology and the
      community they intend to serve. There may be scaling problems
      with this model.

    - hybrid combinations of the club and centrally coordinated models





McCahill, et al              Informational                      [Page 5]

RFC 1862                  IAB Workshop Report              November 1995


   There are a couple of models for how to organize the management of a
   group of cooperating servers, but this does not address the question
   of what sorts of commands the manager (be it a person or a program)
   issues to a cache or replicated server. A manager needs to be able to
   address issues on a server such as:

    - control of caching algorithms, defining how information is aged
      out of the cache based on disk space, usage demands, etc. This is
      where you would control time-to-live and expiry settings.

    - flushing the cache. There are circumstances where the
      information source has become inaccessible and the normal cache
      aging strategy is inappropriate since you will not be able to
      get the information again for an indeterminate amount of time.

    - management control might also be a way for information providers
      to control how information is pushed on servers for maintaining
      data consistency, but this raises tricky problems with trust and

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?