📄 rfc1862.txt
字号:
Network Working Group M. McCahillRequest For Comments: 1862 University of MinnesotaCategory: Informational J. Romkey, Editor M. Schwartz University of Colorado K. Sollins MIT T. Verschuren SURFnet C. Weider Bunyip Information Systems, Inc. November 1995 Report of the IAB Workshop on Internet Information Infrastructure, October 12-14, 1994Status of this Memo This memo provides information for the Internet community. This memo does not specify an Internet standard of any kind. Distribution of this memo is unlimited.Abstract This document is a report on an Internet architecture workshop, initiated by the IAB and held at MCI on October 12-14, 1994. This workshop generally focused on aspects of the information infrastructure on the Internet.1. Introduction The Internet Architecture Board (IAB) holds occasional workshops designed to consider long-term issues and strategies for the Internet, and to suggest future directions for the Internet architecture. This long-term planning function of the IAB is complementary to the ongoing engineering efforts performed by working groups of the Internet Engineering Task Force (IETF), under the leadership of the Internet Engineering Steering Group (IESG) and area directorates. An IAB-initiated workshop on the architecture of the "information infrastructure" of the Internet was held on October 12-14, 1994 at MCI in Tysons Corner, Virginia. In addition to the IAB members, attendees at this meeting included the IESG Area Directors for the relevant areas (Applications, User Services) and a group of other experts in the following areas:McCahill, et al Informational [Page 1]RFC 1862 IAB Workshop Report November 1995 gopher, the World Wide Web, naming, WAIS, searching, indexing, and library services. The IAB explicitly tried to balance the number of attendees from each area of expertise. Logistics limited the attendance to about 35, which unfortunately meant that many highly qualified experts were omitted from the invitation list. The objectives of the workshop were to explore the architecture of "information" applications on the Internet, to provide the IESG with a solid set of recommendations for further work, and to provide a place for communication between the communities of people associated with the lower and upper layers of the Internet protocol suite, as well as allow experience to be exchanged between the communities. The 34 attendees divided into three "breakout groups" which met for the second half of the first day and the entire second day. Each group wrote a report of its activities. The reports are contained in this document, in addition to a set of specific recommendations to the IESG and IETF community.2. Summary Although there were some disagreements between the groups on specific functionalities for architectural components, there was broad agreement on the general shape of an information architecture and on general principles for constructing the architecture. The discussions of the architecture generalized a number of concepts that are currently used in deployed systems such as the World Wide Web, but the main thrust was to define general architectural components rather than focus on current technologies. Research recommendations include: - increased focus on a general caching and replication architecture - a rapid deployment of name resolution services, and - the articulation of a common security architecture for information applications. Procedural recommendations for forwarding this work in the IETF include: - making common identifiers such as the IANA assigned numbers available in an on-line database - tightening the requirements on Proposed Standards to insure that they adequately address securityMcCahill, et al Informational [Page 2]RFC 1862 IAB Workshop Report November 1995 - articulating the procedures necessary to facilitate joining IETF working group meetings, and - reviewing the key distribution infrastructure for use in information applications3. Group 1 report: The Distributed Database Problem Elise Gerich, Tim Berners-Lee, Mark McCahill, Dave Sincoskie, Mike Schwartz, Mitra, Yakov Rekhter, John Klensin, Steve Crocker, Ton Verschuren Editors: Mark McCahill, Mike Schwartz, Ton Verschuren3.1 Problem and Needs Because of the increasing popularity of accessing networked information, current Internet information services are experiencing performance, reliability, and scaling problems. These are general problems, given the distributed nature of the Internet. Current and future applications would benefit from much more widespread use of caching and replication. For instance, popular WWW and Gopher servers experience serious overloading, as many thousands of users per day attempt to access them simultaneously. Neither of these systems was designed with explicit caching or replication support in the core protocol. Moreover, because the DNS is currently the only widely deployed distributed and replicated data storage system in the Internet, it is often used to help support more scalable operation in this environment -- for example, storing service-specific pointer information, or providing a means of rotating service accesses among replicated copies of NCSA's extremely popular WWW server. In most cases, such uses of the DNS semantically overload the system. The DNS may not be able to stand such "semantic extensions" and continue to perform well. It was not designed to be a general-purpose replicated distributed database system. There are many examples of systems that need or would benefit from caching or replication. Examples include key distribution for authentication services, DHCP, multicast SD, and Internet white pages. To date there have been a number of independent attempts to provide caching and replication facilities. The question we address here is whether it might be possible to define a general service interface or protocol, so that caches and replica servers (implemented in a variety of ways to support a range of different situations) mightMcCahill, et al Informational [Page 3]RFC 1862 IAB Workshop Report November 1995 interoperate, and so that we might reduce the amount of wasted re- implementation effort currently being expended. Replication and caching schemes could form a sort of network "middleware" to fulfill a common need of distributed services. It should be noted that it is an open question whether it would be feasible to define a unified interface to all caching and replication problems. For example, very different considerations must go into providing a system to support a nationwide video service for 1,000,000 concurrent users than would be needed for supporting worldwide accesses to popular WWW pages. We recommend research and experimentation to address this more general issue.3.2 Characteristics of Solutions While on the surface caching and replication may appear to occupy two ends of a spectrum, further analysis shows that these are two different approaches with different characteristics. There are cases where a combination of the two techniques is the optimal solution, which further complicates the situation. We can roughly characterize the two approaches as follows: Caching: - a cache contains a partial set of data - a cache is built on demand - a cache is audience-specific, since the cache is built in response to demands of a community Replication: - replicated databases contain the entire data set or a server-defined subset of a given database - a replicated database can return an authoritative answer about existence of an item - data is pushed onto the replicating server rather than pulled on demand While there are important differences between caches and replicated databases, there are some issues common to both, especially when considering how updates and data consistency can be handled.McCahill, et al Informational [Page 4]RFC 1862 IAB Workshop Report November 1995 A variety of methods can be used to update caches and replicas: - master-slave - peer-to-peer - flooding techniques (such as that used by NNTP). Which strategy one chooses influences important characteristics of the cache or replicated database, such as: - consistency of data - is locking used to achieve consistency? this influences performance... - are there a priori guarantees of existence of an item in the database (is the answer authoritative, do you detect conflicts after the fact, or is there no guarantee on authoritativeness of the answer?) Consistency guarantees depend on the granularity of synchronization (ms, sec, hr, day), and there are cases where it is acceptable to trade consistency for better performance or availability. Since there is a range of qualities of service with respect to consistency and performance, we would like to be able to tune these parameters for a given application. However, we recognize that this may not be possible in all cases since it is unlikely one can implement a high performance solution to all of these problems in a single system. Beyond simply performing replication or caching, there is a need for managing cache and replication servers. There are several models for organizing groups of caches/replication servers that range from totally adaptive to a rigidly administered, centrally controlled model: - a club model. Minimal administrative overhead to join the club. Participation is a function of disk space, CPU, available network bandwidth. - centrally coordinated service. Here administrators can take advantage of their knowledge of the system's topology and the community they intend to serve. There may be scaling problems with this model. - hybrid combinations of the club and centrally coordinated modelsMcCahill, et al Informational [Page 5]RFC 1862 IAB Workshop Report November 1995 There are a couple of models for how to organize the management of a group of cooperating servers, but this does not address the question of what sorts of commands the manager (be it a person or a program) issues to a cache or replicated server. A manager needs to be able to address issues on a server such as: - control of caching algorithms, defining how information is aged out of the cache based on disk space, usage demands, etc. This is where you would control time-to-live and expiry settings. - flushing the cache. There are circumstances where the information source has become inaccessible and the normal cache aging strategy is inappropriate since you will not be able to get the information again for an indeterminate amount of time. - management control might also be a way for information providers to control how information is pushed on servers for maintaining data consistency, but this raises tricky problems with trust and
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -