rfc1862.txt
来自「RFC 的详细文档!」· 文本 代码 · 共 1,516 行 · 第 1/5 页
TXT
1,516 行
Network Working Group M. McCahill
Request For Comments: 1862 University of Minnesota
Category: Informational J. Romkey, Editor
M. Schwartz
University of Colorado
K. Sollins
MIT
T. Verschuren
SURFnet
C. Weider
Bunyip Information Systems, Inc.
November 1995
Report of the IAB Workshop on Internet Information Infrastructure,
October 12-14, 1994
Status of this Memo
This memo provides information for the Internet community. This memo
does not specify an Internet standard of any kind. Distribution of
this memo is unlimited.
Abstract
This document is a report on an Internet architecture workshop,
initiated by the IAB and held at MCI on October 12-14, 1994. This
workshop generally focused on aspects of the information
infrastructure on the Internet.
1. Introduction
The Internet Architecture Board (IAB) holds occasional workshops
designed to consider long-term issues and strategies for the
Internet, and to suggest future directions for the Internet
architecture. This long-term planning function of the IAB is
complementary to the ongoing engineering efforts performed by working
groups of the Internet Engineering Task Force (IETF), under the
leadership of the Internet Engineering Steering Group (IESG) and area
directorates.
An IAB-initiated workshop on the architecture of the "information
infrastructure" of the Internet was held on October 12-14, 1994 at
MCI in Tysons Corner, Virginia.
In addition to the IAB members, attendees at this meeting included
the IESG Area Directors for the relevant areas (Applications, User
Services) and a group of other experts in the following areas:
McCahill, et al Informational [Page 1]
RFC 1862 IAB Workshop Report November 1995
gopher, the World Wide Web, naming, WAIS, searching, indexing, and
library services. The IAB explicitly tried to balance the number of
attendees from each area of expertise. Logistics limited the
attendance to about 35, which unfortunately meant that many highly
qualified experts were omitted from the invitation list.
The objectives of the workshop were to explore the architecture of
"information" applications on the Internet, to provide the IESG with
a solid set of recommendations for further work, and to provide a
place for communication between the communities of people associated
with the lower and upper layers of the Internet protocol suite, as
well as allow experience to be exchanged between the communities.
The 34 attendees divided into three "breakout groups" which met for
the second half of the first day and the entire second day. Each
group wrote a report of its activities. The reports are contained in
this document, in addition to a set of specific recommendations to
the IESG and IETF community.
2. Summary
Although there were some disagreements between the groups on specific
functionalities for architectural components, there was broad
agreement on the general shape of an information architecture and on
general principles for constructing the architecture. The discussions
of the architecture generalized a number of concepts that are
currently used in deployed systems such as the World Wide Web, but
the main thrust was to define general architectural components rather
than focus on current technologies.
Research recommendations include:
- increased focus on a general caching and replication architecture
- a rapid deployment of name resolution services, and
- the articulation of a common security architecture for information
applications.
Procedural recommendations for forwarding this work in the IETF
include:
- making common identifiers such as the IANA assigned numbers
available in an on-line database
- tightening the requirements on Proposed Standards to insure that
they adequately address security
McCahill, et al Informational [Page 2]
RFC 1862 IAB Workshop Report November 1995
- articulating the procedures necessary to facilitate joining IETF
working group meetings, and
- reviewing the key distribution infrastructure for use in
information applications
3. Group 1 report: The Distributed Database Problem
Elise Gerich, Tim Berners-Lee, Mark McCahill, Dave Sincoskie, Mike
Schwartz, Mitra, Yakov Rekhter, John Klensin, Steve Crocker, Ton
Verschuren
Editors: Mark McCahill, Mike Schwartz, Ton Verschuren
3.1 Problem and Needs
Because of the increasing popularity of accessing networked
information, current Internet information services are experiencing
performance, reliability, and scaling problems. These are general
problems, given the distributed nature of the Internet. Current and
future applications would benefit from much more widespread use of
caching and replication.
For instance, popular WWW and Gopher servers experience serious
overloading, as many thousands of users per day attempt to access
them simultaneously. Neither of these systems was designed with
explicit caching or replication support in the core protocol.
Moreover, because the DNS is currently the only widely deployed
distributed and replicated data storage system in the Internet, it is
often used to help support more scalable operation in this
environment -- for example, storing service-specific pointer
information, or providing a means of rotating service accesses among
replicated copies of NCSA's extremely popular WWW server. In most
cases, such uses of the DNS semantically overload the system. The
DNS may not be able to stand such "semantic extensions" and continue
to perform well. It was not designed to be a general-purpose
replicated distributed database system.
There are many examples of systems that need or would benefit from
caching or replication. Examples include key distribution for
authentication services, DHCP, multicast SD, and Internet white
pages.
To date there have been a number of independent attempts to provide
caching and replication facilities. The question we address here is
whether it might be possible to define a general service interface or
protocol, so that caches and replica servers (implemented in a
variety of ways to support a range of different situations) might
McCahill, et al Informational [Page 3]
RFC 1862 IAB Workshop Report November 1995
interoperate, and so that we might reduce the amount of wasted re-
implementation effort currently being expended. Replication and
caching schemes could form a sort of network "middleware" to fulfill
a common need of distributed services.
It should be noted that it is an open question whether it would be
feasible to define a unified interface to all caching and replication
problems. For example, very different considerations must go into
providing a system to support a nationwide video service for
1,000,000 concurrent users than would be needed for supporting
worldwide accesses to popular WWW pages. We recommend research and
experimentation to address this more general issue.
3.2 Characteristics of Solutions
While on the surface caching and replication may appear to occupy two
ends of a spectrum, further analysis shows that these are two
different approaches with different characteristics. There are cases
where a combination of the two techniques is the optimal solution,
which further complicates the situation.
We can roughly characterize the two approaches as follows:
Caching:
- a cache contains a partial set of data
- a cache is built on demand
- a cache is audience-specific, since the cache is built in
response to demands of a community
Replication:
- replicated databases contain the entire data set or a
server-defined subset of a given database
- a replicated database can return an authoritative answer about
existence of an item
- data is pushed onto the replicating server rather than pulled on
demand
While there are important differences between caches and replicated
databases, there are some issues common to both, especially when
considering how updates and data consistency can be handled.
McCahill, et al Informational [Page 4]
RFC 1862 IAB Workshop Report November 1995
A variety of methods can be used to update caches and replicas:
- master-slave
- peer-to-peer
- flooding techniques (such as that used by NNTP).
Which strategy one chooses influences important characteristics of
the cache or replicated database, such as:
- consistency of data
- is locking used to achieve consistency? this influences
performance...
- are there a priori guarantees of existence of an item in the
database (is the answer authoritative, do you detect conflicts
after the fact, or is there no guarantee on authoritativeness of
the answer?)
Consistency guarantees depend on the granularity of synchronization
(ms, sec, hr, day), and there are cases where it is acceptable to
trade consistency for better performance or availability. Since there
is a range of qualities of service with respect to consistency and
performance, we would like to be able to tune these parameters for a
given application. However, we recognize that this may not be
possible in all cases since it is unlikely one can implement a high
performance solution to all of these problems in a single system.
Beyond simply performing replication or caching, there is a need for
managing cache and replication servers. There are several models for
organizing groups of caches/replication servers that range from
totally adaptive to a rigidly administered, centrally controlled
model:
- a club model. Minimal administrative overhead to join the club.
Participation is a function of disk space, CPU, available
network bandwidth.
- centrally coordinated service. Here administrators can take
advantage of their knowledge of the system's topology and the
community they intend to serve. There may be scaling problems
with this model.
- hybrid combinations of the club and centrally coordinated models
McCahill, et al Informational [Page 5]
RFC 1862 IAB Workshop Report November 1995
There are a couple of models for how to organize the management of a
group of cooperating servers, but this does not address the question
of what sorts of commands the manager (be it a person or a program)
issues to a cache or replicated server. A manager needs to be able to
address issues on a server such as:
- control of caching algorithms, defining how information is aged
out of the cache based on disk space, usage demands, etc. This is
where you would control time-to-live and expiry settings.
- flushing the cache. There are circumstances where the
information source has become inaccessible and the normal cache
aging strategy is inappropriate since you will not be able to
get the information again for an indeterminate amount of time.
- management control might also be a way for information providers
to control how information is pushed on servers for maintaining
data consistency, but this raises tricky problems with trust and
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?