📄 rfc817.txt
字号:
RFC: 817
MODULARITY AND EFFICIENCY IN PROTOCOL IMPLEMENTATION
David D. Clark
MIT Laboratory for Computer Science
Computer Systems and Communications Group
July, 1982
1. Introduction
Many protocol implementers have made the unpleasant discovery that
their packages do not run quite as fast as they had hoped. The blame
for this widely observed problem has been attributed to a variety of
causes, ranging from details in the design of the protocol to the
underlying structure of the host operating system. This RFC will
discuss some of the commonly encountered reasons why protocol
implementations seem to run slowly.
Experience suggests that one of the most important factors in
determining the performance of an implementation is the manner in which
that implementation is modularized and integrated into the host
operating system. For this reason, it is useful to discuss the question
of how an implementation is structured at the same time that we consider
how it will perform. In fact, this RFC will argue that modularity is
one of the chief villains in attempting to obtain good performance, so
that the designer is faced with a delicate and inevitable tradeoff
between good structure and good performance. Further, the single factor
which most strongly determines how well this conflict can be resolved is
not the protocol but the operating system.
2
2. Efficiency Considerations
There are many aspects to efficiency. One aspect is sending data
at minimum transmission cost, which is a critical aspect of common
carrier communications, if not in local area network communications.
Another aspect is sending data at a high rate, which may not be possible
at all if the net is very slow, but which may be the one central design
constraint when taking advantage of a local net with high raw bandwidth.
The final consideration is doing the above with minimum expenditure of
computer resources. This last may be necessary to achieve high speed,
but in the case of the slow net may be important only in that the
resources used up, for example cpu cycles, are costly or otherwise
needed. It is worth pointing out that these different goals often
conflict; for example it is often possible to trade off efficient use of
the computer against efficient use of the network. Thus, there may be
no such thing as a successful general purpose protocol implementation.
The simplest measure of performance is throughput, measured in bits
per second. It is worth doing a few simple computations in order to get
a feeling for the magnitude of the problems involved. Assume that data
is being sent from one machine to another in packets of 576 bytes, the
maximum generally acceptable internet packet size. Allowing for header
overhead, this packet size permits 4288 bits in each packet. If a
useful throughput of 10,000 bits per second is desired, then a data
bearing packet must leave the sending host about every 430 milliseconds,
a little over two per second. This is clearly not difficult to achieve.
However, if one wishes to achieve 100 kilobits per second throughput,
3
the packet must leave the host every 43 milliseconds, and to achieve one
megabit per second, which is not at all unreasonable on a high-speed
local net, the packets must be spaced no more than 4.3 milliseconds.
These latter numbers are a slightly more alarming goal for which to
set one's sights. Many operating systems take a substantial fraction of
a millisecond just to service an interrupt. If the protocol has been
structured as a process, it is necessary to go through a process
scheduling before the protocol code can even begin to run. If any piece
of a protocol package or its data must be fetched from disk, real time
delays of between 30 to 100 milliseconds can be expected. If the
protocol must compete for cpu resources with other processes of the
system, it may be necessary to wait a scheduling quantum before the
protocol can run. Many systems have a scheduling quantum of 100
milliseconds or more. Considering these sorts of numbers, it becomes
immediately clear that the protocol must be fitted into the operating
system in a thorough and effective manner if any like reasonable
throughput is to be achieved.
There is one obvious conclusion immediately suggested by even this
simple analysis. Except in very special circumstances, when many
packets are being processed at once, the cost of processing a packet is
dominated by factors, such as cpu scheduling, which are independent of
the packet size. This suggests two general rules which any
implementation ought to obey. First, send data in large packets.
Obviously, if processing time per packet is a constant, then throughput
will be directly proportional to the packet size. Second, never send an
4
unneeded packet. Unneeded packets use up just as many resources as a
packet full of data, but perform no useful function. RFC 813, "Window
and Acknowledgement Strategy in TCP", discusses one aspect of reducing
the number of packets sent per useful data byte. This document will
mention other attacks on the same problem.
The above analysis suggests that there are two main parts to the
problem of achieving good protocol performance. The first has to do
with how the protocol implementation is integrated into the host
operating system. The second has to do with how the protocol package
itself is organized internally. This document will consider each of
these topics in turn.
3. The Protocol vs. the Operating System
There are normally three reasonable ways in which to add a protocol
to an operating system. The protocol can be in a process that is
provided by the operating system, or it can be part of the kernel of the
operating system itself, or it can be put in a separate communications
processor or front end machine. This decision is strongly influenced by
details of hardware architecture and operating system design; each of
these three approaches has its own advantages and disadvantages.
The "process" is the abstraction which most operating systems use
to provide the execution environment for user programs. A very simple
path for implementing a protocol is to obtain a process from the
operating system and implement the protocol to run in it.
Superficially, this approach has a number of advantages. Since
5
modifications to the kernel are not required, the job can be done by
someone who is not an expert in the kernel structure. Since it is often
impossible to find somebody who is experienced both in the structure of
the operating system and the structure of the protocol, this path, from
a management point of view, is often extremely appealing. Unfortunately,
putting a protocol in a process has a number of disadvantages, related
to both structure and performance. First, as was discussed above,
process scheduling can be a significant source of real-time delay.
There is not only the actual cost of going through the scheduler, but
the problem that the operating system may not have the right sort of
priority tools to bring the process into execution quickly whenever
there is work to be done.
Structurally, the difficulty with putting a protocol in a process
is that the protocol may be providing services, for example support of
data streams, which are normally obtained by going to special kernel
entry points. Depending on the generality of the operating system, it
may be impossible to take a program which is accustomed to reading
through a kernel entry point, and redirect it so it is reading the data
from a process. The most extreme example of this problem occurs when
implementing server telnet. In almost all systems, the device handler
for the locally attached teletypes is located inside the kernel, and
programs read and write from their teletype by making kernel calls. If
server telnet is implemented in a process, it is then necessary to take
the data streams provided by server telnet and somehow get them back
down inside the kernel so that they mimic the interface provided by
local teletypes. It is usually the case that special kernel
6
modification is necessary to achieve this structure, which somewhat
defeats the benefit of having removed the protocol from the kernel in
the first place.
Clearly, then, there are advantages to putting the protocol package
in the kernel. Structurally, it is reasonable to view the network as a
device, and device drivers are traditionally contained in the kernel.
Presumably, the problems associated with process scheduling can be
sidesteped, at least to a certain extent, by placing the code inside the
kernel. And it is obviously easier to make the server telnet channels
mimic the local teletype channels if they are both realized in the same
level in the kernel.
However, implementation of protocols in the kernel has its own set
of pitfalls. First, network protocols have a characteristic which is
shared by almost no other device: they require rather complex actions
to be performed as a result of a timeout. The problem with this
requirement is that the kernel often has no facility by which a program
can be brought into execution as a result of the timer event. What is
really needed, of course, is a special sort of process inside the
kernel. Most systems lack this mechanism. Failing that, the only
execution mechanism available is to run at interrupt time.
There are substantial drawbacks to implementing a protocol to run
at interrupt time. First, the actions performed may be somewhat complex
and time consuming, compared to the maximum amount of time that the
operating system is prepared to spend servicing an interrupt. Problems
can arise if interrupts are masked for too long. This is particularly
7
bad when running as a result of a clock interrupt, which can imply that
the clock interrupt is masked. Second, the environment provided by an
interrupt handler is usually extremely primitive compared to the
environment of a process. There are usually a variety of system
facilities which are unavailable while running in an interrupt handler.
The most important of these is the ability to suspend execution pending
the arrival of some event or message. It is a cardinal rule of almost
every known operating system that one must not invoke the scheduler
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -