📄 rfc817.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 4 页
字号:
12 3 4 下一页

RFC:  817



          MODULARITY AND EFFICIENCY IN PROTOCOL IMPLEMENTATION

                             David D. Clark
                  MIT Laboratory for Computer Science
               Computer Systems and Communications Group
                               July, 1982


     1.  Introduction


     Many  protocol implementers have made the unpleasant discovery that

their packages do not run quite as fast as they had hoped.    The  blame

for  this  widely  observed  problem has been attributed to a variety of

causes, ranging from details in  the  design  of  the  protocol  to  the

underlying  structure  of  the  host  operating  system.   This RFC will

discuss  some  of  the  commonly  encountered   reasons   why   protocol

implementations seem to run slowly.


     Experience  suggests  that  one  of  the  most important factors in

determining the performance of an implementation is the manner in  which

that   implementation  is  modularized  and  integrated  into  the  host

operating system.  For this reason, it is useful to discuss the question

of how an implementation is structured at the same time that we consider

how it will perform.  In fact, this RFC will argue  that  modularity  is

one  of  the chief villains in attempting to obtain good performance, so

that the designer is faced  with  a  delicate  and  inevitable  tradeoff

between good structure and good performance.  Further, the single factor

which most strongly determines how well this conflict can be resolved is

not the protocol but the operating system.

                                   2


     2.  Efficiency Considerations


     There  are  many aspects to efficiency.  One aspect is sending data

at minimum transmission cost, which  is  a  critical  aspect  of  common

carrier  communications,  if  not  in local area network communications.

Another aspect is sending data at a high rate, which may not be possible

at all if the net is very slow, but which may be the one central  design

constraint when taking advantage of a local net with high raw bandwidth.

The  final  consideration is doing the above with minimum expenditure of

computer resources.  This last may be necessary to achieve  high  speed,

but  in  the  case  of  the  slow  net may be important only in that the

resources used up, for example  cpu  cycles,  are  costly  or  otherwise

needed.    It  is  worth  pointing  out that these different goals often

conflict; for example it is often possible to trade off efficient use of

the computer against efficient use of the network.  Thus, there  may  be

no such thing as a successful general purpose protocol implementation.


     The simplest measure of performance is throughput, measured in bits

per second.  It is worth doing a few simple computations in order to get

a  feeling for the magnitude of the problems involved.  Assume that data

is being sent from one machine to another in packets of 576  bytes,  the

maximum  generally acceptable internet packet size.  Allowing for header

overhead, this packet size permits 4288 bits  in  each  packet.    If  a

useful  throughput  of  10,000  bits  per second is desired, then a data

bearing packet must leave the sending host about every 430 milliseconds,

a little over two per second.  This is clearly not difficult to achieve.

However, if one wishes to achieve 100 kilobits  per  second  throughput,

                                   3


the packet must leave the host every 43 milliseconds, and to achieve one

megabit  per  second,  which  is not at all unreasonable on a high-speed

local net, the packets must be spaced no more than 4.3 milliseconds.


     These latter numbers are a slightly more alarming goal for which to

set one's sights.  Many operating systems take a substantial fraction of

a millisecond just to service an interrupt.  If the  protocol  has  been

structured  as  a  process,  it  is  necessary  to  go through a process

scheduling before the protocol code can even begin to run.  If any piece

of a protocol package or its data must be fetched from disk,  real  time

delays  of  between  30  to  100  milliseconds  can be expected.  If the

protocol must compete for cpu resources  with  other  processes  of  the

system,  it  may  be  necessary  to wait a scheduling quantum before the

protocol can run.   Many  systems  have  a  scheduling  quantum  of  100

milliseconds  or  more.   Considering these sorts of numbers, it becomes

immediately clear that the protocol must be fitted  into  the  operating

system  in  a  thorough  and  effective  manner  if  any like reasonable

throughput is to be achieved.


     There is one obvious conclusion immediately suggested by even  this

simple  analysis.    Except  in  very  special  circumstances, when many

packets are being processed at once, the cost of processing a packet  is

dominated  by  factors, such as cpu scheduling, which are independent of

the  packet  size.    This  suggests  two  general   rules   which   any

implementation  ought  to  obey.    First,  send  data in large packets.

Obviously, if processing time per packet is a constant, then  throughput

will be directly proportional to the packet size.  Second, never send an

                                   4


unneeded  packet.    Unneeded packets use up just as many resources as a

packet full of data, but perform no useful function.  RFC  813,  "Window

and  Acknowledgement  Strategy in TCP", discusses one aspect of reducing

the number of packets sent per useful data byte.    This  document  will

mention other attacks on the same problem.


     The  above  analysis  suggests that there are two main parts to the

problem of achieving good protocol performance.  The  first  has  to  do

with  how  the  protocol  implementation  is  integrated  into  the host

operating system.  The second has to do with how  the  protocol  package

itself  is  organized  internally.   This document will consider each of

these topics in turn.


     3.  The Protocol vs. the Operating System


     There are normally three reasonable ways in which to add a protocol

to an operating system.  The protocol  can  be  in  a  process  that  is

provided by the operating system, or it can be part of the kernel of the

operating  system  itself, or it can be put in a separate communications

processor or front end machine.  This decision is strongly influenced by

details of hardware architecture and operating system  design;  each  of

these three approaches has its own advantages and disadvantages.


     The  "process"  is the abstraction which most operating systems use

to provide the execution environment for user programs.  A  very  simple

path  for  implementing  a  protocol  is  to  obtain  a process from the

operating  system  and  implement   the   protocol   to   run   in   it.

Superficially,  this  approach  has  a  number  of  advantages.    Since

                                   5


modifications  to  the  kernel  are not required, the job can be done by

someone who is not an expert in the kernel structure.  Since it is often

impossible to find somebody who is experienced both in the structure  of

the  operating system and the structure of the protocol, this path, from

a management point of view, is often extremely appealing. Unfortunately,

putting a protocol in a process has a number of  disadvantages,  related

to  both  structure  and  performance.    First, as was discussed above,

process scheduling can be  a  significant  source  of  real-time  delay.

There  is  not  only the actual cost of going through the scheduler, but

the problem that the operating system may not have  the  right  sort  of

priority  tools  to  bring  the  process into execution quickly whenever

there is work to be done.


     Structurally, the difficulty with putting a protocol in  a  process

is  that  the protocol may be providing services, for example support of

data streams, which are normally obtained by  going  to  special  kernel

entry  points.   Depending on the generality of the operating system, it

may be impossible to take a  program  which  is  accustomed  to  reading

through  a kernel entry point, and redirect it so it is reading the data

from a process.  The most extreme example of this  problem  occurs  when

implementing  server  telnet.  In almost all systems, the device handler

for the locally attached teletypes is located  inside  the  kernel,  and

programs  read and write from their teletype by making kernel calls.  If

server telnet is implemented in a process, it is then necessary to  take

the  data  streams  provided  by server telnet and somehow get them back

down inside the kernel so that they  mimic  the  interface  provided  by

local   teletypes.     It  is  usually  the  case  that  special  kernel

                                   6


modification  is  necessary  to  achieve  this structure, which somewhat

defeats the benefit of having removed the protocol from  the  kernel  in

the first place.


     Clearly, then, there are advantages to putting the protocol package

in  the kernel.  Structurally, it is reasonable to view the network as a

device, and device drivers are traditionally contained  in  the  kernel.

Presumably,  the  problems  associated  with  process  scheduling can be

sidesteped, at least to a certain extent, by placing the code inside the

kernel.  And it is obviously easier to make the server  telnet  channels

mimic  the local teletype channels if they are both realized in the same

level in the kernel.


     However, implementation of protocols in the kernel has its own  set

of  pitfalls.    First, network protocols have a characteristic which is

shared by almost no other device:  they require rather  complex  actions

to  be  performed  as  a  result  of  a  timeout.  The problem with this

requirement is that the kernel often has no facility by which a  program

can  be  brought into execution as a result of the timer event.  What is

really needed, of course, is  a  special  sort  of  process  inside  the

kernel.    Most  systems  lack  this  mechanism.  Failing that, the only

execution mechanism available is to run at interrupt time.


     There are substantial drawbacks to implementing a protocol  to  run

at interrupt time.  First, the actions performed may be somewhat complex

and  time  consuming,  compared  to  the maximum amount of time that the

operating system is prepared to spend servicing an interrupt.   Problems

can  arise  if interrupts are masked for too long.  This is particularly

                                   7


bad  when running as a result of a clock interrupt, which can imply that

the clock interrupt is masked.  Second, the environment provided  by  an

interrupt  handler  is  usually  extremely  primitive  compared  to  the

environment of a process.    There  are  usually  a  variety  of  system

facilities  which are unavailable while running in an interrupt handler.

The most important of these is the ability to suspend execution  pending

the  arrival  of some event or message.  It is a cardinal rule of almost

every known operating system that one  must  not  invoke  the  scheduler
12 3 4 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -