📄 rfc817.txt
字号:
RFC: 817 MODULARITY AND EFFICIENCY IN PROTOCOL IMPLEMENTATION David D. Clark MIT Laboratory for Computer Science Computer Systems and Communications Group July, 1982 1. Introduction Many protocol implementers have made the unpleasant discovery thattheir packages do not run quite as fast as they had hoped. The blamefor this widely observed problem has been attributed to a variety ofcauses, ranging from details in the design of the protocol to theunderlying structure of the host operating system. This RFC willdiscuss some of the commonly encountered reasons why protocolimplementations seem to run slowly. Experience suggests that one of the most important factors indetermining the performance of an implementation is the manner in whichthat implementation is modularized and integrated into the hostoperating system. For this reason, it is useful to discuss the questionof how an implementation is structured at the same time that we considerhow it will perform. In fact, this RFC will argue that modularity isone of the chief villains in attempting to obtain good performance, sothat the designer is faced with a delicate and inevitable tradeoffbetween good structure and good performance. Further, the single factorwhich most strongly determines how well this conflict can be resolved isnot the protocol but the operating system. 2 2. Efficiency Considerations There are many aspects to efficiency. One aspect is sending dataat minimum transmission cost, which is a critical aspect of commoncarrier communications, if not in local area network communications.Another aspect is sending data at a high rate, which may not be possibleat all if the net is very slow, but which may be the one central designconstraint when taking advantage of a local net with high raw bandwidth.The final consideration is doing the above with minimum expenditure ofcomputer resources. This last may be necessary to achieve high speed,but in the case of the slow net may be important only in that theresources used up, for example cpu cycles, are costly or otherwiseneeded. It is worth pointing out that these different goals oftenconflict; for example it is often possible to trade off efficient use ofthe computer against efficient use of the network. Thus, there may beno such thing as a successful general purpose protocol implementation. The simplest measure of performance is throughput, measured in bitsper second. It is worth doing a few simple computations in order to geta feeling for the magnitude of the problems involved. Assume that datais being sent from one machine to another in packets of 576 bytes, themaximum generally acceptable internet packet size. Allowing for headeroverhead, this packet size permits 4288 bits in each packet. If auseful throughput of 10,000 bits per second is desired, then a databearing packet must leave the sending host about every 430 milliseconds,a little over two per second. This is clearly not difficult to achieve.However, if one wishes to achieve 100 kilobits per second throughput, 3the packet must leave the host every 43 milliseconds, and to achieve onemegabit per second, which is not at all unreasonable on a high-speedlocal net, the packets must be spaced no more than 4.3 milliseconds. These latter numbers are a slightly more alarming goal for which toset one's sights. Many operating systems take a substantial fraction ofa millisecond just to service an interrupt. If the protocol has beenstructured as a process, it is necessary to go through a processscheduling before the protocol code can even begin to run. If any pieceof a protocol package or its data must be fetched from disk, real timedelays of between 30 to 100 milliseconds can be expected. If theprotocol must compete for cpu resources with other processes of thesystem, it may be necessary to wait a scheduling quantum before theprotocol can run. Many systems have a scheduling quantum of 100milliseconds or more. Considering these sorts of numbers, it becomesimmediately clear that the protocol must be fitted into the operatingsystem in a thorough and effective manner if any like reasonablethroughput is to be achieved. There is one obvious conclusion immediately suggested by even thissimple analysis. Except in very special circumstances, when manypackets are being processed at once, the cost of processing a packet isdominated by factors, such as cpu scheduling, which are independent ofthe packet size. This suggests two general rules which anyimplementation ought to obey. First, send data in large packets.Obviously, if processing time per packet is a constant, then throughputwill be directly proportional to the packet size. Second, never send an 4unneeded packet. Unneeded packets use up just as many resources as apacket full of data, but perform no useful function. RFC 813, "Windowand Acknowledgement Strategy in TCP", discusses one aspect of reducingthe number of packets sent per useful data byte. This document willmention other attacks on the same problem. The above analysis suggests that there are two main parts to theproblem of achieving good protocol performance. The first has to dowith how the protocol implementation is integrated into the hostoperating system. The second has to do with how the protocol packageitself is organized internally. This document will consider each ofthese topics in turn. 3. The Protocol vs. the Operating System There are normally three reasonable ways in which to add a protocolto an operating system. The protocol can be in a process that isprovided by the operating system, or it can be part of the kernel of theoperating system itself, or it can be put in a separate communicationsprocessor or front end machine. This decision is strongly influenced bydetails of hardware architecture and operating system design; each ofthese three approaches has its own advantages and disadvantages. The "process" is the abstraction which most operating systems useto provide the execution environment for user programs. A very simplepath for implementing a protocol is to obtain a process from theoperating system and implement the protocol to run in it.Superficially, this approach has a number of advantages. Since 5modifications to the kernel are not required, the job can be done bysomeone who is not an expert in the kernel structure. Since it is oftenimpossible to find somebody who is experienced both in the structure ofthe operating system and the structure of the protocol, this path, froma management point of view, is often extremely appealing. Unfortunately,putting a protocol in a process has a number of disadvantages, relatedto both structure and performance. First, as was discussed above,process scheduling can be a significant source of real-time delay.There is not only the actual cost of going through the scheduler, butthe problem that the operating system may not have the right sort ofpriority tools to bring the process into execution quickly wheneverthere is work to be done. Structurally, the difficulty with putting a protocol in a processis that the protocol may be providing services, for example support ofdata streams, which are normally obtained by going to special kernelentry points. Depending on the generality of the operating system, itmay be impossible to take a program which is accustomed to readingthrough a kernel entry point, and redirect it so it is reading the datafrom a process. The most extreme example of this problem occurs whenimplementing server telnet. In almost all systems, the device handlerfor the locally attached teletypes is located inside the kernel, andprograms read and write from their teletype by making kernel calls. Ifserver telnet is implemented in a process, it is then necessary to takethe data streams provided by server telnet and somehow get them backdown inside the kernel so that they mimic the interface provided bylocal teletypes. It is usually the case that special kernel 6modification is necessary to achieve this structure, which somewhatdefeats the benefit of having removed the protocol from the kernel inthe first place. Clearly, then, there are advantages to putting the protocol packagein the kernel. Structurally, it is reasonable to view the network as adevice, and device drivers are traditionally contained in the kernel.Presumably, the problems associated with process scheduling can besidesteped, at least to a certain extent, by placing the code inside thekernel. And it is obviously easier to make the server telnet channelsmimic the local teletype channels if they are both realized in the samelevel in the kernel. However, implementation of protocols in the kernel has its own setof pitfalls. First, network protocols have a characteristic which isshared by almost no other device: they require rather complex actionsto be performed as a result of a timeout. The problem with thisrequirement is that the kernel often has no facility by which a programcan be brought into execution as a result of the timer event. What isreally needed, of course, is a special sort of process inside thekernel. Most systems lack this mechanism. Failing that, the onlyexecution mechanism available is to run at interrupt time. There are substantial drawbacks to implementing a protocol to runat interrupt time. First, the actions performed may be somewhat complexand time consuming, compared to the maximum amount of time that theoperating system is prepared to spend servicing an interrupt. Problemscan arise if interrupts are masked for too long. This is particularly 7bad when running as a result of a clock interrupt, which can imply thatthe clock interrupt is masked. Second, the environment provided by aninterrupt handler is usually extremely primitive compared to theenvironment of a process. There are usually a variety of systemfacilities which are unavailable while running in an interrupt handler.The most important of these is the ability to suspend execution pendingthe arrival of some event or message. It is a cardinal rule of almostevery known operating system that one must not invoke the schedulerwhile running in an interrupt handler. Thus, the programmer who isforced to implement all or part of his protocol package as an interrupthandler must be the best sort of expert in the operating systeminvolved, and must be prepared for development sessions filled withobscure bugs which crash not just the protocol package but the entireoperating system. A final problem with processing at interrupt time is that thesystem scheduler has no control over the percentage of system time usedby the protocol handler. If a large number of packets arrive, from aforeign host that is either malfunctioning or fast, all of the time maybe spent in the interrupt handler, effectively killing the system. There are other problems associated with putting protocols into anoperating system kernel. The simplest problem often encountered is thatthe kernel address space is simply too small to hold the piece of codein question. This is a rather artificial sort of problem, but it is asevere problem none the less in many machines. It is an appallinglyunpleasant experience to do an implementation with the knowledge that 8for every byte of new feature put in one must find some other byte ofold feature to throw out. It is hopeless to expect an effective andgeneral implementation under this kind of constraint. Another problemis that the protocol package, once it is thoroughly entwined in theoperating system, may need to be redone every time the operating systemchanges. If the protocol and the operating system are not maintained bythe same group, this makes maintenance of the protocol package aperpetual headache. The third option for protocol implementation is to take theprotocol package and move it outside the machine entirely, on to aseparate processor dedicated to this kind of task. Such a machine isoften described as a communications processor or a front-end processor.There are several advantages to this approach. First, the operatingsystem on the communications processor can be tailored for preciselythis kind of task. This makes the job of implementation much easier.Second, one does not need to redo the task for every machine to whichthe protocol is to be added. It may be possible to reuse the samefront-end machine on different host computers. Since the task need notbe done as many times, one might hope that more attention could be paidto doing it right. Given a careful implementation in an environmentwhich is optimized for this kind of task, the resulting package shouldturn out to be very efficient. Unfortunately, there are also problemswith this approach. There is, of course, a financial problem associatedwith buying an additional computer. In many cases, this is not aproblem at all since the cost is negligible compared to what theprogrammer would cost to do the job in the mainframe itself. More 9fundamentally, the communications processor approach does not completelysidestep any of the problems raised above. The reason is that thecommunications processor, since it is a separate machine, must beattached to the mainframe by some mechanism. Whatever that mechanism,code is required in the mainframe to deal with it. It can be arguedthat the program to deal with the communications processor is simplerthan the program to implement the entire protocol package. Even if thatis so, the communications processor interface package is still aprotocol in nature, with all of the same structural problems. Thus, allof the issues raised above must still be faced. In addition to those
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -