📄 rfc816.txt
字号:
Provided that one can get the proper advice from one's higher levelprotocols, it is possible to implement such a strategy. For example,one could program the TCP level so that whenever it retransmitted a 7segment more than once, it sent a hint down to the IP layer whichtriggered polling. This strategy does not have excessive overhead, butdoes have the problem that the host may be somewhat slow to respond toan error, since only after polling has started will the host be able toconfirm that something has gone wrong, and by then the TCP above mayhave already timed out. Both forms of polling suffer from a minor flaw. Hosts as well asgateways respond to ICMP echo messages. Thus, polling cannot be used todetect the error that a foreign address thought to be a gateway isactually a host. Such a confusion can arise if the physical addressesof machines are rearranged.4. TRIGGERED RESELECTION There is a strategy which makes use of a hint from a higher level,as did the previous strategy, but which avoids polling altogether.Whenever a higher level complains that the service seems to bedefective, the Internet layer can pick the next gateway from the list ofavailable gateways, and switch to it. Assuming that this gateway is up,no real harm can come of this decision, even if it was wrong, for theworst that will happen is a redirect message which instructs the host toreturn to the gateway originally being used. If, on the other hand, theoriginal gateway was indeed down, then this immediately provides a newroute, so the period of time until recovery is shortened. This laststrategy seems particularly clever, and is probably the most generallysuitable for those cases where the network itself does not provide faultisolation. (Regretably, I have forgotten who suggested this idea to me.It is not my invention.) 8 5. Higher Level Fault Detection The previous discussion has concentrated on fault detection andrecovery at the IP layer. This section considers what the higher layerssuch as TCP should do. TCP has a single fault recovery action; it repeatedly retransmits asegment until either it gets an acknowledgement or its connection timerexpires. As discussed above, it may use retransmission as an event totrigger a request for fault recovery to the IP layer. In the otherdirection, information may flow up from IP, reporting such things asICMP Destination Unreachable or error messages from the attachednetwork. The only subtle question about TCP and faults is what TCPshould do when such an error message arrives or its connection timerexpires. The TCP specification discusses the timer. In the description ofthe open call, the timeout is described as an optional value that theclient of TCP may specify; if any segment remains unacknowledged forthis period, TCP should abort the connection. The default for thetimeout is 30 seconds. Early TCPs were often implemented with a fixedtimeout interval, but this did not work well in practice, as thefollowing discussion may suggest. Clients of TCP can be divided into two classes: those running onimmediate behalf of a human, such as Telnet, and those supporting aprogram, such as a mail sender. Humans require a sophisticated responseto errors. Depending on exactly what went wrong, they may want to 9abandon the connection at once, or wait for a long time to see if thingsget better. Programs do not have this human impatience, but also lackthe power to make complex decisions based on details of the exact errorcondition. For them, a simple timeout is reasonable. Based on these considerations, at least two modes of operation areneeded in TCP. One, for programs, abandons the connection withoutexception if the TCP timer expires. The other mode, suitable forpeople, never abandons the connection on its own initiative, but reportsto the layer above when the timer expires. Thus, the human user can seeerror messages coming from all the relevant layers, TCP and ICMP, andcan request TCP to abort as appropriate. This second mode requires thatTCP be able to send an asynchronous message up to its client to reportthe timeout, and it requires that error messages arriving at lowerlayers similarly flow up through TCP. At levels above TCP, fault detection is also required. Either ofthe following can happen. First, the foreign client of TCP can fail,even though TCP is still running, so data is still acknowledged and thetimer never expires. Alternatively, the communication path can fail,without the TCP timer going off, because the local client has no data tosend. Both of these have caused trouble. Sending mail provides an example of the first case. When sendingmail using SMTP, there is an SMTP level acknowledgement that is returnedwhen a piece of mail is successfully delivered. Several early mailreceiving programs would crash just at the point where they had receivedall of the mail text (so TCP did not detect a timeout due to outstanding 10unacknowledged data) but before the mail was acknowledged at the SMTPlevel. This failure would cause early mail senders to wait forever forthe SMTP level acknowledgement. The obvious cure was to set a timer atthe SMTP level, but the first attempt to do this did not work, for therewas no simple way to select the timer interval. If the intervalselected was short, it expired in normal operational when sending alarge file to a slow host. An interval of many minutes was needed toprevent false timeouts, but that meant that failures were detected onlyvery slowly. The current solution in several mailers is to pick atimeout interval proportional to the size of the message. Server telnet provides an example of the other kind of failure. Itcan easily happen that the communications link can fail while there isno traffic flowing, perhaps because the user is thinking. Eventually,the user will attempt to type something, at which time he will discoverthat the connection is dead and abort it. But the host end of theconnection, having nothing to send, will not discover anything wrong,and will remain waiting forever. In some systems there is no way for auser in a different process to destroy or take over such a hangingprocess, so there is no way to recover. One solution to this would be to have the host server telnet querythe user end now and then, to see if it is still up. (Telnet does nothave an explicit query feature, but the host could negotiate someunimportant option, which should produce either agreement ordisagreement in return.) The only problem with this is that areasonable sample interval, if applied to every user on a large system, 11can generate an unacceptable amount of traffic and system overhead. Asmart server telnet would use this query only when something seemswrong, perhaps when there had been no user activity for some time. In both these cases, the general conclusion is that client levelerror detection is needed, and that the details of the mechanism arevery dependent on the application. Application programmers must be madeaware of the problem of failures, and must understand that errordetection at the TCP or lower level cannot solve the whole problem forthem. 6. Knowing When to Give Up It is not obvious, when error messages such as ICMP DestinationUnreachable arrive, whether TCP should abandon the connection. Thereason that error messages are difficult to interpret is that, asdiscussed above, after a failure of a gateway or network, there is atransient period during which the gateways may have incorrectinformation, so that irrelevant or incorrect error messages maysometimes return. An isolated ICMP Destination Unreachable may arriveat a host, for example, if a packet is sent during the period when thegateways are trying to find a new route. To abandon a TCP connectionbased on such a message arriving would be to ignore the valuable featureof the Internet that for many internal failures it reconstructs itsfunction without any disruption of the end points. But if failure messages do not imply a failure, what are they for?In fact, error messages serve several important purposes. First, if 12they arrive in response to opening a new connection, they probably arecaused by opening the connection improperly (e.g., to a non-existentaddress) rather than by a transient network failure. Second, theyprovide valuable information, after the TCP timeout has occurred, as tothe probable cause of the failure. Finally, certain messages, such asICMP Parameter Problem, imply a possible implementation problem. Ingeneral, error messages give valuable information about what went wrong,but are not to be taken as absolutely reliable. A general alertingmechanism, such as the TCP timeout discussed above, provides a goodindication that whatever is wrong is a serious condition, but withoutthe advisory messages to augment the timer, there is no way for theclient to know how to respond to the error. The combination of thetimer and the advice from the error messages provide a reasonable set offacts for the client layer to have. It is important that error messagesfrom all layers be passed up to the client module in a useful andconsistent way.-------
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -