📄 rfc816.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 2 页
字号:
上一页 12

Provided that one can get the proper  advice  from  one's  higher  level

protocols,  it  is  possible to implement such a strategy.  For example,

one could program the TCP level so  that  whenever  it  retransmitted  a

                                   7


segment  more  than  once,  it  sent  a  hint down to the IP layer which

triggered polling.  This strategy does not have excessive overhead,  but

does  have  the problem that the host may be somewhat slow to respond to

an error, since only after polling has started will the host be able  to

confirm  that  something  has  gone wrong, and by then the TCP above may

have already timed out.


     Both forms of polling suffer from a minor flaw.  Hosts as  well  as

gateways respond to ICMP echo messages.  Thus, polling cannot be used to

detect  the  error  that  a  foreign  address thought to be a gateway is

actually a host.  Such a confusion can arise if the  physical  addresses

of machines are rearranged.


4.  TRIGGERED RESELECTION


     There  is a strategy which makes use of a hint from a higher level,

as did the previous  strategy,  but  which  avoids  polling  altogether.

Whenever  a  higher  level  complains  that  the  service  seems  to  be

defective, the Internet layer can pick the next gateway from the list of

available gateways, and switch to it.  Assuming that this gateway is up,

no real harm can come of this decision, even if it was  wrong,  for  the

worst that will happen is a redirect message which instructs the host to

return to the gateway originally being used.  If, on the other hand, the

original  gateway  was indeed down, then this immediately provides a new

route, so the period of time until recovery is  shortened.    This  last

strategy  seems  particularly clever, and is probably the most generally

suitable for those cases where the network itself does not provide fault

isolation.  (Regretably, I have forgotten who suggested this idea to me.

It is not my invention.)

                                   8


     5.  Higher Level Fault Detection


     The  previous  discussion  has  concentrated on fault detection and

recovery at the IP layer.  This section considers what the higher layers

such as TCP should do.


     TCP has a single fault recovery action; it repeatedly retransmits a

segment until either it gets an acknowledgement or its connection  timer

expires.    As discussed above, it may use retransmission as an event to

trigger a request for fault recovery to the IP  layer.    In  the  other

direction,  information  may  flow  up from IP, reporting such things as

ICMP  Destination  Unreachable  or  error  messages  from  the  attached

network.    The  only  subtle  question about TCP and faults is what TCP

should do when such an error message arrives  or  its  connection  timer

expires.


     The  TCP  specification discusses the timer.  In the description of

the open call, the timeout is described as an optional  value  that  the

client  of  TCP  may  specify; if any segment remains unacknowledged for

this period, TCP should abort the  connection.    The  default  for  the

timeout  is  30 seconds.  Early TCPs were often implemented with a fixed

timeout interval, but this  did  not  work  well  in  practice,  as  the

following discussion may suggest.


     Clients  of  TCP can be divided into two classes:  those running on

immediate behalf of a human, such as  Telnet,  and  those  supporting  a

program, such as a mail sender.  Humans require a sophisticated response

to  errors.    Depending  on  exactly  what went wrong, they may want to

                                   9


abandon the connection at once, or wait for a long time to see if things

get  better.   Programs do not have this human impatience, but also lack

the power to make complex decisions based on details of the exact  error

condition.  For them, a simple timeout is reasonable.


     Based  on these considerations, at least two modes of operation are

needed in TCP.  One,  for  programs,  abandons  the  connection  without

exception  if  the  TCP  timer  expires.    The other mode, suitable for

people, never abandons the connection on its own initiative, but reports

to the layer above when the timer expires.  Thus, the human user can see

error messages coming from all the relevant layers, TCP  and  ICMP,  and

can request TCP to abort as appropriate.  This second mode requires that

TCP  be  able to send an asynchronous message up to its client to report

the timeout, and it requires  that  error  messages  arriving  at  lower

layers similarly flow up through TCP.


     At  levels  above TCP, fault detection is also required.  Either of

the following can happen.  First, the foreign client of  TCP  can  fail,

even  though TCP is still running, so data is still acknowledged and the

timer never expires.  Alternatively, the communication  path  can  fail,

without the TCP timer going off, because the local client has no data to

send.  Both of these have caused trouble.


     Sending  mail  provides an example of the first case.  When sending

mail using SMTP, there is an SMTP level acknowledgement that is returned

when a piece of mail is successfully  delivered.    Several  early  mail

receiving programs would crash just at the point where they had received

all of the mail text (so TCP did not detect a timeout due to outstanding

                                   10


unacknowledged  data)  but  before the mail was acknowledged at the SMTP

level.  This failure would cause early mail senders to wait forever  for

the  SMTP level acknowledgement.  The obvious cure was to set a timer at

the SMTP level, but the first attempt to do this did not work, for there

was no simple way to  select  the  timer  interval.    If  the  interval

selected  was  short,  it  expired  in normal operational when sending a

large file to a slow host.  An interval of many minutes  was  needed  to

prevent  false timeouts, but that meant that failures were detected only

very slowly.  The current solution in  several  mailers  is  to  pick  a

timeout interval proportional to the size of the message.


     Server telnet provides an example of the other kind of failure.  It

can  easily  happen that the communications link can fail while there is

no traffic flowing, perhaps because the user is thinking.    Eventually,

the  user will attempt to type something, at which time he will discover

that the connection is dead and abort it.   But  the  host  end  of  the

connection,  having  nothing  to send, will not discover anything wrong,

and will remain waiting forever.  In some systems there is no way for  a

user  in  a  different  process  to  destroy or take over such a hanging

process, so there is no way to recover.


     One solution to this would be to have the host server telnet  query

the  user  end now and then, to see if it is still up.  (Telnet does not

have an explicit query  feature,  but  the  host  could  negotiate  some

unimportant   option,   which   should   produce   either  agreement  or

disagreement in  return.)    The  only  problem  with  this  is  that  a

reasonable  sample interval, if applied to every user on a large system,

                                   11


can  generate  an unacceptable amount of traffic and system overhead.  A

smart server telnet would use  this  query  only  when  something  seems

wrong, perhaps when there had been no user activity for some time.


     In  both  these  cases, the general conclusion is that client level

error detection is needed, and that the details  of  the  mechanism  are

very dependent on the application.  Application programmers must be made

aware  of  the  problem  of  failures,  and  must  understand that error

detection at the TCP or lower level cannot solve the whole  problem  for

them.


     6.  Knowing When to Give Up


     It  is  not  obvious,  when error messages such as ICMP Destination

Unreachable arrive, whether TCP should  abandon  the  connection.    The

reason  that  error  messages  are  difficult  to  interpret is that, as

discussed above, after a failure of a gateway or  network,  there  is  a

transient   period   during   which  the  gateways  may  have  incorrect

information,  so  that  irrelevant  or  incorrect  error  messages   may

sometimes  return.   An isolated ICMP Destination Unreachable may arrive

at a host, for example, if a packet is sent during the period  when  the

gateways  are  trying  to find a new route.  To abandon a TCP connection

based on such a message arriving would be to ignore the valuable feature

of the Internet that for many  internal  failures  it  reconstructs  its

function without any disruption of the end points.


     But  if failure messages do not imply a failure, what are they for?

In fact, error messages serve several important  purposes.    First,  if

                                   12


they  arrive  in response to opening a new connection, they probably are

caused by opening the connection improperly  (e.g.,  to  a  non-existent

address)  rather  than  by  a  transient  network failure.  Second, they

provide valuable information, after the TCP timeout has occurred, as  to

the  probable  cause of the failure.  Finally, certain messages, such as

ICMP Parameter Problem, imply a possible  implementation  problem.    In

general, error messages give valuable information about what went wrong,

but  are  not  to  be  taken as absolutely reliable.  A general alerting

mechanism, such as the TCP timeout  discussed  above,  provides  a  good

indication  that  whatever  is wrong is a serious condition, but without

the advisory messages to augment the timer, there  is  no  way  for  the

client  to  know  how  to  respond to the error.  The combination of the

timer and the advice from the error messages provide a reasonable set of

facts for the client layer to have.  It is important that error messages

from all layers be passed up to  the  client  module  in  a  useful  and

consistent way.


-------
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -