📄 24.htm
字号:
all the instances of the particular socket in all the processes that <br>
share the socket and close them all at once if you whish to use <br>
close() or use shutdown() in one process to break the connection. <br>
<br>
2.7. Please explain the TIME_WAIT state. <br>
Remember that TCP guarantees all data transmitted will be delivered, <br>
if at all possible. When you close a socket, the server goes into a <br>
TIME_WAIT state, just to be really really sure that all the data has <br>
gone through. When a socket is closed, both sides agree by sending <br>
messages to each other that they will send no more data. This, it <br>
seemed to me was good enough, and after the handshaking is done, the <br>
socket should be closed. The problem is two-fold. First, there is no <br>
way to be sure that the last ack was communicated successfully. <br>
Second, there may be "wandering duplicates" left on the net that must <br>
be dealt with if they are delivered. <br>
Andrew Gierth (andrew@erlenstar.demon.co.uk) helped to explain the <br>
closing sequence in the following usenet posting: <br>
Assume that a connection is in ESTABLISHED state, and the client is <br>
about to do an orderly release. The client's sequence no. is Sc, and <br>
the server's is Ss. The pipe is empty in both directions. <br>
Client Server <br>
====== ====== <br>
ESTABLISHED ESTABLISH <br>
ED <br>
(client closes) <br>
ESTABLISHED ESTABLISH <br>
ED <br>
<CTL=FIN+ACK><SEQ=Sc><ACK=Ss> ------->> <br>
FIN_WAIT_1 <br>
<<-------- <CTL=ACK><SEQ=Ss><ACK=Sc+1> <br>
FIN_WAIT_2 CLOSE_WAI <br>
T <br>
<<-------- <CTL=FIN+ACK><SEQ=Ss><ACK=Sc+1> (server c <br>
loses) <br>
LAST_ACK <br>
<CTL=ACK>,<SEQ=Sc+1><ACK=Ss+1> ------->> <br>
TIME_WAIT CLOSED <br>
(2*msl elapses...) <br>
CLOSED <br>
Note: the +1 on the sequence numbers is because the FIN counts as one <br>
byte of data. (The above diagram is equivalent to fig. 13 from RFC <br>
793). <br>
Now consider what happens if the last of those packets is dropped in <br>
the network. The client has done with the connection; it has no more <br>
data or control info to send, and never will have. But the server does <br>
not know whether the client received all the data correctly; that's <br>
what the last ACK segment is for. Now the server may or may not care <br>
whether the client got the data, but that is not an issue for TCP; TCP <br>
is a reliable rotocol, and must distinguish between an orderly <br>
connection close where all data is transferred, and a connection abort <br>
where data may or may not have been lost. <br>
So, if that last packet is dropped, the server will retransmit it (it <br>
is, after all, an unacknowledged segment) and will expect to see a <br>
suitable ACK segment in reply. If the client went straight to CLOSED, <br>
the only possible response to that retransmit would be a RST, which <br>
would indicate to the server that data had been lost, when in fact it <br>
had not been. <br>
(Bear in mind that the server's FIN segment may, additionally, contain <br>
data.) <br>
DISCLAIMER: This is my interpretation of the RFCs (I have read all the <br>
TCP-related ones I could find), but I have not attempted to examine <br>
implementation source code or trace actual connections in order to <br>
verify it. I am satisfied that the logic is correct, though. <br>
More commentarty from Vic: <br>
The second issue was addressed by Richard Stevens (rstevens@noao.edu, <br>
author of "Unix Network Programming", see ``1.5 Where can I get source <br>
code for the book [book title]?''). I have put together quotes from <br>
some of his postings and email which explain this. I have brought <br>
together paragraphs from different postings, and have made as few <br>
changes as possible. <br>
From Richard Stevens (rstevens@noao.edu): <br>
If the duration of the TIME_WAIT state were just to handle TCP's full- <br>
duplex close, then the time would be much smaller, and it would be <br>
some function of the current RTO (retransmission timeout), not the MSL <br>
(the packet lifetime). <br>
A couple of points about the TIME_WAIT state. <br>
o The end that sends the first FIN goes into the TIME_WAIT state, <br>
because that is the end that sends the final ACK. If the other <br>
end's FIN is lost, or if the final ACK is lost, having the end that <br>
sends the first FIN maintain state about the connection guarantees <br>
that it has enough information to retransmit the final ACK. <br>
o Realize that TCP sequence numbers wrap around after 2**32 bytes <br>
have been transferred. Assume a connection between A.1500 (host A, <br>
port 1500) and B.2000. During the connection one segment is lost <br>
and retransmitted. But the segment is not really lost, it is held <br>
by some intermediate router and then re-injected into the network. <br>
(This is called a "wandering duplicate".) But in the time between <br>
the packet being lost & retransmitted, and then reappearing, the <br>
connection is closed (without any problems) and then another <br>
connection is established between the same host, same port (that <br>
is, A.1500 and B.2000; this is called another "incarnation" of the <br>
connection). But the sequence numbers chosen for the new <br>
incarnation just happen to overlap with the sequence number of the <br>
wandering duplicate that is about to reappear. (This is indeed <br>
possible, given the way sequence numbers are chosen for TCP <br>
connections.) Bingo, you are about to deliver the data from the <br>
wandering duplicate (the previous incarnation of the connection) to <br>
the new incarnation of the connection. To avoid this, you do not <br>
allow the same incarnation of the connection to be reestablished <br>
until the TIME_WAIT state terminates. <br>
Even the TIME_WAIT state doesn't complete solve the second problem, <br>
given what is called TIME_WAIT assassination. RFC 1337 has more <br>
details. <br>
o The reason that the duration of the TIME_WAIT state is 2*MSL is <br>
that the maximum amount of time a packet can wander around a <br>
network is assumed to be MSL seconds. The factor of 2 is for the <br>
round-trip. The recommended value for MSL is 120 seconds, but <br>
Berkeley-derived implementations normally use 30 seconds instead. <br>
This means a TIME_WAIT delay between 1 and 4 minutes. Solaris 2.x <br>
does indeed use the recommended MSL of 120 seconds. <br>
A wandering duplicate is a packet that appeared to be lost and was <br>
retransmitted. But it wasn't really lost ... some router had <br>
problems, held on to the packet for a while (order of seconds, could <br>
be a minute if the TTL is large enough) and then re-injects the packet <br>
back into the network. But by the time it reappears, the application <br>
that sent it originally has already retransmitted the data contained <br>
in that packet. <br>
Because of these potential problems with TIME_WAIT assassinations, one <br>
should not avoid the TIME_WAIT state by setting the SO_LINGER option <br>
to send an RST instead of the normal TCP connection termination <br>
(FIN/ACK/FIN/ACK). The TIME_WAIT state is there for a reason; it's <br>
your friend and it's there to help you :-) <br>
I have a long discussion of just this topic in my just-released <br>
"TCP/IP Illustrated, Volume 3". The TIME_WAIT state is indeed, one of <br>
the most misunderstood features of TCP. <br>
I'm currently rewriting "Unix Network Programming" (see ``1.5 Where <br>
can I get source code for the book [book title]?''). and will include <br>
lots more on this topic, as it is often confusing and misunderstood. <br>
An additional note from Andrew: <br>
Closing a socket: if SO_LINGER has not been called on a socket, then <br>
close() is not supposed to discard data. This is true on SVR4.2 (and, <br>
apparently, on all non-SVR4 systems) but apparently not on SVR4; the <br>
use of either shutdown() or SO_LINGER seems to be required to <br>
guarantee delivery of all data. <br>
<br>
2.8. Why does it take so long to detect that the peer died? <br>
From Andrew Gierth (andrew@erlenstar.demon.co.uk): <br>
Because by default, no packets are sent on the TCP connection unless <br>
there is data to send or acknowledge. <br>
So, if you are simply waiting for data from the peer, there is no way <br>
to tell if the peer has silently gone away, or just isn't ready to <br>
send any more data yet. This can be a problem (especially if the peer <br>
is a PC, and the user just hits the Big Switch...). <br>
One solution is to use the SO_KEEPALIVE option. This option enables <br>
periodic probing of the connection to ensure that the peer is still <br>
present. BE WARNED: the default timeout for this option is AT LEAST 2 <br>
HOURS. This timeout can often be altered (in a system-dependent <br>
fashion) but not normally on a per-connection basis (AFAIK). <br>
RFC1122 specifies that this timeout (if it exists) must be <br>
configurable. On the majority of Unix variants, this configuration <br>
may only be done globally, affecting all TCP connections which have <br>
keepalive enabled. The method of changing the value, moreover, is <br>
often difficult and/or poorly documented, and in any case is different <br>
for just about every version in existence. <br>
If you must change the value, look for something resembling <br>
tcp_keepidle in your kernel configuration or network options <br>
configuration. <br>
If you're sending to the peer, though, you have some better <br>
guarantees; since sending data implies receiving ACKs from the peer, <br>
then you will know after the retransmit timeout whether the peer is <br>
still alive. But the retransmit timeout is designed to allow for <br>
various contingencies, with the intention that TCP connections are not <br>
dropped simply as a result of minor network upsets. So you should <br>
still expect a delay of several minutes before getting notification of <br>
the failure. <br>
The approach taken by most application protocols currently in use on <br>
the Internet (e.g. FTP, SMTP etc.) is to implement read timeouts on <br>
the server end; the server simply gives up on the client if no <br>
requests are received in a given time period (often of the order of 15 <br>
minutes). Protocols where the connection is maintained even if idle <br>
for long periods have two choices: <br>
1. use SO_KEEPALIVE <br>
2. use a higher-level keepalive mechanism (such as sending a null <br>
request to the server every so often). <br>
<br>
2.9. What are the pros/cons of select(), non-blocking I/O and SIGIO? <br>
Using non-blocking I/O means that you have to poll sockets to see if <br>
there is data to be read from them. Polling should usually be avoided <br>
since it uses more CPU time than other techniques. <br>
Using SIGIO allows your application to do what it does and have the <br>
operating system tell it (with a signal) that there is data waiting <br>
for it on a socket. The only drawback to this soltion is that it can <br>
be confusing, and if you are dealing with multiple sockets you will <br>
have to do a select() anyway to find out which one(s) is ready to be <br>
read. <br>
Using select() is great if your application has to accept data from <br>
more than one socket at a time since it will block until any one of a <br>
number of sockets is ready with data. One other advantage to select() <br>
is that you can set a time-out value after which control will be <br>
returned to you whether any of the sockets have data for you or not. <br>
<br>
2.10. Why do I get EPROTO from read()? <br>
From Steve Rago (sar@plc.com): <br>
EPROTO means that the protocol encountered an unrecoverable error for <br>
that endpoint. EPROTO is one of those catch-all error codes used by <br>
STREAMS-based drivers when a better code isn't available. <br>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -