📄 rfc2923.txt
字号:
RFC 2923 TCP Problems with Path MTU Discovery September 2000
Note that, under IPv6, there is no DF bit -- it is implicitly on
at all times. Fragmentation is not allowed in routers, only at
the originating host. Fortunately, the minimum supported MTU for
IPv6 is 1280 octets, which is significantly larger than the 68
octet minimum in IPv4. This should make it more reasonable for
IPv6 TCP implementations to fall back to 1280 octet packets, when
IPv4 implementations will probably have to turn off DF to respond
to black hole detection.
Ideally, the ICMP black holes should be fixed when they are found.
If hosts start to implement black hole detection, it may be that
these problems will go unnoticed and unfixed. This is especially
unfortunate, since detection can take several seconds each time,
and these delays could result in a significant, hidden degradation
of performance. Hosts that implement black hole detection should
probably log detected black holes, so that they can be fixed.
2.2.
Name of Problem
Stretch ACK due to PMTUD
Classification
Congestion Control / Performance
Description
When a naively implemented TCP stack communicates with a PMTUD
equipped stack, it will try to generate an ACK for every second
full-sized segment. If it determines the full-sized segment based
on the advertised MSS, this can degrade badly in the face of
PMTUD.
The PMTU can wind up being a small fraction of the advertised MSS;
in this case, an ACK would be generated only very infrequently.
Significance
Stretch ACKs have a variety of unfortunate effects, more fully
outlined in [RFC2525]. Most of these have to do with encouraging
a more bursty connection, due to the infrequent arrival of ACKs.
They can also impede congestion window growth.
Implications
The complete implications of stretch ACKs are outlined in
[RFC2525].
Lahey Informational [Page 6]
RFC 2923 TCP Problems with Path MTU Discovery September 2000
Relevant RFCs
RFC 1122 outlines the requirements for frequency of ACK
generation. [RFC2581] expands on this and clarifies that delayed
ACK is a SHOULD, not a MUST.
Trace file demonstrating it
Made using tcpdump recording at an intermediate host. The
timestamp options from all but the first two packets have been
removed for clarity.
18:16:52.976657 A > B: S 3183102292:3183102292(0) win 16384
<mss 4312,nop,wscale 0,nop,nop,timestamp 12128 0> (DF)
18:16:52.979580 B > A: S 2022212745:2022212745(0) ack 3183102293 win
49152 <mss 4312,nop,wscale 1,nop,nop,timestamp 1592957 12128> (DF)
18:16:52.979738 A > B: . ack 1 win 17248 (DF)
18:16:52.982473 A > B: . 1:4301(4300) ack 1 win 17248 (DF)
18:16:52.982557 C > A: icmp: B unreachable -
need to frag (mtu 1500)! (DF)
18:16:52.985839 B > A: . ack 1 win 32768 (DF)
18:16:54.129928 A > B: . 1:1449(1448) ack 1 win 17248 (DF)
.
.
.
18:16:58.507078 A > B: . 1463941:1465389(1448) ack 1 win 17248 (DF)
18:16:58.507200 A > B: . 1465389:1466837(1448) ack 1 win 17248 (DF)
18:16:58.507326 A > B: . 1466837:1468285(1448) ack 1 win 17248 (DF)
18:16:58.507439 A > B: . 1468285:1469733(1448) ack 1 win 17248 (DF)
18:16:58.524763 B > A: . ack 1452357 win 32768 (DF)
18:16:58.524986 B > A: . ack 1461045 win 32768 (DF)
18:16:58.525138 A > B: . 1469733:1471181(1448) ack 1 win 17248 (DF)
18:16:58.525268 A > B: . 1471181:1472629(1448) ack 1 win 17248 (DF)
18:16:58.525393 A > B: . 1472629:1474077(1448) ack 1 win 17248 (DF)
18:16:58.525516 A > B: . 1474077:1475525(1448) ack 1 win 17248 (DF)
18:16:58.525642 A > B: . 1475525:1476973(1448) ack 1 win 17248 (DF)
18:16:58.525766 A > B: . 1476973:1478421(1448) ack 1 win 17248 (DF)
18:16:58.526063 A > B: . 1478421:1479869(1448) ack 1 win 17248 (DF)
18:16:58.526187 A > B: . 1479869:1481317(1448) ack 1 win 17248 (DF)
18:16:58.526310 A > B: . 1481317:1482765(1448) ack 1 win 17248 (DF)
18:16:58.526432 A > B: . 1482765:1484213(1448) ack 1 win 17248 (DF)
18:16:58.526561 A > B: . 1484213:1485661(1448) ack 1 win 17248 (DF)
18:16:58.526671 A > B: . 1485661:1487109(1448) ack 1 win 17248 (DF)
18:16:58.537944 B > A: . ack 1478421 win 32768 (DF)
18:16:58.538328 A > B: . 1487109:1488557(1448) ack 1 win 17248 (DF)
Lahey Informational [Page 7]
RFC 2923 TCP Problems with Path MTU Discovery September 2000
Note that the interval between ACKs is significantly larger than two
times the segment size; it works out to be almost exactly two times
the advertised MSS. This transfer was long enough that it could be
verified that the stretch ACK was not the result of lost ACK packets.
Trace file demonstrating correct behavior
Made using tcpdump recording at an intermediate host. The timestamp
options from all but the first two packets have been removed for
clarity.
18:13:32.287965 A > B: S 2972697496:2972697496(0)
win 16384 <mss 4312,nop,wscale 0,nop,nop,timestamp 11326 0> (DF)
18:13:32.290785 B > A: S 245639054:245639054(0)
ack 2972697497 win 34496 <mss 4312> (DF)
18:13:32.290941 A > B: . ack 1 win 17248 (DF)
18:13:32.293774 A > B: . 1:4313(4312) ack 1 win 17248 (DF)
18:13:32.293856 C > A: icmp: B unreachable -
need to frag (mtu 1500)! (DF)
18:13:33.637338 A > B: . 1:1461(1460) ack 1 win 17248 (DF)
.
.
.
18:13:35.561691 A > B: . 1514021:1515481(1460) ack 1 win 17248 (DF)
18:13:35.561814 A > B: . 1515481:1516941(1460) ack 1 win 17248 (DF)
18:13:35.561938 A > B: . 1516941:1518401(1460) ack 1 win 17248 (DF)
18:13:35.562059 A > B: . 1518401:1519861(1460) ack 1 win 17248 (DF)
18:13:35.562174 A > B: . 1519861:1521321(1460) ack 1 win 17248 (DF)
18:13:35.564008 B > A: . ack 1481901 win 64680 (DF)
18:13:35.564383 A > B: . 1521321:1522781(1460) ack 1 win 17248 (DF)
18:13:35.564499 A > B: . 1522781:1524241(1460) ack 1 win 17248 (DF)
18:13:35.615576 B > A: . ack 1484821 win 64680 (DF)
18:13:35.615646 B > A: . ack 1487741 win 64680 (DF)
18:13:35.615716 B > A: . ack 1490661 win 64680 (DF)
18:13:35.615784 B > A: . ack 1493581 win 64680 (DF)
18:13:35.615856 B > A: . ack 1496501 win 64680 (DF)
18:13:35.615952 A > B: . 1524241:1525701(1460) ack 1 win 17248 (DF)
18:13:35.615966 B > A: . ack 1499421 win 64680 (DF)
18:13:35.616088 A > B: . 1525701:1527161(1460) ack 1 win 17248 (DF)
18:13:35.616105 B > A: . ack 1502341 win 64680 (DF)
18:13:35.616211 A > B: . 1527161:1528621(1460) ack 1 win 17248 (DF)
18:13:35.616228 B > A: . ack 1505261 win 64680 (DF)
18:13:35.616327 A > B: . 1528621:1530081(1460) ack 1 win 17248 (DF)
18:13:35.616349 B > A: . ack 1508181 win 64680 (DF)
18:13:35.616448 A > B: . 1530081:1531541(1460) ack 1 win 17248 (DF)
18:13:35.616565 A > B: . 1531541:1533001(1460) ack 1 win 17248 (DF)
18:13:35.616891 A > B: . 1533001:1534461(1460) ack 1 win 17248 (DF)
Lahey Informational [Page 8]
RFC 2923 TCP Problems with Path MTU Discovery September 2000
In this trace, an ACK is generated for every two segments that
arrive. (The segment size is slightly larger in this trace, even
though the source hosts are the same, because of the lack of
timestamp options in this trace.)
How to detect
This condition can be observed in a packet trace when the advertised
MSS is significantly larger than the actual PMTU of a connection.
How to fix Several solutions for this problem have been proposed:
A simple solution is to ACK every other packet, regardless of size.
This has the drawback of generating large numbers of ACKs in the face
of lots of very small packets; this shows up with applications like
the X Window System.
A slightly more complex solution would monitor the size of incoming
segments and try to determine what segment size the sender is using.
This requires slightly more state in the receiver, but has the
advantage of making receiver silly window syndrome avoidance
computations more accurate [RFC813].
2.3.
Name of Problem
Determining MSS from PMTU
Classification
Performance
Description
The MSS advertised at the start of a connection should be based on
the MTU of the interfaces on the system. (For efficiency and other
reasons this may not be the largest MSS possible.) Some systems use
PMTUD determined values to determine the MSS to advertise.
This results in an advertised MSS that is smaller than the largest
MTU the system can receive.
Significance
The advertised MSS is an indication to the remote system about the
largest TCP segment that can be received [RFC879]. If this value is
too small, the remote system will be forced to use a smaller segment
size when sending, purely because the local system found a particular
PMTU earlier.
Lahey Informational [Page 9]
RFC 2923 TCP Problems with Path MTU Discovery September 2000
Given the asymmetric nature of many routes on the Internet
[Paxson97], it seems entirely possible that the return PMTU is
different from the sending PMTU. Limiting the segment size in this
way can reduce performance and frustrate the PMTUD algorithm.
Even if the route was symmetric, setting this artificially lowered
limit on segment size will make it impossible to probe later to
determine if the PMTU has changed.
Implications
The whole point of PMTUD is to send as large a segment as possible.
If long-running connections cannot successfully probe for larger
PMTU, then potential performance gains will be impossible to realize.
This destroys the whole point of PMTUD.
Relevant RFCs RFC 1191. [RFC879] provides a complete discussion of
MSS calculations and appropriate values. Note that this practice
does not violate any of the specifications in these RFCs.
Trace file demonstrating it
This trace was made using tcpdump running on an intermediate host.
Host A initiates two separate consecutive connections, A1 and A2, to
host B. Router C is the location of the MTU bottleneck. As usual,
TCP options are removed from all non-SYN packets.
22:33:32.305912 A1 > B: S 1523306220:1523306220(0)
win 8760 <mss 1460> (DF)
22:33:32.306518 B > A1: S 729966260:729966260(0)
ack 1523306221 win 16384 <mss 65240>
22:33:32.310307 A1 > B: . ack 1 win 8760 (DF)
22:33:32.323496 A1 > B: P 1:1461(1460) ack 1 win 8760 (DF)
22:33:32.323569 C > A1: icmp: 129.99.238.5 unreachable -
need to frag (mtu 1024) (DF) (ttl 255, id 20666)
22:33:32.783694 A1 > B: . 1:985(984) ack 1 win 8856 (DF)
22:33:32.840817 B > A1: . ack 985 win 16384
22:33:32.845651 A1 > B: . 1461:2445(984) ack 1 win 8856 (DF)
22:33:32.846094 B > A1: . ack 985 win 16384
22:33:33.724392 A1 > B: . 985:1969(984) ack 1 win 8856 (DF)
22:33:33.724893 B > A1: . ack 2445 win 14924
22:33:33.728591 A1 > B: . 2445:2921(476) ack 1 win 8856 (DF)
22:33:33.729161 A1 > B: . ack 1 win 8856 (DF)
22:33:33.840758 B > A1: . ack 2921 win 16384
[...]
22:33:34.238659 A1 > B: F 7301:8193(892) ack 1 win 8856 (DF)
22:33:34.239036 B > A1: . ack 8194 win 15492
22:33:34.239303 B > A1: F 1:1(0) ack 8194 win 16384
Lahey Informational [Page 10]
RFC 2923 TCP Problems with Path MTU Discovery September 2000
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -