📄 rfc2923.txt
字号:
RFC 2923 TCP Problems with Path MTU Discovery September 2000 Note that, under IPv6, there is no DF bit -- it is implicitly on at all times. Fragmentation is not allowed in routers, only at the originating host. Fortunately, the minimum supported MTU for IPv6 is 1280 octets, which is significantly larger than the 68 octet minimum in IPv4. This should make it more reasonable for IPv6 TCP implementations to fall back to 1280 octet packets, when IPv4 implementations will probably have to turn off DF to respond to black hole detection. Ideally, the ICMP black holes should be fixed when they are found. If hosts start to implement black hole detection, it may be that these problems will go unnoticed and unfixed. This is especially unfortunate, since detection can take several seconds each time, and these delays could result in a significant, hidden degradation of performance. Hosts that implement black hole detection should probably log detected black holes, so that they can be fixed.2.2. Name of Problem Stretch ACK due to PMTUD Classification Congestion Control / Performance Description When a naively implemented TCP stack communicates with a PMTUD equipped stack, it will try to generate an ACK for every second full-sized segment. If it determines the full-sized segment based on the advertised MSS, this can degrade badly in the face of PMTUD. The PMTU can wind up being a small fraction of the advertised MSS; in this case, an ACK would be generated only very infrequently. Significance Stretch ACKs have a variety of unfortunate effects, more fully outlined in [RFC2525]. Most of these have to do with encouraging a more bursty connection, due to the infrequent arrival of ACKs. They can also impede congestion window growth. Implications The complete implications of stretch ACKs are outlined in [RFC2525].Lahey Informational [Page 6]RFC 2923 TCP Problems with Path MTU Discovery September 2000 Relevant RFCs RFC 1122 outlines the requirements for frequency of ACK generation. [RFC2581] expands on this and clarifies that delayed ACK is a SHOULD, not a MUST. Trace file demonstrating it Made using tcpdump recording at an intermediate host. The timestamp options from all but the first two packets have been removed for clarity. 18:16:52.976657 A > B: S 3183102292:3183102292(0) win 16384 <mss 4312,nop,wscale 0,nop,nop,timestamp 12128 0> (DF) 18:16:52.979580 B > A: S 2022212745:2022212745(0) ack 3183102293 win 49152 <mss 4312,nop,wscale 1,nop,nop,timestamp 1592957 12128> (DF) 18:16:52.979738 A > B: . ack 1 win 17248 (DF) 18:16:52.982473 A > B: . 1:4301(4300) ack 1 win 17248 (DF) 18:16:52.982557 C > A: icmp: B unreachable - need to frag (mtu 1500)! (DF) 18:16:52.985839 B > A: . ack 1 win 32768 (DF) 18:16:54.129928 A > B: . 1:1449(1448) ack 1 win 17248 (DF) . . . 18:16:58.507078 A > B: . 1463941:1465389(1448) ack 1 win 17248 (DF) 18:16:58.507200 A > B: . 1465389:1466837(1448) ack 1 win 17248 (DF) 18:16:58.507326 A > B: . 1466837:1468285(1448) ack 1 win 17248 (DF) 18:16:58.507439 A > B: . 1468285:1469733(1448) ack 1 win 17248 (DF) 18:16:58.524763 B > A: . ack 1452357 win 32768 (DF) 18:16:58.524986 B > A: . ack 1461045 win 32768 (DF) 18:16:58.525138 A > B: . 1469733:1471181(1448) ack 1 win 17248 (DF) 18:16:58.525268 A > B: . 1471181:1472629(1448) ack 1 win 17248 (DF) 18:16:58.525393 A > B: . 1472629:1474077(1448) ack 1 win 17248 (DF) 18:16:58.525516 A > B: . 1474077:1475525(1448) ack 1 win 17248 (DF) 18:16:58.525642 A > B: . 1475525:1476973(1448) ack 1 win 17248 (DF) 18:16:58.525766 A > B: . 1476973:1478421(1448) ack 1 win 17248 (DF) 18:16:58.526063 A > B: . 1478421:1479869(1448) ack 1 win 17248 (DF) 18:16:58.526187 A > B: . 1479869:1481317(1448) ack 1 win 17248 (DF) 18:16:58.526310 A > B: . 1481317:1482765(1448) ack 1 win 17248 (DF) 18:16:58.526432 A > B: . 1482765:1484213(1448) ack 1 win 17248 (DF) 18:16:58.526561 A > B: . 1484213:1485661(1448) ack 1 win 17248 (DF) 18:16:58.526671 A > B: . 1485661:1487109(1448) ack 1 win 17248 (DF) 18:16:58.537944 B > A: . ack 1478421 win 32768 (DF) 18:16:58.538328 A > B: . 1487109:1488557(1448) ack 1 win 17248 (DF)Lahey Informational [Page 7]RFC 2923 TCP Problems with Path MTU Discovery September 2000 Note that the interval between ACKs is significantly larger than two times the segment size; it works out to be almost exactly two times the advertised MSS. This transfer was long enough that it could be verified that the stretch ACK was not the result of lost ACK packets. Trace file demonstrating correct behavior Made using tcpdump recording at an intermediate host. The timestamp options from all but the first two packets have been removed for clarity. 18:13:32.287965 A > B: S 2972697496:2972697496(0) win 16384 <mss 4312,nop,wscale 0,nop,nop,timestamp 11326 0> (DF) 18:13:32.290785 B > A: S 245639054:245639054(0) ack 2972697497 win 34496 <mss 4312> (DF) 18:13:32.290941 A > B: . ack 1 win 17248 (DF) 18:13:32.293774 A > B: . 1:4313(4312) ack 1 win 17248 (DF) 18:13:32.293856 C > A: icmp: B unreachable - need to frag (mtu 1500)! (DF) 18:13:33.637338 A > B: . 1:1461(1460) ack 1 win 17248 (DF) . . . 18:13:35.561691 A > B: . 1514021:1515481(1460) ack 1 win 17248 (DF) 18:13:35.561814 A > B: . 1515481:1516941(1460) ack 1 win 17248 (DF) 18:13:35.561938 A > B: . 1516941:1518401(1460) ack 1 win 17248 (DF) 18:13:35.562059 A > B: . 1518401:1519861(1460) ack 1 win 17248 (DF) 18:13:35.562174 A > B: . 1519861:1521321(1460) ack 1 win 17248 (DF) 18:13:35.564008 B > A: . ack 1481901 win 64680 (DF) 18:13:35.564383 A > B: . 1521321:1522781(1460) ack 1 win 17248 (DF) 18:13:35.564499 A > B: . 1522781:1524241(1460) ack 1 win 17248 (DF) 18:13:35.615576 B > A: . ack 1484821 win 64680 (DF) 18:13:35.615646 B > A: . ack 1487741 win 64680 (DF) 18:13:35.615716 B > A: . ack 1490661 win 64680 (DF) 18:13:35.615784 B > A: . ack 1493581 win 64680 (DF) 18:13:35.615856 B > A: . ack 1496501 win 64680 (DF) 18:13:35.615952 A > B: . 1524241:1525701(1460) ack 1 win 17248 (DF) 18:13:35.615966 B > A: . ack 1499421 win 64680 (DF) 18:13:35.616088 A > B: . 1525701:1527161(1460) ack 1 win 17248 (DF) 18:13:35.616105 B > A: . ack 1502341 win 64680 (DF) 18:13:35.616211 A > B: . 1527161:1528621(1460) ack 1 win 17248 (DF) 18:13:35.616228 B > A: . ack 1505261 win 64680 (DF) 18:13:35.616327 A > B: . 1528621:1530081(1460) ack 1 win 17248 (DF) 18:13:35.616349 B > A: . ack 1508181 win 64680 (DF) 18:13:35.616448 A > B: . 1530081:1531541(1460) ack 1 win 17248 (DF) 18:13:35.616565 A > B: . 1531541:1533001(1460) ack 1 win 17248 (DF) 18:13:35.616891 A > B: . 1533001:1534461(1460) ack 1 win 17248 (DF)Lahey Informational [Page 8]RFC 2923 TCP Problems with Path MTU Discovery September 2000 In this trace, an ACK is generated for every two segments that arrive. (The segment size is slightly larger in this trace, even though the source hosts are the same, because of the lack of timestamp options in this trace.) How to detect This condition can be observed in a packet trace when the advertised MSS is significantly larger than the actual PMTU of a connection. How to fix Several solutions for this problem have been proposed: A simple solution is to ACK every other packet, regardless of size. This has the drawback of generating large numbers of ACKs in the face of lots of very small packets; this shows up with applications like the X Window System. A slightly more complex solution would monitor the size of incoming segments and try to determine what segment size the sender is using. This requires slightly more state in the receiver, but has the advantage of making receiver silly window syndrome avoidance computations more accurate [RFC813].2.3. Name of Problem Determining MSS from PMTU Classification Performance Description The MSS advertised at the start of a connection should be based on the MTU of the interfaces on the system. (For efficiency and other reasons this may not be the largest MSS possible.) Some systems use PMTUD determined values to determine the MSS to advertise. This results in an advertised MSS that is smaller than the largest MTU the system can receive. Significance The advertised MSS is an indication to the remote system about the largest TCP segment that can be received [RFC879]. If this value is too small, the remote system will be forced to use a smaller segment size when sending, purely because the local system found a particular PMTU earlier.Lahey Informational [Page 9]RFC 2923 TCP Problems with Path MTU Discovery September 2000 Given the asymmetric nature of many routes on the Internet [Paxson97], it seems entirely possible that the return PMTU is different from the sending PMTU. Limiting the segment size in this way can reduce performance and frustrate the PMTUD algorithm. Even if the route was symmetric, setting this artificially lowered limit on segment size will make it impossible to probe later to determine if the PMTU has changed. Implications The whole point of PMTUD is to send as large a segment as possible. If long-running connections cannot successfully probe for larger PMTU, then potential performance gains will be impossible to realize. This destroys the whole point of PMTUD. Relevant RFCs RFC 1191. [RFC879] provides a complete discussion of MSS calculations and appropriate values. Note that this practice does not violate any of the specifications in these RFCs. Trace file demonstrating it This trace was made using tcpdump running on an intermediate host. Host A initiates two separate consecutive connections, A1 and A2, to host B. Router C is the location of the MTU bottleneck. As usual, TCP options are removed from all non-SYN packets. 22:33:32.305912 A1 > B: S 1523306220:1523306220(0) win 8760 <mss 1460> (DF) 22:33:32.306518 B > A1: S 729966260:729966260(0) ack 1523306221 win 16384 <mss 65240> 22:33:32.310307 A1 > B: . ack 1 win 8760 (DF) 22:33:32.323496 A1 > B: P 1:1461(1460) ack 1 win 8760 (DF) 22:33:32.323569 C > A1: icmp: 129.99.238.5 unreachable - need to frag (mtu 1024) (DF) (ttl 255, id 20666) 22:33:32.783694 A1 > B: . 1:985(984) ack 1 win 8856 (DF) 22:33:32.840817 B > A1: . ack 985 win 16384 22:33:32.845651 A1 > B: . 1461:2445(984) ack 1 win 8856 (DF) 22:33:32.846094 B > A1: . ack 985 win 16384 22:33:33.724392 A1 > B: . 985:1969(984) ack 1 win 8856 (DF) 22:33:33.724893 B > A1: . ack 2445 win 14924 22:33:33.728591 A1 > B: . 2445:2921(476) ack 1 win 8856 (DF) 22:33:33.729161 A1 > B: . ack 1 win 8856 (DF) 22:33:33.840758 B > A1: . ack 2921 win 16384 [...] 22:33:34.238659 A1 > B: F 7301:8193(892) ack 1 win 8856 (DF) 22:33:34.239036 B > A1: . ack 8194 win 15492 22:33:34.239303 B > A1: F 1:1(0) ack 8194 win 16384Lahey Informational [Page 10]RFC 2923 TCP Problems with Path MTU Discovery September 2000
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -