📄 rfc2525.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45





Paxson, et. al.              Informational                     [Page 22]

RFC 2525              TCP Implementation Problems             March 1999


   How to detect
      If source code is available, that is generally the easiest way to
      detect this problem.  Search for each modification to the cwnd
      variable; (at least) one of these will be for congestion
      avoidance, and inspection of the related code should immediately
      identify the problem if present.

      The problem can also be detected by closely examining packet
      traces taken near the sender.  During congestion avoidance, cwnd
      will increase by an additional segment upon the receipt of
      (typically) eight acknowledgements without a loss.  This increase
      is in addition to the one segment increase per round trip time (or
      two round trip times if the receiver is using delayed ACKs).

      Furthermore, graphs of the sequence number vs. time, taken from
      packet traces, are normally linear during congestion avoidance.
      When viewing packet traces of transfers from senders exhibiting
      this problem, the graphs appear quadratic instead of linear.

      Finally, the traces will show that, with sufficiently large
      windows, nearly every loss event results in a timeout.

   How to fix
      This problem may be corrected by removing the "+ MSS/8" term from
      the congestion avoidance code that increases cwnd each time an ACK
      of new data is received.

2.7.

   Name of Problem
      Initial RTO too low

   Classification
      Performance

   Description
      When a TCP first begins transmitting data, it lacks the RTT
      measurements necessary to have computed an adaptive retransmission
      timeout (RTO).  RFC 1122, 4.2.3.1, states that a TCP SHOULD
      initialize RTO to 3 seconds.  A TCP that uses a lower value
      exhibits "Initial RTO too low".

   Significance
      In environments with large RTTs (where "large" means any value
      larger than the initial RTO), TCPs will experience very poor
      performance.





Paxson, et. al.              Informational                     [Page 23]

RFC 2525              TCP Implementation Problems             March 1999


   Implications
      Whenever RTO < RTT, very poor performance can result as packets
      are unnecessarily retransmitted (because RTO will expire before an
      ACK for the packet can arrive) and the connection enters slow
      start and congestion avoidance.  Generally, the algorithms for
      computing RTO avoid this problem by adding a positive term to the
      estimated RTT.  However, when a connection first begins it must
      use some estimate for RTO, and if it picks a value less than RTT,
      the above problems will arise.

      Furthermore, when the initial RTO < RTT, it can take a long time
      for the TCP to correct the problem by adapting the RTT estimate,
      because the use of Karn's algorithm (mandated by RFC 1122,
      4.2.3.1) will discard many of the candidate RTT measurements made
      after the first timeout, since they will be measurements of
      retransmitted segments.

   Relevant RFCs
      RFC 1122 states that TCPs SHOULD initialize RTO to 3 seconds and
      MUST implement Karn's algorithm.

   Trace file demonstrating it
      The following trace file was taken using tcpdump at host A, the
      data sender.  The advertised window and SYN options have been
      omitted for clarity.

   07:52:39.870301 A > B: S 2786333696:2786333696(0)
   07:52:40.548170 B > A: S 130240000:130240000(0) ack 2786333697
   07:52:40.561287 A > B: P 1:513(512) ack 1
   07:52:40.753466 A > B: . 1:513(512) ack 1
   07:52:41.133687 A > B: . 1:513(512) ack 1
   07:52:41.458529 B > A: . ack 513
   07:52:41.458686 A > B: . 513:1025(512) ack 1
   07:52:41.458797 A > B: P 1025:1537(512) ack 1
   07:52:41.541633 B > A: . ack 513
   07:52:41.703732 A > B: . 513:1025(512) ack 1
   07:52:42.044875 B > A: . ack 513
   07:52:42.173728 A > B: . 513:1025(512) ack 1
   07:52:42.330861 B > A: . ack 1537
   07:52:42.331129 A > B: . 1537:2049(512) ack 1
   07:52:42.331262 A > B: P 2049:2561(512) ack 1
   07:52:42.623673 A > B: . 1537:2049(512) ack 1
   07:52:42.683203 B > A: . ack 1537
   07:52:43.044029 B > A: . ack 1537
   07:52:43.193812 A > B: . 1537:2049(512) ack 1






Paxson, et. al.              Informational                     [Page 24]

RFC 2525              TCP Implementation Problems             March 1999


      Note from the SYN/SYN-ACK exchange, the RTT is over 600 msec.
      However, from the elapsed time between the third and fourth lines
      (the first packet being sent and then retransmitted), it is
      apparent the RTO was initialized to under 200 msec.  The next line
      shows that this value has doubled to 400 msec (correct exponential
      backoff of RTO), but that still does not suffice to avoid an
      unnecessary retransmission.

      Finally, an ACK from B arrives for the first segment.  Later two
      more duplicate ACKs for 513 arrive, indicating that both the
      original and the two retransmissions arrived at B.  (Indeed, a
      concurrent trace at B showed that no packets were lost during the
      entire connection).  This ACK opens the congestion window to two
      packets, which are sent back-to-back, but at 07:52:41.703732 RTO
      again expires after a little over 200 msec, leading to an
      unnecessary retransmission, and the pattern repeats.  By the end
      of the trace excerpt above, 1536 bytes have been successfully
      transmitted from A to B, over an interval of more than 2 seconds,
      reflecting terrible performance.

   Trace file demonstrating correct behavior
      The following trace file was taken using tcpdump at host C, the
      data sender.  The advertised window and SYN options have been
      omitted for clarity.

   17:30:32.090299 C > D: S 2031744000:2031744000(0)
   17:30:32.900325 D > C: S 262737964:262737964(0) ack 2031744001
   17:30:32.900326 C > D: . ack 1
   17:30:32.910326 C > D: . 1:513(512) ack 1
   17:30:34.150355 D > C: . ack 513
   17:30:34.150356 C > D: . 513:1025(512) ack 1
   17:30:34.150357 C > D: . 1025:1537(512) ack 1
   17:30:35.170384 D > C: . ack 1025
   17:30:35.170385 C > D: . 1537:2049(512) ack 1
   17:30:35.170386 C > D: . 2049:2561(512) ack 1
   17:30:35.320385 D > C: . ack 1537
   17:30:35.320386 C > D: . 2561:3073(512) ack 1
   17:30:35.320387 C > D: . 3073:3585(512) ack 1
   17:30:35.730384 D > C: . ack 2049

      The initial SYN/SYN-ACK exchange shows that RTT is more than 800
      msec, and for some subsequent packets it rises above 1 second, but
      C's retransmit timer does not ever expire.

   References
      This problem is documented in [Paxson97].





Paxson, et. al.              Informational                     [Page 25]

RFC 2525              TCP Implementation Problems             March 1999


   How to detect
      This problem is readily detected by inspecting a packet trace of
      the startup of a TCP connection made over a long-delay path.  It
      can be diagnosed from either a sender-side or receiver-side trace.
      Long-delay paths can often be found by locating remote sites on
      other continents.

   How to fix
      As this problem arises from a faulty initialization, one hopes
      fixing it requires a one-line change to the TCP source code.

2.8.

   Name of Problem
      Failure of window deflation after loss recovery

   Classification
      Congestion control / performance

   Description
      The fast recovery algorithm allows TCP senders to continue to
      transmit new segments during loss recovery.  First, fast
      retransmission is initiated after a TCP sender receives three
      duplicate ACKs.  At this point, a retransmission is sent and cwnd
      is halved.  The fast recovery algorithm then allows additional
      segments to be sent when sufficient additional duplicate ACKs
      arrive.  Some implementations of fast recovery compute when to
      send additional segments by artificially incrementing cwnd, first
      by three segments to account for the three duplicate ACKs that
      triggered fast retransmission, and subsequently by 1 MSS for each
      new duplicate ACK that arrives.  When cwnd allows, the sender
      transmits new data segments.

      When an ACK arrives that covers new data, cwnd is to be reduced by
      the amount by which it was artificially increased.  However, some
      TCP implementations fail to "deflate" the window, causing an
      inappropriate amount of data to be sent into the network after
      recovery.  One cause of this problem is the "header prediction"
      code, which is used to handle incoming segments that require
      little work.  In some implementations of TCP, the header
      prediction code does not check to make sure cwnd has not been
      artificially inflated, and therefore does not reduce the
      artificially increased cwnd when appropriate.

   Significance
      TCP senders that exhibit this problem will transmit a burst of
      data immediately after recovery, which can degrade performance, as
      well as network stability.  Effectively, the sender does not



Paxson, et. al.              Informational                     [Page 26]

RFC 2525              TCP Implementation Problems             March 1999


      reduce the size of cwnd as much as it should (to half its value
      when loss was detected), if at all.  This can harm the performance
      of the TCP connection itself, as well as competing TCP flows.

   Implications
      A TCP sender exhibiting this problem does not reduce cwnd
      appropriately in times of congestion, and therefore may contribute
      to congestive collapse.

   Relevant RFCs
      RFC 2001 outlines the fast retransmit/fast recovery algorithms.
      [Brakmo95] outlines this implementation problem and offers a fix.

   Trace file demonstrating it
      The following trace file was taken using tcpdump at host A, the
      data sender.  The advertised window (which never changed) has been
      omitted for clarity, except for the first packet sent by each
      host.

   08:22:56.825635 A.7505 > B.7505: . 29697:30209(512) ack 1 win 4608
   08:22:57.038794 B.7505 > A.7505: . ack 27649 win 4096
   08:22:57.039279 A.7505 > B.7505: . 30209:30721(512) ack 1
   08:22:57.321876 B.7505 > A.7505: . ack 28161
   08:22:57.322356 A.7505 > B.7505: . 30721:31233(512) ack 1
   08:22:57.347128 B.7505 > A.7505: . ack 28673
   08:22:57.347572 A.7505 > B.7505: . 31233:31745(512) ack 1
   08:22:57.347782 A.7505 > B.7505: . 31745:32257(512) ack 1
   08:22:57.936393 B.7505 > A.7505: . ack 29185
   08:22:57.936864 A.7505 > B.7505: . 32257:32769(512) ack 1
   08:22:57.950802 B.7505 > A.7505: . ack 29697 win 4096
   08:22:57.951246 A.7505 > B.7505: . 32769:33281(512) ack 1
   08:22:58.169422 B.7505 > A.7505: . ack 29697
   08:22:58.638222 B.7505 > A.7505: . ack 29697
   08:22:58.643312 B.7505 > A.7505: . ack 29697
   08:22:58.643669 A.7505 > B.7505: . 29697:30209(512) ack 1
   08:22:58.936436 B.7505 > A.7505: . ack 29697
   08:22:59.002614 B.7505 > A.7505: . ack 29697
   08:22:59.003026 A.7505 > B.7505: . 33281:33793(512) ack 1
   08:22:59.682902 B.7505 > A.7505: . ack 33281
   08:22:59.683391 A.7505 > B.7505: P 33793:34305(512) ack 1
   08:22:59.683748 A.7505 > B.7505: P 34305:34817(512) ack 1 ***
   08:22:59.684043 A.7505 > B.7505: P 34817:35329(512) ack 1
   08:22:59.684266 A.7505 > B.7505: P 35329:35841(512) ack 1
   08:22:59.684567 A.7505 > B.7505: P 35841:36353(512) ack 1
   08:22:59.684810 A.7505 > B.7505: P 36353:36865(512) ack 1
   08:22:59.685094 A.7505 > B.7505: P 36865:37377(512) ack 1





Paxson, et. al.              Informational                     [Page 27]

RFC 2525              TCP Implementation Problems             March 1999


      The first 12 lines of the trace show incoming ACKs clocking out a
      window of data segments.  At this point in the transfer, cwnd is 7
      segments.  The next 4 lines of the trace show 3 duplicate ACKs
      arriving from the receiver, followed by a retransmission from the
      sender.  At this point, cwnd is halved (to 3 segments) and
      artificially incremented by the three duplicate ACKs that have
      arrived, making cwnd 6 segments.  The next two lines show 2 more
      duplicate ACKs arriving, each of which increases cwnd by 1
      segment.  So, after these two duplicate ACKs arrive the cwnd is 8
      segments and the sender has permission to send 1 new segment
      (since there are 7 segments outstanding).  The next line in the
      trace shows this new segment being transmitted.  The next packet
      shown in the trace is an ACK from host B that covers the first 7
      outstanding segments (all but the new segment sent during
      recovery).  This should cause cwnd to be reduced to 3 segments and
      2 segments to be transmitted (since there is already 1 outstanding
      segment in the network).  However, as shown
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -