📄 rfc3155.txt
字号:
TCPs should immediately send an acknowledgement when data is received
out-of-order [RFC2581], providing the next expected sequence number
with no delay, so that the sender can retransmit the required data as
quickly as possible and the receiver can resume delivery of data to
the receiving application. When an acknowledgement carries the same
expected sequence number as an acknowledgement that has already been
sent for the last in-order segment received, these acknowledgement
are called "duplicate ACKs".
Because IP networks are allowed to reorder packets, the receiver may
send duplicate acknowledgments for segments that arrive out of order
due to routing changes, link-level retransmission, etc. When a TCP
sender receives three duplicate ACKs, fast retransmit [RFC2581]
allows it to infer that a segment was lost. The sender retransmits
what it considers to be this lost segment without waiting for the
full retransmission timeout, thus saving time.
After a fast retransmit, a sender halves its congestion window and
invokes the fast recovery [RFC2581] algorithm, whereby it invokes
congestion avoidance from a halved congestion window, but does not
invoke slow start from a one-segment congestion window as it would do
after a retransmission timeout. As the sender is still receiving
dupacks, it knows the receiver is receiving packets sent, so the full
reduction after a timeout when no communication has been received is
not called for. This relatively safe optimization also saves time.
It is important to be realistic about the maximum throughput that TCP
can have over a connection that traverses a high error-rate link. In
general, TCP will increase its congestion window beyond the delay-
bandwidth product. TCP's congestion avoidance strategy is additive-
increase, multiplicative-decrease, which means that if additional
errors are encountered before the congestion window recovers
completely from a 50-percent reduction, the effect can be a "downward
Dawkins, et al. Best Current Practice [Page 6]
RFC 3155 PILC - Links with Errors August 2001
spiral" of the congestion window due to additional 50-percent
reductions. Even using Fast Retransmit/Fast Recovery, the sender
will halve the congestion window each time a window contains one or
more segments that are lost, and will re-open the window by one
additional segment for each congestion window's worth of
acknowledgement received.
If a connection's path traverses a link that loses one or more
segments during this recovery period, the one-half reduction takes
place again, this time on a reduced congestion window - and this
downward spiral will continue to hold the congestion window below
path capacity until the connection is able to recover completely by
additive increase without experiencing loss.
Of course, no downward spiral occurs if the error rate is constantly
high and the congestion window always remains small; the
multiplicative-increase "slow start" will be exited early, and the
congestion window remains low for the duration of the TCP connection.
In links with high error rates, the TCP window may remain rather
small for long periods of time.
Not all causes of small windows are related to errors. For example,
HTTP/1.0 commonly closes TCP connections to indicate boundaries
between requested resources. This means that these applications are
constantly closing "trained" TCP connections and opening "untrained"
TCP connections which will execute slow start, beginning with one or
two segments. This can happen even with HTTP/1.1, if webmasters
configure their HTTP/1.1 servers to close connections instead of
waiting to see if the connection will be useful again.
A small window - especially a window of less than four segments -
effectively prevents the sender from taking advantage of Fast
Retransmits. Moreover, efficient recovery from multiple losses
within a single window requires adoption of new proposals (NewReno
[RFC2582]).
Recommendation: Implement Fast Retransmit and Fast Recovery at this
time. This is a widely-implemented optimization and is currently at
Proposed Standard level. [RFC2488] recommends implementation of Fast
Retransmit/Fast Recovery in satellite environments.
2.3 Selective Acknowledgements [RFC2018, RFC2883]
Selective Acknowledgements [RFC2018] allow the repair of multiple
segment losses per window without requiring one (or more) round-trips
per loss.
Dawkins, et al. Best Current Practice [Page 7]
RFC 3155 PILC - Links with Errors August 2001
[RFC2883] proposes a minor extension to SACK that allows receiving
TCPs to provide more information about the order of delivery of
segments, allowing "more robust operation in an environment of
reordered packets, ACK loss, packet replication, and/or early
retransmit timeouts". Unless explicitly stated otherwise, in this
document, "Selective Acknowledgements" (or "SACK") refers to the
combination of [RFC2018] and [RFC2883].
Selective acknowledgments are most useful in LFNs ("Long Fat
Networks") because of the long round trip times that may be
encountered in these environments, according to Section 1.1 of
[RFC1323], and are especially useful if large windows are required,
because there is a higher probability of multiple segment losses per
window.
On the other hand, if error rates are generally low but occasionally
higher due to channel conditions, TCP will have the opportunity to
increase its window to larger values during periods of improved
channel conditions between bursts of errors. When bursts of errors
occur, multiple losses within a window are likely to occur. In this
case, SACK would provide benefits in speeding the recovery and
preventing unnecessary reduction of the window size.
Recommendation: Implement SACK as specified in [RFC2018] and updated
by [RFC2883], both Proposed Standards. In cases where SACK cannot be
enabled for both sides of a connection, TCP senders may use NewReno
[RFC2582] to better handle partial ACKs and multiple losses within a
single window.
3.0 Summary of Recommendations
The Internet does not provide a widely-available loss feedback
mechanism that allows TCP to distinguish between congestion loss and
transmission error. Because congestion affects all traffic on a path
while transmission loss affects only the specific traffic
encountering uncorrected errors, avoiding congestion has to take
precedence over quickly repairing transmission errors. This means
that the best that can be achieved without new feedback mechanisms is
minimizing the amount of time that is spent unnecessarily in
congestion avoidance.
The Fast Retransmit/Fast Recovery mechanism allows quick repair of
loss without giving up the safety of congestion avoidance. In order
for Fast Retransmit/Fast Recovery to work, the window size must be
large enough to force the receiver to send three duplicate
acknowledgments before the retransmission timeout interval expires,
forcing full TCP slow-start.
Dawkins, et al. Best Current Practice [Page 8]
RFC 3155 PILC - Links with Errors August 2001
Selective Acknowledgements (SACK) extend the benefit of Fast
Retransmit/Fast Recovery to situations where multiple segment losses
in the window need to be repaired more quickly than can be
accomplished by executing Fast Retransmit for each segment loss, only
to discover the next segment loss.
These mechanisms are not limited to wireless environments. They are
usable in all environments.
4.0 Topics For Further Work
"Limited Transmit" [RFC3042] has been specified as an optimization
extending Fast Retransmit/Fast Recovery for TCP connections with
small congestion windows that will not trigger three duplicate
acknowledgments. This specification is deemed safe, and it also
provides benefits for TCP connections that experience a large amount
of packet (data or ACK) loss. Implementors should evaluate this
standards track specification for TCP in loss environments.
Delayed Duplicate Acknowledgements [MV97, VMPM99] attempts to prevent
TCP-level retransmission when link-level retransmission is still in
progress, adding additional traffic to the network. This proposal is
worthy of additional study, but is not recommended at this time,
because we don't know how to calculate appropriate amounts of delay
for an arbitrary network topology.
It is not possible to use explicit congestion notification [RFC2481]
as a surrogate for explicit transmission error notification (no
matter how much we wish it was!). Some mechanism to provide explicit
notification of transmission error would be very helpful. This might
be more easily provided in a PEP environment, especially when the PEP
is the "first hop" in a connection path, because current checksum
mechanisms do not distinguish between transmission error to a payload
and transmission error to the header. Furthermore, if the header is
damaged, sending explicit transmission error notification to the
right endpoint is problematic.
Losses that take place on the ACK stream, especially while a TCP is
learning network characteristics, can make the data stream quite
bursty (resulting in losses on the data stream, as well). Several
ways of limiting this burstiness have been proposed, including TCP
transmit pacing at the sender and ACK rate control within the
network.
"Appropriate Byte Counting" (ABC) [ALL99], has been proposed as a way
of opening the congestion window based on the number of bytes that
have been successfully transfered to the receiver, giving more
appropriate behavior for application protocols that initiate
Dawkins, et al. Best Current Practice [Page 9]
RFC 3155 PILC - Links with Errors August 2001
connections with relatively short packets. For SMTP [RFC2821], for
instance, the client might send a short HELO packet, a short MAIL
packet, one or more short RCPT packets, and a short DATA packet -
followed by the entire mail body sent as maximum-length packets. An
ABC TCP sender would not use ACKs for each of these short packets to
increase the congestion window to allow additional full-length
packets. ABC is worthy of additional study, but is not recommended
at this time, because ABC can lead to increased burstiness when
acknowledgments are lost.
4.1 Achieving, and maintaining, large windows
The recommendations described in this document will aid TCPs in
injecting packets into ERRORed connections as fast as possible
without destabilizing the Internet, and so optimizing the use of
available bandwidth.
In addition to these TCP-level recommendations, there is still
additional work to do at the application level, especially with the
dominant application protocol on the World Wide Web, HTTP.
HTTP/1.0 (and earlier versions) closes TCP connections to signal a
receiver that all of a requested resource had been transmitted.
Because WWW objects tend to be small in size [MOGUL], TCPs carrying
HTTP/1.0 traffic experience difficulty in "training" on available
path capacity (a substantial portion of the transfer has already
happened by the time TCP exits slow start).
Several HTTP modifications have been introduced to improve this
interaction with TCP ("persistent connections" in HTTP/1.0, with
improvements in HTTP/1.1 [RFC2616]). For a variety of reasons, many
HTTP interactions are still HTTP/1.0-style - relatively short-lived.
Proposals which reuse TCP congestion information across connections,
like TCP Control Block Interdependence [RFC2140], or the more recent
Congestion Manager [BS00] proposal, will have the effect of making
multiple parallel connections impact the network as if they were a
single connection, "trained" after a single startup transient. These
proposals are critical to the long-term stability of the Internet,
because today's users always have the choice of clicking on the
"reload" button in their browsers and cutting off TCP's exponential
backoff - replacing connections which are building knowledge of the
available bandwidth with connections with no knowledge at all.
Dawkins, et al. Best Current Practice [Page 10]
RFC 3155 PILC - Links with Errors August 2001
5.0 Security Considerations
A potential vulnerability introduced by Fast Retransmit/Fast Recovery
is (as pointed out in [RFC2581]) that an attacker may force TCP
connections to grind to a halt, or, more dangerously, behave more
aggressively. The latter possibility may lead to congestion
collapse, at least in some regions of the network.
Selective acknowledgments is believed to neither strengthen nor
weaken TCP's current security properties [RFC2018].
Given that the recommendations in this document are performed on an
end-to-end basis, they continue working even in the presence of end-
to-end IPsec. This is in direct contrast with mechanisms such as
PEP's which are implemented in intermediate nodes (section 1.2).
6.0 IANA Considerations
This document is a pointer to other, existing IETF standards. There
are no new IANA considerations.
7.0 Acknowledgements
This recommendation has grown out of RFC 2757, "Long Thin Networks",
which was in turn based on work done in the IETF TCPSAT working
group. The authors are indebted to the active members of the PILC
working group. In particular, Mark Allman and Lloyd Wood gave us
copious and insightful feedback, and Dan Grossman and Jamshid Mahdavi
provided text replacements.
References
[ALL99] M. Allman, "TCP Byte Counting Refinements," ACM Computer
Communication Review, Volume 29, Number 3, July 1999.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -