rfc2884.txt

来自「RFC 的详细文档!」· 文本 代码 · 共 1,012 行 · 第 1/3 页

TXT
1,012
字号






Network Working Group                                     J. Hadi Salim
Request for Comments: 2884                              Nortel Networks
Category: Informational                                        U. Ahmed
                                                    Carleton University
                                                              July 2000


   Performance Evaluation of Explicit Congestion Notification (ECN)
                             in IP Networks

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2000).  All Rights Reserved.

Abstract

   This memo presents a performance study of the Explicit Congestion
   Notification (ECN) mechanism in the TCP/IP protocol using our
   implementation on the Linux Operating System. ECN is an end-to-end
   congestion avoidance mechanism proposed by [6] and incorporated into
   RFC 2481[7]. We study the behavior of ECN for both bulk and
   transactional transfers. Our experiments show that there is
   improvement in throughput over NON ECN (TCP employing any of Reno,
   SACK/FACK or NewReno congestion control) in the case of bulk
   transfers and substantial improvement for transactional transfers.

   A more complete pdf version of this document is available at:
   http://www7.nortel.com:8080/CTL/ecnperf.pdf

   This memo in its current revision is missing a lot of the visual
   representations and experimental results found in the pdf version.

1. Introduction

   In current IP networks, congestion management is left to the
   protocols running on top of IP. An IP router when congested simply
   drops packets.  TCP is the dominant transport protocol today [26].
   TCP infers that there is congestion in the network by detecting
   packet drops (RFC 2581). Congestion control algorithms [11] [15] [21]
   are then invoked to alleviate congestion.  TCP initially sends at a
   higher rate (slow start) until it detects a packet loss. A packet
   loss is inferred by the receipt of 3 duplicate ACKs or detected by a



Salim & Ahmed                Informational                      [Page 1]

RFC 2884                   ECN in IP Networks                  July 2000


   timeout. The sending TCP then moves into a congestion avoidance state
   where it carefully probes the network by sending at a slower rate
   (which goes up until another packet loss is detected).  Traditionally
   a router reacts to congestion by dropping a packet in the absence of
   buffer space. This is referred to as Tail Drop. This method has a
   number of drawbacks (outlined in Section 2). These drawbacks coupled
   with the limitations of end-to-end congestion control have led to
   interest in introducing smarter congestion control mechanisms in
   routers.  One such mechanism is Random Early Detection (RED) [9]
   which detects incipient congestion and implicitly signals the
   oversubscribing flow to slow down by dropping its packets. A RED-
   enabled router detects congestion before the buffer overflows, based
   on a running average queue size, and drops packets probabilistically
   before the queue actually fills up. The probability of dropping a new
   arriving packet increases as the average queue size increases above a
   low water mark minth, towards higher water mark maxth. When the
   average queue size exceeds maxth all arriving packets are dropped.

   An extension to RED is to mark the IP header instead of dropping
   packets (when the average queue size is between minth and maxth;
   above maxth arriving packets are dropped as before). Cooperating end
   systems would then use this as a signal that the network is congested
   and slow down. This is known as Explicit Congestion Notification
   (ECN).  In this paper we study an ECN implementation on Linux for
   both the router and the end systems in a live network.  The memo is
   organized as follows. In Section 2 we give an overview of queue
   management in routers. Section 3 gives an overview of ECN and the
   changes required at the router and the end hosts to support ECN.
   Section 4 defines the experimental testbed and the terminologies used
   throughout this memo. Section 5 introduces the experiments that are
   carried out, outlines the results and presents an analysis of the
   results obtained.  Section 6 concludes the paper.

2. Queue Management in routers

   TCP's congestion control and avoidance algorithms are necessary and
   powerful but are not enough to provide good service in all
   circumstances since they treat the network as a black box. Some sort
   of control is required from the routers to complement the end system
   congestion control mechanisms. More detailed analysis is contained in
   [19].  Queue management algorithms traditionally manage the length of
   packet queues in the router by dropping packets only when the buffer
   overflows.  A maximum length for each queue is configured. The router
   will accept packets till this maximum size is exceeded, at which
   point it will drop incoming packets. New packets are accepted when
   buffer space allows. This technique is known as Tail Drop. This
   method has served the Internet well for years, but has the several
   drawbacks.  Since all arriving packets (from all flows) are dropped



Salim & Ahmed                Informational                      [Page 2]

RFC 2884                   ECN in IP Networks                  July 2000


   when the buffer overflows, this interacts badly with the congestion
   control mechanism of TCP. A cycle is formed with a burst of drops
   after the maximum queue size is exceeded, followed by a period of
   underutilization at the router as end systems back off. End systems
   then increase their windows simultaneously up to a point where a
   burst of drops happens again. This phenomenon is called Global
   Synchronization. It leads to poor link utilization and lower overall
   throughput [19] Another problem with Tail Drop is that a single
   connection or a few flows could monopolize the queue space, in some
   circumstances. This results in a lock out phenomenon leading to
   synchronization or other timing effects [19].  Lastly, one of the
   major drawbacks of Tail Drop is that queues remain full for long
   periods of time. One of the major goals of queue management is to
   reduce the steady state queue size[19].  Other queue management
   techniques include random drop on full and drop front on full [13].

2.1. Active Queue Management

   Active queue management mechanisms detect congestion before the queue
   overflows and provide an indication of this congestion to the end
   nodes [7]. With this approach TCP does not have to rely only on
   buffer overflow as the indication of congestion since notification
   happens before serious congestion occurs. One such active management
   technique is RED.

2.1.1. Random Early Detection

   Random Early Detection (RED) [9] is a congestion avoidance mechanism
   implemented in routers which works on the basis of active queue
   management. RED addresses the shortcomings of Tail Drop.  A RED
   router signals incipient congestion to TCP by dropping packets
   probabilistically before the queue runs out of buffer space. This
   drop probability is dependent on a running average queue size to
   avoid any bias against bursty traffic. A RED router randomly drops
   arriving packets, with the result that the probability of dropping a
   packet belonging to a particular flow is approximately proportional
   to the flow's share of bandwidth. Thus, if the sender is using
   relatively more bandwidth it gets penalized by having more of its
   packets dropped.  RED operates by maintaining two levels of
   thresholds minimum (minth) and maximum (maxth). It drops a packet
   probabilistically if and only if the average queue size lies between
   the minth and maxth thresholds. If the average queue size is above
   the maximum threshold, the arriving packet is always dropped. When
   the average queue size is between the minimum and the maximum
   threshold, each arriving packet is dropped with probability pa, where
   pa is a function of the average queue size. As the average queue
   length varies between minth and maxth, pa increases linearly towards
   a configured maximum drop probability, maxp. Beyond maxth, the drop



Salim & Ahmed                Informational                      [Page 3]

RFC 2884                   ECN in IP Networks                  July 2000


   probability is 100%.  Dropping packets in this way ensures that when
   some subset of the source TCP packets get dropped and they invoke
   congestion avoidance algorithms that will ease the congestion at the
   gateway. Since the dropping is distributed across flows, the problem
   of global synchronization is avoided.

3. Explicit Congestion Notification

   Explicit Congestion Notification is an extension proposed to RED
   which marks a packet instead of dropping it when the average queue
   size is between minth and maxth [7]. Since ECN marks packets before
   congestion actually occurs, this is useful for protocols like TCP
   that are sensitive to even a single packet loss. Upon receipt of a
   congestion marked packet, the TCP receiver informs the sender (in the
   subsequent ACK) about incipient congestion which will in turn trigger
   the congestion avoidance algorithm at the sender.  ECN requires
   support from both the router as well as the end hosts, i.e.  the end
   hosts TCP stack needs to be modified. Packets from flows that are not
   ECN capable will continue to be dropped by RED (as was the case
   before ECN).

3.1. Changes at the router

   Router side support for ECN can be added by modifying current RED
   implementations. For packets from ECN capable hosts, the router marks
   the packets rather than dropping them (if the average queue size is
   between minth and maxth).  It is necessary that the router identifies
   that a packet is ECN capable, and should only mark packets that are
   from ECN capable hosts. This uses two bits in the IP header.  The ECN
   Capable Transport (ECT) bit is set by the sender end system if both
   the end systems are ECN capable (for a unicast transport, only if
   both end systems are ECN-capable). In TCP this is confirmed in the
   pre-negotiation during the connection setup phase (explained in
   Section 3.2).  Packets encountering congestion are marked by the
   router using the Congestion Experienced (CE) (if the average queue
   size is between minth and maxth) on their way to the receiver end
   system (from the sender end system), with a probability proportional
   to the average queue size following the procedure used in RED
   (RFC2309) routers.  Bits 10 and 11 in the IPV6 header are proposed
   respectively for the ECT and CE bits. Bits 6 and 7 of the IPV4 header
   DSCP field are also specified for experimental purposes for the ECT
   and CE bits respectively.

3.2. Changes at the TCP Host side

   The proposal to add ECN to TCP specifies two new flags in the
   reserved field of the TCP header. Bit 9 in the reserved field of the
   TCP header is designated as the ECN-Echo (ECE) flag and Bit 8 is



Salim & Ahmed                Informational                      [Page 4]

RFC 2884                   ECN in IP Networks                  July 2000


   designated as the Congestion Window Reduced (CWR) flag.  These two
   bits are used both for the initializing phase in which the sender and
   the receiver negotiate the capability and the desire to use ECN, as
   well as for the subsequent actions to be taken in case there is
   congestion experienced in the network during the established state.

   There are two main changes that need to be made to add ECN to TCP to
   an end system and one extension to a router running RED.

   1. In the connection setup phase, the source and destination TCPs
   have to exchange information about their desire and/or capability to
   use ECN. This is done by setting both the ECN-Echo flag and the CWR
   flag in the SYN packet of the initial connection phase by the sender;
   on receipt of this SYN packet, the receiver will set the ECN-Echo
   flag in the SYN-ACK response. Once this agreement has been reached,
   the sender will thereon set the ECT bit in the IP header of data
   packets for that flow, to indicate to the network that it is capable
   and willing to participate in ECN. The ECT bit is set on all packets
   other than pure ACK's.

   2. When a router has decided from its active queue management
   mechanism, to drop or mark a packet, it checks the IP-ECT bit in the
   packet header. It sets the CE bit in the IP header if the IP-ECT bit
   is set. When such a packet reaches the receiver, the receiver
   responds by setting the ECN-Echo flag (in the TCP header) in the next
   outgoing ACK for the flow. The receiver will continue to do this in
   subsequent ACKs until it receives from the sender an indication that
   it (the sender) has responded to the congestion notification.

   3. Upon receipt of this ACK, the sender triggers its congestion
   avoidance algorithm by halving its congestion window, cwnd, and
   updating its congestion window threshold value ssthresh. Once it has
   taken these appropriate steps, the sender sets the CWR bit on the
   next data outgoing packet to tell the receiver that it has reacted to
   the (receiver's) notification of congestion.  The receiver reacts to
   the CWR by halting the sending of the congestion notifications (ECE)
   to the sender if there is no new congestion in the network.

   Note that the sender reaction to the indication of congestion in the
   network (when it receives an ACK packet that has the ECN-Echo flag
   set) is equivalent to the Fast Retransmit/Recovery algorithm (when
   there is a congestion loss) in NON-ECN-capable TCP i.e. the sender
   halves the congestion window cwnd and reduces the slow start
   threshold ssthresh. Fast Retransmit/Recovery is still available for
   ECN capable stacks for responding to three duplicate acknowledgments.






Salim & Ahmed                Informational                      [Page 5]

RFC 2884                   ECN in IP Networks                  July 2000


4. Experimental setup

   For testing purposes we have added ECN to the Linux TCP/IP stack,
   kernels version 2.0.32. 2.2.5, 2.3.43 (there were also earlier
   revisions of 2.3 which were tested).  The 2.0.32 implementation
   conforms to RFC 2481 [7] for the end systems only. We have also
   modified the code in the 2.1,2.2 and 2.3 cases for the router portion
   as well as end system to conform to the RFC. An outdated version of
   the 2.0 code is available at [18].  Note Linux version 2.0.32
   implements TCP Reno congestion control while kernels >= 2.2.0 default
   to New Reno but will opt for a SACK/FACK combo when the remote end
   understands SACK.  Our initial tests were carried out with the 2.0
   kernel at the end system and 2.1 (pre 2.2) for the router part.  The
   majority of the test results here apply to the 2.0 tests. We  did
   repeat these tests on a different testbed (move from Pentium to
   Pentium-II class machines)with faster machines for the 2.2 and 2.3
   kernels, so the comparisons on the 2.0 and 2.2/3 are not relative.

   We have updated this memo release to reflect the tests against SACK
   and New Reno.

4.1. Testbed setup

                                             -----      ----
                                            | ECN |    | ECN |
                                            | ON  |    | OFF |
          data direction ---->>              -----      ----
                                              |          |
      server                                  |          |
       ----        ------        ------       |          |
      |    |      |  R1  |      |  R2  |      |          |
      |    | -----|      | ---- |      | ----------------------
       ----        ------ ^      ------             |
                          ^                         |
                          |                        -----
      congestion point ___|                       |  C  |
                                                  |     |
                                                   -----

   The figure above shows our test setup.

   All the physical links are 10Mbps ethernet.  Using Class Based
   Queuing (CBQ) [22], packets from the data server are constricted to a
   1.5Mbps pipe at the router R1. Data is always retrieved from the
   server towards the clients labelled , "ECN ON", "ECN OFF", and "C".
   Since the pipe from the server is 10Mbps, this creates congestion at
   the exit from the router towards the clients for competing flows. The
   machines labeled "ECN ON" and "ECN OFF"  are running the same version



Salim & Ahmed                Informational                      [Page 6]

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?