📄 rfc3124.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 4 页
字号:

RFC 3124                 The Congestion Manager                June 2001


   Section 5.2).  A later guideline document is expected to describe a
   few simple schedulers (e.g., weighted round-robin, hierarchical
   scheduling) and the API they export to provide relative
   prioritization.

4. CM internals

   This section describes the internal components of the CM.  It
   includes a Congestion Controller and a Scheduler, with well-defined,
   abstract interfaces exported by them.

4.1 Congestion controller

   Associated with each macroflow is a congestion control algorithm; the
   collection of all these algorithms comprises the congestion
   controller of the CM.  The control algorithm decides when and how
   much data can be transmitted by a macroflow.  It uses application
   notifications (Section 4.3) from concurrent streams on the same
   macroflow to build up information about the congestion state of the
   network path used by the macroflow.

   The congestion controller MUST implement a "TCP-friendly" [Mahdavi98]
   congestion control algorithm.  Several macroflows MAY (and indeed,
   often will) use the same congestion control algorithm but each
   macroflow maintains state about the network used by its streams.

   The congestion control module MUST implement the following abstract
   interfaces.  We emphasize that these are not directly visible to
   applications; they are within the context of a macroflow, and are
   different from the CM API functions of Section 4.

   - void query(u64 *rate, u32 *srtt, u32 *rttdev): This function
     returns the estimated rate (in bits per second) and smoothed
     round trip time (in microseconds) for the macroflow.

   - void notify(u32 nsent): This function MUST be used to notify the
     congestion control module whenever data is sent by an
     application.  The nsent parameter indicates the number of bytes
     just sent by the application.

   - void update(u32 nsent, u32 nrecd, u32 rtt, u32 lossmode): This
     function is called whenever any of the CM streams associated with
     a macroflow identifies that data has reached the receiver or has
     been lost en route.  The nrecd parameter indicates the number of
     bytes that have just arrived at the receiver.  The nsent
     parameter is the sum of the number of bytes just received and the





Balakrishnan, et. al.       Standards Track                    [Page 12]

RFC 3124                 The Congestion Manager                June 2001


     number of bytes identified as lost en route.  The rtt parameter is
     the estimated round trip time in microseconds during the
     transfer.  The lossmode parameter provides an indicator of how a
     loss was detected (section 4.3).

   Although these interfaces are not visible to applications, the
   congestion controller MUST implement these abstract interfaces to
   provide for modular inter-operability with different separately-
   developed schedulers.

   The congestion control module MUST also call the associated
   scheduler's schedule function (section 5.2) when it believes that the
   current congestion state allows an MTU-sized packet to be sent.

4.2 Scheduler

   While it is the responsibility of the congestion control module to
   determine when and how much data can be transmitted, it is the
   responsibility of a macroflow's scheduler module to determine which
   of the streams should get the opportunity to transmit data.

   The Scheduler MUST implement the following interfaces:

   - void schedule(u32 num_bytes): When the congestion control module
     determines that data can be sent, the schedule() routine MUST be
     called with no more than the number of bytes that can be sent.
     In turn, the scheduler MAY call the cmapp_send() function that CM
     applications must provide.

   - float query_share(i32 cm_streamid): This call returns the
     described stream's share of the total bandwidth available to the
     macroflow.  This call combined with the query call of the
     congestion controller provides the information to satisfy an
     application's cm_query() request.

   - void notify(i32 cm_streamid, u32 nsent): This interface is used
     to notify the scheduler module whenever data is sent by a CM
     application.  The nsent parameter indicates the number of bytes
     just sent by the application.

     The Scheduler MAY implement many additional interfaces.  As
     experience with CM schedulers increases, future documents may
     make additions and/or changes to some parts of the scheduler
     API.







Balakrishnan, et. al.       Standards Track                    [Page 13]

RFC 3124                 The Congestion Manager                June 2001


5. Examples

5.1 Example applications

   This section describes three possible uses of the CM API by
   applications.  We describe two asynchronous applications---an
   implementation of a TCP sender and an implementation of congestion-
   controlled UDP sockets, and a synchronous application---a streaming
   audio server.  More details of these applications and CM
   implementation optimizations for efficient operation are described in
   [Andersen00].

   All applications that use the CM MUST incorporate feedback from the
   receiver.  For example, it must periodically (typically once or twice
   per round trip time) determine how many of its packets arrived at the
   receiver.  When the source gets this feedback, it MUST use
   cm_update() to inform the CM of this new information.  This results
   in the CM updating ownd and may result in the CM changing its
   estimates and calling cmapp_update() of the streams of the macroflow.

   The protocols in this section are examples and suggestions for
   implementation, rather than requirements for any conformant
   implementation.

5.1.1 TCP

   A TCP implementation that uses CM should use the cmapp_send()
   callback API.  TCP only identifies which data it should send upon the
   arrival of an acknowledgement or expiration of a timer.  As a result,
   it requires tight control over when and if new data or
   retransmissions are sent.

   When TCP either connects to or accepts a connection from another
   host, it performs a cm_open() call to associate the TCP connection
   with a cm_streamid.

   Once a connection is established, the CM is used to control the
   transmission of outgoing data.  The CM eliminates the need for
   tracking and reacting to congestion in TCP, because the CM and its
   transmission API ensure proper congestion behavior.  Loss recovery is
   still performed by TCP based on fast retransmissions and recovery as
   well as timeouts.  In addition, TCP is also modified to have its own
   outstanding window (tcp_ownd) estimate.  Whenever data segments are
   sent from its cmapp_send() callback, TCP updates its tcp_ownd value.
   The ownd variable is also updated after each cm_update() call.  TCP
   also maintains a count of the number of outstanding segments
   (pkt_cnt).  At any time, TCP can calculate the average packet size
   (avg_pkt_size) as tcp_ownd/pkt_cnt.  The avg_pkt_size is used by TCP



Balakrishnan, et. al.       Standards Track                    [Page 14]

RFC 3124                 The Congestion Manager                June 2001


   to help estimate the amount of outstanding data.  Note that this is
   not needed if the SACK option is used on the connection, since this
   information is explicitly available.

   The TCP output routines are modified as follows:

      1. All congestion window (cwnd) checks are removed.

      2. When application data is available.  The TCP output routines
      perform all non-congestion checks (Nagle algorithm, receiver-
      advertised window check, etc).  If these checks pass, the output
      routine queues the data and calls cm_request() for the stream.

      3. If incoming data or timers result in a loss being detected, the
      retransmission is also placed in a queue and cm_request() is
      called for the stream.

      4. The cmapp_send() callback for TCP is set to an output routine.
      If any retransmission is enqueued, the routine outputs the
      retransmission.  Otherwise, the routine outputs as much new data
      as the TCP connection state allows.  However, the cmapp_send()
      never sends more than a single segment per call.  This routine
      arranges for the other output computations to be done, such as
      header and options computations.

   The IP output routine on the host calls cm_notify() when the packets
   are actually sent out.  Because it does not know which cm_streamid is
   responsible for the packet, cm_notify() takes the stream_info as
   argument (see Section 4 for what the stream_info should contain).
   Because cm_notify() reports the IP payload size, TCP keeps track of
   the total header size and incorporates these updates.

   The TCP input routines are modified as follows:

      1. RTT estimation is done as normal using either timestamps or
      Karn's algorithm.  Any rtt estimate that is generated is passed to
      CM via the cm_update call.

      2. All cwnd and slow start threshold (ssthresh) updates are
      removed.

      3. Upon the arrival of an ack for new data, TCP computes the value
      of in_flight (the amount of data in flight) as snd_max-ack-1
      (i.e., MAX Sequence Sent - Current Ack - 1).  TCP then calls
      cm_update(streamid, tcp_ownd - in_flight, 0, CM_NO_CONGESTION,
      rtt).





Balakrishnan, et. al.       Standards Track                    [Page 15]

RFC 3124                 The Congestion Manager                June 2001


      4. Upon the arrival of a duplicate acknowledgement, TCP must check
      its dupack count (dup_acks) to determine its action.  If dup_acks
      < 3, the TCP does nothing.  If dup_acks == 3, TCP assumes that a
      packet was lost and that at least 3 packets arrived to generate
      these duplicate acks.  Therefore, it calls cm_update(streamid, 4 *
      avg_pkt_size, 3 * avg_pkt_size, CM_LOSS_FEEDBACK, rtt).  The
      average packet size is used since the acknowledgments do not
      indicate exactly how much data has reached the other end.  Most
      TCP implementations interpret a duplicate ACK as an indication
      that a full MSS has reached its destination.  Once a new ACK is
      received, these TCP sender implementations may resynchronize with
      TCP receiver.  The CM API does not provide a mechanism for TCP to
      pass information from this resynchronization.  Therefore, TCP can
      only infer the arrival of an avg_pkt_size amount of data from each
      duplicate ack.  TCP also enqueues a retransmission of the lost
      segment and calls cm_request().  If dup_acks > 3, TCP assumes that
      a packet has reached the other end and caused this ack to be sent.
      As a result, it calls cm_update(streamid, avg_pkt_size,
      avg_pkt_size, CM_NO_CONGESTION, rtt).

      5. Upon the arrival of a partial acknowledgment (one that does not
      exceed the highest segment transmitted at the time the loss
      occurred, as defined in [Floyd99b]), TCP assumes that a packet was
      lost and that the retransmitted packet has reached the recipient.
      Therefore, it calls cm_update(streamid, 2 * avg_pkt_size,
      avg_pkt_size, CM_NO_CONGESTION, rtt).  CM_NO_CONGESTION is used
      since the loss period has already been reported.  TCP also
      enqueues a retransmission of the lost segment and calls
      cm_request().

   When the TCP retransmission timer expires, the sender identifies that
   a segment has been lost and calls cm_update(streamid, avg_pkt_size,
   0, CM_NO_FEEDBACK, 0) to signify that no feedback has been received
   from the receiver and that one segment is sure to have "left the
   pipe."  TCP also enqueues a retransmission of the lost segment and
   calls cm_request().

5.1.2 Congestion-controlled UDP

   Congestion-controlled UDP is a useful CM application, which we
   describe in the context of Berkeley sockets [Stevens94].  They
   provide the same functionality as standard Berkeley UDP sockets, but
   instead of immediately sending the data from the kernel packet queue
   to lower layers for transmission, the buffered socket implementation
   makes calls to the API exported by the CM inside the kernel and gets
   callbacks from the CM.  When a CM UDP socket is created, it is bound
   to a particular stream.  Later, when data is added to the packet
   queue, cm_request() is called on the stream associated with the



Balakrishnan, et. al.       Standards Track                    [Page 16]

RFC 3124                 The Congestion Manager                June 2001


   socket.  When the CM schedules this stream for transmission, it calls
   udp_ccappsend() in the UDP module.  This function transmits one MTU
   from the packet queue, and schedules the transmission of any
   remaining packets.  The in-kernel implementation of the CM UDP API
   should not require any additional data copies and should support all
   standard UDP options.  Modifying existing applications to use
   congestion-controlled UDP requires the implementation of a new socket
   option on the socket.  To work correctly, the sender must obtain
   feedback about congestion.  This can be done in at least two ways:
   (i) the UDP receiver application can provide feedback to the sender
   application, which will inform the CM of network conditions using
   cm_update(); (ii) the UDP receiver implementation can provide
   feedback to the sending UDP.  Note that this latter alternative
   requires changes to the receiver's network stack and the sender UDP
   cannot assume that all receivers support this option without explicit
   negotiation.

5.1.3 Audio server

   A typical audio application often has access to the sample in a
   multitude of data rates and qualities.  The objective of the
   application is then to deliver the highest possible quality of audio
   (typically the highest data rate) its clients.  The selection of
   which version of audio to transmit should be based on the current
   congestion state of the network.  In addition, the source will want
💿 文件大小 56963 K
👤 上传用户 zhhw254774338
📂 所属分类电子书籍
🏷️ 相关标签

#RFC #文档
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -