📄 rfc1191.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 4 页
字号:
   entries should be initialized to be the MTU of the associated
   first-hop data link, and must never be changed by the PMTU Discovery
   process.  (PMTU Discovery only creates or changes entries for
   per-host routes).  Until a Datagram Too Big message is received, the
   PMTU associated with the initially-chosen route is presumed to be
   accurate.

   When a Datagram Too Big message is received, the ICMP layer
   determines a new estimate for the Path MTU (either from a non-zero
   Next-Hop MTU value in the packet, or using the method described in
   section 5).  If a per-host route for this path does not exist, then
   one is created (almost as if a per-host ICMP Redirect is being
   processed; the new route uses the same first-hop router as the
   current route).  If the PMTU estimate associated with the per-host
   route is higher than the new estimate, then the value in the routing
   entry is changed.

   The packetization layers must be notified about decreases in the
   PMTU.  Any packetization layer instance (for example, a TCP
   connection) that is actively using the path must be notified if the
   PMTU estimate is decreased.

          Note: even if the Datagram Too Big message contains an
          Original Datagram Header that refers to a UDP packet, the TCP
          layer must be notified if any of its connections use the given


Mogul & Deering                                                [page 10]


RFC 1191                   Path MTU Discovery              November 1990




          path.

   Also, the instance that sent the datagram that elicited the Datagram
   Too Big message should be notified that its datagram has been
   dropped, even if the PMTU estimate has not changed, so that it may
   retransmit the dropped datagram.

          Note: The notification mechanism can be analogous to the
          mechanism used to provide notification of an ICMP Source
          Quench message.  In some implementations (such as
          4.2BSD-derived systems), the existing notification mechanism
          is not able to identify the specific connection involved, and
          so an additional mechanism is necessary.

          Alternatively, an implementation can avoid the use of an
          asynchronous notification mechanism for PMTU decreases by
          postponing notification until the next attempt to send a
          datagram larger than the PMTU estimate.  In this approach,
          when an attempt is made to SEND a datagram with the DF bit
          set, and the datagram is larger than the PMTU estimate, the
          SEND function should fail and return a suitable error
          indication.  This approach may be more suitable to a
          connectionless packetization layer (such as one using UDP),
          which (in some implementations) may be hard to "notify" from
          the ICMP layer.  In this case, the normal timeout-based
          retransmission mechanisms would be used to recover from the
          dropped datagrams.

   It is important to understand that the notification of the
   packetization layer instances using the path about the change in the
   PMTU is distinct from the notification of a specific instance that a
   packet has been dropped.  The latter should be done as soon as
   practical (i.e., asynchronously from the point of view of the
   packetization layer instance), while the former may be delayed until
   a packetization layer instance wants to create a packet.
   Retransmission should be done for only for those packets that are
   known to be dropped, as indicated by a Datagram Too Big message.


6.3. Purging stale PMTU information

   Internetwork topology is dynamic; routes change over time.  The PMTU
   discovered for a given destination may be wrong if a new route comes
   into use.  Thus, PMTU information cached by a host can become stale.

   Because a host using PMTU Discovery always sets the DF bit, if the
   stale PMTU value is too large, this will be discovered almost


Mogul & Deering                                                [page 11]


RFC 1191                   Path MTU Discovery              November 1990




   immediately once a datagram is sent to the given destination.  No
   such mechanism exists for realizing that a stale PMTU value is too
   small, so an implementation should "age" cached values.  When a PMTU
   value has not been decreased for a while (on the order of 10
   minutes), the PMTU estimate should be set to the first-hop data-link
   MTU, and the packetization layers should be notified of the change.
   This will cause the complete PMTU Discovery process to take place
   again.

          Note: an implementation should provide a means for changing
          the timeout duration, including setting it to "infinity".  For
          example, hosts attached to an FDDI network which is then
          attached to the rest of the Internet via a slow serial line
          are never going to discover a new non-local PMTU, so they
          should not have to put up with dropped datagrams every 10
          minutes.

   An upper layer MUST not retransmit datagrams in response to an
   increase in the PMTU estimate, since this increase never comes in
   response to an indication of a dropped datagram.

   One approach to implementing PMTU aging is to add a timestamp field
   to the routing table entry.  This field is initialized to a
   "reserved" value, indicating that the PMTU has never been changed.
   Whenever the PMTU is decreased in response to a Datagram Too Big
   message, the timestamp is set to the current time.

   Once a minute, a timer-driven procedure runs through the routing
   table, and for each entry whose timestamp is not "reserved" and is
   older than the timeout interval:

      - The PMTU estimate is set to the MTU of the associated first
        hop.

      - Packetization layers using this route are notified of the
        increase.

   PMTU estimates may disappear from the routing table if the per-host
   routes are removed; this can happen in response to an ICMP Redirect
   message, or because certain routing-table daemons delete old routes
   after several minutes.  Also, on a multi-homed host a topology change
   may result in the use of a different source interface.  When this
   happens, if the packetization layer is not notified then it may
   continue to use a cached PMTU value that is now too small.  One
   solution is to notify the packetization layer of a possible PMTU
   change whenever a Redirect message causes a route change, and
   whenever a route is simply deleted from the routing table.


Mogul & Deering                                                [page 12]


RFC 1191                   Path MTU Discovery              November 1990




          Note: a more sophisticated method for detecting PMTU increases
          is described in section 7.1.


6.4. TCP layer actions

   The TCP layer must track the PMTU for the destination of a
   connection; it should not send datagrams that would be larger than
   this.  A simple implementation could ask the IP layer for this value
   (using the GET_MAXSIZES interface described in [1]) each time it
   created a new segment, but this could be inefficient.  Moreover, TCP
   implementations that follow the "slow-start" congestion-avoidance
   algorithm [4] typically calculate and cache several other values
   derived from the PMTU.  It may be simpler to receive asynchronous
   notification when the PMTU changes, so that these variables may be
   updated.

   A TCP implementation must also store the MSS value received from its
   peer (which defaults to 536), and not send any segment larger than
   this MSS, regardless of the PMTU.  In 4.xBSD-derived implementations,
   this requires adding an additional field to the TCP state record.

   Finally, when a Datagram Too Big message is received, it implies that
   a datagram was dropped by the router that sent the ICMP message.  It
   is sufficient to treat this as any other dropped segment, and wait
   until the retransmission timer expires to cause retransmission of the
   segment.  If the PMTU Discovery process requires several steps to
   estimate the right PMTU, this could delay the connection by many
   round-trip times.

   Alternatively, the retransmission could be done in immediate response
   to a notification that the Path MTU has changed, but only for the
   specific connection specified by the Datagram Too Big message.  The
   datagram size used in the retransmission should, of course, be no
   larger than the new PMTU.

          Note: One MUST not retransmit in response to every Datagram
          Too Big message, since a burst of several oversized segments
          will give rise to several such messages and hence several
          retransmissions of the same data.  If the new estimated PMTU
          is still wrong, the process repeats, and there is an
          exponential growth in the number of superfluous segments sent!

          This means that the TCP layer must be able to recognize when a
          Datagram Too Big notification actually decreases the PMTU that
          it has already used to send a datagram on the given
          connection, and should ignore any other notifications.


Mogul & Deering                                                [page 13]


RFC 1191                   Path MTU Discovery              November 1990




   Modern TCP implementations incorporate "congestion advoidance" and
   "slow-start" algorithms to improve performance [4].  Unlike a
   retransmission caused by a TCP retransmission timeout, a
   retransmission caused by a Datagram Too Big message should not change
   the congestion window.  It should, however, trigger the slow-start
   mechanism (i.e., only one segment should be retransmitted until
   acknowledgements begin to arrive again).

   TCP performance can be reduced if the sender's maximum window size is
   not an exact multiple of the segment size in use (this is not the
   congestion window size, which is always a multiple of the segment
   size).  In many system (such as those derived from 4.2BSD), the
   segment size is often set to 1024 octets, and the maximum window size
   (the "send space") is usually a multiple of 1024 octets, so the
   proper relationship holds by default.  If PMTU Discovery is used,
   however, the segment size may not be a submultiple of the send space,
   and it may change during a connection; this means that the TCP layer
   may need to change the transmission window size when PMTU Discovery
   changes the PMTU value.  The maximum window size should be set to the
   greatest multiple of the segment size (PMTU - 40) that is less than
   or equal to the sender's buffer space size.

   PMTU Discovery does not affect the value sent in the TCP MSS option,
   because that value is used by the other end of the connection, which
   may be using an unrelated PMTU value.


6.5. Issues for other transport protocols

   Some transport protocols (such as ISO TP4 [3]) are not allowed to
   repacketize when doing a retransmission.  That is, once an attempt is
   made to transmit a datagram of a certain size, its contents cannot be
   split into smaller datagrams for retransmission.  In such a case, the
   original datagram should be retransmitted without the DF bit set,
   allowing it to be fragmented as necessary to reach its destination.
   Subsequent datagrams, when transmitted for the first time, should be
   no larger than allowed by the Path MTU, and should have the DF bit
   set.

   The Sun Network File System (NFS) uses a Remote Procedure Call (RPC)
   protocol [11] that, in many cases, sends datagrams that must be
   fragmented even for the first-hop link.  This might improve
   performance in certain cases, but it is known to cause reliability
   and performance problems, especially when the client and server are
   separated by routers.

   We recommend that NFS implementations use PMTU Discovery whenever


Mogul & Deering                                                [page 14]


RFC 1191                   Path MTU Discovery              November 1990




   routers are involved.  Most NFS implementations allow the RPC
   datagram size to be changed at mount-time (indirectly, by changing
   the effective file system block size), but might require some
   modification to support changes later on.

   Also, since a single NFS operation cannot be split across several UDP
   datagrams, certain operations (primarily, those operating on file
   names and directories) require a minimum datagram size that may be
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -