📄 rfc3267.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 5 页
字号:

Sjoberg, et. al.            Standards Track                    [Page 11]

RFC 3267        RTP Payload Format for AMR and AMR-WB          June 2002


3.9. AMR or AMR-WB Speech over IP scenarios

   The primary scenario for this payload format is IP end-to-end between
   two terminals, as shown in Figure 2.  This payload format is expected
   to be useful for both conversational and streaming services.

                +----------+                         +----------+
                |          |    IP/UDP/RTP/AMR or    |          |
                | TERMINAL |<----------------------->| TERMINAL |
                |          |    IP/UDP/RTP/AMR-WB    |          |
                +----------+                         +----------+

                   Figure 2: IP terminal to IP terminal scenario

   A conversational service puts requirements on the payload format.
   Low delay is one very important factor, i.e., few speech frame-blocks
   per payload packet.  Low overhead is also required when the payload
   format traverses low bandwidth links, especially as the frequency of
   packets will be high.  For low bandwidth links it also an advantage
   to support UED which allows a link provider to reduce delay and
   packet loss or to reduce the utilization of link resources.

   Streaming service has less strict real-time requirements and
   therefore can use a larger number of frame-blocks per packet than
   conversational service.  This reduces the overhead from IP, UDP, and
   RTP headers.  However, including several frame-blocks per packet
   makes the transmission more vulnerable to packet loss, so
   interleaving may be used to reduce the effect packet loss will have
   on speech quality.  A streaming server handling a large number of
   clients also needs a payload format that requires as few resources as
   possible when doing packetization.  The octet-aligned and
   interleaving modes require the least amount of resources, while CRC,
   robust sorting, and bandwidth efficient modes have higher demands.

   Another scenario occurs when AMR or AMR-WB encoded speech will be
   transmitted from a non-IP system (e.g., a GSM or 3GPP network) to an
   IP/UDP/RTP VoIP terminal, and/or vice versa, as depicted in Figure 3.














Sjoberg, et. al.            Standards Track                    [Page 12]

RFC 3267        RTP Payload Format for AMR and AMR-WB          June 2002


          AMR or AMR-WB
          over
          I.366.{2,3} or +------+                        +----------+
          3G Iu or       |      |   IP/UDP/RTP/AMR or    |          |
          <------------->|  GW  |<---------------------->| TERMINAL |
          GSM Abis       |      |   IP/UDP/RTP/AMR-WB    |          |
          etc.           +------+                        +----------+
                             |
           GSM/3GPP network  |           IP network
                             |

                     Figure 3: GW to VoIP terminal scenario

   In such a case, it is likely that the AMR or AMR-WB frame is
   packetized in a different way in the non-IP network and will need to
   be re-packetized into RTP at the gateway.  Also, speech frames from
   the non-IP network may come with some UEP/UED information (e.g., a
   frame quality indicator) that will need to be preserved and forwarded
   on to the decoder along with the speech bits.  This is specified in
   Section 4.3.2.

   AMR's capability to do fast mode switching is exploited in some non-
   IP networks to optimize speech quality.  To preserve this
   functionality in scenarios including a gateway to an IP network, a
   codec mode request (CMR) field is needed.  The gateway will be
   responsible for forwarding the CMR between the non-IP and IP parts in
   both directions.  The IP terminal should follow the CMR forwarded by
   the gateway to optimize speech quality going to the non-IP decoder.
   The mode control algorithm in the gateway must accommodate the delay
   imposed by the IP network on the response to CMR by the IP terminal.

   The IP terminal should not set the CMR (see Section 4.3.1), but the
   gateway can set the CMR value on frames going toward the encoder in
   the non-IP part to optimize speech quality from that encoder to the
   gateway.  The gateway can alternatively set a lower CMR value, if
   desired, as one means to control congestion on the IP network.















Sjoberg, et. al.            Standards Track                    [Page 13]

RFC 3267        RTP Payload Format for AMR and AMR-WB          June 2002


   A third likely scenario is that IP/UDP/RTP is used as transport
   between two non-IP systems, i.e., IP is originated and terminated in
   gateways on both sides of the IP transport, as illustrated in Figure
   4 below.

   AMR or AMR-WB                                        AMR or AMR-WB
   over                                                 over
   I.366.{2,3} or +------+                     +------+ I.366.{2,3} or
   3G Iu or       |      |  IP/UDP/RTP/AMR or  |      | 3G Iu or
   <------------->|  GW  |<------------------->|  GW  |<------------->
   GSM Abis       |      |  IP/UDP/RTP/AMR-WB  |      | GSM Abis
   etc.           +------+                     +------+ etc.
                      |                           |
    GSM/3GPP network  |          IP network       |  GSM/3GPP network
                      |                           |

                        Figure 4: GW to GW scenario

   This scenario requires the same mechanisms for preserving UED/UEP and
   CMR information as in the single gateway scenario.  In addition, the
   CMR value may be set in packets received by the gateways on the IP
   network side.  The gateway should forward to the non-IP side a CMR
   value that is the minimum of three values:

      -  the CMR value it receives on the IP side;

      -  the CMR value it calculates based on its reception quality on
         the non-IP side; and

      - a CMR value it may choose for congestion control of transmission
         on the IP side.

   The details of the control algorithm are left to the implementation.

4. AMR and AMR-WB RTP Payload Formats

   The AMR and AMR-WB payload formats have identical structure, so they
   are specified together.  The only differences are in the types of
   codec frames contained in the payload.  The payload format consists
   of the RTP header, payload header and payload data.

4.1. RTP Header Usage

   The format of the RTP header is specified in [8].  This payload
   format uses the fields of the header in a manner consistent with that
   specification.





Sjoberg, et. al.            Standards Track                    [Page 14]

RFC 3267        RTP Payload Format for AMR and AMR-WB          June 2002


   The RTP timestamp corresponds to the sampling instant of the first
   sample encoded for the first frame-block in the packet.  The
   timestamp clock frequency is the same as the sampling frequency, so
   the timestamp unit is in samples.

   The duration of one speech frame-block is 20 ms for both AMR and
   AMR-WB.  For AMR, the sampling frequency is 8 kHz, corresponding to
   160 encoded speech samples per frame from each channel.  For AMR-WB,
   the sampling frequency is 16 kHz, corresponding to 320 samples per
   frame from each channel.  Thus, the timestamp is increased by 160 for
   AMR and 320 for AMR-WB for each consecutive frame-block.

   A packet may contain multiple frame-blocks of encoded speech or
   comfort noise parameters.  If interleaving is employed, the frame-
   blocks encapsulated into a payload are picked according to the
   interleaving rules as defined in Section 4.4.1.  Otherwise, each
   packet covers a period of one or more contiguous 20 ms frame-block
   intervals.  In case the data from all the channels for a particular
   frame-block in the period is missing, for example at a gateway from
   some other transport format, it is possible to indicate that no data
   is present for that frame-block rather than breaking a multi-frame-
   block packet into two, as explained in Section 4.3.2.

   To allow for error resiliency through redundant transmission, the
   periods covered by multiple packets MAY overlap in time.  A receiver
   MUST be prepared to receive any speech frame multiple times, either
   in exact duplicates, or in different AMR rate modes, or with data
   present in one packet and not present in another.  If multiple
   versions of the same speech frame are received, it is RECOMMENDED
   that the mode with the highest rate be used by the speech decoder.  A
   given frame MUST NOT be encoded as speech in one packet and comfort
   noise parameters in another.

   The payload is always made an integral number of octets long by
   padding with zero bits if necessary.  If additional padding is
   required to bring the payload length to a larger multiple of octets
   or for some other purpose, then the P bit in the RTP in the header
   may be set and padding appended as specified in [8].

   The RTP header marker bit (M) SHALL be set to 1 if the first frame-
   block carried in the packet contains a speech frame which is the
   first in a talkspurt.  For all other packets the marker bit SHALL be
   set to zero (M=0).








Sjoberg, et. al.            Standards Track                    [Page 15]

RFC 3267        RTP Payload Format for AMR and AMR-WB          June 2002


   The assignment of an RTP payload type for this new packet format is
   outside the scope of this document, and will not be specified here.
   It is expected that the RTP profile under which this payload format
   is being used will assign a payload type for this encoding or specify
   that the payload type is to be bound dynamically.

4.2. Payload Structure

   The complete payload consists of a payload header, a payload table of
   contents, and speech data representing one or more speech frame-
   blocks.  The following diagram shows the general payload format
   layout:

   +----------------+-------------------+----------------
   | payload header | table of contents | speech data ...
   +----------------+-------------------+----------------

   Payloads containing more than one speech frame-block are called
   compound payloads.

   The following sections describe the variations taken by the payload
   format depending on whether the AMR session is set up to use the
   bandwidth-efficient mode or octet-aligned mode and any of the
   OPTIONAL functions for robust sorting, interleaving, and frame CRCs.
   Implementations SHOULD support both bandwidth-efficient and octet-
   aligned operation to increase interoperability.

4.3. Bandwidth-Efficient Mode

4.3.1. The Payload Header

   In bandwidth-efficient mode, the payload header simply consists of a
   4 bit codec mode request:

    0 1 2 3
   +-+-+-+-+
   |  CMR  |
   +-+-+-+-+

   CMR (4 bits): Indicates a codec mode request sent to the speech
      encoder at the site of the receiver of this payload.  The value of
      the CMR field is set to the frame type index of the corresponding
      speech mode being requested.  The frame type index may be 0-7 for
      AMR, as defined in Table 1a in [2], or 0-8 for AMR-WB, as defined
      in Table 1a in [4].  CMR value 15 indicates that no mode request
      is present, and other values are for future use.





Sjoberg, et. al.            Standards Track                    [Page 16]

RFC 3267        RTP Payload Format for AMR and AMR-WB          June 2002


   The mode request received in the CMR field is valid until the next
   CMR is received, i.e., a newly received CMR value overrides the
   previous one.  Therefore, if a terminal continuously wishes to
   receive frames in the same mode X, it needs to set CMR=X for all its
   outbound payloads, and if a terminal has no preference in which mode
   to receive, it SHOULD set CMR=15 in all its outbound payloads.

   If receiving a payload with a CMR value which is not a speech mode or
   NO_DATA, the CMR MUST be ignored by the receiver.

   In a multi-channel session, CMR SHOULD be interpreted by the receiver
   of the payload as the desired encoding mode for all the channels in
   the session.

   An IP end-point SHOULD NOT set the CMR based on packet losses or
   other congestion indications, for several reasons:

      -  The other end of the IP path may be a gateway to a non-IP
         network (such as a radio link) that needs to set the CMR field
         to optimize performance on that network.

      -  Congestion on the IP network is managed by the IP sender, in
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -