📄 rfc3267.txt
字号:
to be useful for both conversational and streaming services. +----------+ +----------+ | | IP/UDP/RTP/AMR or | | | TERMINAL |<----------------------->| TERMINAL | | | IP/UDP/RTP/AMR-WB | | +----------+ +----------+ Figure 2: IP terminal to IP terminal scenario A conversational service puts requirements on the payload format. Low delay is one very important factor, i.e., few speech frame-blocks per payload packet. Low overhead is also required when the payload format traverses low bandwidth links, especially as the frequency of packets will be high. For low bandwidth links it also an advantage to support UED which allows a link provider to reduce delay and packet loss or to reduce the utilization of link resources. Streaming service has less strict real-time requirements and therefore can use a larger number of frame-blocks per packet than conversational service. This reduces the overhead from IP, UDP, and RTP headers. However, including several frame-blocks per packet makes the transmission more vulnerable to packet loss, so interleaving may be used to reduce the effect packet loss will have on speech quality. A streaming server handling a large number of clients also needs a payload format that requires as few resources as possible when doing packetization. The octet-aligned and interleaving modes require the least amount of resources, while CRC, robust sorting, and bandwidth efficient modes have higher demands. Another scenario occurs when AMR or AMR-WB encoded speech will be transmitted from a non-IP system (e.g., a GSM or 3GPP network) to an IP/UDP/RTP VoIP terminal, and/or vice versa, as depicted in Figure 3.Sjoberg, et. al. Standards Track [Page 12]RFC 3267 RTP Payload Format for AMR and AMR-WB June 2002 AMR or AMR-WB over I.366.{2,3} or +------+ +----------+ 3G Iu or | | IP/UDP/RTP/AMR or | | <------------->| GW |<---------------------->| TERMINAL | GSM Abis | | IP/UDP/RTP/AMR-WB | | etc. +------+ +----------+ | GSM/3GPP network | IP network | Figure 3: GW to VoIP terminal scenario In such a case, it is likely that the AMR or AMR-WB frame is packetized in a different way in the non-IP network and will need to be re-packetized into RTP at the gateway. Also, speech frames from the non-IP network may come with some UEP/UED information (e.g., a frame quality indicator) that will need to be preserved and forwarded on to the decoder along with the speech bits. This is specified in Section 4.3.2. AMR's capability to do fast mode switching is exploited in some non- IP networks to optimize speech quality. To preserve this functionality in scenarios including a gateway to an IP network, a codec mode request (CMR) field is needed. The gateway will be responsible for forwarding the CMR between the non-IP and IP parts in both directions. The IP terminal should follow the CMR forwarded by the gateway to optimize speech quality going to the non-IP decoder. The mode control algorithm in the gateway must accommodate the delay imposed by the IP network on the response to CMR by the IP terminal. The IP terminal should not set the CMR (see Section 4.3.1), but the gateway can set the CMR value on frames going toward the encoder in the non-IP part to optimize speech quality from that encoder to the gateway. The gateway can alternatively set a lower CMR value, if desired, as one means to control congestion on the IP network.Sjoberg, et. al. Standards Track [Page 13]RFC 3267 RTP Payload Format for AMR and AMR-WB June 2002 A third likely scenario is that IP/UDP/RTP is used as transport between two non-IP systems, i.e., IP is originated and terminated in gateways on both sides of the IP transport, as illustrated in Figure 4 below. AMR or AMR-WB AMR or AMR-WB over over I.366.{2,3} or +------+ +------+ I.366.{2,3} or 3G Iu or | | IP/UDP/RTP/AMR or | | 3G Iu or <------------->| GW |<------------------->| GW |<-------------> GSM Abis | | IP/UDP/RTP/AMR-WB | | GSM Abis etc. +------+ +------+ etc. | | GSM/3GPP network | IP network | GSM/3GPP network | | Figure 4: GW to GW scenario This scenario requires the same mechanisms for preserving UED/UEP and CMR information as in the single gateway scenario. In addition, the CMR value may be set in packets received by the gateways on the IP network side. The gateway should forward to the non-IP side a CMR value that is the minimum of three values: - the CMR value it receives on the IP side; - the CMR value it calculates based on its reception quality on the non-IP side; and - a CMR value it may choose for congestion control of transmission on the IP side. The details of the control algorithm are left to the implementation.4. AMR and AMR-WB RTP Payload Formats The AMR and AMR-WB payload formats have identical structure, so they are specified together. The only differences are in the types of codec frames contained in the payload. The payload format consists of the RTP header, payload header and payload data.4.1. RTP Header Usage The format of the RTP header is specified in [8]. This payload format uses the fields of the header in a manner consistent with that specification.Sjoberg, et. al. Standards Track [Page 14]RFC 3267 RTP Payload Format for AMR and AMR-WB June 2002 The RTP timestamp corresponds to the sampling instant of the first sample encoded for the first frame-block in the packet. The timestamp clock frequency is the same as the sampling frequency, so the timestamp unit is in samples. The duration of one speech frame-block is 20 ms for both AMR and AMR-WB. For AMR, the sampling frequency is 8 kHz, corresponding to 160 encoded speech samples per frame from each channel. For AMR-WB, the sampling frequency is 16 kHz, corresponding to 320 samples per frame from each channel. Thus, the timestamp is increased by 160 for AMR and 320 for AMR-WB for each consecutive frame-block. A packet may contain multiple frame-blocks of encoded speech or comfort noise parameters. If interleaving is employed, the frame- blocks encapsulated into a payload are picked according to the interleaving rules as defined in Section 4.4.1. Otherwise, each packet covers a period of one or more contiguous 20 ms frame-block intervals. In case the data from all the channels for a particular frame-block in the period is missing, for example at a gateway from some other transport format, it is possible to indicate that no data is present for that frame-block rather than breaking a multi-frame- block packet into two, as explained in Section 4.3.2. To allow for error resiliency through redundant transmission, the periods covered by multiple packets MAY overlap in time. A receiver MUST be prepared to receive any speech frame multiple times, either in exact duplicates, or in different AMR rate modes, or with data present in one packet and not present in another. If multiple versions of the same speech frame are received, it is RECOMMENDED that the mode with the highest rate be used by the speech decoder. A given frame MUST NOT be encoded as speech in one packet and comfort noise parameters in another. The payload is always made an integral number of octets long by padding with zero bits if necessary. If additional padding is required to bring the payload length to a larger multiple of octets or for some other purpose, then the P bit in the RTP in the header may be set and padding appended as specified in [8]. The RTP header marker bit (M) SHALL be set to 1 if the first frame- block carried in the packet contains a speech frame which is the first in a talkspurt. For all other packets the marker bit SHALL be set to zero (M=0).Sjoberg, et. al. Standards Track [Page 15]RFC 3267 RTP Payload Format for AMR and AMR-WB June 2002 The assignment of an RTP payload type for this new packet format is outside the scope of this document, and will not be specified here. It is expected that the RTP profile under which this payload format is being used will assign a payload type for this encoding or specify that the payload type is to be bound dynamically.4.2. Payload Structure The complete payload consists of a payload header, a payload table of contents, and speech data representing one or more speech frame- blocks. The following diagram shows the general payload format layout: +----------------+-------------------+---------------- | payload header | table of contents | speech data ... +----------------+-------------------+---------------- Payloads containing more than one speech frame-block are called compound payloads. The following sections describe the variations taken by the payload format depending on whether the AMR session is set up to use the bandwidth-efficient mode or octet-aligned mode and any of the OPTIONAL functions for robust sorting, interleaving, and frame CRCs. Implementations SHOULD support both bandwidth-efficient and octet- aligned operation to increase interoperability.4.3. Bandwidth-Efficient Mode4.3.1. The Payload Header In bandwidth-efficient mode, the payload header simply consists of a 4 bit codec mode request: 0 1 2 3 +-+-+-+-+ | CMR | +-+-+-+-+ CMR (4 bits): Indicates a codec mode request sent to the speech encoder at the site of the receiver of this payload. The value of the CMR field is set to the frame type index of the corresponding speech mode being requested. The frame type index may be 0-7 for AMR, as defined in Table 1a in [2], or 0-8 for AMR-WB, as defined in Table 1a in [4]. CMR value 15 indicates that no mode request is present, and other values are for future use.Sjoberg, et. al. Standards Track [Page 16]RFC 3267 RTP Payload Format for AMR and AMR-WB June 2002 The mode request received in the CMR field is valid until the next CMR is received, i.e., a newly received CMR value overrides the previous one. Therefore, if a terminal continuously wishes to receive frames in the same mode X, it needs to set CMR=X for all its outbound payloads, and if a terminal has no preference in which mode to receive, it SHOULD set CMR=15 in all its outbound payloads. If receiving a payload with a CMR value which is not a speech mode or NO_DATA, the CMR MUST be ignored by the receiver. In a multi-channel session, CMR SHOULD be interpreted by the receiver of the payload as the desired encoding mode for all the channels in the session. An IP end-point SHOULD NOT set the CMR based on packet losses or other congestion indications, for several reasons: - The other end of the IP path may be a gateway to a non-IP network (such as a radio link) that needs to set the CMR field to optimize performance on that network. - Congestion on the IP network is managed by the IP sender, in this case at the other end of the IP path. Feedback about congestion SHOULD be provided to that IP sender through RTCP or other means, and then the sender can choose to avoid congestion using the most appropriate mechanism. That may include adjusting the codec mode, but also includes adjusting the level of redundancy or number of frames per packet. The encoder SHOULD follow a received mode request, but MAY change to a lower-numbered mode if it so chooses, for example to control congestion. The CMR field MUST be set to 15 for packets sent to a multicast group. The encoder in the speech sender SHOULD ignore mode requests when sending speech to a multicast session but MAY use RTCP feedback information as a hint that a mode change is needed.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -