📄 rfc2833.txt
字号:
| event |E|R| volume | duration | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1: Payload Format for Named Events events: The events are encoded as shown in Sections 3.10 through 3.14. volume: For DTMF digits and other events representable as tones, this field describes the power level of the tone, expressed in dBm0 after dropping the sign. Power levels range from 0 to -63 dBm0. The range of valid DTMF is from 0 to -36 dBm0 (must accept); lower than -55 dBm0 must be rejected (TR-TSY-000181, ITU-T Q.24A). Thus, larger values denote lower volume. This value is defined only for DTMF digits. For other events, it is set to zero by the sender and is ignored by the receiver.Schulzrinne & Petrack Standards Track [Page 6]RFC 2833 Tones May 2000 duration: Duration of this digit, in timestamp units. Thus, the event began at the instant identified by the RTP timestamp and has so far lasted as long as indicated by this parameter. The event may or may not have ended. For a sampling rate of 8000 Hz, this field is sufficient to express event durations of up to approximately 8 seconds. E: If set to a value of one, the "end" bit indicates that this packet contains the end of the event. Thus, the duration parameter above measures the complete duration of the event. A sender MAY delay setting the end bit until retransmitting the last packet for a tone, rather than on its first transmission. This avoids having to wait to detect whether the tone has indeed ended. Receiver implementations MAY use different algorithms to create tones, including the two described here. In the first, the receiver simply places a tone of the given duration in the audio playout buffer at the location indicated by the timestamp. As additional packets are received that extend the same tone, the waveform in the playout buffer is extended accordingly. (Care has to be taken if audio is mixed, i.e., summed, in the playout buffer rather than simply copied.) Thus, if a packet in a tone lasting longer than the packet interarrival time gets lost and the playout delay is short, a gap in the tone may occur. Alternatively, the receiver can start a tone and play it until it receives a packet with the "E" bit set, the next tone, distinguished by a different timestamp value or a given time period elapses. This is more robust against packet loss, but may extend the tone if all retransmissions of the last packet in an event are lost. Limiting the time period of extending the tone is necessary to avoid that a tone "gets stuck". Regardless of the algorithm used, the tone SHOULD NOT be extended by more than three packet interarrival times. A slight extension of tone durations and shortening of pauses is generally harmless. R: This field is reserved for future use. The sender MUST set it to zero, the receiver MUST ignore it.Schulzrinne & Petrack Standards Track [Page 7]RFC 2833 Tones May 20003.6 Sending Event Packets An audio source SHOULD start transmitting event packets as soon as it recognizes an event and every 50 ms thereafter or the packet interval for the audio codec used for this session, if known. (The sender does not need to maintain precise time intervals between event packets in order to maintain precise inter-event times, since the timing information is contained in the timestamp.) Q.24 [5], Table A-1, indicates that all administrations surveyed use a minimum signal duration of 40 ms, with signaling velocity (tone and pause) of no less than 93 ms. If an event continues for more than one period, the source generating the events should send a new event packet with the RTP timestamp value corresponding to the beginning of the event and the duration of the event increased correspondingly. (The RTP sequence number is incremented by one for each packet.) If there has been no new event in the last interval, the event SHOULD be retransmitted three times or until the next event is recognized. This ensures that the duration of the event can be recognized correctly even if the last packet for an event is lost. DTMF digits and events are sent incrementally to avoid having the receiver wait for the completion of the event. Since some tones are two seconds long, this would incur a substantial delay. The transmitter does not know if event length is important and thus needs to transmit immediately and incrementally. If the receiver application does not care about event length, the incremental transmission mechanism avoids delay. Some applications, such as gateways into the PSTN, care about both delays and event duration.3.7 Reliability During an event, the RTP event payload format provides incremental updates on the event. The error resiliency depends on the playout delay at the receiver. For example, for a playout delay of 120 ms and a packet gap of 50 ms, two packets in a row can get lost without causing a gap in the tones generated at the receiver. The audio redundancy mechanism described in RFC 2198 [6] MAY be used to recover from packet loss across events. The effective data rate is r times 64 bits (32 bits for the redundancy header and 32 bits for the telephone-event payload) every 50 ms or r times 1280 bits/second, where r is the number of redundant events carried in each packet. The value of r is an implementation trade-off, with a value of 5 suggested.Schulzrinne & Petrack Standards Track [Page 8]RFC 2833 Tones May 2000 The timestamp offset in this redundancy scheme has 14 bits, so that it allows a single packet to "cover" 2.048 seconds of telephone events at a sampling rate of 8000 Hz. Including the starting time of previous events allows precise reconstruction of the tone sequence at a gateway. The scheme is resilient to consecutive packet losses spanning this interval of 2.048 seconds or r digits, whichever is less. Note that for previous digits, only an average loudness can be represented. An encoder MAY treat the event payload as a highly-compressed version of the current audio frame. In that mode, each RTP packet during an event would contain the current audio codec rendition (say, G.723.1 or G.729) of this digit as well as the representation described in Section 3.5, plus any previous events seen earlier. This approach allows dumb gateways that do not understand this format to function. See also the discussion in Section 1.3.8 Example A typical RTP packet, where the user is just dialing the last digit of the DTMF sequence "911". The first digit was 200 ms long (1600 timestamp units) and started at time 0, the second digit lasted 250 ms (2000 timestamp units) and started at time 800 ms (6400 timestamp units), the third digit was pressed at time 1.4 s (11,200 timestamp units) and the packet shown was sent at 1.45 s (11,600 timestamp units). The frame duration is 50 ms. To make the parts recognizable, the figure below ignores byte alignment. Timestamp and sequence number are assumed to have been zero at the beginning of the first digit. In this example, the dynamic payload types 96 and 97 have been assigned for the redundancy mechanism and the telephone event payload, respectively.Schulzrinne & Petrack Standards Track [Page 9]RFC 2833 Tones May 20003.9 Indication of Receiver Capabilities using SDP 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC |M| PT | sequence number | | 2 |0|0| 0 |0| 96 | 28 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp | | 11200 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | | 0x5234a8 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F| block PT | timestamp offset | block length | |1| 97 | 11200 | 4 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F| block PT | timestamp offset | block length | |1| 97 | 11200 - 6400 = 4800 | 4 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F| Block PT | |0| 97 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | digit |E R| volume | duration | | 9 |1 0| 7 | 1600 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | digit |E R| volume | duration | | 1 |1 0| 10 | 2000 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | digit |E R| volume | duration | | 1 |0 0| 20 | 400 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2: Example RTP packet after dialing "911" Receivers MAY indicate which named events they can handle, for example, by using the Session Description Protocol (RFC 2327 [7]). The payload formats use the following fmtp format to list the event values that they can receive: a=fmtp:<format> <list of values> The list of values consists of comma-separated elements, which can be either a single decimal number or two decimal numbers separated by a hyphen (dash), where the second number is larger than the first. No whitespace is allowed between numbers or hyphens. The list does not have to be sorted.Schulzrinne & Petrack Standards Track [Page 10]RFC 2833 Tones May 2000 For example, if the payload format uses the payload type number 100, and the implementation can handle the DTMF tones (events 0 through 15) and the dial and ringing tones, it would include the following description in its SDP message: a=fmtp:100 0-15,66,70 Since all implementations MUST be able to receive events 0 through 15, listing these events in the a=fmtp line is OPTIONAL. The corresponding MIME parameter is "events", so that the following sample media type definition corresponds to the SDP example above: audio/telephone-event;events="0-11,66,67";rate="8000"3.10 DTMF Events Table 1 summarizes the DTMF-related named events within the telephone-event payload format. Event encoding (decimal) _________________________ 0--9 0--9 * 10 # 11 A--D 12--15 Flash 16 Table 1: DTMF named events3.11 Data Modem and Fax Events Table 3.11 summarizes the events and tones that can appear on a subscriber line serving a fax machine or modem. The tones are described below, with additional detail in Table 7. ANS: This 2100 +/- 15 Hz tone is used to disable echo suppression for data transmission [8,9]. For fax machines, Recommendation T.30 [9] refers to this tone as called terminal identification (CED) answer tone. /ANS: This is the same signal as ANS, except that it reverses phase at an interval of 450 +/- 25 ms. It disables both echo cancellers and echo suppressors. (In the ITU Recommendation V.25 [8], this signal is rendered as ANS with a bar on top.)Schulzrinne & Petrack Standards Track [Page 11]RFC 2833 Tones May 2000 ANSam: The modified answer tone (ANSam) [3] is a sinewave signal at 2100 +/- 1 Hz without phase reversals, amplitude-modulated by a sinewave at 15 +/- 0.1 Hz. This tone is sent by modems if network echo canceller disabling is not required.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -