📄 rfc2833.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 5 页
字号:
   The payload format is shown in Fig. 1.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     event     |E|R| volume    |          duration             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

               Figure 1: Payload Format for Named Events

      events: The events are encoded as shown in Sections 3.10 through
           3.14.

      volume: For DTMF digits and other events representable as tones,
           this field describes the power level of the tone, expressed
           in dBm0 after dropping the sign. Power levels range from 0 to
           -63 dBm0. The range of valid DTMF is from 0 to -36 dBm0 (must
           accept); lower than -55 dBm0 must be rejected (TR-TSY-000181,
           ITU-T Q.24A). Thus, larger values denote lower volume. This
           value is defined only for DTMF digits. For other events, it
           is set to zero by the sender and is ignored by the receiver.








Schulzrinne & Petrack       Standards Track                     [Page 6]

RFC 2833                         Tones                          May 2000


      duration: Duration of this digit, in timestamp units. Thus, the
           event began at the instant identified by the RTP timestamp
           and has so far lasted as long as indicated by this parameter.
           The event may or may not have ended.

           For a sampling rate of 8000 Hz, this field is sufficient to
           express event durations of up to approximately 8 seconds.

      E: If set to a value of one, the "end" bit indicates that this
           packet contains the end of the event. Thus, the duration
           parameter above measures the complete duration of the event.

           A sender MAY delay setting the end bit until retransmitting
           the last packet for a tone, rather than on its first
           transmission. This avoids having to wait to detect whether
           the tone has indeed ended.

           Receiver implementations MAY use different algorithms to
           create tones, including the two described here. In the first,
           the receiver simply places a tone of the given duration in
           the audio playout buffer at the location indicated by the
           timestamp. As additional packets are received that extend the
           same tone, the waveform in the playout buffer is extended
           accordingly. (Care has to be taken if audio is mixed, i.e.,
           summed, in the playout buffer rather than simply copied.)
           Thus, if a packet in a tone lasting longer than the packet
           interarrival time gets lost and the playout delay is short, a
           gap in the tone may occur.  Alternatively, the receiver can
           start a tone and play it until it receives a packet with the
           "E" bit set, the next tone, distinguished by a different
           timestamp value or a given time period elapses. This is more
           robust against packet loss, but may extend the tone if all
           retransmissions of the last packet in an event are lost.
           Limiting the time period of extending the tone is necessary
           to avoid that a tone "gets stuck". Regardless of the
           algorithm used, the tone SHOULD NOT be extended by more than
           three packet interarrival times. A slight extension of tone
           durations and shortening of pauses is generally harmless.

      R: This field is reserved for future use. The sender MUST set it
           to zero, the receiver MUST ignore it.










Schulzrinne & Petrack       Standards Track                     [Page 7]

RFC 2833                         Tones                          May 2000


3.6 Sending Event Packets

   An audio source SHOULD start transmitting event packets as soon as it
   recognizes an event and every 50 ms thereafter or the packet interval
   for the audio codec used for this session, if known. (The sender does
   not need to maintain precise time intervals between event packets in
   order to maintain precise inter-event times, since the timing
   information is contained in the timestamp.)

      Q.24 [5], Table A-1, indicates that all administrations surveyed
      use a minimum signal duration of 40 ms, with signaling velocity
      (tone and pause) of no less than 93 ms.

   If an event continues for more than one period, the source generating
   the events should send a new event packet with the RTP timestamp
   value corresponding to the beginning of the event and the duration of
   the event increased correspondingly. (The RTP sequence number is
   incremented by one for each packet.) If there has been no new event
   in the last interval, the event SHOULD be retransmitted three times
   or until the next event is recognized. This ensures that the duration
   of the event can be recognized correctly even if the last packet for
   an event is lost.

      DTMF digits and events are sent incrementally to avoid having the
      receiver wait for the completion of the event.  Since some tones
      are two seconds long, this would incur a substantial delay. The
      transmitter does not know if event length is important and thus
      needs to transmit immediately and incrementally. If the receiver
      application does not care about event length, the incremental
      transmission mechanism avoids delay. Some applications, such as
      gateways into the PSTN, care about both delays and event duration.

3.7 Reliability

   During an event, the RTP event payload format provides incremental
   updates on the event. The error resiliency depends on the playout
   delay at the receiver. For example, for a playout delay of 120 ms and
   a packet gap of 50 ms, two packets in a row can get lost without
   causing a gap in the tones generated at the receiver.

   The audio redundancy mechanism described in RFC 2198 [6] MAY be used
   to recover from packet loss across events. The effective data rate is
   r times 64 bits (32 bits for the redundancy header and 32 bits for
   the telephone-event payload) every 50 ms or r times 1280 bits/second,
   where r is the number of redundant events carried in each packet. The
   value of r is an implementation trade-off, with a value of 5
   suggested.




Schulzrinne & Petrack       Standards Track                     [Page 8]

RFC 2833                         Tones                          May 2000


      The timestamp offset in this redundancy scheme has 14 bits, so
      that it allows a single packet to "cover" 2.048 seconds of
      telephone events at a sampling rate of 8000 Hz.  Including the
      starting time of previous events allows precise reconstruction of
      the tone sequence at a gateway.  The scheme is resilient to
      consecutive packet losses spanning this interval of 2.048 seconds
      or r digits, whichever is less. Note that for previous digits,
      only an average loudness can be represented.

   An encoder MAY treat the event payload as a highly-compressed version
   of the current audio frame. In that mode, each RTP packet during an
   event would contain the current audio codec rendition (say, G.723.1
   or G.729) of this digit as well as the representation described in
   Section 3.5, plus any previous events seen earlier.

      This approach allows dumb gateways that do not understand this
      format to function. See also the discussion in Section 1.

3.8 Example

   A typical RTP packet, where the user is just dialing the last digit
   of the DTMF sequence "911". The first digit was 200 ms long (1600
   timestamp units) and started at time 0, the second digit lasted 250
   ms (2000 timestamp units) and started at time 800 ms (6400 timestamp
   units), the third digit was pressed at time 1.4 s (11,200 timestamp
   units) and the packet shown was sent at 1.45 s (11,600 timestamp
   units).  The frame duration is 50 ms. To make the parts recognizable,
   the figure below ignores byte alignment. Timestamp and sequence
   number are assumed to have been zero at the beginning of the first
   digit. In this example, the dynamic payload types 96 and 97 have been
   assigned for the redundancy mechanism and the telephone event
   payload, respectively.



















Schulzrinne & Petrack       Standards Track                     [Page 9]

RFC 2833                         Tones                          May 2000


3.9 Indication of Receiver Capabilities using SDP

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   | 2 |0|0|   0   |0|     96      |              28               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           timestamp                           |
   |                             11200                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           synchronization source (SSRC) identifier            |
   |                            0x5234a8                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|   block PT  |     timestamp offset      |   block length    |
   |1|     97      |            11200          |         4         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|   block PT  |     timestamp offset      |   block length    |
   |1|     97      |   11200 - 6400 = 4800     |         4         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|   Block PT  |
   |0|     97      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     digit     |E R| volume    |          duration             |
   |       9       |1 0|     7     |             1600              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     digit     |E R| volume    |          duration             |
   |       1       |1 0|    10     |             2000              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     digit     |E R| volume    |          duration             |
   |       1       |0 0|    20     |              400              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          Figure 2: Example RTP packet after dialing "911"

   Receivers MAY indicate which named events they can handle, for
   example, by using the Session Description Protocol (RFC 2327 [7]).
   The payload formats use the following fmtp format to list the event
   values that they can receive:

   a=fmtp:<format> <list of values>

   The list of values consists of comma-separated elements, which can be
   either a single decimal number or two decimal numbers separated by a
   hyphen (dash), where the second number is larger than the first. No
   whitespace is allowed between numbers or hyphens. The list does not
   have to be sorted.




Schulzrinne & Petrack       Standards Track                    [Page 10]

RFC 2833                         Tones                          May 2000


   For example, if the payload format uses the payload type number 100,
   and the implementation can handle the DTMF tones (events 0 through
   15) and the dial and ringing tones, it would include the following
   description in its SDP message:

   a=fmtp:100 0-15,66,70

   Since all implementations MUST be able to receive events 0 through
   15, listing these events in the a=fmtp line is OPTIONAL.

   The corresponding MIME parameter is "events", so that the following
   sample media type definition corresponds to the SDP example above:

   audio/telephone-event;events="0-11,66,67";rate="8000"

3.10 DTMF Events

   Table 1 summarizes the DTMF-related named events within the
   telephone-event payload format.

                     Event  encoding (decimal)
                     _________________________
                     0--9                0--9
                     *                     10
                     #                     11
                     A--D              12--15
                     Flash                 16

                     Table 1: DTMF named events

3.11 Data Modem and Fax Events

   Table 3.11 summarizes the events and tones that can appear on a
   subscriber line serving a fax machine or modem. The tones are
   described below, with additional detail in Table 7.

      ANS: This 2100 +/- 15 Hz tone is used to disable echo
           suppression for data transmission [8,9]. For fax machines,
           Recommendation T.30 [9] refers to this tone as called
           terminal identification (CED) answer tone.

      /ANS: This is the same signal as ANS, except that it reverses
           phase at an interval of 450 +/- 25 ms. It disables both
           echo cancellers and echo suppressors. (In the ITU
           Recommendation V.25 [8], this signal is rendered as ANS
           with a bar on top.)
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -