📄 rfc2833.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 5 页
字号:
12 3 4 5 下一页






Network Working Group                                      H. Schulzrinne
Request for Comments: 2833                            Columbia University
Category: Standards Track                                      S. Petrack
                                                                  MetaTel
                                                                 May 2000


   RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals

Status of this Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2000).  All Rights Reserved.

Abstract

   This memo describes how to carry dual-tone multifrequency (DTMF)
   signaling, other tone signals and telephony events in RTP packets.

1 Introduction

   This memo defines two payload formats, one for carrying dual-tone
   multifrequency (DTMF) digits, other line and trunk signals (Section
   3), and a second one for general multi-frequency tones in RTP [1]
   packets (Section 4). Separate RTP payload formats are desirable since
   low-rate voice codecs cannot be guaranteed to reproduce these tone
   signals accurately enough for automatic recognition. Defining
   separate payload formats also permits higher redundancy while
   maintaining a low bit rate.

   The payload formats described here may be useful in at least three
   applications: DTMF handling for gateways and end systems, as well as
   "RTP trunks". In the first application, the Internet telephony
   gateway detects DTMF on the incoming circuits and sends the RTP
   payload described here instead of regular audio packets. The gateway
   likely has the necessary digital signal processors and algorithms, as
   it often needs to detect DTMF, e.g., for two-stage dialing. Having
   the gateway detect tones relieves the receiving Internet end system
   from having to do this work and also avoids that low bit-rate codecs
   like G.723.1 render DTMF tones unintelligible. Secondly, an Internet




Schulzrinne & Petrack       Standards Track                     [Page 1]

RFC 2833                         Tones                          May 2000


   end system such as an "Internet phone" can emulate DTMF functionality
   without concerning itself with generating precise tone pairs and
   without imposing the burden of tone recognition on the receiver.

   In the "RTP trunk" application, RTP is used to replace a normal
   circuit-switched trunk between two nodes. This is particularly of
   interest in a telephone network that is still mostly circuit-
   switched.  In this case, each end of the RTP trunk encodes audio
   channels into the appropriate encoding, such as G.723.1 or G.729.
   However, this encoding process destroys in-band signaling information
   which is carried using the least-significant bit ("robbed bit
   signaling") and may also interfere with in-band signaling tones, such
   as the MF digit tones. In addition, tone properties such as the phase
   reversals in the ANSam tone, will not survive speech coding. Thus,
   the gateway needs to remove the in-band signaling information from
   the bit stream. It can now either carry it out-of-band in a signaling
   transport mechanism yet to be defined, or it can use the mechanism
   described in this memorandum. (If the two trunk end points are within
   reach of the same media gateway controller, the media gateway
   controller can also handle the signaling.)  Carrying it in-band may
   simplify the time synchronization between audio packets and the tone
   or signal information. This is particularly relevant where duration
   and timing matter, as in the carriage of DTMF signals.

1.1 Terminology

   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
   and "OPTIONAL" are to be interpreted as described in RFC 2119 [2] and
   indicate requirement levels for compliant implementations.

2 Events vs. Tones

   A gateway has two options for handling DTMF digits and events. First,
   it can simply measure the frequency components of the voice band
   signals and transmit this information to the RTP receiver (Section
   4). In this mode, the gateway makes no attempt to discern the meaning
   of the tones, but simply distinguishes tones from speech signals.

   All tone signals in use in the PSTN and meant for human consumption
   are sequences of simple combinations of sine waves, either added or
   modulated. (There is at least one tone, the ANSam tone [3] used for
   indicating data transmission over voice lines, that makes use of
   periodic phase reversals.)

   As a second option, a gateway can recognize the tones and translate
   them into a name, such as ringing or busy tone. The receiver then
   produces a tone signal or other indication appropriate to the signal.



Schulzrinne & Petrack       Standards Track                     [Page 2]

RFC 2833                         Tones                          May 2000


   Generally, since the recognition of signals often depends on their
   on/off pattern or the sequence of several tones, this recognition can
   take several seconds. On the other hand, the gateway may have access
   to the actual signaling information that generates the tones and thus
   can generate the RTP packet immediately, without the detour through
   acoustic signals.

   In the phone network, tones are generated at different places,
   depending on the switching technology and the nature of the tone.
   This determines, for example, whether a person making a call to a
   foreign country hears her local tones she is familiar with or the
   tones as used in the country called.

   For analog lines, dial tone is always generated by the local switch.
   ISDN terminals may generate dial tone locally and then send a Q.931
   SETUP message containing the dialed digits. If the terminal just
   sends a SETUP message without any Called Party digits, then the
   switch does digit collection, provided by the terminal as KEYPAD
   messages, and provides dial tone over the B-channel. The terminal can
   either use the audio signal on the B-channel or can use the Q.931
   messages to trigger locally generated dial tone.

   Ringing tone (also called ringback tone) is generated by the local
   switch at the callee, with a one-way voice path opened up as soon as
   the callee's phone rings. (This reduces the chance of clipping the
   called party's response just after answer. It also permits pre-answer
   announcements or in-band call-progress indications to reach the
   caller before or in lieu of a ringing tone.) Congestion tone and
   special information tones can be generated by any of the switches
   along the way, and may be generated by the caller's switch based on
   ISUP messages received. Busy tone is generated by the caller's
   switch, triggered by the appropriate ISUP message, for analog
   instruments, or the ISDN terminal.

   Gateways which send signaling events via RTP MAY send both named
   signals (Section 3) and the tone representation (Section 4) as a
   single RTP session, using the redundancy mechanism defined in Section
   3.7 to interleave the two representations. It is generally a good
   idea to send both, since it allows the receiver to choose the
   appropriate rendering.

   If a gateway cannot present a tone representation, it SHOULD send the
   audio tones as regular RTP audio packets (e.g., as payload format
   PCMU), in addition to the named signals.







Schulzrinne & Petrack       Standards Track                     [Page 3]

RFC 2833                         Tones                          May 2000


3 RTP Payload Format for Named Telephone Events

3.1 Introduction

   The payload format for named telephone events described below is
   suitable for both gateway and end-to-end scenarios. In the gateway
   scenario, an Internet telephony gateway connecting a packet voice
   network to the PSTN recreates the DTMF tones or other telephony
   events and injects them into the PSTN. Since, for example, DTMF digit
   recognition takes several tens of milliseconds, the first few
   milliseconds of a digit will arrive as regular audio packets. Thus,
   careful time and power (volume) alignment between the audio samples
   and the events is needed to avoid generating spurious digits at the
   receiver.

   DTMF digits and named telephone events are carried as part of the
   audio stream, and MUST use the same sequence number and time-stamp
   base as the regular audio channel to simplify the generation of audio
   waveforms at a gateway. The default clock frequency is 8,000 Hz, but
   the clock frequency can be redefined when assigning the dynamic
   payload type.

   The payload format described here achieves a higher redundancy even
   in the case of sustained packet loss than the method proposed for the
   Voice over Frame Relay Implementation Agreement [4].

   If an end system is directly connected to the Internet and does not
   need to generate tone signals again, time alignment and power levels
   are not relevant. These systems rely on PSTN gateways or Internet end
   systems to generate DTMF events and do not perform their own audio
   waveform analysis. An example of such a system is an Internet
   interactive voice-response (IVR) system.

   In circumstances where exact timing alignment between the audio
   stream and the DTMF digits or other events is not important and data
   is sent unicast, such as the IVR example mentioned earlier, it may be
   preferable to use a reliable control protocol rather than RTP
   packets. In those circumstances, this payload format would not be
   used.

3.2 Simultaneous Generation of Audio and Events

   A source MAY send events and coded audio packets for the same time
   instants, using events as the redundant encoding for the audio
   stream, or it MAY block outgoing audio while event tones are active
   and only send named events as both the primary and redundant
   encodings.




Schulzrinne & Petrack       Standards Track                     [Page 4]

RFC 2833                         Tones                          May 2000


   Note that a period covered by an encoded tone may overlap in time
   with a period of audio encoded by other means. This is likely to
   occur at the onset of a tone and is necessary to avoid possible
   errors in the interpretation of the reproduced tone at the remote
   end.  Implementations supporting this payload format must be prepared
   to handle the overlap. It is RECOMMENDED that gateways only render
   the encoded tone since the audio may contain spurious tones
   introduced by the audio compression algorithm. However, it is
   anticipated that these extra tones in general should not interfere
   with recognition at the far end.

3.3 Event Types

   This payload format is used for five different types of signals:

      o  DTMF tones (Section 3.10);

      o  fax-related tones (Section 3.11);

      o  standard subscriber line tones (Section 3.12);

      o  country-specific subscriber line tones (Section 3.13) and;

      o  trunk events (Section 3.14).

   A compliant implementation MUST support the events listed in Table 1
   with the exception of "flash". If it uses some other, out-of-band
   mechanism for signaling line conditions, it does not have to
   implement the other events.

   In some cases, an implementation may simply ignore certain events,
   such as fax tones, that do not make sense in a particular
   environment.  Section 3.9 specifies how an implementation can use the
   SDP "fmtp" parameter within an SDP description to indicate its
   inability to understand a particular event or range of events.

   Depending on the available user interfaces, an implementation MAY
   render all tones in Table 5 the same or, preferably, use the tones
   conveyed by the concurrent "tone" payload or other RTP audio payload.
   Alternatively, it could provide a textual representation.

   Note that end systems that emulate telephones only need to support
   the events described in Sections 3.10 and 3.12, while systems that
   receive trunk signaling need to implement those in Sections 3.10,
   3.11, 3.12 and 3.14, since MF trunks also carry most of the "line"
   signals. Systems that do not support fax or modem functionality do
   not need to render fax-related events described in Section 3.11.




Schulzrinne & Petrack       Standards Track                     [Page 5]

RFC 2833                         Tones                          May 2000


   The RTP payload format is designated as "telephone-event", the MIME
   type as "audio/telephone-event". The default timestamp rate is 8000
   Hz, but other rates may be defined. In accordance with current
   practice, this payload format does not have a static payload type
   number, but uses a RTP payload type number established dynamically
   and out-of-band.

3.4 Use of RTP Header Fields

      Timestamp: The RTP timestamp reflects the measurement point for
           the current packet. The event duration described in Section
           3.5 extends forwards from that time. The receiver calculates
           jitter for RTCP receiver reports based on all packets with a
           given timestamp. Note: The jitter value should primarily be
           used as a means for comparing the reception quality between
           two users or two time-periods, not as an absolute measure.

      Marker bit: The RTP marker bit indicates the beginning of a new
           event.

3.5 Payload Format
12 3 4 5 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -