📄 rfc1890.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 3 页
字号:

RFC 1890                       AV Profile                   January 1996


   For sample-based encodings producing one or more octets per sample,
   samples from different channels sampled at the same sampling instant
   are packed in consecutive octets. For example, for a two-channel
   encoding, the octet sequence is (left channel, first sample), (right
   channel, first sample), (left channel, second sample), (right
   channel, second sample), .... For multi-octet encodings, octets are
   transmitted in network byte order (i.e., most significant octet
   first).

   The packing of sample-based encodings producing less than one octet
   per sample is encoding-specific.

4.3 Guidelines for Frame-Based Audio Encodings

   Frame-based encodings encode a fixed-length block of audio into
   another block of compressed data, typically also of fixed length. For
   frame-based encodings, the sender may choose to combine several such
   frames into a single message. The receiver can tell the number of
   frames contained in a message since the frame duration is defined as
   part of the encoding.

   For frame-based codecs, the channel order is defined for the whole
   block. That is, for two-channel audio, right and left samples are
   coded independently, with the encoded frame for the left channel
   preceding that for the right channel.

   All frame-oriented audio codecs should be able to encode and decode
   several consecutive frames within a single packet. Since the frame
   size for the frame-oriented codecs is given, there is no need to use
   a separate designation for the same encoding, but with different
   number of frames per packet.




















Schulzrinne                 Standards Track                     [Page 7]

RFC 1890                       AV Profile                   January 1996


4.4 Audio Encodings

           encoding    sample/frame    bits/sample    ms/frame
           ____________________________________________________
           1016        frame           N/A            30
           DVI4        sample          4
           G721        sample          4
           G722        sample          8
           G728        frame           N/A            2.5
           GSM         frame           N/A            20
           L8          sample          8
           L16         sample          16
           LPC         frame           N/A            20
           MPA         frame           N/A
           PCMA        sample          8
           PCMU        sample          8
           VDVI        sample          var.

                 Table 1: Properties of Audio Encodings

   The characteristics of standard audio encodings are shown in Table 1
   and their payload types are listed in Table 2.

4.4.1 1016

   Encoding 1016 is a frame based encoding using code-excited linear
   prediction (CELP) and is specified in Federal Standard FED-STD 1016
   [2,3,4,5].

   The U. S. DoD's Federal-Standard-1016 based 4800 bps code excited
   linear prediction voice coder version 3.2 (CELP 3.2) Fortran and C
   simulation source codes are available for worldwide distribution at
   no charge (on DOS diskettes, but configured to compile on Sun SPARC
   stations) from: Bob Fenichel, National Communications System,
   Washington, D.C. 20305, phone +1-703-692-2124, fax +1-703-746-4960.

4.4.2 DVI4

   DVI4 is specified, with pseudo-code, in [6] as the IMA ADPCM wave
   type. A specification titled "DVI ADPCM Wave Type" can also be found
   in the Microsoft Developer Network Development Library CD ROM
   published quarterly by Microsoft. The relevant section is found under
   Product Documentation, SDKs, Multimedia Standards Update, New
   Multimedia Data Types and Data Techniques, Revision 3.0, April 15,
   1994. However, the encoding defined here as DVI4 differs in two
   respects from these recommendations:





Schulzrinne                 Standards Track                     [Page 8]

RFC 1890                       AV Profile                   January 1996


        o The header contains the predicted value rather than the first
         sample value.

        o IMA ADPCM blocks contain odd number of samples, since the
         first sample of a block is contained just in the header
         (uncompressed), followed by an even number of compressed
         samples. DVI4 has an even number of compressed samples only,
         using the 'predict' word from the header to decode the first
         sample.

   Each packet contains a single DVI block. The profile only defines the
   4-bit-per-sample version, while IMA also specifies a 3-bit-per-sample
   encoding.

   The "header" word for each channel has the following structure:

     int16  predict;  /* predicted value of first sample
                         from the previous block (L16 format) */
     u_int8 index;    /* current index into stepsize table */
     u_int8 reserved; /* set to zero by sender, ignored by receiver */

   Packing of samples for multiple channels is for further study.

   The document, "IMA Recommended Practices for Enhancing Digital Audio
   Compatibility in Multimedia Systems (version 3.0)", contains the
   algorithm description.  It is available from:

   Interactive Multimedia Association
   48 Maryland Avenue, Suite 202
   Annapolis, MD 21401-8011
   USA
   phone: +1 410 626-1380

4.4.3 G721

   G721 is specified in ITU recommendation G.721. Reference
   implementations for G.721 are available as part of the CCITT/ITU-T
   Software Tool Library (STL) from the ITU General Secretariat, Sales
   Service, Place du Nations, CH-1211 Geneve 20, Switzerland. The
   library is covered by a license.

4.4.4 G722

   G722 is specified in ITU-T recommendation G.722, "7 kHz audio-coding
   within 64 kbit/s".

   G728 is specified in ITU-T recommendation G.728, "Coding of speech at
   16 kbit/s using low-delay code excited linear prediction".



Schulzrinne                 Standards Track                     [Page 9]

RFC 1890                       AV Profile                   January 1996


4.4.6 GSM

   GSM (group speciale mobile) denotes the European GSM 06.10
   provisional standard for full-rate speech transcoding, prI-ETS 300
   036, which is based on RPE/LTP (residual pulse excitation/long term
   prediction) coding at a rate of 13 kb/s [7,8,9]. The standard can be
   obtained from

   ETSI (European Telecommunications Standards Institute)
   ETSI Secretariat: B.P.152
   F-06561 Valbonne Cedex
   France
   Phone: +33 92 94 42 00
   Fax: +33 93 65 47 16

4.4.7 L8

   L8 denotes linear audio data, using 8-bits of precision with an
   offset of 128, that is, the most negative signal is encoded as zero.

4.4.8 L16

   L16 denotes uncompressed audio data, using 16-bit signed
   representation with 65535 equally divided steps between minimum and
   maximum signal level, ranging from -32768 to 32767. The value is
   represented in two's complement notation and network byte order.

4.4.9 LPC

   LPC designates an experimental linear predictive encoding contributed
   by Ron Frederick, Xerox PARC, which is based on an implementation
   written by Ron Zuckerman, Motorola, posted to the Usenet group
   comp.dsp on June 26, 1992.

4.4.10 MPA

   MPA denotes MPEG-I or MPEG-II audio encapsulated as elementary
   streams. The encoding is defined in ISO standards ISO/IEC 11172-3 and
   13818-3. The encapsulation is specified in work in progress [10],
   Section 3. The authors can be contacted at

   Don Hoffman
   Sun Microsystems, Inc.
   Mail-stop UMPK14-305
   2550 Garcia Avenue
   Mountain View, California 94043-1100
   USA
   electronic mail: don.hoffman@eng.sun.com



Schulzrinne                 Standards Track                    [Page 10]

RFC 1890                       AV Profile                   January 1996


   Sampling rate and channel count are contained in the payload. MPEG-I
   audio supports sampling rates of 32000, 44100, and 48000 Hz (ISO/IEC
   11172-3, section 1.1; "Scope"). MPEG-II additionally supports ISO/IEC
   11172-3 Audio...").

4.4.11 PCMA

   PCMA is specified in CCITT/ITU-T recommendation G.711. Audio data is
   encoded as eight bits per sample, after logarithmic scaling. Code to
   convert between linear and A-law companded data is available in [6].
   A detailed description is given by Jayant and Noll [11].

4.4.12 PCMU

   PCMU is specified in CCITT/ITU-T recommendation G.711. Audio data is
   encoded as eight bits per sample, after logarithmic scaling. Code to
   convert between linear and mu-law companded data is available in [6].
   PCMU is the encoding used for the Internet media type audio/basic.  A
   detailed description is given by Jayant and Noll [11].

4.4.13 VDVI

   VDVI is a variable-rate version of DVI4, yielding speech bit rates of
   between 10 and 25 kb/s. It is specified for single-channel operation
   only. It uses the following encoding:

                    DVI4 codeword    VDVI bit pattern
                    __________________________________
                                0    00
                                1    010
                                2    1100
                                3    11100
                                4    111100
                                5    1111100
                                6    11111100
                                7    11111110
                                8    10
                                9    011
                               10    1101
                               11    11101
                               12    111101
                               13    1111101
                               14    11111101
                               15    11111111







Schulzrinne                 Standards Track                    [Page 11]

RFC 1890                       AV Profile                   January 1996


5.  Video

   The following video encodings are currently defined, with their
   abbreviated names used for identification:

5.1 CelB

   The CELL-B encoding is a proprietary encoding proposed by Sun
   Microsystems.  The byte stream format is described in work in
   progress [12].  The author can be contacted at

   Michael F. Speer
   Sun Microsystems Computer Corporation
   2550 Garcia Ave MailStop UMPK14-305
   Mountain View, CA 94043
   United States
   electronic mail: michael.speer@eng.sun.com

5.2 JPEG

The encoding is specified in ISO Standards 10918-1 and 10918-2. The
RTP payload format is as specified in work in progress [13].  Further
information can be obtained from

   Steven McCanne
   Lawrence Berkeley National Laboratory
   M/S 46A-1123
   One Cyclotron Road
   Berkeley, CA 94720
   United States
   Phone: +1 510 486 7520
   electronic mail: mccanne@ee.lbl.gov

5.3 H261

   The encoding is specified in CCITT/ITU-T standard H.261. The
   packetization and RTP-specific properties are described in work in
   progress [14]. Further information can be obtained from

   Thierry Turletti
   Office NE 43-505
   Telemedia, Networks and Systems
   Laboratory for Computer Science
   Massachusetts Institute of Technology
   545 Technology Square
   Cambridge, MA 02139
   United States
   electronic mail: turletti@clove.lcs.mit.edu



Schulzrinne                 Standards Track                    [Page 12]

RFC 1890                       AV Profile                   January 1996
💿 文件大小 56963 K
👤 上传用户 zhhw254774338
📂 所属分类电子书籍
🏷️ 相关标签

#RFC #文档
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -