📄 rfc1890.txt
字号:
RFC 1890 AV Profile January 1996 For sample-based encodings producing one or more octets per sample, samples from different channels sampled at the same sampling instant are packed in consecutive octets. For example, for a two-channel encoding, the octet sequence is (left channel, first sample), (right channel, first sample), (left channel, second sample), (right channel, second sample), .... For multi-octet encodings, octets are transmitted in network byte order (i.e., most significant octet first). The packing of sample-based encodings producing less than one octet per sample is encoding-specific.4.3 Guidelines for Frame-Based Audio Encodings Frame-based encodings encode a fixed-length block of audio into another block of compressed data, typically also of fixed length. For frame-based encodings, the sender may choose to combine several such frames into a single message. The receiver can tell the number of frames contained in a message since the frame duration is defined as part of the encoding. For frame-based codecs, the channel order is defined for the whole block. That is, for two-channel audio, right and left samples are coded independently, with the encoded frame for the left channel preceding that for the right channel. All frame-oriented audio codecs should be able to encode and decode several consecutive frames within a single packet. Since the frame size for the frame-oriented codecs is given, there is no need to use a separate designation for the same encoding, but with different number of frames per packet.Schulzrinne Standards Track [Page 7]RFC 1890 AV Profile January 19964.4 Audio Encodings encoding sample/frame bits/sample ms/frame ____________________________________________________ 1016 frame N/A 30 DVI4 sample 4 G721 sample 4 G722 sample 8 G728 frame N/A 2.5 GSM frame N/A 20 L8 sample 8 L16 sample 16 LPC frame N/A 20 MPA frame N/A PCMA sample 8 PCMU sample 8 VDVI sample var. Table 1: Properties of Audio Encodings The characteristics of standard audio encodings are shown in Table 1 and their payload types are listed in Table 2.4.4.1 1016 Encoding 1016 is a frame based encoding using code-excited linear prediction (CELP) and is specified in Federal Standard FED-STD 1016 [2,3,4,5]. The U. S. DoD's Federal-Standard-1016 based 4800 bps code excited linear prediction voice coder version 3.2 (CELP 3.2) Fortran and C simulation source codes are available for worldwide distribution at no charge (on DOS diskettes, but configured to compile on Sun SPARC stations) from: Bob Fenichel, National Communications System, Washington, D.C. 20305, phone +1-703-692-2124, fax +1-703-746-4960.4.4.2 DVI4 DVI4 is specified, with pseudo-code, in [6] as the IMA ADPCM wave type. A specification titled "DVI ADPCM Wave Type" can also be found in the Microsoft Developer Network Development Library CD ROM published quarterly by Microsoft. The relevant section is found under Product Documentation, SDKs, Multimedia Standards Update, New Multimedia Data Types and Data Techniques, Revision 3.0, April 15, 1994. However, the encoding defined here as DVI4 differs in two respects from these recommendations:Schulzrinne Standards Track [Page 8]RFC 1890 AV Profile January 1996 o The header contains the predicted value rather than the first sample value. o IMA ADPCM blocks contain odd number of samples, since the first sample of a block is contained just in the header (uncompressed), followed by an even number of compressed samples. DVI4 has an even number of compressed samples only, using the 'predict' word from the header to decode the first sample. Each packet contains a single DVI block. The profile only defines the 4-bit-per-sample version, while IMA also specifies a 3-bit-per-sample encoding. The "header" word for each channel has the following structure: int16 predict; /* predicted value of first sample from the previous block (L16 format) */ u_int8 index; /* current index into stepsize table */ u_int8 reserved; /* set to zero by sender, ignored by receiver */ Packing of samples for multiple channels is for further study. The document, "IMA Recommended Practices for Enhancing Digital Audio Compatibility in Multimedia Systems (version 3.0)", contains the algorithm description. It is available from: Interactive Multimedia Association 48 Maryland Avenue, Suite 202 Annapolis, MD 21401-8011 USA phone: +1 410 626-13804.4.3 G721 G721 is specified in ITU recommendation G.721. Reference implementations for G.721 are available as part of the CCITT/ITU-T Software Tool Library (STL) from the ITU General Secretariat, Sales Service, Place du Nations, CH-1211 Geneve 20, Switzerland. The library is covered by a license.4.4.4 G722 G722 is specified in ITU-T recommendation G.722, "7 kHz audio-coding within 64 kbit/s". G728 is specified in ITU-T recommendation G.728, "Coding of speech at 16 kbit/s using low-delay code excited linear prediction".Schulzrinne Standards Track [Page 9]RFC 1890 AV Profile January 19964.4.6 GSM GSM (group speciale mobile) denotes the European GSM 06.10 provisional standard for full-rate speech transcoding, prI-ETS 300 036, which is based on RPE/LTP (residual pulse excitation/long term prediction) coding at a rate of 13 kb/s [7,8,9]. The standard can be obtained from ETSI (European Telecommunications Standards Institute) ETSI Secretariat: B.P.152 F-06561 Valbonne Cedex France Phone: +33 92 94 42 00 Fax: +33 93 65 47 164.4.7 L8 L8 denotes linear audio data, using 8-bits of precision with an offset of 128, that is, the most negative signal is encoded as zero.4.4.8 L16 L16 denotes uncompressed audio data, using 16-bit signed representation with 65535 equally divided steps between minimum and maximum signal level, ranging from -32768 to 32767. The value is represented in two's complement notation and network byte order.4.4.9 LPC LPC designates an experimental linear predictive encoding contributed by Ron Frederick, Xerox PARC, which is based on an implementation written by Ron Zuckerman, Motorola, posted to the Usenet group comp.dsp on June 26, 1992.4.4.10 MPA MPA denotes MPEG-I or MPEG-II audio encapsulated as elementary streams. The encoding is defined in ISO standards ISO/IEC 11172-3 and 13818-3. The encapsulation is specified in work in progress [10], Section 3. The authors can be contacted at Don Hoffman Sun Microsystems, Inc. Mail-stop UMPK14-305 2550 Garcia Avenue Mountain View, California 94043-1100 USA electronic mail: don.hoffman@eng.sun.comSchulzrinne Standards Track [Page 10]RFC 1890 AV Profile January 1996 Sampling rate and channel count are contained in the payload. MPEG-I audio supports sampling rates of 32000, 44100, and 48000 Hz (ISO/IEC 11172-3, section 1.1; "Scope"). MPEG-II additionally supports ISO/IEC 11172-3 Audio...").4.4.11 PCMA PCMA is specified in CCITT/ITU-T recommendation G.711. Audio data is encoded as eight bits per sample, after logarithmic scaling. Code to convert between linear and A-law companded data is available in [6]. A detailed description is given by Jayant and Noll [11].4.4.12 PCMU PCMU is specified in CCITT/ITU-T recommendation G.711. Audio data is encoded as eight bits per sample, after logarithmic scaling. Code to convert between linear and mu-law companded data is available in [6]. PCMU is the encoding used for the Internet media type audio/basic. A detailed description is given by Jayant and Noll [11].4.4.13 VDVI VDVI is a variable-rate version of DVI4, yielding speech bit rates of between 10 and 25 kb/s. It is specified for single-channel operation only. It uses the following encoding: DVI4 codeword VDVI bit pattern __________________________________ 0 00 1 010 2 1100 3 11100 4 111100 5 1111100 6 11111100 7 11111110 8 10 9 011 10 1101 11 11101 12 111101 13 1111101 14 11111101 15 11111111Schulzrinne Standards Track [Page 11]RFC 1890 AV Profile January 19965. Video The following video encodings are currently defined, with their abbreviated names used for identification:5.1 CelB The CELL-B encoding is a proprietary encoding proposed by Sun Microsystems. The byte stream format is described in work in progress [12]. The author can be contacted at Michael F. Speer Sun Microsystems Computer Corporation 2550 Garcia Ave MailStop UMPK14-305 Mountain View, CA 94043 United States electronic mail: michael.speer@eng.sun.com5.2 JPEGThe encoding is specified in ISO Standards 10918-1 and 10918-2. TheRTP payload format is as specified in work in progress [13]. Furtherinformation can be obtained from Steven McCanne Lawrence Berkeley National Laboratory M/S 46A-1123 One Cyclotron Road Berkeley, CA 94720 United States Phone: +1 510 486 7520 electronic mail: mccanne@ee.lbl.gov5.3 H261 The encoding is specified in CCITT/ITU-T standard H.261. The packetization and RTP-specific properties are described in work in progress [14]. Further information can be obtained from Thierry Turletti Office NE 43-505 Telemedia, Networks and Systems Laboratory for Computer Science Massachusetts Institute of Technology 545 Technology Square Cambridge, MA 02139 United States electronic mail: turletti@clove.lcs.mit.eduSchulzrinne Standards Track [Page 12]RFC 1890 AV Profile January 1996
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -