📄 rfc3951.txt
字号:
Network Working Group S. AndersenRequest for Comments: 3951 Aalborg UniversityCategory: Experimental A. Duric Telio H. Astrom R. Hagen W. Kleijn J. Linden Global IP Sound December 2004 Internet Low Bit Rate Codec (iLBC)Status of this Memo This memo defines an Experimental Protocol for the Internet community. It does not specify an Internet standard of any kind. Discussion and suggestions for improvement are requested. Distribution of this memo is unlimited.Copyright Notice Copyright (C) The Internet Society (2004).Abstract This document specifies a speech codec suitable for robust voice communication over IP. The codec is developed by Global IP Sound (GIPS). It is designed for narrow band speech and results in a payload bit rate of 13.33 kbit/s for 30 ms frames and 15.20 kbit/s for 20 ms frames. The codec enables graceful speech quality degradation in the case of lost frames, which occurs in connection with lost or delayed IP packets.Andersen, et al. Experimental [Page 1]RFC 3951 Internet Low Bit Rate Codec December 2004Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Outline of the Codec . . . . . . . . . . . . . . . . . . . . . 5 2.1. Encoder. . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2. Decoder. . . . . . . . . . . . . . . . . . . . . . . . . 7 3. Encoder Principles . . . . . . . . . . . . . . . . . . . . . . 7 3.1. Pre-processing . . . . . . . . . . . . . . . . . . . . . 9 3.2. LPC Analysis and Quantization. . . . . . . . . . . . . . 9 3.2.1. Computation of Autocorrelation Coefficients. . . 10 3.2.2. Computation of LPC Coefficients. . . . . . . . . 11 3.2.3. Computation of LSF Coefficients from LPC Coefficients . . . . . . . . . . . . . . . . . . 11 3.2.4. Quantization of LSF Coefficients . . . . . . . . 12 3.2.5. Stability Check of LSF Coefficients. . . . . . . 13 3.2.6. Interpolation of LSF Coefficients. . . . . . . . 13 3.2.7. LPC Analysis and Quantization for 20 ms Frames . 14 3.3. Calculation of the Residual. . . . . . . . . . . . . . . 15 3.4. Perceptual Weighting Filter. . . . . . . . . . . . . . . 15 3.5. Start State Encoder. . . . . . . . . . . . . . . . . . . 15 3.5.1. Start State Estimation . . . . . . . . . . . . . 16 3.5.2. All-Pass Filtering and Scale Quantization. . . . 17 3.5.3. Scalar Quantization. . . . . . . . . . . . . . . 18 3.6. Encoding the Remaining Samples . . . . . . . . . . . . . 19 3.6.1. Codebook Memory. . . . . . . . . . . . . . . . . 20 3.6.2. Perceptual Weighting of Codebook Memory and Target . . . . . . . . . . . . . . . . . . . 22 3.6.3. Codebook Creation. . . . . . . . . . . . . . . . 23 3.6.3.1. Creation of a Base Codebook . . . . . . 23 3.6.3.2. Codebook Expansion. . . . . . . . . . . 24 3.6.3.3. Codebook Augmentation . . . . . . . . . 24 3.6.4. Codebook Search. . . . . . . . . . . . . . . . . 26 3.6.4.1. Codebook Search at Each Stage . . . . . 26 3.6.4.2. Gain Quantization at Each Stage . . . . 27 3.6.4.3. Preparation of Target for Next Stage. . 28 3.7. Gain Correction Encoding . . . . . . . . . . . . . . . . 28 3.8. Bitstream Definition . . . . . . . . . . . . . . . . . . 29 4. Decoder Principles . . . . . . . . . . . . . . . . . . . . . . 32 4.1. LPC Filter Reconstruction. . . . . . . . . . . . . . . . 33 4.2. Start State Reconstruction . . . . . . . . . . . . . . . 33 4.3. Excitation Decoding Loop . . . . . . . . . . . . . . . . 34 4.4. Multistage Adaptive Codebook Decoding. . . . . . . . . . 35 4.4.1. Construction of the Decoded Excitation Signal. . 35 4.5. Packet Loss Concealment. . . . . . . . . . . . . . . . . 35 4.5.1. Block Received Correctly and Previous Block Also Received. . . . . . . . . . . . . . . . . . 35 4.5.2. Block Not Received . . . . . . . . . . . . . . . 36Andersen, et al. Experimental [Page 2]RFC 3951 Internet Low Bit Rate Codec December 2004 4.5.3. Block Received Correctly When Previous Block Not Received . . . . . . . . . . . . . . . . . . 36 4.6. Enhancement. . . . . . . . . . . . . . . . . . . . . . . 37 4.6.1. Estimating the Pitch . . . . . . . . . . . . . . 39 4.6.2. Determination of the Pitch-Synchronous Sequences. . . . . . . . . . . . . . . . . . . . 39 4.6.3. Calculation of the Smoothed Excitation . . . . . 41 4.6.4. Enhancer Criterion . . . . . . . . . . . . . . . 41 4.6.5. Enhancing the Excitation . . . . . . . . . . . . 42 4.7. Synthesis Filtering. . . . . . . . . . . . . . . . . . . 43 4.8. Post Filtering . . . . . . . . . . . . . . . . . . . . . 43 5. Security Considerations. . . . . . . . . . . . . . . . . . . . 43 6. Evaluation of the iLBC Implementations . . . . . . . . . . . . 43 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 43 7.1. Normative References . . . . . . . . . . . . . . . . . . 43 7.2. Informative References . . . . . . . . . . . . . . . . . 44 8. ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . 44 APPENDIX A: Reference Implementation . . . . . . . . . . . . . . . 45 A.1. iLBC_test.c. . . . . . . . . . . . . . . . . . . . . . . 46 A.2 iLBC_encode.h. . . . . . . . . . . . . . . . . . . . . . 52 A.3. iLBC_encode.c. . . . . . . . . . . . . . . . . . . . . . 53 A.4. iLBC_decode.h. . . . . . . . . . . . . . . . . . . . . . 63 A.5. iLBC_decode.c. . . . . . . . . . . . . . . . . . . . . . 64 A.6. iLBC_define.h. . . . . . . . . . . . . . . . . . . . . . 76 A.7. constants.h. . . . . . . . . . . . . . . . . . . . . . . 80 A.8. constants.c. . . . . . . . . . . . . . . . . . . . . . . 82 A.9. anaFilter.h. . . . . . . . . . . . . . . . . . . . . . . 96 A.10. anaFilter.c. . . . . . . . . . . . . . . . . . . . . . . 97 A.11. createCB.h . . . . . . . . . . . . . . . . . . . . . . . 98 A.12. createCB.c . . . . . . . . . . . . . . . . . . . . . . . 99 A.13. doCPLC.h . . . . . . . . . . . . . . . . . . . . . . . .104 A.14. doCPLC.c . . . . . . . . . . . . . . . . . . . . . . . .104 A.15. enhancer.h . . . . . . . . . . . . . . . . . . . . . . .109 A.16. enhancer.c . . . . . . . . . . . . . . . . . . . . . . .110 A.17. filter.h . . . . . . . . . . . . . . . . . . . . . . . .123 A.18. filter.c . . . . . . . . . . . . . . . . . . . . . . . .125 A.19. FrameClassify.h. . . . . . . . . . . . . . . . . . . . .128 A.20. FrameClassify.c. . . . . . . . . . . . . . . . . . . . .129 A.21. gainquant.h. . . . . . . . . . . . . . . . . . . . . . .131 A.22. gainquant.c. . . . . . . . . . . . . . . . . . . . . . .131 A.23. getCBvec.h . . . . . . . . . . . . . . . . . . . . . . .134 A.24. getCBvec.c . . . . . . . . . . . . . . . . . . . . . . .134 A.25. helpfun.h. . . . . . . . . . . . . . . . . . . . . . . .138 A.26. helpfun.c. . . . . . . . . . . . . . . . . . . . . . . .140 A.27. hpInput.h. . . . . . . . . . . . . . . . . . . . . . . .146 A.28. hpInput.c. . . . . . . . . . . . . . . . . . . . . . . .146 A.29. hpOutput.h . . . . . . . . . . . . . . . . . . . . . . .148 A.30. hpOutput.c . . . . . . . . . . . . . . . . . . . . . . .148Andersen, et al. Experimental [Page 3]RFC 3951 Internet Low Bit Rate Codec December 2004 A.31. iCBConstruct.h . . . . . . . . . . . . . . . . . . . . .149 A.32. iCBConstruct.c . . . . . . . . . . . . . . . . . . . . .150 A.33. iCBSearch.h. . . . . . . . . . . . . . . . . . . . . . .152 A.34. iCBSearch.c. . . . . . . . . . . . . . . . . . . . . . .153 A.35. LPCdecode.h. . . . . . . . . . . . . . . . . . . . . . .163 A.36. LPCdecode.c. . . . . . . . . . . . . . . . . . . . . . .164 A.37. LPCencode.h. . . . . . . . . . . . . . . . . . . . . . .167 A.38. LPCencode.c. . . . . . . . . . . . . . . . . . . . . . .167 A.39. lsf.h. . . . . . . . . . . . . . . . . . . . . . . . . .172 A.40. lsf.c. . . . . . . . . . . . . . . . . . . . . . . . . .172 A.41. packing.h. . . . . . . . . . . . . . . . . . . . . . . .178 A.42. packing.c. . . . . . . . . . . . . . . . . . . . . . . .179 A.43. StateConstructW.h. . . . . . . . . . . . . . . . . . . .182 A.44. StateConstructW.c. . . . . . . . . . . . . . . . . . . .183 A.45. StateSearchW.h . . . . . . . . . . . . . . . . . . . . .185 A.46. StateSearchW.c . . . . . . . . . . . . . . . . . . . . .186 A.47. syntFilter.h . . . . . . . . . . . . . . . . . . . . . .190 A.48. syntFilter.c . . . . . . . . . . . . . . . . . . . . . .190 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .192 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . .1941. Introduction This document contains the description of an algorithm for the coding of speech signals sampled at 8 kHz. The algorithm, called iLBC, uses a block-independent linear-predictive coding (LPC) algorithm and has support for two basic frame lengths: 20 ms at 15.2 kbit/s and 30 ms at 13.33 kbit/s. When the codec operates at block lengths of 20 ms, it produces 304 bits per block, which SHOULD be packetized as in [1]. Similarly, for block lengths of 30 ms it produces 400 bits per block, which SHOULD be packetized as in [1]. The two modes for the different frame sizes operate in a very similar way. When they differ it is explicitly stated in the text, usually with the notation x/y, where x refers to the 20 ms mode and y refers to the 30 ms mode. The described algorithm results in a speech coding system with a controlled response to packet losses similar to what is known from pulse code modulation (PCM) with packet loss concealment (PLC), such as the ITU-T G.711 standard [4], which operates at a fixed bit rate of 64 kbit/s. At the same time, the described algorithm enables fixed bit rate coding with a quality-versus-bit rate tradeoff close to state-of-the-art. A suitable RTP payload format for the iLBC codec is specified in [1]. Some of the applications for which this coder is suitable are real time communications such as telephony and videoconferencing, streaming audio, archival, and messaging.Andersen, et al. Experimental [Page 4]RFC 3951 Internet Low Bit Rate Codec December 2004 Cable Television Laboratories (CableLabs(R)) has adopted iLBC as a mandatory PacketCable(TM) audio codec standard for VoIP over Cable applications [3]. This document is organized as follows. Section 2 gives a brief outline of the codec. The specific encoder and decoder algorithms are explained in sections 3 and 4, respectively. Appendix A provides a c-code reference implementation. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [2].2. Outline of the Codec The codec consists of an encoder and a decoder as described in sections 2.1 and 2.2, respectively. The essence of the codec is LPC and block-based coding of the LPC residual signal. For each 160/240 (20 ms/30 ms) sample block, the following major steps are performed: A set of LPC filters are computed, and the speech signal is filtered through them to produce the residual signal. The codec uses scalar quantization of the dominant part, in terms of energy, of the residual signal for the block. The dominant state is of length 57/58 (20 ms/30 ms) samples and forms a start state for dynamic codebooks constructed from the already coded parts of the residual signal. These dynamic codebooks are used to code the remaining parts of the residual signal. By this method, coding independence between blocks is achieved, resulting in elimination of propagation of perceptual degradations due to packet loss. The method facilitates high-quality packet loss concealment (PLC).2.1. Encoder The input to the encoder SHOULD be 16 bit uniform PCM sampled at 8 kHz. It SHOULD be partitioned into blocks of BLOCKL=160/240 samples for the 20/30 ms frame size. Each block is divided into NSUB=4/6 consecutive sub-blocks of SUBL=40 samples each. For 30 ms frame size, the encoder performs two LPC_FILTERORDER=10 linear-predictive coding (LPC) analyses. The first analysis applies a smooth window centered over the second sub-block and extending to the middle of the fifth sub-block. The second LPC analysis applies a smooth asymmetric window centered over the fifth sub-block and extending to the end of the sixth sub-block. For 20 ms frame size, one LPC_FILTERORDER=10 linear-predictive coding (LPC) analysis is performed with a smooth window centered over the third sub-frame.Andersen, et al. Experimental [Page 5]RFC 3951 Internet Low Bit Rate Codec December 2004 For each of the LPC analyses, a set of line-spectral frequencies (LSFs) are obtained, quantized, and interpolated to obtain LSF coefficients for each sub-block. Subsequently, the LPC residual is computed by using the quantized and interpolated LPC analysis filters. The two consecutive sub-blocks of the residual exhibiting the maximal weighted energy are identified. Within these two sub-blocks, the start state (segment) is selected from two choices: the first 57/58 samples or the last 57/58 samples of the two consecutive sub-blocks. The selected segment is the one of higher energy. The start state is encoded with scalar quantization.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -