📄 ilbc.txt
字号:
Network Working Group S. Andersen
Request for Comments: 3951 Aalborg University
Category: Experimental A. Duric
Telio
H. Astrom
R. Hagen
W. Kleijn
J. Linden
Global IP Sound
December 2004
Internet Low Bit Rate Codec (iLBC)
Status of this Memo
This memo defines an Experimental Protocol for the Internet
community. It does not specify an Internet standard of any kind.
Discussion and suggestions for improvement are requested.
Distribution of this memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2004).
Abstract
This document specifies a speech codec suitable for robust voice
communication over IP. The codec is developed by Global IP Sound
(GIPS). It is designed for narrow band speech and results in a
payload bit rate of 13.33 kbit/s for 30 ms frames and 15.20 kbit/s
for 20 ms frames. The codec enables graceful speech quality
degradation in the case of lost frames, which occurs in connection
with lost or delayed IP packets.
Andersen, et al. Experimental [Page 1]
RFC 3951 Internet Low Bit Rate Codec December 2004
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Outline of the Codec . . . . . . . . . . . . . . . . . . . . . 5
2.1. Encoder. . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2. Decoder. . . . . . . . . . . . . . . . . . . . . . . . . 7
3. Encoder Principles . . . . . . . . . . . . . . . . . . . . . . 7
3.1. Pre-processing . . . . . . . . . . . . . . . . . . . . . 9
3.2. LPC Analysis and Quantization. . . . . . . . . . . . . . 9
3.2.1. Computation of Autocorrelation Coefficients. . . 10
3.2.2. Computation of LPC Coefficients. . . . . . . . . 11
3.2.3. Computation of LSF Coefficients from LPC
Coefficients . . . . . . . . . . . . . . . . . . 11
3.2.4. Quantization of LSF Coefficients . . . . . . . . 12
3.2.5. Stability Check of LSF Coefficients. . . . . . . 13
3.2.6. Interpolation of LSF Coefficients. . . . . . . . 13
3.2.7. LPC Analysis and Quantization for 20 ms Frames . 14
3.3. Calculation of the Residual. . . . . . . . . . . . . . . 15
3.4. Perceptual Weighting Filter. . . . . . . . . . . . . . . 15
3.5. Start State Encoder. . . . . . . . . . . . . . . . . . . 15
3.5.1. Start State Estimation . . . . . . . . . . . . . 16
3.5.2. All-Pass Filtering and Scale Quantization. . . . 17
3.5.3. Scalar Quantization. . . . . . . . . . . . . . . 18
3.6. Encoding the Remaining Samples . . . . . . . . . . . . . 19
3.6.1. Codebook Memory. . . . . . . . . . . . . . . . . 20
3.6.2. Perceptual Weighting of Codebook Memory
and Target . . . . . . . . . . . . . . . . . . . 22
3.6.3. Codebook Creation. . . . . . . . . . . . . . . . 23
3.6.3.1. Creation of a Base Codebook . . . . . . 23
3.6.3.2. Codebook Expansion. . . . . . . . . . . 24
3.6.3.3. Codebook Augmentation . . . . . . . . . 24
3.6.4. Codebook Search. . . . . . . . . . . . . . . . . 26
3.6.4.1. Codebook Search at Each Stage . . . . . 26
3.6.4.2. Gain Quantization at Each Stage . . . . 27
3.6.4.3. Preparation of Target for Next Stage. . 28
3.7. Gain Correction Encoding . . . . . . . . . . . . . . . . 28
3.8. Bitstream Definition . . . . . . . . . . . . . . . . . . 29
4. Decoder Principles . . . . . . . . . . . . . . . . . . . . . . 32
4.1. LPC Filter Reconstruction. . . . . . . . . . . . . . . . 33
4.2. Start State Reconstruction . . . . . . . . . . . . . . . 33
4.3. Excitation Decoding Loop . . . . . . . . . . . . . . . . 34
4.4. Multistage Adaptive Codebook Decoding. . . . . . . . . . 35
4.4.1. Construction of the Decoded Excitation Signal. . 35
4.5. Packet Loss Concealment. . . . . . . . . . . . . . . . . 35
4.5.1. Block Received Correctly and Previous Block
Also Received. . . . . . . . . . . . . . . . . . 35
4.5.2. Block Not Received . . . . . . . . . . . . . . . 36
Andersen, et al. Experimental [Page 2]
RFC 3951 Internet Low Bit Rate Codec December 2004
4.5.3. Block Received Correctly When Previous Block
Not Received . . . . . . . . . . . . . . . . . . 36
4.6. Enhancement. . . . . . . . . . . . . . . . . . . . . . . 37
4.6.1. Estimating the Pitch . . . . . . . . . . . . . . 39
4.6.2. Determination of the Pitch-Synchronous
Sequences. . . . . . . . . . . . . . . . . . . . 39
4.6.3. Calculation of the Smoothed Excitation . . . . . 41
4.6.4. Enhancer Criterion . . . . . . . . . . . . . . . 41
4.6.5. Enhancing the Excitation . . . . . . . . . . . . 42
4.7. Synthesis Filtering. . . . . . . . . . . . . . . . . . . 43
4.8. Post Filtering . . . . . . . . . . . . . . . . . . . . . 43
5. Security Considerations. . . . . . . . . . . . . . . . . . . . 43
6. Evaluation of the iLBC Implementations . . . . . . . . . . . . 43
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.1. Normative References . . . . . . . . . . . . . . . . . . 43
7.2. Informative References . . . . . . . . . . . . . . . . . 44
8. ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . 44
APPENDIX A: Reference Implementation . . . . . . . . . . . . . . . 45
A.1. iLBC_test.c. . . . . . . . . . . . . . . . . . . . . . . 46
A.2 iLBC_encode.h. . . . . . . . . . . . . . . . . . . . . . 52
A.3. iLBC_encode.c. . . . . . . . . . . . . . . . . . . . . . 53
A.4. iLBC_decode.h. . . . . . . . . . . . . . . . . . . . . . 63
A.5. iLBC_decode.c. . . . . . . . . . . . . . . . . . . . . . 64
A.6. iLBC_define.h. . . . . . . . . . . . . . . . . . . . . . 76
A.7. constants.h. . . . . . . . . . . . . . . . . . . . . . . 80
A.8. constants.c. . . . . . . . . . . . . . . . . . . . . . . 82
A.9. anaFilter.h. . . . . . . . . . . . . . . . . . . . . . . 96
A.10. anaFilter.c. . . . . . . . . . . . . . . . . . . . . . . 97
A.11. createCB.h . . . . . . . . . . . . . . . . . . . . . . . 98
A.12. createCB.c . . . . . . . . . . . . . . . . . . . . . . . 99
A.13. doCPLC.h . . . . . . . . . . . . . . . . . . . . . . . .104
A.14. doCPLC.c . . . . . . . . . . . . . . . . . . . . . . . .104
A.15. enhancer.h . . . . . . . . . . . . . . . . . . . . . . .109
A.16. enhancer.c . . . . . . . . . . . . . . . . . . . . . . .110
A.17. filter.h . . . . . . . . . . . . . . . . . . . . . . . .123
A.18. filter.c . . . . . . . . . . . . . . . . . . . . . . . .125
A.19. FrameClassify.h. . . . . . . . . . . . . . . . . . . . .128
A.20. FrameClassify.c. . . . . . . . . . . . . . . . . . . . .129
A.21. gainquant.h. . . . . . . . . . . . . . . . . . . . . . .131
A.22. gainquant.c. . . . . . . . . . . . . . . . . . . . . . .131
A.23. getCBvec.h . . . . . . . . . . . . . . . . . . . . . . .134
A.24. getCBvec.c . . . . . . . . . . . . . . . . . . . . . . .134
A.25. helpfun.h. . . . . . . . . . . . . . . . . . . . . . . .138
A.26. helpfun.c. . . . . . . . . . . . . . . . . . . . . . . .140
A.27. hpInput.h. . . . . . . . . . . . . . . . . . . . . . . .146
A.28. hpInput.c. . . . . . . . . . . . . . . . . . . . . . . .146
A.29. hpOutput.h . . . . . . . . . . . . . . . . . . . . . . .148
A.30. hpOutput.c . . . . . . . . . . . . . . . . . . . . . . .148
Andersen, et al. Experimental [Page 3]
RFC 3951 Internet Low Bit Rate Codec December 2004
A.31. iCBConstruct.h . . . . . . . . . . . . . . . . . . . . .149
A.32. iCBConstruct.c . . . . . . . . . . . . . . . . . . . . .150
A.33. iCBSearch.h. . . . . . . . . . . . . . . . . . . . . . .152
A.34. iCBSearch.c. . . . . . . . . . . . . . . . . . . . . . .153
A.35. LPCdecode.h. . . . . . . . . . . . . . . . . . . . . . .163
A.36. LPCdecode.c. . . . . . . . . . . . . . . . . . . . . . .164
A.37. LPCencode.h. . . . . . . . . . . . . . . . . . . . . . .167
A.38. LPCencode.c. . . . . . . . . . . . . . . . . . . . . . .167
A.39. lsf.h. . . . . . . . . . . . . . . . . . . . . . . . . .172
A.40. lsf.c. . . . . . . . . . . . . . . . . . . . . . . . . .172
A.41. packing.h. . . . . . . . . . . . . . . . . . . . . . . .178
A.42. packing.c. . . . . . . . . . . . . . . . . . . . . . . .179
A.43. StateConstructW.h. . . . . . . . . . . . . . . . . . . .182
A.44. StateConstructW.c. . . . . . . . . . . . . . . . . . . .183
A.45. StateSearchW.h . . . . . . . . . . . . . . . . . . . . .185
A.46. StateSearchW.c . . . . . . . . . . . . . . . . . . . . .186
A.47. syntFilter.h . . . . . . . . . . . . . . . . . . . . . .190
A.48. syntFilter.c . . . . . . . . . . . . . . . . . . . . . .190
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .192
Full Copyright Statement . . . . . . . . . . . . . . . . . . . . .194
1. Introduction
This document contains the description of an algorithm for the coding
of speech signals sampled at 8 kHz. The algorithm, called iLBC, uses
a block-independent linear-predictive coding (LPC) algorithm and has
support for two basic frame lengths: 20 ms at 15.2 kbit/s and 30 ms
at 13.33 kbit/s. When the codec operates at block lengths of 20 ms,
it produces 304 bits per block, which SHOULD be packetized as in [1].
Similarly, for block lengths of 30 ms it produces 400 bits per block,
which SHOULD be packetized as in [1]. The two modes for the
different frame sizes operate in a very similar way. When they
differ it is explicitly stated in the text, usually with the notation
x/y, where x refers to the 20 ms mode and y refers to the 30 ms mode.
The described algorithm results in a speech coding system with a
controlled response to packet losses similar to what is known from
pulse code modulation (PCM) with packet loss concealment (PLC), such
as the ITU-T G.711 standard [4], which operates at a fixed bit rate
of 64 kbit/s. At the same time, the described algorithm enables
fixed bit rate coding with a quality-versus-bit rate tradeoff close
to state-of-the-art. A suitable RTP payload format for the iLBC
codec is specified in [1].
Some of the applications for which this coder is suitable are real
time communications such as telephony and videoconferencing,
streaming audio, archival, and messaging.
Andersen, et al. Experimental [Page 4]
RFC 3951 Internet Low Bit Rate Codec December 2004
Cable Television Laboratories (CableLabs(R)) has adopted iLBC as a
mandatory PacketCable(TM) audio codec standard for VoIP over Cable
applications [3].
This document is organized as follows. Section 2 gives a brief
outline of the codec. The specific encoder and decoder algorithms
are explained in sections 3 and 4, respectively. Appendix A provides
a c-code reference implementation.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14, RFC 2119 [2].
2. Outline of the Codec
The codec consists of an encoder and a decoder as described in
sections 2.1 and 2.2, respectively.
The essence of the codec is LPC and block-based coding of the LPC
residual signal. For each 160/240 (20 ms/30 ms) sample block, the
following major steps are performed: A set of LPC filters are
computed, and the speech signal is filtered through them to produce
the residual signal. The codec uses scalar quantization of the
dominant part, in terms of energy, of the residual signal for the
block. The dominant state is of length 57/58 (20 ms/30 ms) samples
and forms a start state for dynamic codebooks constructed from the
already coded parts of the residual signal. These dynamic codebooks
are used to code the remaining parts of the residual signal. By this
method, coding independence between blocks is achieved, resulting in
elimination of propagation of perceptual degradations due to packet
loss. The method facilitates high-quality packet loss concealment
(PLC).
2.1. Encoder
The input to the encoder SHOULD be 16 bit uniform PCM sampled at 8
kHz. It SHOULD be partitioned into blocks of BLOCKL=160/240 samples
for the 20/30 ms frame size. Each block is divided into NSUB=4/6
consecutive sub-blocks of SUBL=40 samples each. For 30 ms frame
size, the encoder performs two LPC_FILTERORDER=10 linear-predictive
coding (LPC) analyses. The first analysis applies a smooth window
centered over the second sub-block and extending to the middle of the
fifth sub-block. The second LPC analysis applies a smooth asymmetric
window centered over the fifth sub-block and extending to the end of
the sixth sub-block. For 20 ms frame size, one LPC_FILTERORDER=10
linear-predictive coding (LPC) analysis is performed with a smooth
window centered over the third sub-frame.
Andersen, et al. Experimental [Page 5]
RFC 3951 Internet Low Bit Rate Codec December 2004
For each of the LPC analyses, a set of line-spectral frequencies
(LSFs) are obtained, quantized, and interpolated to obtain LSF
coefficients for each sub-block. Subsequently, the LPC residual is
computed by using the quantized and interpolated LPC analysis
filters.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -