📄 draft-ietf-avt-ilbc-codec-05.txt

📁 开源的openh323的v1.18.0版,有1.19.0版无法编译过的朋友可以用这版
💻 TXT
📖 第 1 页 / 共 5 页
字号:
   coefficients with the following window:
   
         lpc_lagwinTbl[0] = 1.0001; 
         lpc_lagwinTbl[i] = exp(-0.5 * ((2 * PI * 60.0 * i) /FS)^2); 
                  i=1,...,LPC_FILTERORDER
                  where FS=8000 is the sampling frequency
   
   Then, the windowed acf function acf1_win is obtained by:
   
         acf1_win[i] = acf1[i] * lpc_lagwinTbl[i];
                  i=0,...,LPC_FILTERORDER
   
   The second set of autocorrelation coefficients, acf2_win are
   obtained in a similar manner. The window, lpc_asymwinTbl, is applied
   to samples 60 through 299, i.e., the entire current block. The
   window consists of two segments; the first (samples 0 to 219) being
   half a Hanning window with length 440 and the second being a quarter
   of a cycle of a cosine wave. By using this asymmetric window, an LPC
   analysis centered in the fifth sub-block is obtained without the
   need for any look-ahead, which would have added delay. The
   asymmetric window is defined as:
   
         lpc_asymwinTbl[i] = (sin(PI * (i + 1) / 441))^2; i=0,...,219


         lpc_asymwinTbl[i] = cos((i - 220) * PI / 40); i=220,...,239
   



   
   Andersen et. al.  Experimental - Expires November 29th, 2004     10
                     Internet Low Bit Rate Codec               May 04
   
   and the windowed speech is computed by:
   
         speech_hp_win2[i] = speech_hp[i + LPC_LOOKBACK] *
                  lpc_asymwinTbl[i];  i=0,....BLOCKL-1
   
   The windowed autocorrelation coefficients are then obtained in
   exactly the same way as for the first analysis instance.
   
   The generation of the windows lpc_winTbl, lpc_asymwinTbl, and
   lpc_lagwinTbl are typically done in advance and the arrays are
   stored in ROM rather than repeating the calculation for every block.


 3.2.2 Computation of LPC Coefficients
   
   From the 2 x 11 smoothed autocorrelation coefficients, acf1_win and
   acf2_win, the 2 x 11 LPC coefficients, lp1 and lp2, are calculated
   in the same way for both analysis locations using the well known
   Levinson-Durbin recursion. The first LPC coefficient is always 1.0,
   resulting in 10 unique coefficients.
   
   After determining the LPC coefficients, a bandwidth expansion
   procedure is applied in order to smooth the spectral peaks in the
   short-term spectrum. The bandwidth addition is obtained by the
   following modification of the LPC coefficients:
   
         lp1_bw[i] = lp1[i] * chirp^i; i=0,...,LPC_FILTERORDER
         lp2_bw[i] = lp2[i] * chirp^i; i=0,...,LPC_FILTERORDER
   
   where "chirp" is a real number between 0 and 1. It is RECOMMENDED to
   use a value of 0.9.


 3.2.3 Computation of LSF Coefficients from LPC Coefficients
   
   Thusfar, two sets of LPC coefficients that represent the short-term
   spectral characteristics of the speech signal for two different time
   locations within the current block have been determined. These
   coefficients SHOULD be quantized and interpolated. Before doing so,
   it is advantageous to convert the LPC parameters into another type
   of representation called Line Spectral Frequencies (LSF). The LSF
   parameters are used because they are better suited for quantization
   and interpolation than the regular LPC coefficients. Many
   computationally efficient methods for calculating the LSFs from the
   LPC coefficients have been proposed in the literature. The detailed
   implementation of one applicable method can be found in Appendix
   A.26. The two arrays of LSF coefficients obtained, lsf1 and lsf2,
   are of dimension 10 (LPC_FILTERORDER).


 3.2.4 Quantization of LSF Coefficients
   
   Since the LPC filters defined by the two sets of LSFs are needed
   also in the decoder, the LSF parameters need to be quantized and
   transmitted as side information. The total number of bits required
   to represent the quantization of the two LSF representations for one
   block of speech is 40 with 20 bits used for each of lsf1 and lsf2.
   
   Andersen et. al.  Experimental - Expires November 29th, 2004     11
                     Internet Low Bit Rate Codec               May 04
   
   For computational and storage reasons, the LSF vectors are quantized
   using 3-split vector quantization (VQ). That is, the LSF vectors are
   split into three sub-vectors which are each quantized with a regular
   VQ. The quantized versions of lsf1 and lsf2, qlsf1 and qlsf2, are
   obtained by using the same memoryless split VQ.  The length of each
   of these two LSF vectors is 10 and they are split into 3 sub-vectors
   containing 3, 3 and 4 values respectively.
   
   For each of the sub-vectors, a separate codebook of quantized values
   has been designed using a standard VQ training method for a large
   database containing speech from a large number of speakers recorded
   under various conditions. The size of each of the three codebooks
   associated with the split definitions above is:
   
        int size_lsfCbTbl[LSF_NSPLIT] = {64,128,128};
   
   The actual values of the vector quantization codebook that must be
   used can be found in the reference code of appendix A. Both sets of
   LSF coefficients, lsf1 and lsf2, are quantized with a standard
   memoryless split vector quantization (VQ) structure using the
   squared error criterion in the LSF domain. The split VQ quantization
   consists of the following steps:
   
   1) Quantize the first 3 LSF coefficients (1 - 3) with a VQ codebook
   of size 64.
   2) Quantize the LSF coefficients 4, 5, and 6 with VQ a codebook of
   size 128.
   3) Quantize the last 4 LSF coefficients (7 - 10) with a VQ codebook
   of size 128.
   
   This procedure, repeated for lsf1 and lsf2, gives 6 quantization
   indices and the quantized sets of LSF coefficients qlsf1 and qlsf2.
   Each set of three indices is encoded with 6 + 7 + 7 = 20 bits. The
   total number of bits used for LSF quantization in a block is thus 40
   bits. 


 3.2.5 Stability Check of LSF Coefficients
   
   The LSF representation of the LPC filter has the nice property that
   the coefficients are ordered by increasing value, i.e., lsf(n-1) <
   lsf(n), 0 < n < 10, if the corresponding synthesis filter is stable.
   Since we are employing a split VQ scheme it is possible that at the
   split boundaries the LSF coefficients are not ordered correctly and
   hence the corresponding LP filter is unstable. To ensure that the
   filter used is stable, a stability check is performed for the
   quantized LSF vectors. If it turns out that the coefficients are not
   ordered appropriately (with a safety margin of 50 Hz to ensure that
   formant peaks are not too narrow) they will be moved apart. The
   detailed method for this can be found in Appendix A.40. The same
   procedure is performed in the decoder. This ensures that exactly the
   same LSF representations are used in both encoder and decoder.


 3.2.6 Interpolation of LSF Coefficients
   
   
   Andersen et. al.  Experimental - Expires November 29th, 2004     12
                     Internet Low Bit Rate Codec               May 04
   
   From the two sets of LSF coefficients that are computed for each
   block of speech, different LSFs are obtained for each sub-block by
   means of interpolation. This procedure is performed for the original
   LSFs (lsf1 and lsf2), as well as the quantized versions qlsf1 and
   qlsf2 since both versions are used in the encoder. Here follows a
   brief summary of the interpolation scheme while the details are
   found in the c-code of Appendix A. In the first sub-block, the
   average of the second LSF vector from the previous block and the
   first LSF vector in the current block is used. For sub-blocks two
   through five the LSFs used are obtained by linear interpolation from
   lsf1 (and qlsf1) to lsf2 (and qlsf2) with lsf1 used in sub-block two
   and lsf2 in sub-block five. In the last sub-block, lsf2 is used. For
   the very first block it is assumed that the last LSF vector of the
   previous block is equal to a predefined vector, lsfmeanTbl, that was
   obtained by calculating the mean LSF vector of the LSF design
   database.
   
   lsfmeanTbl[LPC_FILTERORDER] = {0.281738, 0.445801, 0.663330, 
                  0.962524, 1.251831, 1.533081, 1.850586, 2.137817,
                  2.481445, 2.777344}
   
   The interpolation method is standard linear interpolation in the LSF
   domain. The interpolated LSF values are converted to LPC
   coefficients for each sub-block. The unquantized and quantized LPC
   coefficients form two sets of filters respectively. The unquantized
   analysis filter for sub-block k:
   
                ___
                \
      Ak(z)= 1 + > ak(i)*z^(-i)
                /__
             i=1...LPC_FILTERORDER
   
   And the quantized analysis filter for sub-block k:
                 ___
                 \
      A~k(z)= 1 + > a~k(i)*z^(-i)
                 /__
             i=1...LPC_FILTERORDER
   
   A reference implementation of the lsf encoding is given in Appendix
   A.38. A reference implementation of the corresponding decoding can
   be found in Appendix A.36.
  
 3.2.7 LPC Analysis and Quantization for 20 ms frames
 
   As stated before, the codec only calculates one set of LPC
   parameters for the 20 ms frame size as opposed to two sets for 30 ms
   frames. A single set of autocorrelation coefficients is calculated
   on the LPC_LOOKBACK + BLOCKL = 80 + 160 = 240 samples. These sampl
es
   are windowed with the asymmetric window lpc_asymwinTbl, centered
   over the third sub-frame, to form speech_hp_win. Autocorrelation
   coefficients, acf, are calculated on the 240 samples in


   
   Andersen et. al.  Experimental - Expires November 29th, 2004     13
                     Internet Low Bit Rate Codec               May 04
   
   speech_hp_win and then windowed exactly as in 3.2.1 (resulting in
   acf_win).
   
   This single set of windowed autocorrelation coefficients is used to
   calculate LPC Coefficients, LSF Coefficients and quantized LSF
   coefficients in exactly the same manner as in 3.2.3 to 3.2.4. As for
   the 30 ms frame size, the 10 LSF coefficients are divided into three
   sub-vectors of size 3, 3, 4 and quantized using the same scheme and
   codebook as in 3.2.4 to finally get 3 quantization indices. The
   quantized LSF coefficients are stabilized with the algorithm
   described in 3.2.5.
   
   From the set of LSF coefficients that was computed for this block
   together with the LSF coefficients from the previous block,
   different LSFs are obtained for each sub-block by means of
   interpolation. The interpolation is done linearly in the LSF domain
   over the 4 sub-blocks, so that the n-th sub-frame uses the weight
   (4-n)/4 for the LSF from old frame and the weight n/4 of the LSF
   from the current frame. For the very first block the mean LSF,
   lsfmeanTbl, is used as the LSF from the previous block. Similar to
   3.2.6, both unquantized, A(z), and quantized, A~(z), analysis
   filters are calculated for each of the four sub-blocks.


3.3 Calculation of the Residual
   
   The block of speech samples is filtered by the quantized and
   interpolated LPC analysis filters to yield the residual signal. In
   particular, the corresponding LPC analysis filter for each 40 sample
   sub-block is used to filter the speech samples for the same sub-
   block. The filter memory at the end of each sub-block is carried
   over to the LPC filter of the next sub-block.  The signal at the
   output of each LP analysis filter constitutes the residual signal
   for the corresponding sub-block.
   
   A reference implementation of the LPC analysis filters is given in
   Appendix A.10.


3.4 Perceptual Weighting Filter
   
   In principle any good design of a perceptual weighting filter can be
   applied in the encoder without compromising this codec definition.
   It is however RECOMMENDED to use the perceptual weighting filter
   specified below:
   
      Weighting filter for sub-block k:
   
      Wk(z)=1/Ak(z/LPC_CHIRP_WEIGHTDENUM), where
                               LPC_CHIRP_WEIGHTDENUM = 0.4222
   
   This is a simple design with low complexity that is applied in the
   LPC residual domain. Here Ak(z) is the filter obtained from
   unquantized but interpolated LSF coefficients.



   
   Andersen et. al.  Experimental - Expires November 29th, 2004     14
                     Internet Low Bit Rate Codec               May 04
   
3.5 Start State Encoder 
   
   The start state is quantized using a common 6-bit scalar quantizer
💿 文件大小 3812 K
👤 上传用户 xujinliner
📂 所属分类 Internet/网络编程
🏷️ 相关标签

#openh #323 #18 #19
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -