📄 rfc3951.txt
字号:
Andersen, et al. Experimental [Page 16]RFC 3951 Internet Low Bit Rate Codec December 2004 Hence, in both cases, index 0 is not used. In order to shorten the start state for bit rate efficiency, the start state is brought down to STATE_SHORT_LEN=57 samples for 20 ms frames and STATE_SHORT_LEN=58 samples for 30 ms frames. The power of the first 23/22 and last 23/22 samples of the two sub-frame blocks identified above is computed as the sum of the squared signal sample values, and the 23/22-sample segment with the lowest power is excluded from the start state. One bit is transmitted to indicate which of the two possible 57/58 sample segments is used. The start state position within the two sub-frames determined above, state_first, MUST be encoded as state_first=1: start state is first STATE_SHORT_LEN samples state_first=0: start state is last STATE_SHORT_LEN samples3.5.2. All-Pass Filtering and Scale Quantization The block of residual samples in the start state is first filtered by an all-pass filter with the quantized LPC coefficients as denominator and reversed quantized LPC coefficients as numerator. The purpose of this phase-dispersion filter is to get a more even distribution of the sample values in the residual signal. The filtering is performed by circular convolution, where the initial filter memory is set to zero. res(0..(STATE_SHORT_LEN-1)) = uncoded start state residual res((STATE_SHORT_LEN)..(2*STATE_SHORT_LEN-1)) = 0 Pk(z) = A~rk(z)/A~k(z), where ___ \ A~rk(z)= z^(-LPC_FILTERORDER)+>a~k(i+1)*z^(i-(LPC_FILTERORDER-1)) /__ i=0...(LPC_FILTERORDER-1) and A~k(z) is taken from the block where the start state begins res -> Pk(z) -> filtered ccres(k) = filtered(k) + filtered(k+STATE_SHORT_LEN), k=0..(STATE_SHORT_LEN-1) The all-pass filtered block is searched for its largest magnitude sample. The 10-logarithm of this magnitude is quantized with a 6-bit quantizer, state_frgqTbl, by finding the nearest representation.Andersen, et al. Experimental [Page 17]RFC 3951 Internet Low Bit Rate Codec December 2004 This results in an index, idxForMax, corresponding to a quantized value, qmax. The all-pass filtered residual samples in the block are then multiplied with a scaling factor scal=4.5/(10^qmax) to yield normalized samples. state_frgqTbl[64] = {1.000085, 1.071695, 1.140395, 1.206868, 1.277188, 1.351503, 1.429380, 1.500727, 1.569049, 1.639599, 1.707071, 1.781531, 1.840799, 1.901550, 1.956695, 2.006750, 2.055474, 2.102787, 2.142819, 2.183592, 2.217962, 2.257177, 2.295739, 2.332967, 2.369248, 2.402792, 2.435080, 2.468598, 2.503394, 2.539284, 2.572944, 2.605036, 2.636331, 2.668939, 2.698780, 2.729101, 2.759786, 2.789834, 2.818679, 2.848074, 2.877470, 2.906899, 2.936655, 2.967804, 3.000115, 3.033367, 3.066355, 3.104231, 3.141499, 3.183012, 3.222952, 3.265433, 3.308441, 3.350823, 3.395275, 3.442793, 3.490801, 3.542514, 3.604064, 3.666050, 3.740994, 3.830749, 3.938770, 4.101764}3.5.3. Scalar Quantization The normalized samples are quantized in the perceptually weighted speech domain by a sample-by-sample scalar DPCM quantization as depicted in Figure 3.3. Each sample in the block is filtered by a weighting filter Wk(z), specified in section 3.4, to form a weighted speech sample x[n]. The target sample d[n] is formed by subtracting a predicted sample y[n], where the prediction filter is given by Pk(z) = 1 - 1 / Wk(z). +-------+ x[n] + d[n] +-----------+ u[n] residual -->| Wk(z) |-------->(+)---->| Quantizer |------> quantized +-------+ - /|\ +-----------+ | residual | \|/ y[n] +--------------------->(+) | | | +------+ | +--------| Pk(z)|<------+ +------+ Figure 3.3. Quantization of start state samples by DPCM in weighted speech domain. The coded state sample u[n] is obtained by quantizing d[n] with a 3- bit quantizer with quantization table state_sq3Tbl. state_sq3Tbl[8] = {-3.719849, -2.177490, -1.130005, -0.309692, 0.444214, 1.329712, 2.436279, 3.983887}Andersen, et al. Experimental [Page 18]RFC 3951 Internet Low Bit Rate Codec December 2004 The quantized samples are transformed back to the residual domain by 1) scaling with 1/scal; 2) time-reversing the scaled samples; 3) filtering the time-reversed samples by the same all-pass filter, as in section 3.5.2, by using circular convolution; and 4) time- reversing the filtered samples. (More detail is in section 4.2.) A reference implementation of the start-state encoding can be found in Appendix A.46.3.6. Encoding the Remaining Samples A dynamic codebook is used to encode 1) the 23/22 remaining samples in the two sub-blocks containing the start state; 2) the sub-blocks after the start state in time; and 3) the sub-blocks before the start state in time. Thus, the encoding target can be either the 23/22 samples remaining of the 2 sub-blocks containing the start state, or a 40-sample sub-block. This target can consist of samples that are indexed forward in time or backward in time, depending on the location of the start state. The length of the target is denoted by lTarget. The coding is based on an adaptive codebook that is built from a codebook memory that contains decoded LPC excitation samples from the already encoded part of the block. These samples are indexed in the same time direction as is the target vector and end at the sample instant prior to the first sample instant represented in the target vector. The codebook memory has length lMem, which is equal to CB_MEML=147 for the two/four 40-sample sub-blocks and 85 for the 23/22-sample sub-block. The following figure shows an overview of the encoding procedure. +------------+ +---------------+ +-------------+ -> | 1. Decode | -> | 2. Mem setup | -> | 3. Perc. W. | -> +------------+ +---------------+ +-------------+ +------------+ +-----------------+ -> | 4. Search | -> | 5. Upd. Target | ------------------> | +------------+ +------------------ | ----<-------------<-----------<---------- stage=0..2 +----------------+ -> | 6. Recalc G[0] | ---------------> gains and CB indices +----------------+ Figure 3.4. Flow chart of the codebook search in the iLBC encoder.Andersen, et al. Experimental [Page 19]RFC 3951 Internet Low Bit Rate Codec December 2004 1. Decode the part of the residual that has been encoded so far, using the codebook without perceptual weighting. 2. Set up the memory by taking data from the decoded residual. This memory is used to construct codebooks. For blocks preceding the start state, both the decoded residual and the target are time reversed (section 3.6.1). 3. Filter the memory + target with the perceptual weighting filter (section 3.6.2). 4. Search for the best match between the target and the codebook vector. Compute the optimal gain for this match and quantize that gain (section 3.6.4). 5. Update the perceptually weighted target by subtracting the contribution from the selected codebook vector from the perceptually weighted memory (quantized gain times selected vector). Repeat 4 and 5 for the two additional stages. 6. Calculate the energy loss due to encoding of the residual. If needed, compensate for this loss by an upscaling and requantization of the gain for the first stage (section 3.7). The following sections provide an in-depth description of the different blocks of Figure 3.4.3.6.1. Codebook Memory The codebook memory is based on the already encoded sub-blocks, so the available data for encoding increases for each new sub-block that has been encoded. Until enough sub-blocks have been encoded to fill the codebook memory with data, it is padded with zeros. The following figure shows an example of the order in which the sub- blocks are encoded for the 30 ms frame size if the start state is located in the last 58 samples of sub-block 2 and 3. +-----------------------------------------------------+ | 5 | 1 |///|////////| 2 | 3 | 4 | +-----------------------------------------------------+ Figure 3.5. The order from 1 to 5 in which the sub-blocks are encoded. The slashed area is the start state.Andersen, et al. Experimental [Page 20]RFC 3951 Internet Low Bit Rate Codec December 2004 The first target sub-block to be encoded is number 1, and the corresponding codebook memory is shown in the following figure. As the target vector comes before the start state in time, the codebook memory and target vector are time reversed; thus, after the block has been time reversed the search algorithm can be reused. As only the start state has been encoded so far, the last samples of the codebook memory are padded with zeros. +------------------------- |zeros|\\\\\\\\|\\\\| 1 | +------------------------- Figure 3.6. The codebook memory, length lMem=85 samples, and the target vector 1, length 22 samples. The next step is to encode sub-block 2 by using the memory that now has increased since sub-block 1 has been encoded. The following figure shows the codebook memory for encoding of sub-block 2. +----------------------------------- | zeros | 1 |///|////////| 2 | +----------------------------------- Figure 3.7. The codebook memory, length lMem=147 samples, and the target vector 2, length 40 samples. The next step is to encode sub-block 3 by using the memory which has been increased yet again since sub-blocks 1 and 2 have been encoded, but the sub-block still has to be padded with a few zeros. The following figure shows the codebook memory for encoding of sub-block 3. +------------------------------------------ |zeros| 1 |///|////////| 2 | 3 | +------------------------------------------ Figure 3.8. The codebook memory, length lMem=147 samples, and the target vector 3, length 40 samples. The next step is to encode sub-block 4 by using the memory which now has increased yet again since sub-blocks 1, 2, and 3 have been encoded. This time, the memory does not have to be padded with zeros. The following figure shows the codebook memory for encoding of sub-block 4.Andersen, et al. Experimental [Page 21]RFC 3951 Internet Low Bit Rate Codec December 2004 +------------------------------------------ |1|///|////////| 2 | 3 | 4 | +------------------------------------------ Figure 3.9. The codebook memory, length lMem=147 samples, and the target vector 4, length 40 samples. The final target sub-block to be encoded is number 5, and the following figure shows the corresponding codebook memory. As the target vector comes before the start state in time, the codebook memory and target vector are time reversed. +------------------------------------------- | 3 | 2 |\\\\\\\\|\\\\| 1 | 5 |
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -