📄 wavec.htm

📁 各种文件格式说明及程序描述
💻 HTM
📖 第 1 页 / 共 5 页
字号:
<td width="17%"><font size="2" face="arial">6</font></td>
<td width="9%"><font size="2" face="arial">392</font></td>
<td colspan="2" width="52%"><font size="2" face="arial">-232</font></td>
</tr>
<tr>
<td width="22%">&nbsp;</td>
<td colspan="4" width="78%"><font size="2" face="arial">note that if even only 1
coefficient set was used to encode the file then all coefficient sets are still included.
more coefficients may be added by the encoding software, but the first 7 must always be
the same.</font></td>
</tr>
</table>

<blockquote>
<blockquote>
<p><font size="2" face="arial"><b>note</b>: 8.8 signed values can be divided by 256 to
obtain the integer portion of the value.</font></p>
</blockquote>
</blockquote>

<p><font size="2" face="arial"><b>block</b></font></p>

<blockquote>
<p><font size="2" face="arial">the block has three parts, the header, data, and padding.
the three together are <!--webbot bot="htmlmarkup" startspan --><<b><!--webbot
bot="htmlmarkup" endspan -->nblockalign&gt; bytes.</font></p>
<p><font size="2" face="arial">typedef struct adpcmblockheader_tag {</font></p>
<p><font size="2" face="arial">byte bpredictor[nchannels];</font></p>
<p><font size="2" face="arial">int idelta[nchannels];</font></p>
<p><font size="2" face="arial">int isamp1[nchannels];</font></p>
<p><font size="2" face="arial">int isamp2[nchannels];</font></p>
<p><font size="2" face="arial">} adpcmblockheader;</font></p>
</blockquote>

<table border="1" cellpadding="7" width="600">
<tr>
<td width="20%"><font size="2" face="arial"><b>field</b></font></td>
<td width="72%"><font size="2" face="arial"><b>description</b></font></td>
</tr>
<tr>
<td width="20%"><font size="2" face="arial">bpredictor</font></td>
<td width="72%"><font size="2" face="arial">index into the acoef array to define the
predictor used to encode this block.</font></td>
</tr>
<tr>
<td width="20%"><font size="2" face="arial">idelta</font></td>
<td width="72%"><font size="2" face="arial">initial delta value to use.</font></td>
</tr>
<tr>
<td width="20%"><font size="2" face="arial">isamp1</font></td>
<td width="72%"><font size="2" face="arial">the second sample value of the block. when
decoding this will be used as the previous sample to start decoding with.</font></td>
</tr>
<tr>
<td width="20%"><font size="2" face="arial">isamp2</font></td>
<td width="72%"><font size="2" face="arial">the first sample value of the block. when
decoding this will be used as the previous' previous sample to start decoding with.</font></td>
</tr>
</table>

<p><font size="2" face="arial"><b>data</b></font></p>

<blockquote>
<p><font size="2" face="arial">the data is a bit string parsed in groups of
(wbitspersample * nchannels). </font></p>
<p><font size="2" face="arial">for the case of mono voice adpcm (wbitspersample = 4,
nchannels = 1) we have:</font></p>
<p><font size="2" face="arial"><byte1> <byte2>...<byten> ...<byte((nsamplesperblock-2)/2)></font></p>
<p><font size="2" face="arial">where <byten> has <high order bit ... low orderbit> or <(sample 2n + 2) (sample 2n + 3)></font></p>
<p><font size="2" face="arial"><byten> = ((4 bit error delta for sample (2 * n) + 2) << 4) | (4 bit error delta for sample (2 * n) + 3) </font></p>
<p><font size="2" face="arial">for the case of stereo voice adpcm (wbitspersample = 4,
nchannels = 2) we have:</font></p>
<p><font size="2" face="arial"><byte1> <byte2>...<byten> ...<byte(nsamplesperblock-2)></font></p>
<p><font size="2" face="arial">where <byten> has <high order bit ... low orderbit> or</font></p>
<p><font size="2" face="arial"><(left channel of sample n + 2) (right channel of sample n + 2)></font></p>
<p><font size="2" face="arial"><byten> = ((4 bit error delta for left channel of sample n + 2) << 4) | (4 bit error delta for right channel of sample n + 2) </font></p>
</blockquote>

<p><font size="2" face="arial"><b>padding</b></font></p>

<blockquote>
<p><font size="2" face="arial">bit padding is used to round off the block to an exact byte
length.</font></p>
<p><font size="2" face="arial">the size of the padding (in bits):</font></p>
<p><font size="2" face="arial">((nblockalign - (7 * nchannels)) * 8) - </font></p>
<p><font size="2" face="arial">(((nsamplesperblock - 2) * nchannels) * wbitspersample)</font></p>
<p><font size="2" face="arial">the padding does not store any data and should be made
zero.</font></p>
</blockquote>

<p><font size="2" face="arial"><b>adpcm algorithm</b></font></p>

<blockquote>
<p><font size="2" face="arial">each channel of the adpcm file can be encoded/decoded
independently. however this should not destroy phase and amplitude information since each
channel will track the original. since the channels are encoded/decoded independently,
this document is written as if only one channel is being decoded. since the channels are
interleaved, multiple channels may be encoded/decoded in parallel using independent local
storage and temporaries.</font></p>
<p><font size="2" face="arial">note that the process for encoding/decoding one block is
independent from the process for the next block. therefore the process is described for
one block only, and may be repeated for other blocks. while some optimizations may relate
the process for one block to another, in theory they are still independent.</font></p>
<p><font size="2" face="arial">note that in the description below the number designation
appended to isamp (i.e. isamp1 and isamp2) refers to the placement of the sample in
relation to the current one being decoded. thus when you are decoding sample n, isamp1
would be sample n - 1 and isamp2 would be sample n - 2. coef1 is the coefficient for
isamp1 and coef2 is the coefficient for isamp2. this numbering is identical to that used
in the block and format descriptions above.</font></p>
<p><font size="2" face="arial">a sample application will be provided to convert a riff
waveform file to and from adpcm and pcm formats.</font></p>
</blockquote>

<p><font size="2" face="arial"><b>decoding</b></font></p>

<blockquote>
<p><font size="2" face="arial">first the predictor coefficients are determined by using
the bpredictor field of block header. this value is an index into the acoef array in the
file header. </font></p>
<p><font size="2" face="arial">bpredictor = getbyte</font></p>
<p><font size="2" face="arial">the initial idelta is also taken from the block header. </font></p>
<p><font size="2" face="arial">idelta = getword</font></p>
<p><font size="2" face="arial">then the first two samples are taken from block header.
(they are stored as 16 bit pcm data as isamp1 and isamp2. isamp2 is the first sample of
the block, isamp1 is the second sample.) </font></p>
<p><font size="2" face="arial">isamp1= getint</font></p>
<p><font size="2" face="arial">isamp2 = getint</font></p>
<p><font size="2" face="arial">after taking this initial data from the block header, the
process of decoding the rest of the block may begin. it can be done in the following
manner:</font></p>
<p><font size="2" face="arial">while there are more samples in the block to decode:</font></p>
<p><font size="2" face="arial">predict the next sample from the previous two samples. </font></p>
<p><font size="2" face="arial">lpredsamp = ((isamp1 * icoef1) + (isamp2 *icoef2)) /
fixed_point_coef_base</font></p>
<p><font size="2" face="arial">get the 4 bit signed error delta.</font></p>
<p><font size="2" face="arial">(ierrordelta = getnibble)</font></p>
<p><font size="2" face="arial">add the 'error in prediction' to the predicted next sample
and prevent over/underflow errors.</font></p>
<p><font size="2" face="arial">(lnewsamp = lpredsample + (idelta * ierrordelta)</font></p>
<p><font size="2" face="arial">if lnewsample too large, make it the maximum allowable
size.</font></p>
<p><font size="2" face="arial">if lnewsample too small, make it the minimum allowable
size.</font></p>
<p><font size="2" face="arial">output the new sample.</font></p>
<p><font size="2" face="arial">output( lnewsamp )</font></p>
<p><font size="2" face="arial">adjust the quantization step size used to calculate the
'error in prediction'.</font></p>
<p><font size="2" face="arial">idelta = idelta * adaptiontable[ ierrordelta] /
fixed_point_adaption_base</font></p>
<p><font size="2" face="arial">if idelta too small, make it the minimum allowable size.</font></p>
<p><font size="2" face="arial">update the record of previous samples.</font></p>
<p><font size="2" face="arial">isamp2 = isamp1;</font></p>
<p><font size="2" face="arial">isamp1 = lnewsample.</font></p>
</blockquote>

<p><font size="2" face="arial"><b>encoding</b></font></p>

<blockquote>
<p><font size="2" face="arial">for each block, the encoding process can be done through
the following steps. (for each channel)</font></p>
<blockquote>
<p><font size="2" face="arial">determine the predictor to use for the block.</font></p>
<p><font size="2" face="arial">determine the initial idelta for the block.</font></p>
<p><font size="2" face="arial">write out the block header.</font></p>
<p><font size="2" face="arial">encode and write out the data.</font></p>
</blockquote>
<p><font size="2" face="arial">the predictor to use for each block can be determined in
many ways. </font></p>
<blockquote>
<p><font size="2" face="arial">1. a static predictor for all files. </font></p>
<p><font size="2" face="arial">2. the block can be encoded with each possible predictor.
then the predictor that gave the least error can be chosen. the least error can be
determined from:</font></p>
<p><font size="2" face="arial">1. sum of squares of differences. (from
compressed/decompressed to original data)</font></p>
<p><font size="2" face="arial">2. the least average absolute difference.</font></p>
<p><font size="2" face="arial">3. the least average idelta</font></p>
<p><font size="2" face="arial">3. the predictor that has the smallest initial idelta can
be chosen. (this is an approximation of method 2.3)</font></p>
<p><font size="2" face="arial">4. statistics from either the previous or current block.
(e.g. a linear combination of the first 5 samples of a block that corresponds to the
average predicted error.) </font></p>
</blockquote>
<p><font size="2" face="arial">the starting idelta for each block can also be determined
in a couple of ways. </font></p>
<blockquote>
<p><font size="2" face="arial">1. one way is to always start off with the same initial
idelta. </font></p>
<p><font size="2" face="arial">2. another way is to use the idelta from the end of the
previous block. (note that for the first block an initial value must then be chosen.)</font></p>
<p><font size="2" face="arial">3. the initial idelta may also be determined from the first
few samples of the block. (idelta generally fluctuates around the value that makes the
absolute value of the encoded output about half maximum absolute value of the encoded
output. (for 4 bit error deltas the maximum absolute value is 8. this means the initial
idelta should be set so that the first output is around 4.)</font></p>
<p><font size="2" face="arial">4. finally the initial idelta for this block may be
determined from the last few samples of the last block. (note that for the first block an
initial value must then be chosen.)</font></p>
<p><font size="2" face="arial"><b>note</b> that different choices for predictor and
initial idelta will result in different audio quality.</font></p>
</blockquote>
<p><font size="2" face="arial">once the predictor and starting quantization values are
chosen, the block header may be written out.</font></p>
<p><font size="2" face="arial">first the choice of predictor is written out. (for each
channel.)</font></p>
<p><font size="2" face="arial">then the initial idelta (quantization scale) is written
out. (for each channel.)</font></p>
<p><font size="2" face="arial">then the 16 bit pcm value of the second sample is written
out. (isamp1) (for each channel.)</font></p>
<p><font size="2" face="arial">finally the 16 bit pcm value of the first sample is written
out. (isamp2) (for each channel.)</font></p>
<p><font size="2" face="arial">then the rest of the block may be encoded. (note that the
first encoded value will be for the 3rd sample in the block since the first two are
contained in the header.)</font></p>
<p><font size="2" face="arial">while there are more samples in the block to decode:</font></p>
<p><font size="2" face="arial">predict the next sample from the previous two samples. </font></p>
<p><font size="2" face="arial">lpredsamp = ((isamp1 * icoef1) + (isamp2 *icoef2)) </font></p>
<p><font size="2" face="arial">/ fixed_point_coef_base</font></p>
<p><font size="2" face="arial">the 4 bit signed error delta is produced and
overflow/underflow is prevented..</font></p>
<p><font size="2" face="arial">ierrordelta = (sample(n) - lpredsamp) / idelta</font></p>
<p><font size="2" face="arial">if ierrordelta is too large, make it the maximum allowable
size.</font></p>
<p><font size="2" face="arial">if ierrordelta is too small, make it the minimum allowable
size.</font></p>
<p><font size="2" face="arial">then the nibble ierrordelta is written out.</font></p>
<p><font size="2" face="arial">putnibble( ierrordelta )</font></p>
<p><font size="2" face="arial">add the 'error in prediction' to the predicted next sample
and prevent over/underflow errors.</font></p>
<p><font size="2" face="arial">(lnewsamp = lpredsample + (idelta * ierrordelta)</font></p>
<p><font size="2" face="arial">if lnewsample too large, make it the maximum allowable
size.</font></p>
<p><font size="2" face="arial">if lnewsample too small, make it the minimum allowable
size.</font></p>
<p><font size="2" face="arial">adjust the quantization step size used to calculate the
'error in prediction'.</font></p>
<p><font size="2" face="arial">idelta = idelta * adaptiontable[ ierrordelta] /
fixed_point_adaption_base</font></p>
<p><font size="2" face="arial">if idelta too small, make it the minimum allowable size.</font></p>
<p><font size="2" face="arial">update the record of previous samples.</font></p>
<p><font size="2" face="arial">isamp2 = isamp1;</font></p>
<p><font size="2" face="arial">isamp1 = lnewsample.</font></p>
</blockquote>

<p><font size="2" face="arial"><b>sample c code</b></font></p>

<p><font size="2" face="arial">sample c code is contained in the file msadpcm.c, which is
available with this document in electronic form and separately. see the overview section
for how to obtain this sample code.</font></p>

<p><font size="2" face="arial"><b>cvsd wave type</b></font></p>

<blockquote>
<p><font size="2" face="arial">added 07/21/92<br>
author: dsp solutions, formerly digispeech</font></p>
</blockquote>

<p><font size="2" face="arial"><b>fact chunk</b></font></p>

<p><font size="2" face="arial">this chunk is required for all wave formats other than
wave_format_pcm. it stores file dependent information about the contents of the wave data.
it currently specifies the time length of the data in samples.</font></p>

<p><font size="2" face="arial"><b>wave format header</b></font></p>

<p><font size="2" face="arial"><b># define wave_format_ibm_cvsd (0x0005)</b></font></p>

<table border="1" cellpadding="7" width="576">
💿 文件大小 4995 K
👤 上传用户 hh831
📂 所属分类文件格式
🏷️ 相关标签

#文件格式 #程序
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -