📄 multiple channel audio data and wave files.htm
字号:
structure, the <B>Format.cbSize</B> field must be set to <B>22</B> and the
<B>SubFormat</B> field must be set to <B>KSDATAFORMAT_SUBTYPE_PCM</B>.</P>
<P>The definition (in KSMEDIA.H) of <B>KSDATAFORMAT_SUBTYPE_PCM</B> is
included below:</P><PRE class=codeSample>#define STATIC_KSDATAFORMAT_SUBTYPE_PCM\
DEFINE_WAVEFORMATEX_GUID(WAVE_FORMAT_PCM)
DEFINE_GUIDSTRUCT("00000001-0000-0010-8000-00aa00389b71", KSDATAFORMAT_SUBTYPE_PCM);
#define KSDATAFORMAT_SUBTYPE_PCM DEFINE_GUIDNAMED(KSDATAFORMAT_SUBTYPE_PCM)
</PRE>
<P><B>PWAVEFORMATPCMEX</B> can be safely cast to
<B>PWAVEFORMATEXTENSIBLE</B> or <B>PWAVEFORMATEX</B>.</P><PRE class=codeSample>typedef WAVEFORMATEXTENSIBLE WAVEFORMATPCMEX;
typedef WAVEFORMATPCMEX *PWAVEFORMATPCMEX;
typedef WAVEFORMATPCMEX NEAR *NPWAVEFORMATPCMEX;
typedef WAVEFORMATPCMEX FAR *LPWAVEFORMATPCMEX;
</PRE>
<P><BR><B>Definition of
WAVEFORMATIEEEFLOATEX</B><BR><B>WAVE_FORMAT_IEEE_FLOAT</B> based on
<B>WAVEFORMATEX</B> fails to be a good format for high bit-depth samples
or multiple channel streams for the same reasons as the wave format
<B>WAVE_FORMAT_PCM</B>. For cases in which the spatial locations of the
channels are linked to the standard speaker locations,
<B>WAVEFORMATIEEEFLOATEX</B> is appropriate. For the
<B>WAVEFORMATEXTENSIBLE</B> structure, the <B>Format.cbSize</B> field must
be set to <B>22</B> and the <B>SubFormat</B> field must be set to
<B>KSDATAFORMAT_SUBTYPE_IEEE_FLOAT</B>.</P>
<P>The definition (in KSMEDIA.H) of <B>KSDATAFORMAT_SUBTYPE_IEEE_FLOAT</B>
is included below:</P><PRE class=codeSample>#define STATIC_KSDATAFORMAT_SUBTYPE_PCM\
DEFINE_WAVEFORMATEX_GUID(WAVE_FORMAT_PCM)
DEFINE_GUIDSTRUCT("00000001-0000-0010-8000-00aa00389b71", KSDATAFORMAT_SUBTYPE_PCM);
#define KSDATAFORMAT_SUBTYPE_PCM DEFINE_GUIDNAMED(KSDATAFORMAT_SUBTYPE_PCM)
</PRE>
<P><B>PWAVEFORMATIEEEFLOATEX</B> can be safely cast to
<B>PWAVEFORMATEXTENSIBLE</B> or <B>PWAVEFORMATEX</B>.</P><PRE class=codeSample>typedef WAVEFORMATEXTENSIBLE WAVEFORMATPCMEX;
typedef WAVEFORMATPCMEX *PWAVEFORMATPCMEX;
typedef WAVEFORMATPCMEX NEAR *NPWAVEFORMATPCMEX;
typedef WAVEFORMATPCMEX FAR *LPWAVEFORMATPCMEX;
</PRE>
<DIV style="MARGIN-TOP: 3px; MARGIN-BOTTOM: 10px"><A
href="http://www.microsoft.com/whdc/hwdev/tech/audio/multichaud.mspx#top"><IMG
height=9 alt="Top of page"
src="Multiple Channel Audio Data and WAVE Files.files/arrow_px_up.gif"
width=7 border=0></A><A class=topOfPage
href="http://www.microsoft.com/whdc/hwdev/tech/audio/multichaud.mspx#top">Top
of page</A></DIV><A name=#XSLTsection128121120120></A>
<H2>Details about WAVEFORMATEX Fields </H2>
<P>What follows are interpretations of the fields of the
<B>WAVEFORMATEX</B> structure, descriptions of the new fields, and
numerous examples that seek to clarify the use of this format.</P>
<P><BR><B>wFormatTag is WAVE_FORMAT_EXTENSIBLE</B><BR>In the new structure
<B>WAVEFORMATEXTENSIBLE</B>, the <B>wFormatTag</B> field must be set to
<B>WAVE_FORMAT_EXTENSIBLE</B> (defined in MMREG.H).
<B>KSDATAFORMAT_SUBTYPE_PCM</B> and <B>KSDATAFORMAT_SUBTYPE_IEEE_FLOAT</B>
are the sub-format itself, not the actual tag.</P>
<P><BR><B>nChannels, nSamplesPerSec, nAvgBytesPerSec Unchanged</B><BR>The
meanings of <B>nChannels</B>, <B>nSamplesPerSec</B>, and
<B>nAvgBytesPerSec</B> have not been altered from <B>WAVE_FORMAT_PCM</B>.
<B>nChannels</B> is the number of interleaved samples per block--the
number of individual channels in the stream. <B>nSamplesPerSec</B> is the
intended sample rate for the stream--the number of blocks that should be
processed in exactly one second. <B>nAvgBytesPerSec</B> is used for buffer
size estimation, so this number is calculated on gross block size. In
fact, <B>nAvgBytesPerSec</B> will always be the product of
<B>nBlockAlign</B> and <B>nSamplesPerSec</B>, as it was in
<B>WAVE_FORMAT_PCM</B>.</P>
<P><BR><B>wBitsPerSample Rigidly Defined as Container Size</B><BR>In
<B>WAVEFORMATEXTENSIBLE</B>, <B>wBitsPerSample</B> is defined
unambiguously as the size of the container for each sample. Individual
samples must be byte-aligned, so this value must be an integer multiple of
8. A case such as (<B>nChannels</B> = 2; <B>wBitsPerSample</B> = 20;
<B>nBlockAlign</B> = 5) is explicitly disallowed in the new format.</P>
<P>A special case is necessary because of the nature of variable bit rate
formats where a static <B>wBitsPerSample</B> cannot be provided. These
types of formats should specify <B>0</B> in the <B>wBitsPerSample
field</B>.</P>
<P><BR><B>nBlockAlign and Channel Alignment</B><BR>For PCM and IEEE float
formats based on <B>WAVEFORMATEXTENSIBLE</B>, <B>nBlockAlign</B> must be
the product of <B>nChannels</B> and <B>wBitsPerSample</B> (divided by
eight). This maintains compatibility with the definition of
<B>nBlockAlign</B> in <B>WAVE_FORMAT_PCM</B> (for those cases in which
<B>wBitsPerSample</B> was assumed to be the container size).</P>
<P><BR><B>cbSize is at Least 22</B><BR>For <B>WAVEFORMATEXTENSIBLE</B>,
<B>cbSize</B> must always be set to at least 22. This is the sum of the
sizes of the <B>Samples</B> union (2), <B>DWORD dwChannelMask</B> (4), and
<B>GUID guidSubFormat</B> (16). This is appended to the initial
<B>WAVEFORMATEX Format</B> (size 18), so a <B>WAVEFORMATPCMEX</B> and
<B>WAVEFORMATIEEEFLOATEX</B> structure is 64-bit aligned.</P>
<DIV style="MARGIN-TOP: 3px; MARGIN-BOTTOM: 10px"><A
href="http://www.microsoft.com/whdc/hwdev/tech/audio/multichaud.mspx#top"><IMG
height=9 alt="Top of page"
src="Multiple Channel Audio Data and WAVE Files.files/arrow_px_up.gif"
width=7 border=0></A><A class=topOfPage
href="http://www.microsoft.com/whdc/hwdev/tech/audio/multichaud.mspx#top">Top
of page</A></DIV><A name=#XSLTsection129121120120></A>
<H2>Details about the Samples Union </H2>
<P>To keep the <B>WAVEFORMATEXTENSIBLE</B> structure under the 64-bit
limit, <B>wValidBitsPerSample</B> and <B>wSamplesPerBlock</B> were joined
together in a union called <B>Samples</B>. An extra field named
<B>wReserved</B> was also added for future use.</P>
<P><BR><B>Details about wValidBitsPerSample</B><BR>The field
<B>wValidBitsPerSample</B> is used to explicitly indicate how many bits of
precision are present in the signal. Most of the time this value will be
equal to <B>wBitsPerSample</B>. If, however, wave data originated from a
20-bit A/D, then <B>wValidBitsPerSample</B> could be set to 20, even
though <B>wBitsPerSample</B> might be 24 or 32. <A
href="http://www.microsoft.com/whdc/hwdev/tech/audio/multichaud.mspx#XSLTsection132121120120">Examples</A>
are included later in this article.</P>
<P>If <B>wValidBitsPerSample</B> is less than <B>wBitsPerSample</B>, then
the actual PCM data is "left-aligned" within the container. The sample
itself is justified most significant; all extra bits are at the
least-significant portion of the container. All non-valid data bits must
be set to 0.</P>
<P>The value of <B>wValidBitsPerSample</B> should never exceed that of
<B>wBitsPerSample</B>. If this is encountered, the proper action is to
reject the data format.</P>
<P>An entity can change <B>wValidBitsPerSample</B> as it processes the
data. For example, an application would know that a stream with
<B>wValidBitsPerSample</B> = 24 must be dithered to 16 bits if the output
driver indicated that it supported <B>wValidBitsPerSample</B> = 16
only.</P>
<P>Although this can be very expensive from the standpoint of memory
bandwidth, <B>wBitsPerSample</B> can be changed as well.
<B>wValidBitsPerSample</B> indicates whether the container size
(<B>wBitsPerSample</B>) can be reduced without data loss. A stream with
<B>wValidBitsPerSample</B> = 20; <B>wBitsPerSample</B> = 32 (for
processing by a 32-bit CPU) could safely be compressed to
<B>wBitsPerSample</B> = 24 (for archiving to disk). Without
<B>wValidBitsPerSample</B>, one would not know whether this was
lossless.</P>
<P><BR><B>Details about wSamplesPerBlock</B><BR>It is often useful to know
how many samples are contained in one compressed block of audio data. The
<B>wSamplesPerBlock</B> is used in compressed formats that have a fixed
number of samples within each block. This value aids in buffer estimation
and position information. If <B>wSamplesPerBlock</B> is <B>0</B>, a
variable amount of samples is contained in each block of compressed audio
data. In this case, buffer estimation and position information need to be
obtained in other ways.</P>
<P><BR><B>Details about wReserved</B><BR>If neither
<B>wValidBitsPerSample</B> or <B>wSamplesPerBlock</B> apply to the audio
data being described by the <B>WAVEFORMATEXTENSIBLE</B> structure, set the
<B>wReserved</B> field to 0.</P>
<DIV style="MARGIN-TOP: 3px; MARGIN-BOTTOM: 10px"><A
href="http://www.microsoft.com/whdc/hwdev/tech/audio/multichaud.mspx#top"><IMG
height=9 alt="Top of page"
src="Multiple Channel Audio Data and WAVE Files.files/arrow_px_up.gif"
width=7 border=0></A><A class=topOfPage
href="http://www.microsoft.com/whdc/hwdev/tech/audio/multichaud.mspx#top">Top
of page</A></DIV><A name=#XSLTsection130121120120></A>
<H2>Specifying Channel Locations Using dwChannelMask </H2>
<P>To account for the possibility that only a subset of the possible
speakers is present, a new field <B>dwChannelMask</B> was created to
specify the mapping of channels to spatial locations. In support of this,
the following bitmap can be found in sdk\inc\ksmedia.h and
sdk\inc\mmreg.h:</P><PRE class=codeSample>#define SPEAKER_FRONT_LEFT 0x1
#define SPEAKER_FRONT_RIGHT 0x2
#define SPEAKER_FRONT_CENTER 0x4
#define SPEAKER_LOW_FREQUENCY 0x8
#define SPEAKER_BACK_LEFT 0x10
#define SPEAKER_BACK_RIGHT 0x20
#define SPEAKER_FRONT_LEFT_OF_CENTER 0x40
#define SPEAKER_FRONT_RIGHT_OF_CENTER 0x80
#define SPEAKER_BACK_CENTER 0x100
#define SPEAKER_SIDE_LEFT 0x200
#define SPEAKER_SIDE_RIGHT 0x400
#define SPEAKER_TOP_CENTER 0x800
#define SPEAKER_TOP_FRONT_LEFT 0x1000
#define SPEAKER_TOP_FRONT_CENTER 0x2000
#define SPEAKER_TOP_FRONT_RIGHT 0x4000
#define SPEAKER_TOP_BACK_LEFT 0x8000
#define SPEAKER_TOP_BACK_CENTER 0x10000
#define SPEAKER_TOP_BACK_RIGHT 0x20000
#define SPEAKER_RESERVED 0x80000000
</PRE>
<P>These values correspond exactly to the master channel layout defined in
various external standards.</P>
<P>As an example, assume <B>nChannels</B> = 4; <B>dwChannelMask</B> =
0x00000033. This indicates that the audio channels are intended for
playback to the Front Left, Front Right, Back Left and Back Right
speakers. The channel data should be interleaved in that order within each
block.</P>
<P>When using <B>WAVEFORMATEXTENSIBLE</B>, channel locations beyond this
predefined set of 18 are considered reserved. One should make no
assumptions regarding ordering of channels beyond these, other than to
assume that Microsoft will conform to additional standards.</P>
<DIV style="MARGIN-TOP: 3px; MARGIN-BOTTOM: 10px"><A
href="http://www.microsoft.com/whdc/hwdev/tech/audio/multichaud.mspx#top"><IMG
height=9 alt="Top of page"
src="Multiple Channel Audio Data and WAVE Files.files/arrow_px_up.gif"
width=7 border=0></A><A class=topOfPage
href="http://www.microsoft.com/whdc/hwdev/tech/audio/multichaud.mspx#top">Top
of page</A></DIV><A name=#XSLTsection131121120120></A>
<H2>Details about dwChannelMask </H2>
<P>The field <B>dwChannelMask</B> indicates which channels are present in
the multi-channel stream. The least significant bit corresponds with the
Front Left speaker, the next least significant bit corresponds to the
Front Right speaker, and so on, continuing in the order defined in Section
2. The channels specified in <B>dwChannelMask</B> must be present in the
prescribed order (from least significant bit up). In other words, if only
Front Left and Front Center are specified, then Front Left should come
first in the interleaved stream.</P>
<P>Should <B>nChannels</B> be less than the number of bits set in
<B>dwChannelMask</B>, then the extra (most significant) bits in
<B>dwChannelMask</B> are ignored. Should <B>nChannels</B> exceed the
number of bits set in <B>dwChannelMask</B>, then the remaining channels
are not assigned to any particular speaker location. An audio device would
render the remaining channel data to output ports not in use.</P>
<P>If an audio sink, such as WDM Audio's built-in kernel mixer, does not
know to process the extra channels without speaker locations, the data is
not rendered. Having <B>nChannels</B> exceed the number of bits set in
<B>dwChannelMask</B> can produce inconsistent results and should be
avoided if possible.</P>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -