📄 ch18.htm
字号:
<hr>
<blockquote>
<b><p>Listing 18.8. Testing the <tt><font FACE="Courier">Vol</font></tt> control tag.<br>
</b></p>
</blockquote>
<blockquote>
<tt><font FACE="Courier"><p>\Spd=150\ \Vol=30000\<br>
I \Emp\told you never to go running in the street.<br>
<br>
\pau=1000\ \Spd=75\ \Vol=65000\<br>
Didn't you \Emp\hear me?<br>
<br>
\Vol=15000\ \pau=2000\ \Spd=200\<br>
You must listen to me when I tell you something \Emp\important. <br>
\Spd=150\ \Vol\=65000</font></tt> </p>
</blockquote>
<hr>
<h3><a NAME="TheLowLevelTTSControlTags">The Low-Level TTS Control Tags</h3>
<p>There are seven low-level TTS control tags, which are used to handle TTS adjustments
not normally seen by TTS users. Most of these control tags are meant to be used by people
who are designing and training complex TTS engines and grammars. </p>
<p>Of the event low-level TTS control tags, only one is used frequently-the <tt><font
FACE="Courier">\Rst\</font></tt> tag. This tag resets the control values to those that
existed at the start of the current session. </p>
<p>The remaining control tags are summarized in Table 18.3.<br>
</p>
<p align="center"><b>Table 18.3. The low-level TTS control tags.</b> </p>
<div align="center"><center>
<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
<tr>
<td><i>Control Tag</i></td>
<td WIDTH="180"><i>Syntax</i> </td>
<td WIDTH="313"><i>Description</i></td>
</tr>
<tr>
<td WIDTH="97">Com</td>
<td WIDTH="180">\Com=<tt><i><font FACE="Courier">string</font></i></tt>\ </td>
<td WIDTH="313">Use this tag to add comments to the text passed to the TTS engine. These
comments will be ignored by the TTS engine. </td>
</tr>
<tr>
<td WIDTH="97">Eng</td>
<td WIDTH="180">\Eng;<i><tt><font FACE="Courier">[GUID]:command</font></tt>\</i> </td>
<td WIDTH="313">Use this tag to call an engine-specific command. This can be used to call
special hardware-specific commands supported by third-party TTS engines. </td>
</tr>
<tr>
<td WIDTH="97">Mrk</td>
<td WIDTH="180">\Mrk=<tt><i><font FACE="Courier">number</font></i></tt>\ </td>
<td WIDTH="313">Use this tag to fire the BookMark event of the ITTSBufNotifySink. You can
use this to signal such things as page turns or slide changes once the place in the text
is reached. </td>
</tr>
<tr>
<td WIDTH="97">Prn</td>
<td WIDTH="180">\Prn=<tt><i><font FACE="Courier">text=IPA</font></i></tt>\ </td>
<td WIDTH="313">Use this tag to embed custom pronunciations of words using the
International Phonetic Alphabet. This may not be supported by your engine. </td>
</tr>
<tr>
<td WIDTH="97">Pro</td>
<td WIDTH="180">\Pro=<tt><i><font FACE="Courier">number</font></i></tt>\ </td>
<td WIDTH="313">Use this tag to turn on and off the TTS prosody rules. Setting the Pro
value to 1 turns the rules on; setting it to 0 turns the rules off. </td>
</tr>
<tr>
<td WIDTH="97"><tt><font FACE="Courier">Prt</font></tt></td>
<td WIDTH="180"><tt><font FACE="Courier">\Prt=<i>string</i>\</font></tt> </td>
<td WIDTH="313">Use this tag to tell the engine what part of speech the current word is.
Microsoft has defined these general categories: <br>
<tt><font FACE="Courier">Abbr</font></tt> (abbreviation)<br>
<tt><font FACE="Courier">N</font></tt> (noun)<br>
<tt><font FACE="Courier">Adj</font></tt> (adjective)<br>
<tt><font FACE="Courier">Ord</font></tt> (ordinal number)<br>
<tt><font FACE="Courier">Adv</font></tt> (adverb)<br>
<tt><font FACE="Courier">Prep</font></tt> (preposition)<br>
<tt><font FACE="Courier">Card</font></tt> (cardinal number)<br>
<tt><font FACE="Courier">Pron</font></tt> (pronoun)<br>
<tt><font FACE="Courier">Conj</font></tt> (conjunction)<br>
<tt><font FACE="Courier">Prop</font></tt> (proper noun)<br>
<tt><font FACE="Courier">Cont</font></tt> (contraction)<br>
<tt><font FACE="Courier">Punct</font></tt> (punctuation)<br>
<tt><font FACE="Courier">Det</font></tt> (determiner)<br>
<tt><font FACE="Courier">Quant</font></tt> (quantifier)<br>
<tt><font FACE="Courier">Interj</font></tt> (interjection)<br>
<tt><font FACE="Courier">V</font></tt> (verb) </td>
</tr>
</table>
</center></div><div align="center"><center>
<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
<tr>
<td><b>Note</b></td>
</tr>
<tr>
<td WIDTH="582"><blockquote>
<p>With the exception of the <tt><font FACE="Courier">\Rst\</font></tt> tag, none of the
other tags produced noticeable results using the TTS engine that ships with Microsoft
Voice. For this reason, there are no examples for these control tags. </p>
</blockquote>
</td>
</tr>
</table>
</center></div>
<p>Now that you know how to modify the way the TTS engine processes input, you are ready
to learn how to use grammar rules to control the way SR engines behave. </p>
<h2></a><a NAME="GrammarRules"><font SIZE="5" COLOR="#FF0000">Grammar Rules</font></a></h2>
<p>The grammar of the SR engine controls how the SR engine interprets audio input. The
grammar defines the objects for which the engine will listen and the rules used to analyze
the objects. SR engines require that one or more grammars be loaded and activated before
an engine can successfully interpret the audio stream. </p>
<p>As mentioned in earlier chapters, the SAPI model defines three types of SR grammars:
<ul>
<li><i>Context-free</i> grammars-Words are analyzed based on syntax rules instead of content
rules. Interpretation is based on the placement and order of the objects, not their
meaning or context. </li>
<li><i>Dictation</i> grammars-Words are compared against a large vocabulary, a predefined
topic (or context), and an expected speaking style. </li>
<li><i>Limited-domain</i> grammars-This grammar is a cross between rule-based context-free
and word-based dictation grammars. </li>
</ul>
<p>The context-free grammar format is the most commonly used format. It is especially good
at interpreting command and control statements from the user. Context-free grammars also
allow a great deal of flexibility since the creation of a set of rules is much easier than
building and analyzing large vocabularies, as is done in dictation grammars. By defining a
small set of general rules, the SR engine can successfully respond to hundreds (or even
thousands) of valid commands-without having to actually build each command into the SR
lexicon. The rest of this section deals with the design, compilation, and testing of
context-free grammars for the SAPI SR engine model. </p>
<h3><a NAME="GeneralRulesfortheSAPIContextFree">General Rules for the SAPI Context-Free
Grammar</a></h3>
<p>The SAPI Context-Free Grammar (CFG) operates on a limited set of rules. These rules are
used to analyze all audio input. In addition to rules, CFGs also allow for the definition
of individual words. These words become part of the grammar and can be recognized by
themselves or as part of a defined rule. </p>
<div align="center"><center>
<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
<tr>
<td><b>Note</b></td>
</tr>
<tr>
<td><blockquote>
<p>Throughout the rest of this section, you will be using <tt><font FACE="Courier">NOTEPAD.EXE</font></tt>
(or some other ASCII editor) to create grammar files that will be compiled using the <tt><font
FACE="Courier">GRAMCOMP.EXE</font></tt> grammar compiler that ships with the Microsoft
Speech SDK. You will also need the <tt><font FACE="Courier">SRTEST.EXE</font></tt>
application that ships with the Speech SDK to test your compiled grammars. Even if you do
not have the Microsoft Speech SDK, however, you can still learn a lot from this material. </p>
</blockquote>
</td>
</tr>
</table>
</center></div>
<h4>Defining Words in a CFG</h4>
<p>In SAPI CFGs, each defined word is assigned a unique ID number. This is done by listing
each word, followed by a number. Listing 18.9 shows an example. </p>
<hr>
<blockquote>
<b><p>Listing 18.9. Defining words for a CFG file.<br>
</b></p>
</blockquote>
<blockquote>
<tt><font FACE="Courier"><p>//<br>
// defining names<br>
//<br>
Lee = 101 ;<br>
Shannon = 102 ;<br>
Jesse = 103 ;<br>
Scott = 104 ;<br>
Michelle = 105 ;<br>
Sue = 106 ;</font></tt> </p>
</blockquote>
<hr>
<p>Notice that there are spaces between each item on the line. The Microsoft <tt><font
FACE="Courier">GRAMCOMP.EXE</font></tt> program requires that each item be separated by
white space. Also note that a semicolon (;) must appear at the end of each definition. </p>
<div align="center"><center>
<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
<tr>
<td><b>Tip</b></td>
</tr>
<tr>
<td><blockquote>
<p>If you are using the <tt><font FACE="Courier">GRAMCOMP.EXE</font></tt> compiler, you
are not required to define each word and give it a number. The <tt><font FACE="Courier">GRAMCOMP.EXE</font></tt>
program automatically assigns a number to each new word for you. However, it is a good
idea to predefine words to prevent any potential conflicts at compile time. </p>
</blockquote>
</td>
</tr>
</table>
</center></div>
<p>The list of words can be as short or as long as you require. Keep in mind that the SR
engine can only recognize words that appear in the vocabulary. If you fail to define the
word "Stop," you can holler <i>Stop!</i> to the engine as long as you like, but
it will have no idea what you are saying! Also, the longer the list, the more likely it is
that the engine will confuse one word for another. As the list increases in size, the
accuracy of the engine decreases. Try to keep your lists as short as possible. </p>
<h4>Defining Rules in a CFG</h4>
<p>Along with words, CFGs require rules to interpret the audio stream. Each rule consists
of two parts-the rule name and the series of operations that define the rule: </p>
<blockquote>
<tt><font FACE="Courier"><p><RuleName> = [series of operations]</font></tt> </p>
</blockquote>
<p>There are several possible operations within a rule. You can call another rule, list a
set of recognizable words, or refer to an external list of words. There are also several
special functions defined for CFGs. These functions define interpretation options for the
input stream. There are four CFG functions recognized by the <tt><font FACE="Courier">GRAMCOMP.EXE</font></tt>
compiler:
<ul>
<li><tt><font FACE="Courier">alt()</font></tt>-Use this function to list a set of
alternative inputs. </li>
<li><tt><font FACE="Courier">seq()</font></tt>-Use this function to indicate to the SR
engine the sequence in which input will occur. </li>
<li><tt><font FACE="Courier">opt()</font></tt>-Use this function to inform the SR engine
that the word or rule is optional and may not appear as part of the input stream. </li>
<li><tt><font FACE="Courier">rep()</font></tt>-Use this function to tell the SR engine that
this word or rule could be repeated several times. </li>
</ul>
<h4>Using the <tt><font FACE="Courier">alt()</font></tt> Rule Function</h4>
<p>When building a rule definition, you can tell the SR engine that only one of the items
in the list is expected. Listing 18.10 shows how this is done. </p>
<hr>
<blockquote>
<b><p>Listing 18.10. An example of the <tt><font FACE="Courier">alt()</font></tt> rule
function.<br>
</b></p>
</blockquote>
<blockquote>
<tt><font FACE="Courier"><p><Names> = alt(<br>
Scott<br>
Wayne<br>
Curt<br>
)alt ;</font></tt> </p>
</blockquote>
<hr>
<p>The <tt><font FACE="Courier"><Names></font></tt> rule in Listing 18.10 defines
three alternative names for the rule. This tells the SR engine that only one of the names
will be spoken at a single occurrence. </p>
<h4>Using the <tt><font FACE="Courier">seq()</font></tt> Rule Function</h4>
<p>You can also define a rule that indicates the sequence in which words will be spoken.
Listing 18.11 shows how you can modify the <tt><font FACE="Courier"><Names></font></tt>
rule to also include last names as part of the rule. </p>
<hr>
<blockquote>
<b><p>Listing 18.11. An example of the <tt><font FACE="Courier">seq()</font></tt> rule
function.<br>
</b></p>
</blockquote>
<blockquote>
<tt><font FACE="Courier"><p><Names> = alt(<br>
Scott<br>
seq( Scott Ivey )seq<br>
Wayne<br>
seq( Wayne Ivey )seq<br>
Curt<br>
seq( Curt Smith )seq<br>
)alt ;</font></tt> </p>
</blockquote>
<hr>
<p>The <tt><font FACE="Courier"><Names></font></tt> rule now lists six alternatives.
Three of them include two-word phrases that must be spoken in the proper order to be
recognized. For example, users could say <i>Scott</i> or <i>Scott Ivey</i>, and the SR
engine would recognize the input. However, if the user said <i>Ivey Scott</i>, the system
would not understand the input. </p>
<h4>Using the <tt><font FACE="Courier">opt()</font></tt> Rule Function</h4>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -