⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch18.htm

📁 MAPI__SAPI__TAPI
💻 HTM
📖 第 1 页 / 共 5 页
字号:
<hr>

<blockquote>
  <b><p>Listing 18.8. Testing the <tt><font FACE="Courier">Vol</font></tt> control tag.<br>
  </b></p>
</blockquote>

<blockquote>
  <tt><font FACE="Courier"><p>\Spd=150\ \Vol=30000\<br>
  I \Emp\told you never to go running in the street.<br>
  <br>
  \pau=1000\ \Spd=75\ \Vol=65000\<br>
  Didn't you \Emp\hear me?<br>
  <br>
  \Vol=15000\ \pau=2000\ \Spd=200\<br>
  You must listen to me when I tell you something \Emp\important. <br>
  \Spd=150\ \Vol\=65000</font></tt> </p>
</blockquote>

<hr>

<h3><a NAME="TheLowLevelTTSControlTags">The Low-Level TTS Control Tags</h3>

<p>There are seven low-level TTS control tags, which are used to handle TTS adjustments 
not normally seen by TTS users. Most of these control tags are meant to be used by people 
who are designing and training complex TTS engines and grammars. </p>

<p>Of the event low-level TTS control tags, only one is used frequently-the <tt><font
FACE="Courier">\Rst\</font></tt> tag. This tag resets the control values to those that 
existed at the start of the current session. </p>

<p>The remaining control tags are summarized in Table 18.3.<br>
</p>

<p align="center"><b>Table 18.3. The low-level TTS control tags.</b> </p>
<div align="center"><center>

<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
  <tr>
    <td><i>Control Tag</i></td>
    <td WIDTH="180"><i>Syntax</i> </td>
    <td WIDTH="313"><i>Description</i></td>
  </tr>
  <tr>
    <td WIDTH="97">Com</td>
    <td WIDTH="180">\Com=<tt><i><font FACE="Courier">string</font></i></tt>\ </td>
    <td WIDTH="313">Use this tag to add comments to the text passed to the TTS engine. These 
    comments will be ignored by the TTS engine. </td>
  </tr>
  <tr>
    <td WIDTH="97">Eng</td>
    <td WIDTH="180">\Eng;<i><tt><font FACE="Courier">[GUID]:command</font></tt>\</i> </td>
    <td WIDTH="313">Use this tag to call an engine-specific command. This can be used to call 
    special hardware-specific commands supported by third-party TTS engines. </td>
  </tr>
  <tr>
    <td WIDTH="97">Mrk</td>
    <td WIDTH="180">\Mrk=<tt><i><font FACE="Courier">number</font></i></tt>\ </td>
    <td WIDTH="313">Use this tag to fire the BookMark event of the ITTSBufNotifySink. You can 
    use this to signal such things as page turns or slide changes once the place in the text 
    is reached. </td>
  </tr>
  <tr>
    <td WIDTH="97">Prn</td>
    <td WIDTH="180">\Prn=<tt><i><font FACE="Courier">text=IPA</font></i></tt>\ </td>
    <td WIDTH="313">Use this tag to embed custom pronunciations of words using the 
    International Phonetic Alphabet. This may not be supported by your engine. </td>
  </tr>
  <tr>
    <td WIDTH="97">Pro</td>
    <td WIDTH="180">\Pro=<tt><i><font FACE="Courier">number</font></i></tt>\ </td>
    <td WIDTH="313">Use this tag to turn on and off the TTS prosody rules. Setting the Pro 
    value to 1 turns the rules on; setting it to 0 turns the rules off. </td>
  </tr>
  <tr>
    <td WIDTH="97"><tt><font FACE="Courier">Prt</font></tt></td>
    <td WIDTH="180"><tt><font FACE="Courier">\Prt=<i>string</i>\</font></tt> </td>
    <td WIDTH="313">Use this tag to tell the engine what part of speech the current word is. 
    Microsoft has defined these general categories: <br>
    <tt><font FACE="Courier">Abbr</font></tt> (abbreviation)<br>
    <tt><font FACE="Courier">N</font></tt> (noun)<br>
    <tt><font FACE="Courier">Adj</font></tt> (adjective)<br>
    <tt><font FACE="Courier">Ord</font></tt> (ordinal number)<br>
    <tt><font FACE="Courier">Adv</font></tt> (adverb)<br>
    <tt><font FACE="Courier">Prep</font></tt> (preposition)<br>
    <tt><font FACE="Courier">Card</font></tt> (cardinal number)<br>
    <tt><font FACE="Courier">Pron</font></tt> (pronoun)<br>
    <tt><font FACE="Courier">Conj</font></tt> (conjunction)<br>
    <tt><font FACE="Courier">Prop</font></tt> (proper noun)<br>
    <tt><font FACE="Courier">Cont</font></tt> (contraction)<br>
    <tt><font FACE="Courier">Punct</font></tt> (punctuation)<br>
    <tt><font FACE="Courier">Det</font></tt> (determiner)<br>
    <tt><font FACE="Courier">Quant</font></tt> (quantifier)<br>
    <tt><font FACE="Courier">Interj</font></tt> (interjection)<br>
    <tt><font FACE="Courier">V</font></tt> (verb) </td>
  </tr>
</table>
</center></div><div align="center"><center>

<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
  <tr>
    <td><b>Note</b></td>
  </tr>
  <tr>
    <td WIDTH="582"><blockquote>
      <p>With the exception of the <tt><font FACE="Courier">\Rst\</font></tt> tag, none of the 
      other tags produced noticeable results using the TTS engine that ships with Microsoft 
      Voice. For this reason, there are no examples for these control tags. </p>
    </blockquote>
    </td>
  </tr>
</table>
</center></div>

<p>Now that you know how to modify the way the TTS engine processes input, you are ready 
to learn how to use grammar rules to control the way SR engines behave. </p>

<h2></a><a NAME="GrammarRules"><font SIZE="5" COLOR="#FF0000">Grammar Rules</font></a></h2>

<p>The grammar of the SR engine controls how the SR engine interprets audio input. The 
grammar defines the objects for which the engine will listen and the rules used to analyze 
the objects. SR engines require that one or more grammars be loaded and activated before 
an engine can successfully interpret the audio stream. </p>

<p>As mentioned in earlier chapters, the SAPI model defines three types of SR grammars: 

<ul>
  <li><i>Context-free</i> grammars-Words are analyzed based on syntax rules instead of content 
    rules. Interpretation is based on the placement and order of the objects, not their 
    meaning or context. </li>
  <li><i>Dictation</i> grammars-Words are compared against a large vocabulary, a predefined 
    topic (or context), and an expected speaking style. </li>
  <li><i>Limited-domain</i> grammars-This grammar is a cross between rule-based context-free 
    and word-based dictation grammars. </li>
</ul>

<p>The context-free grammar format is the most commonly used format. It is especially good 
at interpreting command and control statements from the user. Context-free grammars also 
allow a great deal of flexibility since the creation of a set of rules is much easier than 
building and analyzing large vocabularies, as is done in dictation grammars. By defining a 
small set of general rules, the SR engine can successfully respond to hundreds (or even 
thousands) of valid commands-without having to actually build each command into the SR 
lexicon. The rest of this section deals with the design, compilation, and testing of 
context-free grammars for the SAPI SR engine model. </p>

<h3><a NAME="GeneralRulesfortheSAPIContextFree">General Rules for the SAPI Context-Free 
Grammar</a></h3>

<p>The SAPI Context-Free Grammar (CFG) operates on a limited set of rules. These rules are 
used to analyze all audio input. In addition to rules, CFGs also allow for the definition 
of individual words. These words become part of the grammar and can be recognized by 
themselves or as part of a defined rule. </p>
<div align="center"><center>

<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
  <tr>
    <td><b>Note</b></td>
  </tr>
  <tr>
    <td><blockquote>
      <p>Throughout the rest of this section, you will be using <tt><font FACE="Courier">NOTEPAD.EXE</font></tt> 
      (or some other ASCII editor) to create grammar files that will be compiled using the <tt><font
      FACE="Courier">GRAMCOMP.EXE</font></tt> grammar compiler that ships with the Microsoft 
      Speech SDK. You will also need the <tt><font FACE="Courier">SRTEST.EXE</font></tt> 
      application that ships with the Speech SDK to test your compiled grammars. Even if you do 
      not have the Microsoft Speech SDK, however, you can still learn a lot from this material. </p>
    </blockquote>
    </td>
  </tr>
</table>
</center></div>

<h4>Defining Words in a CFG</h4>

<p>In SAPI CFGs, each defined word is assigned a unique ID number. This is done by listing 
each word, followed by a number. Listing 18.9 shows an example. </p>

<hr>

<blockquote>
  <b><p>Listing 18.9. Defining words for a CFG file.<br>
  </b></p>
</blockquote>

<blockquote>
  <tt><font FACE="Courier"><p>//<br>
  // defining names<br>
  //<br>
  Lee = 101 ;<br>
  Shannon = 102 ;<br>
  Jesse = 103 ;<br>
  Scott = 104 ;<br>
  Michelle = 105 ;<br>
  Sue = 106 ;</font></tt> </p>
</blockquote>

<hr>

<p>Notice that there are spaces between each item on the line. The Microsoft <tt><font
FACE="Courier">GRAMCOMP.EXE</font></tt> program requires that each item be separated by 
white space. Also note that a semicolon (;) must appear at the end of each definition. </p>
<div align="center"><center>

<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
  <tr>
    <td><b>Tip</b></td>
  </tr>
  <tr>
    <td><blockquote>
      <p>If you are using the <tt><font FACE="Courier">GRAMCOMP.EXE</font></tt> compiler, you 
      are not required to define each word and give it a number. The <tt><font FACE="Courier">GRAMCOMP.EXE</font></tt> 
      program automatically assigns a number to each new word for you. However, it is a good 
      idea to predefine words to prevent any potential conflicts at compile time. </p>
    </blockquote>
    </td>
  </tr>
</table>
</center></div>

<p>The list of words can be as short or as long as you require. Keep in mind that the SR 
engine can only recognize words that appear in the vocabulary. If you fail to define the 
word &quot;Stop,&quot; you can holler <i>Stop!</i> to the engine as long as you like, but 
it will have no idea what you are saying! Also, the longer the list, the more likely it is 
that the engine will confuse one word for another. As the list increases in size, the 
accuracy of the engine decreases. Try to keep your lists as short as possible. </p>

<h4>Defining Rules in a CFG</h4>

<p>Along with words, CFGs require rules to interpret the audio stream. Each rule consists 
of two parts-the rule name and the series of operations that define the rule: </p>

<blockquote>
  <tt><font FACE="Courier"><p>&lt;RuleName&gt; = [series of operations]</font></tt> </p>
</blockquote>

<p>There are several possible operations within a rule. You can call another rule, list a 
set of recognizable words, or refer to an external list of words. There are also several 
special functions defined for CFGs. These functions define interpretation options for the 
input stream. There are four CFG functions recognized by the <tt><font FACE="Courier">GRAMCOMP.EXE</font></tt> 
compiler: 

<ul>
  <li><tt><font FACE="Courier">alt()</font></tt>-Use this function to list a set of 
    alternative inputs. </li>
  <li><tt><font FACE="Courier">seq()</font></tt>-Use this function to indicate to the SR 
    engine the sequence in which input will occur. </li>
  <li><tt><font FACE="Courier">opt()</font></tt>-Use this function to inform the SR engine 
    that the word or rule is optional and may not appear as part of the input stream. </li>
  <li><tt><font FACE="Courier">rep()</font></tt>-Use this function to tell the SR engine that 
    this word or rule could be repeated several times. </li>
</ul>

<h4>Using the <tt><font FACE="Courier">alt()</font></tt> Rule Function</h4>

<p>When building a rule definition, you can tell the SR engine that only one of the items 
in the list is expected. Listing 18.10 shows how this is done. </p>

<hr>

<blockquote>
  <b><p>Listing 18.10. An example of the <tt><font FACE="Courier">alt()</font></tt> rule 
  function.<br>
  </b></p>
</blockquote>

<blockquote>
  <tt><font FACE="Courier"><p>&lt;Names&gt; = alt(<br>
  Scott<br>
  Wayne<br>
  Curt<br>
  )alt ;</font></tt> </p>
</blockquote>

<hr>

<p>The <tt><font FACE="Courier">&lt;Names&gt;</font></tt> rule in Listing 18.10 defines 
three alternative names for the rule. This tells the SR engine that only one of the names 
will be spoken at a single occurrence. </p>

<h4>Using the <tt><font FACE="Courier">seq()</font></tt> Rule Function</h4>

<p>You can also define a rule that indicates the sequence in which words will be spoken. 
Listing 18.11 shows how you can modify the <tt><font FACE="Courier">&lt;Names&gt;</font></tt> 
rule to also include last names as part of the rule. </p>

<hr>

<blockquote>
  <b><p>Listing 18.11. An example of the <tt><font FACE="Courier">seq()</font></tt> rule 
  function.<br>
  </b></p>
</blockquote>

<blockquote>
  <tt><font FACE="Courier"><p>&lt;Names&gt; = alt(<br>
  &nbsp;&nbsp;&nbsp;&nbsp;Scott<br>
  &nbsp;&nbsp;&nbsp;&nbsp;seq( Scott Ivey )seq<br>
  &nbsp;&nbsp;&nbsp;&nbsp;Wayne<br>
  &nbsp;&nbsp;&nbsp;&nbsp;seq( Wayne Ivey )seq<br>
  &nbsp;&nbsp;&nbsp;&nbsp;Curt<br>
  &nbsp;&nbsp;&nbsp;&nbsp;seq( Curt Smith )seq<br>
  &nbsp;&nbsp;&nbsp;&nbsp;)alt ;</font></tt> </p>
</blockquote>

<hr>

<p>The <tt><font FACE="Courier">&lt;Names&gt;</font></tt> rule now lists six alternatives. 
Three of them include two-word phrases that must be spoken in the proper order to be 
recognized. For example, users could say <i>Scott</i> or <i>Scott Ivey</i>, and the SR 
engine would recognize the input. However, if the user said <i>Ivey Scott</i>, the system 
would not understand the input. </p>

<h4>Using the <tt><font FACE="Courier">opt()</font></tt> Rule Function</h4>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -