📄 ch15.htm
字号:
<tr>
<td WIDTH="176"><tt><font FACE="Courier">ISRResEval</font></tt> </td>
<td WIDTH="414">Used to re-evaluate the results of the previous recognition. This could be
used by the engine to request the speaker to repeat training phrases and use the new
information to re-evaluate previous interpretations. </td>
</tr>
<tr>
<td WIDTH="176"><tt><font FACE="Courier">ISRResSpeaker</font></tt> </td>
<td WIDTH="414">Used to identify the speaker performing the dictation. Could be used to
improve engine performance by comparing stored information from previous sessions with the
same speaker. </td>
</tr>
<tr>
<td WIDTH="176"><tt><font FACE="Courier">ISRResModifyGUI</font></tt> </td>
<td WIDTH="414">Used to provide a pop-up window asking the user to confirm the engine's
interpretation. Could also provide a list of alternate results to choose from. </td>
</tr>
<tr>
<td WIDTH="176"><tt><font FACE="Courier">ISRResMerge</font></tt> </td>
<td WIDTH="414">Used to merge data from two different recognition events into a single
unit for evaluation purposes. This can be done to improve the system's knowledge about a
speaker or phrase. </td>
</tr>
<tr>
<td WIDTH="176"><tt><font FACE="Courier">ISRResMemory</font></tt> </td>
<td WIDTH="414">Used to allocate and release memory used by results objects. This is
strictly a housekeeping function. </td>
</tr>
</table>
</center></div>
<h3><a NAME="TexttoSpeech">Text-to-Speech</a></h3>
<p>The low-level text-to-speech services are provided by one primary object-the TTS <tt><font
FACE="Courier">Engine</font></tt> object. Like the SR object set, the TTS object set has
an <tt><font FACE="Courier">Enumerator</font></tt> object and an <tt><font FACE="Courier">Engine
Enumerator</font></tt> object. These objects are used to locate and select a valid TTS <tt><font
FACE="Courier">Engine</font></tt> object and are then discarded (see Figure 15.3). </p>
<p><a HREF="f15-3.gif"><b>Figure 15.3 : </b><i>Mapping the low-level SAPI objects.</i></a>
</p>
<p>The TTS services also use an audio output object. The default object for output is the
pc speakers, but this can be set to the telephone device. Applications can also create
their own output devices, including the creation of a WAV format recording device as the
output for TTS engine activity. </p>
<p>The rest of this section discusses the details of the low-level SAPI TTS objects. </p>
<h4>The TTS <tt><font FACE="Courier">Enumerator</font></tt> and <tt><font FACE="Courier">Engine
Enumerator</font></tt> Objects </h4>
<p>The TTS <tt><font FACE="Courier">Enumerator</font></tt> and <tt><font FACE="Courier">Engine
Enumerator</font></tt> objects are used to obtain a list of the available TTS engines and
their speaking modes. They both support two interfaces:
<ul>
<li><tt><font FACE="Courier">ITTSEnum</font></tt>-Used to obtain a list of the available TTS
engines. </li>
<li><tt><font FACE="Courier">ITTSFind</font></tt>-Used to obtain a pointer to the requested
TTS engine. </li>
</ul>
<p>Once the objects have provided a valid address to a TTS engine object, the TTS <tt><font
FACE="Courier">Enumerator</font></tt> and <tt><font FACE="Courier">Engine Enumerator</font></tt>
objects can be discarded. </p>
<h4>The TTS <tt><font FACE="Courier">Engine</font></tt> Object </h4>
<p>The TTS <tt><font FACE="Courier">Engine</font></tt> object is the primary object of
low-level SAPI TTS services. The <tt><font FACE="Courier">Engine</font></tt> object
supports several interfaces. Table 15.5 lists the interfaces used for the translations of
text into audible speech. </p>
<p align="center"><b>Table 15.5. The TTS <tt><font FACE="Courier">Engine</font></tt>
object interfaces.</b> </p>
<div align="center"><center>
<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
<tr>
<td><i>Interface Name</i></td>
<td WIDTH="434"><i>Description</i> </td>
</tr>
<tr>
<td WIDTH="157"><tt><font FACE="Courier">ITTSCentral</font></tt> </td>
<td WIDTH="434">The main interface for the TTS engine object. It is used to register an
application with the TTS system, starting, pausing, and stopping the TTS playback, and so
on. </td>
</tr>
<tr>
<td WIDTH="157"><tt><font FACE="Courier">ITTSDialogs</font></tt> </td>
<td WIDTH="434">Used to provide a connection to several dialog boxes. The exact contents
of each dialog box is determined by the engine provider, not by Microsoft. Dialog boxes
defined for the interface are: <br>
About Box<br>
General Dialog<br>
Lexicon Dialog<br>
Training Dialog </td>
</tr>
<tr>
<td WIDTH="157"><tt><font FACE="Courier">ITTSAttributes</font></tt> </td>
<td WIDTH="434">Used to set and retrieve control parameters of the TTS engine, including
playback speed and volume, playback device, and so on. </td>
</tr>
</table>
</center></div>
<p>In addition to the interfaces described in Table 15.5, the TTS <tt><font FACE="Courier">Engine</font></tt>
object supports two notification callbacks:
<ul>
<li><tt><font FACE="Courier">ITTSNotifySink</font></tt>-Used to send the application
messages regarding the playback of text as audio output, including start and stop of
playback and other events. </li>
<li><tt><font FACE="Courier">ITTSBufNotifysink</font></tt>-Used to send messages regarding
the status of text in the playback buffer. If the content of the buffer changes, messages
are sent to the application using the TTS engine. </li>
</ul>
<h2><a NAME="SpeechObjectsandOLEAutomation"><font SIZE="5" COLOR="#FF0000">Speech Objects
and OLE Automation</font></a></h2>
<p>Microsoft supplies an OLE Automation type library with the Speech SDK. This type
library can be used with any VBA-compliant software, including Visual Basic, Access,
Excel, and others. The OLE Automation set provides high-level SAPI services only. The
objects, properties, and methods are quite similar to the objects and interfaces provided
by the high-level SAPI services described at the beginning of this chapter. </p>
<p>There are two type library files in the Microsoft Speech SDK:
<ul>
<li><tt><font FACE="Courier">VCAUTO.TLB</font></tt> supplies the speech recognition
services. </li>
<li><tt><font FACE="Courier">VTXTAUTO.TLB</font></tt> supplies the text-to-speech services. </li>
</ul>
<p>You can load these libraries into a Visual Basic project by way of the <tt><font
FACE="Courier">Tools | References</font></tt> menu item (see Figure 15.4). </p>
<p><a HREF="f15-4.gif"><b>Figure 15.4 : </b><i>Loading the Voice Command and Voice Text
type libraries.</i></a> </p>
<h3><a NAME="OLEAutomationSpeechRecognitionServic">OLE Automation Speech Recognition
Services</a></h3>
<p>The OLE Automation speech recognition services are implemented using two objects:
<ul>
<li><font COLOR="#000000">The OLE </font><tt><font FACE="Courier">Voice Command</font></tt>
object </li>
<li><font COLOR="#000000">The OLE </font><tt><font FACE="Courier">Voice Menu</font></tt>
object </li>
</ul>
<p>The OLE <tt><font FACE="Courier">Voice Command</font></tt> object has three properties
and two methods. Table 15.6 shows the <tt><font FACE="Courier">Voice Command</font></tt>
object's properties and methods, along with their parameters and short descriptions. </p>
<p align="center"><b>Table 15.6. The properties and methods of the OLE <tt><font
FACE="Courier">Voice Command</font></tt> object.</b> </p>
<div align="center"><center>
<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
<tr>
<td><i>Property/Method Name</i></td>
<td WIDTH="197"><i>Parameters</i> </td>
<td WIDTH="197"><i>Description</i></td>
</tr>
<tr>
<td WIDTH="197"><tt><font FACE="Courier">Register</font></tt> method </td>
<td WIDTH="197"> </td>
<td WIDTH="197">This method is used to register the application with the SR engine. It
must be called before any speech recognition will occur. </td>
</tr>
<tr>
<td WIDTH="197"><tt><font FACE="Courier">CallBack</font></tt> property </td>
<td WIDTH="197"><tt><font FACE="Courier">Project.Class</font></tt> as string </td>
<td WIDTH="197">Visual Basic 4.0 programs can use this property to identify an existing
class module that has two special methods defined. (See the following section, "Using
the Voice Command Callback.") </td>
</tr>
<tr>
<td WIDTH="197"><tt><font FACE="Courier">Awake</font></tt> property </td>
<td WIDTH="197"><tt><font FACE="Courier">TRUE</font></tt>/<tt><font FACE="Courier">FALSE</font></tt>
</td>
<td WIDTH="197">Use this property to turn on or off speech recognition for the
application. </td>
</tr>
<tr>
<td WIDTH="197"><tt><font FACE="Courier">CommandSpoken</font></tt> property </td>
<td WIDTH="197"><tt><font FACE="Courier">cmdNum</font></tt> as integer </td>
<td WIDTH="197">Use this property to determine which command was heard by the SR engine.
VB4 applications do not need to use this property if they have installed the callback
routines described earlier. All other programming environments must poll this value (using
a timer) to determine the command that has been spoken. </td>
</tr>
<tr>
<td WIDTH="197"><tt><font FACE="Courier">MenuCreate</font></tt> method </td>
<td WIDTH="197"><tt><font FACE="Courier">appName</font></tt> as String, <br>
<tt><font FACE="Courier">state</font></tt> as String,<br>
<tt><font FACE="Courier">langID</font></tt> as Integer,<br>
<tt><font FACE="Courier">dialect</font></tt> as String,<br>
<tt><font FACE="Courier">flags</font></tt> as Long </td>
<td WIDTH="197">Use this method to create a new menu object. Menu objects are used to add
new items to the list of valid commands to be recognized by the SR engine. </td>
</tr>
</table>
</center></div>
<h4>Using the Voice Command Callback</h4>
<p>The Voice Command type library provides a unique and very efficient method for
registering callbacks using a Visual Basic 4.0 class module. In order to establish an
automatic notification from the SR engine, all you need to do is add a VB4 class module to
your application. This class module must have two functions created:
<ul>
<li><tt><font FACE="Courier">CommandRecognize</font></tt>-This event is fired each time the
SR engine recognizes a command that belongs to your application's list. </li>
<li><tt><font FACE="Courier">CommandOther</font></tt>-This event is fired each time the SR
engine receives spoken input it cannot understand. </li>
</ul>
<p>Listing 15.1 shows how these two routines look in a class module. </p>
<hr>
<blockquote>
<b><p>Listing 15.1. Creating the notification routines for the <tt><font FACE="Courier">Voice
Command</font></tt> object.<br>
</b></p>
</blockquote>
<blockquote>
<tt><font FACE="Courier"><p>'Sent when a spoken phrase was either recognized as being from
another <font FACE="ZAPFDINGBATS">Â</font>application's<br>
'command set or was not recognized.<br>
Function CommandOther(pszCommand As String, pszApp As String, pszState As String)<br>
If Len(pszCommand) = 0 Then<br>
VcintrForm.StatusMsg.Text = "Command
unrecognized" & Chr(13) & Chr(10) & <font FACE="ZAPFDINGBATS">Â</font>VcintrForm.StatusMsg.Text<br>
Else<br>
VcintrForm.StatusMsg.Text = pszCommand
& " was recognized from " & pszApp & <font FACE="ZAPFDINGBATS">Â</font>"'s
" & pszState & " menu" & Chr(13) & Chr(10) &
VcintrForm.StatusMsg.Text<br>
End If<br>
<br>
End Function<br>
<br>
<br>
'Sent when a spoken phrase is recognized as being from the application's <font
FACE="ZAPFDINGBATS">Â</font>commandset.<br>
Function CommandRecognize(pszCommand As String, dwID As Long) <br>
VcintrForm.StatusMsg.Text = pszCommand & Chr(13) & Chr(10)
& <font FACE="ZAPFDINGBATS">Â</font>VcintrForm.StatusMsg.Text <br>
End Function</font></tt> </p>
</blockquote>
<hr>
<div align="center"><center>
<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
<tr>
<td><b>Note</b></td>
</tr>
<tr>
<td><blockquote>
<p>You'll learn more about how to use the <tt><font FACE="Courier">Voice Command</font></tt>
object in <a HREF="ch19.htm">Chapter 19</a>, "Creating SAPI Applications with
C++." </p>
</blockquote>
</td>
</tr>
</table>
</center></div>
<h4>The <tt><font FACE="Courier">Voice Menu</font></tt> Object </h4>
<p>The OLE <tt><font FACE="Courier">Voice Menu</font></tt> object is used to add new
commands to the list of valid items that can be recognized by the SR engine. The <tt><font
FACE="Courier">Voice Menu</font></tt> object has two properties and three methods. Table
15.7 shows the <tt><font FACE="Courier">Voice Menu</font></tt> object's methods and
properties, along with parameters and short descriptions.<br>
</p>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -