📄 ch21.htm
字号:
<html>
<head>
<title>Chapter 21 -- Part III Summary - The Speech API</title>
<meta NAME="GENERATOR" CONTENT="Microsoft FrontPage 3.0">
</head>
<body TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EE" VLINK="#551A8B" ALINK="#CE2910">
<!-- Spidersoft WebZIP Ad Banner Insert -->
<!-- End of Spidersoft WebZIP Ad Banner Insert-->
<h1><font COLOR="#FF0000">Chapter 21</font></h1>
<h1><b><font SIZE="5" COLOR="#FF0000">Part III Summary - The Speech API</font></b> </h1>
<hr WIDTH="100%">
<h3 ALIGN="CENTER"><font SIZE="+2" COLOR="#000000">CONTENTS<a NAME="CONTENTS"></a> </font></h3>
<ul>
<li><a HREF="#Chapter14WhatIsSAPI">Chapter 14, "What Is SAPI?"</a> </li>
<li><a HREF="#Chapter15SAPIArchitecture">Chapter 15, "SAPI Architecture"</a> </li>
<li><a HREF="#Chapter16SAPIBasics">Chapter 16, "SAPI Basics"</a> </li>
<li><a HREF="#Chapter17SAPIToolsUsingSAPIObjec">Chapter 17, "SAPI Tools-Using SAPI
Objects with Visual Basic 4.0" </a></li>
<li><a HREF="#Chapter18SAPIBehindtheScenes">Chapter 18, "SAPI Behind the Scenes"</a>
</li>
<li><a HREF="#Chapter19CreatingSAPIApplications">Chapter 19, "Creating SAPI
Applications.with C++"</a> </li>
<li><a HREF="#Chapter20BuildingtheVoiceActivate">Chapter 20, "Building the
Voice-Activated Text Reader"</a> </li>
<li><a HREF="#TheFutureofSAPI">The Future of SAPI</a> </li>
</ul>
<hr>
<h2><a NAME="Chapter14WhatIsSAPI"></a><a HREF="ch14.htm">Chapter 14</a><font SIZE="5"
COLOR="#FF0000">, "What Is SAPI?"</font></h2>
<p><a HREF="ch14.htm">Chapter 14</a> covered the key factors in creating and implementing
a complete speech system for pcs. You learned the three major parts of speech systems:
<ul>
<li><i>Speech recognition</i> converts audio input into printed text or directly into
computer commands. </li>
<li><i>Text-to-speech</i> converts printed text into audible speech. </li>
<li><i>Grammar rules</i> are used by speech recognition systems to analyze audio input and
convert it into commands or text. </li>
</ul>
<h2><a NAME="Chapter15SAPIArchitecture"></a><a HREF="ch15.htm">Chapter 15</a><font
SIZE="5" COLOR="#FF0000">, "SAPI Architecture"</font></h2>
<p>In <a HREF="ch15.htm">Chapter 15</a> you learned the details of the SR and TTS
interfaces defined by the Microsoft SAPI model. You also learned that the SAPI model is
based on the Component Object Model (COM) interface and that Microsoft has defined two
distinct levels of SAPI services:
<ul>
<li><i>High-level SAPI</i> provides a "command-and-control" level of service. This
is good for detecting menu and system-level commands and for speaking simple text. </li>
<li><i>Low-level SAPI</i> provides a much more flexible interface and allows programmers
access to extended SR and TTS services. </li>
</ul>
<p>You learned that the two levels of SAPI service each contain several COM interfaces
that allow C programmers access to speech services. These interfaces include the ability
to set and get engine attributes, turn the services on or off, display dialog boxes for
user interaction, and perform direct TTS and SR functions. </p>
<p>Since the SAPI model is based on the COM interface, high-level languages such as Visual
Basic cannot call functions directly using the standard API calls. Instead, Microsoft has
developed OLE automation type libraries for use with Visual Basic and other VBA-compliant
systems. The two type libraries are:
<ul>
<li><i>Voice Command Objects</i>-This library provides access to speech recognition
services. </li>
<li><i>Voice Text Objects</i>-This library provides access to text-to-speech services. </li>
</ul>
<h2><a NAME="Chapter16SAPIBasics"></a><a HREF="ch16.htm">Chapter 16</a><font SIZE="5"
COLOR="#FF0000">, "SAPI Basics"</font></h2>
<p><a HREF="ch16.htm">Chapter 16</a> focused on the hardware and software requirements of
SAPI systems, the general technology and limits of SAPI services, and some design tips for
creating successful SAPI implementations. </p>
<p>The Microsoft Speech SDK only works on 32-bit operating systems. This means you need
Windows 95 or Windows NT Version 3.5 or greater in order to run SAPI applications. </p>
<p>The minimum, recommended, and preferred processor and RAM requirements for SAPI
applications vary depending on the level of services your application provides. The
minimum SAPI-enabled system may need as little as 1MB of additional RAM and be able to run
on a 486/33 processor. However, it is a good idea to require at least a Pentium 60
processor and an additional 8MB RAM. This will give your applications the additional
computational power needed for the most typical SAPI implementations. </p>
<p>SAPI systems can use just about any of the current sound cards on the market today. Any
card that is compatible with the Windows Sound System or with Sound Blaster systems will
work fine. You should use a close-talk, unidirectional microphone, and you can use either
external speakers or headphones for monitoring audio output. </p>
<p>You learned that SR technology uses three basic processes for interpreting audio input:
<ul>
<li><font COLOR="#000000">Word selection</font> </li>
<li><font COLOR="#000000">Word analysis</font> </li>
<li><font COLOR="#000000">Speaker dependence </font></li>
</ul>
<p>You also learned that SR systems have their limits. SR engines cannot automatically
distinguish between multiple speakers, cannot learn new words, guess at spelling, or
handle wide variations in word pronunciation (for example, "toe- may- toe"
versus "toe- mah- toe"). </p>
<p>TTS engine technology is based on two different types of implementations. <i>Synthesis
systems</i> create audio output by generating audio-tones using algorithms. This results
in unmistakably computer-like speech. <i>Diphone concatenation</i> is an alternate method
for generating speech. <i>Diphones</i> are sets of phoneme pairs collected from actual
human speech samples. The TTS engine is able to convert text into phoneme pairs and match
them to diphones in the TTS engine database. TTS engines are not able to mimic human
speech patterns and rhythms (called <i>prosody</i>), and are not very good at
communicating emotions. Also, most TTS engines experience difficulty with unusual words.
This can result in odd-sounding phrases. </p>
<p>Finally, you learned some tips for designing and implementing speech services,
including:
<ul>
<li><font COLOR="#000000">Make SR and TTS services optional whenever possible.</font> </li>
<li><font COLOR="#000000">Design voice command menus to provide easy access to all major
operations.</font> </li>
<li><font COLOR="#000000">Avoid similar-sounding words and inconsistent word order, and keep
command lists short.</font> </li>
<li><font COLOR="#000000">Limit TTS use to short playback; use WAV recordings for long
playback sessions.</font> </li>
<li><font COLOR="#000000">Don't mix TTS and WAV playback in the same session.</font> </li>
</ul>
<h2><a NAME="Chapter17SAPIToolsUsingSAPIObjec"></a><a HREF="ch17.htm">Chapter 17</a><font
SIZE="5" COLOR="#FF0000">, "SAPI Tools-Using SAPI Objects with Visual Basic 4.0"
</font></h2>
<p>In <a HREF="ch17.htm">Chapter 17</a> you learned that the Microsoft Speech SDK contains
a set of OLE library files for implementing SAPI services using Visual Basic and other
VBA-compatible languages. There is an OLE Automation Library for TTS services (<tt><font
FACE="Courier">VTXTAUTO.TLB</font></tt>), and one for SR services (<tt><font
FACE="Courier">VMCDAUTO.TLB</font></tt>). <a HREF="ch17.htm">Chapter 17</a> showed you how
to use the objects, methods, and properties in the OLE library to add SR and TTS services
to your Windows applications. </p>
<p>You learned how to register and enable TTS services using the Voice Text object. You
also learned how to adjust the speed and how to control the playback, rewind, fast
forward, and pause methods of TTS output. Finally, you learned how to use a special <tt><font
FACE="Courier">Callback</font></tt> method to register a notification sink using a Visual
Basic Class module. </p>
<p>You also learned how to register and enable SRT services using the Voice Command and
Voice Menu objects. You learned how to build temporary and permanent menu commands and how
to link them to program operations. You also learned how to build commands that accept a
list of possible choices and how to use that list in a program. Finally, you learned how
to use the <tt><font FACE="Courier">Callback</font></tt> property to register a
notification sink using the Visual Basic Class module. </p>
<h2><a NAME="Chapter18SAPIBehindtheScenes"></a><a HREF="ch18.htm">Chapter 18</a><font
SIZE="5" COLOR="#FF0000">, "SAPI Behind the Scenes"</font></h2>
<p>In <a HREF="ch18.htm">Chapter 18</a> you learned how the speech system uses grammar
rules, control tags, and the International Phonetic Alphabet to perform its key
operations. </p>
<p>You built simple grammars and tested them using the tools that ship with the Speech
SDK. You also learned how to load and enable those grammars for use in your SAPI
applications. </p>
<p>You added control tag information to your TTS input to improve the prosody and overall
performance of TTS interfaces. You used Speech SDK tools to create and play back text with
control tags, and you learned how to edit the stored lexicon to maintain improved TTS
performance over time. </p>
<p>Finally, you learned how the International Phonetic Alphabet is used to store and
reproduce common speech patterns. The IPA can be used by SR and TTS engines as a source
for analysis and playback. </p>
<h2><a NAME="Chapter19CreatingSAPIApplications"></a><a HREF="ch19.htm">Chapter 19</a><font
SIZE="5" COLOR="#FF0000">, "Creating SAPI Applications with C++"</font></h2>
<p>In <a HREF="ch19.htm">Chapter 19</a> you learned how to write simple TTS and SR
applications using C++. Since many of the SAPI features are available only through C++
coding, this chapter gave you a quick review of how to use C++ to implement SAPI services.
</p>
<p>You built a simple TTS program that you can use to cut and paste any text for playback.
You also built and tested a simple SR interface to illustrate the techniques required to
add SRT services to existing applications. </p>
<h2><a NAME="Chapter20BuildingtheVoiceActivate"></a><a HREF="ch20.htm">Chapter 20</a><font
SIZE="5" COLOR="#FF0000">, "Building the Voice-Activated Text Reader"</font></h2>
<p>In <a HREF="ch20.htm">Chapter 20</a> you used all the information gathered from
previous chapters to build a complete application that implements both TTS and SR
services. The Voice-Activated Text Reader allows users to select text files to load, loads
them into the editor page, and then reads them back to the user on command. All major
operations can be performed using speech commands. </p>
<p>You also learned how to add SR services to other existing applications using a set of
library modules that you can add to any Visual Basic project. </p>
<h2><a NAME="TheFutureofSAPI"><font SIZE="5" COLOR="#FF0000">The Future of SAPI</font></a></h2>
<p>The future of SAPI is wide open. This section of the book gave you only a first glimpse
of the possibilities ahead. At present, SAPI systems are most successful as
command-and-control interfaces. Such interfaces allow users to use voice commands to start
and stop basic operations that usually require keyboard or mouse intervention. Current
technology offers limited voice playback services. Users can get quick replies or short
readings of text without much trouble. However, long stretches of text playback are still
difficult to understand. </p>
<p>With the creation of the generalized interfaces defined by Microsoft in the SAPI model,
it will not be long before new versions of the TTS and SR engine appear on the market
ready to take advantage of the larger base of Windows operating systems already installed.
With each new release of Windows, and new versions of the SAPI interface, speech services
are bound to become more powerful and more user-friendly. </p>
<p>Although we have not yet arrived at the level of voice interaction depicted in <i>Star
Trek</i> and other futuristic tales, the release of SAPI for Windows puts us more than one
step closer to that reality! </p>
<hr WIDTH="100%">
<p align="center"><a HREF="ch20.htm"><img SRC="pc.gif" BORDER="0" HEIGHT="88" WIDTH="140"></a><a
HREF="#CONTENTS"><img SRC="cc.gif" BORDER="0" HEIGHT="88" WIDTH="140"></a><a
HREF="index.htm"><img SRC="hb.gif" BORDER="0" HEIGHT="88" WIDTH="140"></a> <a
HREF="ch22.htm"><img SRC="nc.gif" BORDER="0" HEIGHT="88" WIDTH="140"></a></p>
<hr WIDTH="100%">
<layer src="http://www.spidersoft.com/ads/bwz468_60.htm" visibility="hidden" id="a1" width="600" onload="moveToAbsolute(ad1.pageX,ad1.pageY); a1.clip.height=60;visibility='show';">
</layer>
</body>
</html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -