📄 ch21.htm

📁 MAPI__SAPI__TAPI
💻 HTM
字号:
<html>

<head>
<title>Chapter 21 -- Part III Summary - The Speech API</title>
<meta NAME="GENERATOR" CONTENT="Microsoft FrontPage 3.0">
</head>

<body TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EE" VLINK="#551A8B" ALINK="#CE2910">
<!-- Spidersoft WebZIP Ad Banner Insert -->
<!-- End of Spidersoft WebZIP Ad Banner Insert-->

<h1><font COLOR="#FF0000">Chapter 21</font></h1>

<h1><b><font SIZE="5" COLOR="#FF0000">Part III Summary - The Speech API</font></b> </h1>

<hr WIDTH="100%">

<h3 ALIGN="CENTER"><font SIZE="+2" COLOR="#000000">CONTENTS<a NAME="CONTENTS"></a> </font></h3>

<ul>
  <li><a HREF="#Chapter14WhatIsSAPI">Chapter 14, &quot;What Is SAPI?&quot;</a> </li>
  <li><a HREF="#Chapter15SAPIArchitecture">Chapter 15, &quot;SAPI Architecture&quot;</a> </li>
  <li><a HREF="#Chapter16SAPIBasics">Chapter 16, &quot;SAPI Basics&quot;</a> </li>
  <li><a HREF="#Chapter17SAPIToolsUsingSAPIObjec">Chapter 17, &quot;SAPI Tools-Using SAPI 
    Objects with Visual Basic 4.0&quot; </a></li>
  <li><a HREF="#Chapter18SAPIBehindtheScenes">Chapter 18, &quot;SAPI Behind the Scenes&quot;</a> 
  </li>
  <li><a HREF="#Chapter19CreatingSAPIApplications">Chapter 19, &quot;Creating SAPI 
    Applications.with C++&quot;</a> </li>
  <li><a HREF="#Chapter20BuildingtheVoiceActivate">Chapter 20, &quot;Building the 
    Voice-Activated Text Reader&quot;</a> </li>
  <li><a HREF="#TheFutureofSAPI">The Future of SAPI</a> </li>
</ul>

<hr>

<h2><a NAME="Chapter14WhatIsSAPI"></a><a HREF="ch14.htm">Chapter 14</a><font SIZE="5"
COLOR="#FF0000">, &quot;What Is SAPI?&quot;</font></h2>

<p><a HREF="ch14.htm">Chapter 14</a> covered the key factors in creating and implementing 
a complete speech system for pcs. You learned the three major parts of speech systems: 

<ul>
  <li><i>Speech recognition</i> converts audio input into printed text or directly into 
    computer commands. </li>
  <li><i>Text-to-speech</i> converts printed text into audible speech. </li>
  <li><i>Grammar rules</i> are used by speech recognition systems to analyze audio input and 
    convert it into commands or text. </li>
</ul>

<h2><a NAME="Chapter15SAPIArchitecture"></a><a HREF="ch15.htm">Chapter 15</a><font
SIZE="5" COLOR="#FF0000">, &quot;SAPI Architecture&quot;</font></h2>

<p>In <a HREF="ch15.htm">Chapter 15</a> you learned the details of the SR and TTS 
interfaces defined by the Microsoft SAPI model. You also learned that the SAPI model is 
based on the Component Object Model (COM) interface and that Microsoft has defined two 
distinct levels of SAPI services: 

<ul>
  <li><i>High-level SAPI</i> provides a &quot;command-and-control&quot; level of service. This 
    is good for detecting menu and system-level commands and for speaking simple text. </li>
  <li><i>Low-level SAPI</i> provides a much more flexible interface and allows programmers 
    access to extended SR and TTS services. </li>
</ul>

<p>You learned that the two levels of SAPI service each contain several COM interfaces 
that allow C programmers access to speech services. These interfaces include the ability 
to set and get engine attributes, turn the services on or off, display dialog boxes for 
user interaction, and perform direct TTS and SR functions. </p>

<p>Since the SAPI model is based on the COM interface, high-level languages such as Visual 
Basic cannot call functions directly using the standard API calls. Instead, Microsoft has 
developed OLE automation type libraries for use with Visual Basic and other VBA-compliant 
systems. The two type libraries are: 

<ul>
  <li><i>Voice Command Objects</i>-This library provides access to speech recognition 
    services. </li>
  <li><i>Voice Text Objects</i>-This library provides access to text-to-speech services. </li>
</ul>

<h2><a NAME="Chapter16SAPIBasics"></a><a HREF="ch16.htm">Chapter 16</a><font SIZE="5"
COLOR="#FF0000">, &quot;SAPI Basics&quot;</font></h2>

<p><a HREF="ch16.htm">Chapter 16</a> focused on the hardware and software requirements of 
SAPI systems, the general technology and limits of SAPI services, and some design tips for 
creating successful SAPI implementations. </p>

<p>The Microsoft Speech SDK only works on 32-bit operating systems. This means you need 
Windows 95 or Windows NT Version 3.5 or greater in order to run SAPI applications. </p>

<p>The minimum, recommended, and preferred processor and RAM requirements for SAPI 
applications vary depending on the level of services your application provides. The 
minimum SAPI-enabled system may need as little as 1MB of additional RAM and be able to run 
on a 486/33 processor. However, it is a good idea to require at least a Pentium 60 
processor and an additional 8MB RAM. This will give your applications the additional 
computational power needed for the most typical SAPI implementations. </p>

<p>SAPI systems can use just about any of the current sound cards on the market today. Any 
card that is compatible with the Windows Sound System or with Sound Blaster systems will 
work fine. You should use a close-talk, unidirectional microphone, and you can use either 
external speakers or headphones for monitoring audio output. </p>

<p>You learned that SR technology uses three basic processes for interpreting audio input: 

<ul>
  <li><font COLOR="#000000">Word selection</font> </li>
  <li><font COLOR="#000000">Word analysis</font> </li>
  <li><font COLOR="#000000">Speaker dependence </font></li>
</ul>

<p>You also learned that SR systems have their limits. SR engines cannot automatically 
distinguish between multiple speakers, cannot learn new words, guess at spelling, or 
handle wide variations in word pronunciation (for example, &quot;toe- may- toe&quot; 
versus &quot;toe- mah- toe&quot;). </p>

<p>TTS engine technology is based on two different types of implementations. <i>Synthesis 
systems</i> create audio output by generating audio-tones using algorithms. This results 
in unmistakably computer-like speech. <i>Diphone concatenation</i> is an alternate method 
for generating speech. <i>Diphones</i> are sets of phoneme pairs collected from actual 
human speech samples. The TTS engine is able to convert text into phoneme pairs and match 
them to diphones in the TTS engine database. TTS engines are not able to mimic human 
speech patterns and rhythms (called <i>prosody</i>), and are not very good at 
communicating emotions. Also, most TTS engines experience difficulty with unusual words. 
This can result in odd-sounding phrases. </p>

<p>Finally, you learned some tips for designing and implementing speech services, 
including: 

<ul>
  <li><font COLOR="#000000">Make SR and TTS services optional whenever possible.</font> </li>
  <li><font COLOR="#000000">Design voice command menus to provide easy access to all major 
    operations.</font> </li>
  <li><font COLOR="#000000">Avoid similar-sounding words and inconsistent word order, and keep 
    command lists short.</font> </li>
  <li><font COLOR="#000000">Limit TTS use to short playback; use WAV recordings for long 
    playback sessions.</font> </li>
  <li><font COLOR="#000000">Don't mix TTS and WAV playback in the same session.</font> </li>
</ul>

<h2><a NAME="Chapter17SAPIToolsUsingSAPIObjec"></a><a HREF="ch17.htm">Chapter 17</a><font
SIZE="5" COLOR="#FF0000">, &quot;SAPI Tools-Using SAPI Objects with Visual Basic 4.0&quot; 
</font></h2>

<p>In <a HREF="ch17.htm">Chapter 17</a> you learned that the Microsoft Speech SDK contains 
a set of OLE library files for implementing SAPI services using Visual Basic and other 
VBA-compatible languages. There is an OLE Automation Library for TTS services (<tt><font
FACE="Courier">VTXTAUTO.TLB</font></tt>), and one for SR services (<tt><font
FACE="Courier">VMCDAUTO.TLB</font></tt>). <a HREF="ch17.htm">Chapter 17</a> showed you how 
to use the objects, methods, and properties in the OLE library to add SR and TTS services 
to your Windows applications. </p>

<p>You learned how to register and enable TTS services using the Voice Text object. You 
also learned how to adjust the speed and how to control the playback, rewind, fast 
forward, and pause methods of TTS output. Finally, you learned how to use a special <tt><font
FACE="Courier">Callback</font></tt> method to register a notification sink using a Visual 
Basic Class module. </p>

<p>You also learned how to register and enable SRT services using the Voice Command and 
Voice Menu objects. You learned how to build temporary and permanent menu commands and how 
to link them to program operations. You also learned how to build commands that accept a 
list of possible choices and how to use that list in a program. Finally, you learned how 
to use the <tt><font FACE="Courier">Callback</font></tt> property to register a 
notification sink using the Visual Basic Class module. </p>

<h2><a NAME="Chapter18SAPIBehindtheScenes"></a><a HREF="ch18.htm">Chapter 18</a><font
SIZE="5" COLOR="#FF0000">, &quot;SAPI Behind the Scenes&quot;</font></h2>

<p>In <a HREF="ch18.htm">Chapter 18</a> you learned how the speech system uses grammar 
rules, control tags, and the International Phonetic Alphabet to perform its key 
operations. </p>

<p>You built simple grammars and tested them using the tools that ship with the Speech 
SDK. You also learned how to load and enable those grammars for use in your SAPI 
applications. </p>

<p>You added control tag information to your TTS input to improve the prosody and overall 
performance of TTS interfaces. You used Speech SDK tools to create and play back text with 
control tags, and you learned how to edit the stored lexicon to maintain improved TTS 
performance over time. </p>

<p>Finally, you learned how the International Phonetic Alphabet is used to store and 
reproduce common speech patterns. The IPA can be used by SR and TTS engines as a source 
for analysis and playback. </p>

<h2><a NAME="Chapter19CreatingSAPIApplications"></a><a HREF="ch19.htm">Chapter 19</a><font
SIZE="5" COLOR="#FF0000">, &quot;Creating SAPI Applications with C++&quot;</font></h2>

<p>In <a HREF="ch19.htm">Chapter 19</a> you learned how to write simple TTS and SR 
applications using C++. Since many of the SAPI features are available only through C++ 
coding, this chapter gave you a quick review of how to use C++ to implement SAPI services. 
</p>

<p>You built a simple TTS program that you can use to cut and paste any text for playback. 
You also built and tested a simple SR interface to illustrate the techniques required to 
add SRT services to existing applications. </p>

<h2><a NAME="Chapter20BuildingtheVoiceActivate"></a><a HREF="ch20.htm">Chapter 20</a><font
SIZE="5" COLOR="#FF0000">, &quot;Building the Voice-Activated Text Reader&quot;</font></h2>

<p>In <a HREF="ch20.htm">Chapter 20</a> you used all the information gathered from 
previous chapters to build a complete application that implements both TTS and SR 
services. The Voice-Activated Text Reader allows users to select text files to load, loads 
them into the editor page, and then reads them back to the user on command. All major 
operations can be performed using speech commands. </p>

<p>You also learned how to add SR services to other existing applications using a set of 
library modules that you can add to any Visual Basic project. </p>

<h2><a NAME="TheFutureofSAPI"><font SIZE="5" COLOR="#FF0000">The Future of SAPI</font></a></h2>

<p>The future of SAPI is wide open. This section of the book gave you only a first glimpse 
of the possibilities ahead. At present, SAPI systems are most successful as 
command-and-control interfaces. Such interfaces allow users to use voice commands to start 
and stop basic operations that usually require keyboard or mouse intervention. Current 
technology offers limited voice playback services. Users can get quick replies or short 
readings of text without much trouble. However, long stretches of text playback are still 
difficult to understand. </p>

<p>With the creation of the generalized interfaces defined by Microsoft in the SAPI model, 
it will not be long before new versions of the TTS and SR engine appear on the market 
ready to take advantage of the larger base of Windows operating systems already installed. 
With each new release of Windows, and new versions of the SAPI interface, speech services 
are bound to become more powerful and more user-friendly. </p>

<p>Although we have not yet arrived at the level of voice interaction depicted in <i>Star 
Trek</i> and other futuristic tales, the release of SAPI for Windows puts us more than one 
step closer to that reality! </p>

<hr WIDTH="100%">

<p align="center"><a HREF="ch20.htm"><img SRC="pc.gif" BORDER="0" HEIGHT="88" WIDTH="140"></a><a
HREF="#CONTENTS"><img SRC="cc.gif" BORDER="0" HEIGHT="88" WIDTH="140"></a><a
HREF="index.htm"><img SRC="hb.gif" BORDER="0" HEIGHT="88" WIDTH="140"></a> <a
HREF="ch22.htm"><img SRC="nc.gif" BORDER="0" HEIGHT="88" WIDTH="140"></a></p>

<hr WIDTH="100%">
<layer src="http://www.spidersoft.com/ads/bwz468_60.htm" visibility="hidden" id="a1" width="600" onload="moveToAbsolute(ad1.pageX,ad1.pageY); a1.clip.height=60;visibility='show';">
</layer>
</body>
</html>
💿 文件大小 527 K
👤 上传用户 pjamytian
📂 所属分类 TAPI编程
📄 代码行数 242 行
💻 语言类型 HTM
🏷️ 相关标签

#MAPI #SAPI #TAPI
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -