⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch16.htm

📁 MAPI__SAPI__TAPI
💻 HTM
📖 第 1 页 / 共 3 页
字号:
<html>

<head>
<title>Chapter 16 -- SAPI Basics</title>
<meta NAME="GENERATOR" CONTENT="Microsoft FrontPage 3.0">
</head>

<body TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EE" VLINK="#551A8B" ALINK="#CE2910">
<!-- Spidersoft WebZIP Ad Banner Insert -->
<!-- End of Spidersoft WebZIP Ad Banner Insert-->

<h1><font COLOR="#FF0000">Chapter 16</font></h1>

<h1><b><font SIZE="5" COLOR="#FF0000">SAPI Basics</font></b> </h1>

<hr WIDTH="100%">

<h3 ALIGN="CENTER"><font SIZE="+2" COLOR="#000000">CONTENTS<a NAME="CONTENTS"></a> </font></h3>

<ul>
  <li><a HREF="#SAPIHardware">SAPI Hardware</a> <ul>
      <li><a HREF="#GeneralHardwareRequirements">General Hardware Requirements</a> </li>
      <li><a HREF="#SoftwareRequirementsOperatingSystems">Software Requirements-Operating Systems 
        and Speech Engines</a> </li>
      <li><a HREF="#SpecialHardwareRequirementsSoundCard">Special Hardware Requirements-Sound 
        Cards, Microphones, and Speakers</a> </li>
    </ul>
  </li>
  <li><a HREF="#TechnologyIssues">Technology Issues</a> <ul>
      <li><a HREF="#SRTechniques">SR Techniques</a> </li>
      <li><a HREF="#SRLimits">SR Limits</a> </li>
      <li><a HREF="#TTSTechniques">TTS Techniques</a> </li>
      <li><a HREF="#TTSLimits">TTS Limits</a> </li>
    </ul>
  </li>
  <li><a HREF="#GeneralSRDesignIssues">General SR Design Issues</a> </li>
  <li><a HREF="#VoiceCommandMenuDesign">Voice Command Menu Design</a> </li>
  <li><a HREF="#TTSDesignIssues">TTS Design Issues</a> </li>
  <li><a HREF="#Summary">Summary</a> </li>
</ul>

<hr>

<p><font COLOR="#000000">This chapter covers a handful of </font>issues that must be 
addressed when designing and installing SR/TTS applications, including hardware 
requirements, and the state of current SR/TTS technology and its limits. The chapter also 
includes some tips for designing your SR/TTS applications. </p>

<p>SR/TTS applications can be resource hogs. The section on hardware shows you the 
minimal, recommended, and preferred processor and RAM requirements for the most common 
SR/TTS applications. Of course, speech applications also need special hardware, including 
audio cards, microphones, and speakers. In this chapter, you'll find a general list of 
compatible devices, along with tips on what other options you have and how to use them. </p>

<p>You'll also learn about the general state of SR/TTS technology and its limits. This 
will help you design applications that do not place unrealistic demands on the software or 
raise users' expectations beyond the capabilities of your application. </p>

<p>Finally, this chapter contains a set of tips and suggestions for designing and 
implementing SR/TTS services. You'll learn how to design SR and TTS interfaces that reduce 
the chance of engine errors, and increase the usability of your programs. </p>

<p>When you complete this chapter, you'll know just what hardware is needed for speech 
systems and how to design programs that can successfully implement SR/TTS services that 
really work. </p>

<h2><a NAME="SAPIHardware"><font SIZE="5" COLOR="#FF0000">SAPI Hardware</font></a> </h2>

<p>Speech systems can be resource intensive. It is especially important that SR engines 
have enough RAM and disk space to respond quickly to user requests. Failure to respond 
quickly results in additional commands spoken into the system. This has the effect of 
creating a spiraling degradation in performance. The worse things get, the worse things 
get. It doesn't take too much of this before users decide your software is more trouble 
than it's worth! </p>

<p>Text-to-speech engines can also tax the system. While TTS engines do not always require 
a great deal of memory to operate, insufficient processor speed can result in halting or 
unintelligible playback of text. </p>

<p>For these reasons, it is important to establish clear hardware and software 
requirements when designing and implementing your speech-aware and speech-enabled 
applications. Not all pcs will have the memory, disk space, and hardware needed to 
properly implement SR and TTS services. There are three general categories of workstation 
resources that should be reviewed: 

<ul>
  <li><i>General hardware</i>, including processor speed and RAM memory </li>
  <li><i>Software</i>, including operating system and SR/TTS engines </li>
  <li><i>Special hardware</i>, including sound cards, microphones, speakers, and headphones </li>
</ul>

<p>The following three sections provide some general guidelines to follow when 
establishing minimal resource requirements for your applications. </p>

<h3><a NAME="GeneralHardwareRequirements">General Hardware Requirements</a> </h3>

<p>Speech systems can tax processor and RAM resources. SR services require varying levels 
of resources depending on the type of SR engine installed and the level of services 
implemented. TTS engine requirements are rather stable, but also depend on the TTS engine 
installed. </p>

<p>The SR and TTS engines currently available for SAPI systems usually can be successfully 
implemented using as little as a 486/33 processor chip and an additional 1MB of RAM. 
However, overall pc performance with this configuration is pretty poor and is not 
recommended. A good suggested processor is a Pentium processor (P60 or better) with at 
least 16MB of total RAM. Systems that will be supporting dictation SR services require the 
most computational power. It is not unreasonable to expect the workstation to use 32MB of 
RAM and a P100 or higher processor. Obviously, the more resources, the better the 
performance. </p>

<h4>SR Processor and Memory Requirements</h4>

<p>In general, SR systems that implement command and control services will only need an 
additional 1MB of RAM (not counting the application's RAM requirement). Dictation services 
should get at least another 8MB of RAM-preferably more. The type of speech sampling, 
analysis, and size of recognition vocabulary all affect the minimal resource requirements. 
Table 16.1 shows published minimal processor and RAM requirements of speech recognition 
services.<br>
</p>

<p align="center"><b>Table 16.1. Published minimal processor and RAM requirements of SR 
services.</b> </p>
<div align="center"><center>

<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
  <tr>
    <td><i>Levels of Speech-Recognition Services</i> </td>
    <td WIDTH="142"><p align="center"><i>Minimal Processor</i></td>
    <td WIDTH="189"><p align="center"><i>Minimal Additional RAM</i></td>
  </tr>
  <tr>
    <td WIDTH="259">Discrete, speaker-dependent, whole word, small vocabulary </td>
    <td WIDTH="142"><p align="center">386/16</td>
    <td WIDTH="189"><p align="center">64K </td>
  </tr>
  <tr>
    <td WIDTH="259">Discrete, speaker-independent, whole word, small vocabulary </td>
    <td WIDTH="142"><p align="center">386/33</td>
    <td WIDTH="189"><p align="center">256K </td>
  </tr>
  <tr>
    <td WIDTH="259">Continuous, speaker-independent, sub-word, small vocabulary </td>
    <td WIDTH="142"><p align="center">486/33</td>
    <td WIDTH="189"><p align="center">1MB </td>
  </tr>
  <tr>
    <td WIDTH="259">Discrete, speaker-dependent, whole word, large vocabulary </td>
    <td WIDTH="142">Pentium</td>
    <td WIDTH="189"><p align="center">8MB </td>
  </tr>
  <tr>
    <td WIDTH="259">Continuous, speaker-independent, sub-word, large vocabulary </td>
    <td WIDTH="142">RISC processor</td>
    <td WIDTH="189"><p align="center">8MB </td>
  </tr>
</table>
</center></div>

<p>These memory requirements are in addition to the requirements of the operating system 
and any loaded applications. The minimal Windows 95 memory model should be 12MB. 
Recommended RAM is 16MB and 24MB is preferred. The minimal NT memory should be 16MB with 
24MB recommended and 32MB preferred. </p>

<h4>TTS Processor and Memory Requirements</h4>

<p>TTS engines do not place as much of a demand on workstation resources as SR engines. 
Usually TTS services only require a 486/33 processor and only 1MB of additional RAM. TTS 
programs themselves are rather small-about 150K. However, the grammar and prosody rules 
can demand as much as another 1MB depending on the complexity of the language being 
spoken. It is interesting to note that probably the most complex and demanding language 
for TTS processing is English. This is primarily due to the irregular spelling patterns of 
the language. </p>

<p>Most TTS engines use speech synthesis to produce the audio output. However, advanced 
systems can use diphone concatenation. Since diphone-based systems rely on a set of actual 
voice samples for reproducing written text, these systems can require an additional 1MB of 
RAM. To be safe, it is a good idea to suggest a requirement of 2MB of additional RAM, with 
a recommendation of 4MB for advanced TTS systems. </p>

<h3><a NAME="SoftwareRequirementsOperatingSystems">Software Requirements-Operating Systems 
and Speech Engines</a></h3>

<p>The general software requirements are rather simple. The Microsoft Speech API can only 
be implemented on Windows 32-bit operating systems. This means you'll need Windows 95 or 
Windows NT 3.5 or greater on the workstation. </p>
<div align="center"><center>

<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
  <tr>
    <td><b>Note</b></td>
  </tr>
  <tr>
    <td><blockquote>
      <p>All the testing and programming examples covered in this book have been performed using 
      Windows 95. It is assumed that Windows NT systems will not require any additional 
      modifications.</p>
    </blockquote>
    </td>
  </tr>
</table>
</center></div>

<p>The most important software requirements for implementing speech services are the SR 
and TTS engines. An SR/TTS engine is the back-end processing module in the SAPI model. 
Your application is the front end, and the <tt><font FACE="Courier">SPEEch.DLL</font></tt> 
acts as the broker between the two processes. </p>

<p>The new wave of multimedia pcs usually has SR/TTS engines as part of their initial 
software package. For existing pcs, most sound cards now ship with SR/TTS engines. </p>

<p>Microsoft's Speech SDK does not include a set of SR/TTS engines. However, Microsoft 
does have an engine on the market. Their Microsoft Phone software system (available as 
part of modem/sound card packages) includes the Microsoft Voice SR/TTS engine. You can 
also purchase engines directly from third-party vendors. </p>
<div align="center"><center>

<table BORDERCOLOR="#000000" BORDER="1" WIDTH="80%">
  <tr>
    <td><b>Note</b></td>
  </tr>
  <tr>
    <td><blockquote>
      <p>Refer to appendix B, &quot;SAPI Resources,&quot; for a list of vendors that support the 
      Speech API. You can also check the CD-ROM that ships with this book for the most recent 
      list of SAPI vendors. Finally, the Microsoft Speech SDK contains a list of SAPI engine 
      providers in the <tt><font FACE="Courier">ENGINE.DOC</font></tt> file. </p>
    </blockquote>
    </td>
  </tr>
</table>
</center></div>

<h3><a NAME="SpecialHardwareRequirementsSoundCard">Special Hardware Requirements-Sound 
Cards, Microphones, and Speakers</a></h3>

<p>Complete speech-capable workstations need three additional pieces of hardware: 

<ul>
  <li><font COLOR="#000000">A </font><i>sound card</i> for audio reproduction </li>
  <li><i>Speakers</i> for audio playback </li>
  <li><font COLOR="#000000">A </font><i>microphone</i><font FACE="AGaramond Bold"> </font>for 
    audio input </li>
</ul>

<p>Just about any sound card can support SR/TTS engines. Any of the major vendors' cards 
are acceptable, including Sound Blaster and its compatibles, Media Vision, ESS technology, 
and others. Any card that is compatible with Microsoft's Windows Sound System is also 
acceptable. </p>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -