⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 speech synthesis & speech recognition using sapi 5_1.htm

📁 softonline.dll中函数的使用,请见不同的例程,VB函数见VB例子,VC函数见VC例子,VFP函数见VFP的例子,BCB函数见BCB例子, Delphi函数见Delphi例子
💻 HTM
📖 第 1 页 / 共 5 页
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0066)http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm -->
<HTML><HEAD><TITLE>Speech Synthesis & Speech Recognition Using SAPI 5.1</TITLE>
<META content="text/html; charset=windows-1252" http-equiv=Content-Type>
<META content="MSHTML 5.00.2614.3500" name=GENERATOR></HEAD>
<BODY bgColor=lightblue><A name=Top></A><FONT 
face="Verdana, Arial, Helvetica, sans-serif" size=2><IMG align=right alt=Athena 
height=164 
src="Speech Synthesis &amp; Speech Recognition Using SAPI 5_1.files/Athena.gif" 
width=174> 
<H1>
<P align=center>Speech Synthesis &amp; Speech Recognition Using SAPI 
5.1</P></H1>
<P align=center><A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#AboutBrian"><I>Brian 
Long</I></A> (<A href="http://www.blong.com/" 
target=_blank>http://www.blong.com/</A>)</P>
<H2>Table of Contents</H2>
<UL>
  <LI><A 
  href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#Introduction">Introduction</A> 

  <LI><A 
  href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#TTS">Speech 
  Synthesis</A> 
  <UL>
    <LI><A 
    href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#EnumVoices">Enumerating 
    Voices</A> 
    <LI><A 
    href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#Speech">Making 
    Your Computer Talk</A> 
    <LI><A 
    href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#Events">Voice 
    Events</A> 
    <LI><A 
    href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#Animation">Animating 
    Speech</A> 
    <LI><A 
    href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#KeepingTrack">Keeping 
    Track Of Spoken Text</A> 
    <LI><A 
    href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#SpeakingDialogs">Speaking 
    Dialogs</A> </LI></UL>
  <LI><A 
  href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#SR">Speech 
  Recognition</A> 
  <UL>
    <LI><A 
    href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#Grammars">Grammars</A> 

    <LI><A 
    href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#DSR">Continuous 
    Dictation Recognition</A> 
    <UL>
      <LI><A 
      href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#GramNotify">Grammar 
      Notifications</A> 
      <LI><A 
      href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#EngineDialogs">Engine 
      Dialogs</A> </LI></UL>
    <LI><A 
    href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#CnC">Command 
    and Control Recognition</A> 
    <LI><A 
    href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#Troubleshooting">Speech 
    Recognition Troubleshooting</A> </LI></UL>
  <LI><A 
  href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#Deployment">SAPI 
  5.1 Deployment</A> 
  <LI><A 
  href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#Summary">Summary</A> 

  <LI><A 
  href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#References">References/Further 
  Reading</A> 
  <LI><A 
  href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#AboutBrian">About 
  Brian Long</A> </LI></UL>
<P><A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.zip">Click 
here</A> to download the files associated with this article.</P>
<HR>

<H2><A name=Introduction>Introduction</A></H2>
<P>This article looks at adding support for speech capabilities to Microsoft 
Windows applications written in Delphi, using the Microsoft Speech API version 
5.1 (SAPI 5.1). For an overview on the subject of speech technology please <A 
href="http://www.blong.com/Conferences/DCon2002/Speech/Speech.htm">click 
here</A>.</P>
<P>There is also coverage on using SAPI 4 to build speech-enabled applications. 
Information on using the SAPI 4 high level interfaces can be found by <A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI4HighLevel/SAPI4.htm">clicking 
here</A>, whilst discussion of the low level interfaces can be found by <A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI4LowLevel/SAPI4.htm">clicking 
here</A>.</P>
<P>SAPI 5.1 exposes most of the important interfaces, types and constants 
through a registered type library (SAPI 5.0 did not do this, making it difficult 
to use in Delphi without someone writing the equivalent of the JEDI import unit 
for SAPI 5). This means that you can access SAPI 5.1 functionality through late 
bound or early bound Automation. We will focus our attention on early bound 
Automation, which requires you to import the type library.</P>
<P>Choose <FONT face="Courier New, Courier, mono">Project | Import Type 
Library...</FONT> and locate the type library described as <I>Microsoft Speech 
Object Library (Version 5.1)</I> in the list. Now ensure the <FONT 
face="Courier New, Courier, mono">Generate Component Wrapper</FONT> checkbox is 
checked so the type library import unit will include component wrapper classes 
for each exposed Automation object. These components will go on the 
<I>ActiveX</I> page of the Component Palette by default, but you may wish to 
specify a more appropriate page, such as <I>SAPI 5.1</I>.</P>
<P>Now press <FONT face="Courier New, Courier, mono">Install...</FONT> so the 
type library will be imported and the generated components will be installed 
onto the Component Palette (pressing <FONT 
face="Courier New, Courier, mono">Create Unit</FONT> would also generate the 
type library import unit, but would require us to install it manually).</P>
<P>The generated import unit is called SpeechLib_TLB.pas and will be installed 
in a package. You can either select the default package offered (the <I>Borland 
User Components</I> package by default), choose to open a different package or 
even create a new one. When the package is compiled and installed you will get a 
whopping set of 19 new components on the <I>SAPI 5.1</I> page of the Component 
Palette.</P>
<P>Each component is named after the primary interface it implements. So for 
example, the <FONT face="Courier New, Courier, mono">TSpVoice</FONT> component 
implements the <FONT face="Courier New, Courier, mono">SpVoice</FONT> interface. 
You can find abundant documentation on all these interfaces in the SAPI 5.1 SDK 
documentation.</P>
<P>Ready made SAPI 5.1 packages containing Automation components for Delphi 5, 6 
and 7 can be found in appropriately named subdirectories under SAPI 5.1 in the 
accompanying files.</P>
<P><B><U>Note:</U></B> if you are using Delphi 6 you will encounter a problem 
that is still present even with Update Pack 2 installed. The type library 
importer has a bug where the parameters to Automation events are incorrectly 
dispatched (they are sent in reverse order) meaning that all the Automation 
events operate incorrectly (if at all). You can avoid this by importing the type 
library in Delphi 5 or 7 and using the generated type library import unit in 
Delphi 6. A Delphi 6 compatible package is supplied with <A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.zip">this 
article's files</A> (it uses a Delphi 5 generated type library import unit).</P>
<P><B><U>Note:</U></B> The Delphi 7 type library importer has been improved to 
produce more accurate Pascal representations of items in the type library than 
Delphi 5 did (and than Delphi 6 tried to). As a result of this, the event 
handlers will often have different parameter lists in the Delphi 7 imported type 
library. This means that the sample programs won't compile with Delphi 7 with 
the true Delphi 7 SAPI type library import unit.</P>
<P>If you wish, you can write late bound Automation that calls <FONT 
face="Courier New, Courier, mono">CreateOleObject</FONT> to instantiate the 
Automation objects. In the case of the <FONT 
face="Courier New, Courier, mono">SpVoice</FONT> interface, you would 
execute:</P>
<TABLE bgColor=white border=1>
  <TBODY>
  <TR>
    <TD><PRE><CODE><FONT color=black size=2>
<B>var</B>
  SpVoice: Variant;
...
SpVoice := CreateOleObject(<I>'SAPI.SpVoice'</I>)
</FONT></CODE></PRE></TD></TR></TBODY></TABLE>
<H2><A name=TTS>Speech Synthesis</A></H2>
<P>At its simplest level, all you need to do to get your program to speak is to 
use a <FONT face="Courier New, Courier, mono">TSpVoice</FONT> Automation object 
and call the <FONT face="Courier New, Courier, mono">Speak</FONT> method. A 
trivial application that does this can be found in the TextToSpeechSimple.dpr 
project in <A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.zip">the 
files associated with this article</A>. The code looks like this:</P>
<TABLE bgColor=white border=1>
  <TBODY>
  <TR>
    <TD><PRE><CODE><FONT color=black size=2>
<B>procedure</B> TfrmTextToSpeech.Button1Click(Sender: TObject);
<B>begin</B>
  SpVoice1.Speak(memText.Text, SVSFDefault)
<B>end</B>;
</FONT></CODE></PRE></TD></TR></TBODY></TABLE>
<P>And there you have it: a speaking application. The call to Speak takes a 
number of parameters that we should examine:</P>
<UL>
  <LI>The first is the text to speak, passed as a <FONT 
  face="Courier New, Courier, mono">PChar</FONT>. Because of the second 
  parameter, this call will be synchronous and so will not return until the text 
  has been spoken. 
  <LI>The second parameter represents some flags that indicate how to use the 
  first parameter (you can combine multiple flags with the <FONT 
  face="Courier New, Courier, mono">or</FONT> operator). For example:<BR>
  <UL>
    <LI><FONT face="Courier New, Courier, mono">SVSFDefault</FONT> means the 
    <FONT face="Courier New, Courier, mono">Speak</FONT> method will be 
    synchronous 
    <LI><FONT face="Courier New, Courier, mono">SVSFlagAsync</FONT> makes the 
    <FONT face="Courier New, Courier, mono">Speak</FONT> method asynchronous and 
    so it returns immediately (you can use events to find out when speech 
    terminates, or call the <FONT 
    face="Courier New, Courier, mono">WaitUntilDone</FONT> method, or call <FONT 
    face="Courier New, Courier, mono">SpeakCompleteEvent</FONT> to receive a 
    Win32 event handle, which can be passed to <FONT 
    face="Courier New, Courier, mono">WaitForSingleObject</FONT>).<BR>Note that 
    the <FONT face="Courier New, Courier, mono">Speak</FONT> method returns a 
    stream number. When queuing several asynchronous voice streams, the stream 
    number allows you to identify them; each voice event passes the stream 
    number to which it relates as a parameter. 
    <LI><FONT face="Courier New, Courier, mono">SVSFPurgeBeforeSpeak</FONT> 
    means any text being spoken and any text queued to speak will be immediately 
    cancelled. 
    <LI><FONT face="Courier New, Courier, mono">SVSFNLPSpeakPunc</FONT> means 
    punctuation marks are read out by their names, rather than being used as 
    punctuation (so ? is read out as <I>question mark</I>) 
    <LI><FONT face="Courier New, Courier, mono">SVSFIsFilename</FONT> means the 
    first parameter is a file name containing text to speak. 
    <LI>SVSFIsXML means the text includes XML tags to alter attributes of the 
    spoken text. For example this text controls the pitch, rate, volume, 
    emphasis and pronunciation of the spoken text:<BR>
    <TABLE bgColor=white border=1>
      <TBODY>
      <TR>
        <TD><PRE><CODE><FONT color=black size=2>
&lt;EMPH&gt;Hello&lt;/EMPH&gt;
&lt;PRON SYM="d eh l f y"&gt;Delphi&lt;/PRON&gt; developers!
&lt;VOLUME LEVEL="70"&gt;
I can speak &lt;PITCH MIDDLE="+10"&gt;high&lt;/PITCH&gt; and &lt;PITCH MIDDLE="-10"&gt;low&lt;/PITCH&gt;.
I can speak &lt;RATE SPEED="+10"&gt;very quickly&lt;/RATE&gt; and &lt;RATE SPEED="-10"&gt;very slowly&lt;/RATE&gt;.
I can speak &lt;VOLUME LEVEL="40"&gt;quietly&lt;/VOLUME&gt; and &lt;VOLUME LEVEL="100"&gt;loudly&lt;/VOLUME&gt;.
&lt;/VOLUME&gt;
</FONT></CODE></PRE></TD></TR></TBODY></TABLE></LI></UL></LI></UL>
<P>When the program executes it lets you type in some text in a memo and a 
button renders it into the spoken word.</P>
<P align=center><IMG 
src="Speech Synthesis &amp; Speech Recognition Using SAPI 5_1.files/TextToSpeechSimple.png"></P>
<P>That's the simple example out of the way, but what can we achieve if we dig a 
little deeper and get our hands a little dirtier? The next project, which holds 
the answers to these questions, can be found as TextToSpeech.dpr in <A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.zip">this 
article's files</A>. You can see it running in the screenshot below; notice that 
as the text is spoken, the current sentence is italicised and the current word 
is displayed selected and also the phonemes spoken are written to a memo.</P>
<P align=center><IMG 
src="Speech Synthesis &amp; Speech Recognition Using SAPI 5_1.files/TextToSpeech.png"></P>
<P>The following sections describe the important parts of the code from this 
project.</P>
<H3><A name=EnumVoices>Enumerating Voices</A></H3>
<P>The first thing the program does is to add a list of all the available voices 
to the combobox and set the rate and volume track bar positions. The latter part 
of this is trivial as the voice rate and volume are always within predetermined 
ranges (the volume is in the range 0 to 100 and the rate is in the range -10 to 
10).</P>
<TABLE bgColor=white border=1>
  <TBODY>
  <TR>
    <TD><PRE><CODE><FONT color=black size=2>
<B>procedure</B> TfrmTextToSpeech.FormCreate(Sender: TObject);
<B>var</B>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -