⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 speech synthesis & speech recognition using sapi 5_1.htm

📁 softonline.dll中函数的使用,请见不同的例程,VB函数见VB例子,VC函数见VC例子,VFP函数见VFP的例子,BCB函数见BCB例子, Delphi函数见Delphi例子
💻 HTM
📖 第 1 页 / 共 5 页
字号:
<B>begin</B>
  InvokeUI(SPDUI_AudioVolume, <I>'Audio Volume'</I>)
<B>end</B>;

<B>procedure</B> TfrmContinuousDictation.InvokeUI(<B>const</B> TypeOfUI, Caption: WideString);
<B>var</B>
  U: OleVariant;
<B>begin</B>
  U := Unassigned;
  <B>if</B> SpSharedRecoContext.Recognizer.IsUISupported(TypeOfUI, U) <B>then</B>
    SpSharedRecoContext.Recognizer.DisplayUI(Handle, Caption, TypeOfUI, U)
<B>end</B>;
</FONT></CODE></PRE></TD></TR></TBODY></TABLE>
<H3><A name=CnC>Command and Control Recognition</A></H3>
<P>For C and C recognition we will need a grammar to give the SR engine rules by 
which to recognise the commands. This grammar is used by a sample project called 
CommandAndControl.dpr in <A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.zip">the 
files that accompany this article</A>.</P>
<TABLE bgColor=white border=1>
  <TBODY>
  <TR>
    <TD><PRE><CODE><FONT color=black size=2>
&lt;GRAMMAR LANGID="809"&gt;

  &lt;!-- "Constant" definitions --&gt;

  &lt;DEFINE&gt;
    &lt;ID NAME="RID_start" VAL="1"/&gt;
    &lt;ID NAME="PID_chosencolour" VAL="2"/&gt;
    &lt;ID NAME="PID_colourvalue" VAL="3"/&gt;
  &lt;/DEFINE&gt;

  &lt;!-- Rule definitions --&gt;

  &lt;RULE NAME="start" ID="RID_start" TOPLEVEL="ACTIVE"&gt;
    &lt;O&gt;colour&lt;/O&gt;
    &lt;RULEREF NAME="colour" PROPNAME="chosencolour" PROPID="PID_chosencolour" /&gt;
    &lt;O&gt;please&lt;/O&gt;
  &lt;/RULE&gt;

  &lt;RULE NAME="colour"&gt;
    &lt;L PROPNAME="colourvalue" PROPID="PID_colourvalue"&gt;
      &lt;P VAL="1"&gt;red&lt;/P&gt;
      &lt;P VAL="2"&gt;blue&lt;/P&gt;
      &lt;P VAL="3"&gt;green&lt;/P&gt;
    &lt;/L&gt;
  &lt;/RULE&gt;
&lt;/GRAMMAR&gt;
</FONT></CODE></PRE></TD></TR></TBODY></TABLE>
<P>After defining some constants the rules are laid out next. The top level rule 
(<I>start</I>, which is just an arbitrarily chosen name) is defined as the 
optional word <I>colour</I>, a value from another rule (also called 
<I>colour</I>) and the optional word <I>please</I>. The value from the colour 
rule can be identified programmatically (rather than by scanning the recognised 
text) thanks to it being defined as a property (<I>chosencolour</I>).</P>
<P>The colour rule defines one of three colours that can be spoken, each of 
which has a value defined for it. Again, this value will be accessible thanks to 
the list being defined as a property (<I>colourvalue</I>).</P>
<P>This grammar is stored in an XML file and loaded in the <FONT 
face="Courier New, Courier, mono">OnCreate</FONT> event handler.</P>
<TABLE bgColor=white border=1>
  <TBODY>
  <TR>
    <TD><PRE><CODE><FONT color=black size=2>
<B>procedure</B> TfrmCommandAndControl.FormCreate(Sender: TObject);
<B>begin</B>
  <FONT color=#003399><I>//OnAudioLevel event is not fired by default - this changes that</I></FONT>
  SpSharedRecoContext.EventInterests := SREAllEvents;
  SRGrammar := SpSharedRecoContext.CreateGrammar(0);
  SRGrammar.CmdLoadFromFile(<I>'C and C Grammar.xml'</I>, SLODynamic);
  SRGrammar.CmdSetRuleIdState(0, SGDSActive)
<B>end</B>;
</FONT></CODE></PRE></TD></TR></TBODY></TABLE>
<P>Notice that two different <FONT 
face="Courier New, Courier, mono">ISpeechRecoGrammar</FONT> methods are used to 
instigate command and control recognition. <FONT 
face="Courier New, Courier, mono">CmdLoadFromFile</FONT> loads a grammar from an 
XML file and <FONT face="Courier New, Courier, mono">CmdSetRuleIdState</FONT> 
activates all top level rules when the first parameter is zero (you can activate 
individual rules by passing their rule ID).</P>
<P>The <FONT face="Courier New, Courier, mono">OnRecognition</FONT> event 
handler does the work of locating the <I>chosencolour</I> property and then 
finding the nested <I>colourvalue</I> property. Its value is used to change the 
form colour at the user's request, for example with phrases such as:</P>
<UL>
  <LI>red please 
  <LI>colour green 
  <LI>colour blue please 
  <LI>red </LI></UL>
<TABLE bgColor=white border=1>
  <TBODY>
  <TR>
    <TD><PRE><CODE><FONT color=black size=2>
<B>procedure</B> TfrmCommandAndControl.SpSharedRecoContextRecognition(
  ASender: TObject; StreamNumber: Integer; StreamPosition: OleVariant;
  RecognitionType: TOleEnum; <B>const</B> Result: ISpeechRecoResult);
<B>begin</B>
  <B>with</B> Result.PhraseInfo <B>do</B>
  <B>begin</B>
    Log(<I>'OnRecognition: %s'</I>, [GetText(0, -1, True)]);
    <B>case</B> GetPropValue(Result, [<I>'chosencolour'</I>, <I>'colourvalue'</I>]) <B>of</B>
      1: Color := clRed;
      2: Color := clBlue;
      3: Color := clGreen;
    <B>end</B>
  <B>end</B>
<B>end</B>;
</FONT></CODE></PRE></TD></TR></TBODY></TABLE>
<P>This code uses a helper routine, <FONT 
face="Courier New, Courier, mono">GetPropValue</FONT> whose task is to locate 
the appropriate property in the result object, by following the property path 
specified in the string array parameter. The code for <FONT 
face="Courier New, Courier, mono">GetPropValue</FONT> and its own helper 
routine, <FONT face="Courier New, Courier, mono">GetProp</FONT>, looks like 
this:</P>
<TABLE bgColor=white border=1>
  <TBODY>
  <TR>
    <TD><PRE><CODE><FONT color=black size=2>
<B>function</B> GetProp(Props: ISpeechPhraseProperties;
  <B>const</B> Name: <B>String</B>): ISpeechPhraseProperty; overload;
<B>var</B>
  I: Integer;
  Prop: ISpeechPhraseProperty;
<B>begin</B>
  Result := <B>nil</B>;
  <B>for</B> I := 0 <B>to</B> Props.Count - 1 <B>do</B>
  <B>begin</B>
    Prop := Props.Item(I);
    <B>if</B> CompareText(Prop.Name, Name) = 0 <B>then</B>
    <B>begin</B>
      Result := Prop;
      Break
    <B>end</B>
  <B>end</B>
<B>end</B>;

<B>function</B> GetPropValue(SRResult: ISpeechRecoResult;
  <B>const</B> Path: <B>array</B> <B>of</B> <B>String</B>): OleVariant;
<B>var</B>
  Prop: ISpeechPhraseProperty;
  PathLoop: Integer;
<B>begin</B>
  <B>for</B> PathLoop := Low(Path) <B>to</B> High(Path) <B>do</B>
  <B>begin</B>
    <B>if</B> PathLoop = Low(Path) <B>then</B> <FONT color=#003399><I>//top level property</I></FONT>
      Prop := GetProp(SRResult.PhraseInfo.Properties, Path[PathLoop])
    <B>else</B> <FONT color=#003399><I>//nested property</I></FONT>
      Prop := GetProp(Prop.Children, Path[PathLoop]);
    <B>if</B> <B>not</B> Assigned(Prop) <B>then</B>
    <B>begin</B>
      Result := Unassigned;
      Exit;
    <B>end</B>
  <B>end</B>;
  Result := Prop.Value
<B>end</B>;
</FONT></CODE></PRE></TD></TR></TBODY></TABLE>
<P>This is what the application looks like when running.</P>
<P align=center><IMG 
src="Speech Synthesis &amp; Speech Recognition Using SAPI 5_1.files/CommandAndControl.png"></P>
<H3><A name=Troubleshooting>Speech Recognition Troubleshooting</A></H3>
<P>If you get issues of SR stopping (or not starting) unexpectedly, or other 
weird SR issues, check your recording settings have the microphone enabled.</P>
<UL>
  <LI>Double-click the Volume icon in your Task Bar's System Tray. If no Volume 
  icon is present, choose <FONT face="Courier New, Courier, mono">Start | 
  Programs | Accessories | Entertainment | Volume Control</FONT>. 
  <LI>If you see a <FONT face="Courier New, Courier, mono">Microphone</FONT> 
  column, ensure it has its <FONT face="Courier New, Courier, mono">Mute</FONT> 
  checkbox checked 
  <LI>Choose <FONT face="Courier New, Courier, mono">Options | 
  Properties</FONT>, click <FONT 
  face="Courier New, Courier, mono">Recording</FONT>, ensure the <FONT 
  face="Courier New, Courier, mono">Microphone</FONT> option is checked and 
  press OK. 
  <LI>Now ensure the <FONT face="Courier New, Courier, mono">Microphone</FONT> 
  column has its <FONT face="Courier New, Courier, mono">Select</FONT> checkbox 
  enabled, if it has one, or that its <FONT 
  face="Courier New, Courier, mono">Mute</FONT> checkbox is unchecked, if it has 
  one. </LI></UL>
<H2><A name=Deployment>SAPI 5.1 Deployment</A></H2>
<P>When distributing SAPI 5.1 applications you will need get hold of the 
redistributable components package available as SpeechSDK51MSM.exe from <A 
href="http://www.microsoft.com/speech/download/SDK51" 
target=_blank>http://www.microsoft.com/speech/download/SDK51</A> (a colossal 
file, weighing in at 132 Mb) contains Windows Installer merge modules for all 
the SAPI 5.1 components (the main DLLs, the TTS and SR engines, the Control 
Panel applet) and the SDK documentation includes a white paper on how to use all 
these components from within a Windows Installer compatible installation 
building tool.</P>
<P align=center><IMG 
src="Speech Synthesis &amp; Speech Recognition Using SAPI 5_1.files/SAPI5CPL.png"></P>
<H2><A name=Summary>Summary</A></H2>
<P>Adding various speech capabilities into a Delphi application does not take an 
awful lot of work, particularly if you do the background work to understand the 
SAPI concepts.</P>
<P>There is much to Speech API that we have not looked at in this paper but 
hopefully the areas covered will be enough to whet your appetite and get you 
exploring further on your own.</P>
<H2><A name=References></A>References/Further Reading</H2>
<P>The following is a list of useful articles and papers that I found on SAPI 
5.1 development during my research on this subject.</P>
<OL>
  <LI><A name=Ref1></A><I><A 
  href="http://www.delphi3000.com/articles/article_2581.asp" 
  target=_blank>Speech Part 1 - How to Add "Text to Speech" (Speech Synthesis) 
  to your Delphi Apps</A> </I>by Alec Bergamini, <A 
  href="http://www.delphi3000.com/" target=_blank>Delphi 3000</A>.<BR>This 
  discusses installing the SAPI 5.1 SDK and getting simple speech. 
  <LI><A name=Ref1></A><I><A 
  href="http://www.delphi3000.com/articles/article_2629.asp" target=_blank>9. 
  Speech Part 2 - How to Add Simple Dictation speech recognition to your Delphi 
  Apps</A> </I>by Alec Bergamini, <A href="http://www.delphi3000.com/" 
  target=_blank>Delphi 3000</A>.<BR>This looks at simple dictation SR. </LI></OL>
<H2><A name=AboutBrian>About Brian Long</A> </H2>
<P><A href="mailto:brian@blong.com">Brian Long</A> used to work at <A 
href="http://www.borland.com/">Borland</A> UK, performing a number of duties 
including Technical Support on all the programming tools. Since leaving in 1995, 
Brian has been providing training and consultancy services to the Delphi and 
C++Builder communities, and the newly forming Kylix community. 
<P>If you need training in these products, or need solutions to problems you 
have with them, please <A href="mailto:brian@blong.com">get in touch</A>, or 
visit <A href="http://www.blong.com/">Brian's Web site</A>. 
<P>Besides authoring a <A 
href="http://www.amazon.com/exec/obidos/ASIN/0201593831/qid=905701291/sr=1-1/002-9464178-4139807">Borland 
Pascal problem-solving book</A> published in 1994, Brian is a regular columnist 
in <A href="http://www.thedelphimagazine.com/">The Delphi Magazine</A> and has 
had numerous articles published in Developer's Review, <A 
href="http://www.computingnet.co.uk/">Computing</A>, Delphi Developer's Journal 
and EXE Magazine. He was nominated for the <A 
href="http://www.borland.com/delphi/vote">Spirit of Delphi 2000</A> award.</P>
<P>In his spare time (and waiting for his C++ programs to compile) Brian has 
learnt the art of <A href="http://www.juggling.org/">juggling</A> and making 
inflatable <A href="http://www.paperfolding.com/">origami</A> paper frogs.</P>
<HR>

<P><A href="http://www.blong.com/Conferences/DCon2002/Speech/Speech.htm">Go to 
the speech capabilities overview </A><BR>
<P><A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI4HighLevel/SAPI4.htm">Go 
to the SAPI 4 High Level Interfaces article</A><BR>
<P><A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI4LowLevel/SAPI4.htm">Go 
to the SAPI 4 Low Level Interfaces article</A><BR>
<P><A 
href="http://www.blong.com/Conferences/DCon2002/Speech/SAPI51/SAPI51.htm#Top">Go 
back to the top of this SAPI 5.1 article</A><BR></FONT></P></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -