⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 draft-ietf-speechsc-mrcpv2-05.txt

📁 MRCP V2版协议
💻 TXT
📖 第 1 页 / 共 5 页
字号:
 Internet Engineering Task Force                    Saravanan Shanmugham 
 Internet-Draft                                       Cisco Systems Inc. 
 draft-ietf-speechsc-mrcpv2-05                          October 18, 2004 
 Expires: April 18, 2005                                                 
                                                                         
                                                                         
                                                                         
  
  
  
              Media Resource Control Protocol Version 2(MRCPv2) 
                                           
  
 Status of this Memo  
     
    By submitting this Internet-Draft, we certify that any applicable 
    patent or other IPR claims of which we are aware have been 
    disclosed, and any of which we become aware will be disclosed, in 
    accordance with RFC 3668.  
         
    Internet-Drafts are working documents of the Internet Engineering 
    Task Force (IETF), its areas, and its working groups.  Note that 
    other groups may also distribute working documents as Internet-
    Drafts.  
         
    Internet-Drafts are draft documents valid for a maximum of six 
    months and may be updated, replaced, or obsoleted by other documents 
    at any time.  It is inappropriate to use Internet-Drafts as 
    reference material or to cite them other than as "work in progress".  
         
    The list of current Internet-Drafts can be accessed at 
    http://www.ietf.org/ietf/1id-abstracts.txt .  
         
    The list of Internet-Draft Shadow Directories can be accessed at 
    http://www.ietf.org/shadow.html .  
         
    This Internet-Draft will expire on April 18, 2005.  
     
           
 Copyright Notice 
     
    Copyright (C) The Internet Society (2004).  All Rights Reserved. 
                 
        
 Abstract 
   
    This document describes a proposal for a Media Resource Control 
    Protocol Version 2 (MRCPv2) and aims to meet the requirements 
    specified in the SPEECHSC working group requirements document. It is 
    based on the Media Resource Control Protocol (MRCP), also called 

  
 S. Shanmugham, et. al.                                          Page 1 

                            MRCPv2 Protocol              October, 2004 

    MRCPv1 developed jointly by Cisco Systems, Inc., Nuance 
    Communications, and Speechworks Inc.  
     
    The MRCPv2 protocol will control media service resources like speech 
    synthesizers, recognizers, signal generators, signal detectors, fax 
    servers etc. over a network. This protocol depends on a session 
    management protocol such as the Session Initiation Protocol (SIP) to 
    establish a separate MRCPv2 control session between the client and 
    the server. It also depends on SIP to establish the media pipe and 
    associated parameters between the media source or sink and the media 
    server. Once this is done, the MRCPv2 protocol exchange can happen 
    over the control session established above allowing the client to 
    command and control the media processing resources that may exist on 
    the media server.  
     
     
 Table of Contents 
     
      Status of this Memo..............................................1 
      Copyright Notice.................................................1 
      Abstract.........................................................1 
      Table of Contents................................................2 
      1.   Introduction:...............................................4 
      2.   Notational Convention.......................................5 
      3.   Architecture:...............................................5 
      3.1.  MRCPv2 Media Resources:....................................7 
      3.2.  Server and Resource Addressing.............................8 
      4.   MRCPv2 Protocol Basics......................................8 
      4.1.  Connecting to the Server...................................8 
      4.2.  Managing Resource Control Channels.........................8 
      4.3.  Media Streams and RTP Ports...............................15 
      4.4.  MRCPv2 Message Transport..................................16 
      4.5.  Resource Types............................................17 
      5.   MRCPv2 Specification.......................................17 
      5.1.  Request...................................................18 
      5.2.  Response..................................................19 
      5.3.  Event.....................................................20 
      6.   MRCP Generic Features......................................21 
      6.1.  Generic Message Headers...................................21 
      6.2.  SET-PARAMS................................................30 
      6.3.  GET-PARAMS................................................30 
      7.   Resource Discovery.........................................31 
      8.   Speech Synthesizer Resource................................32 
      8.1.  Synthesizer State Machine.................................33 
      8.2.  Synthesizer Methods.......................................33 
      8.3.  Synthesizer Events........................................34 
      8.4.  Synthesizer Header Fields.................................34 
      8.5.  Synthesizer Message Body..................................40 
      8.6.  SPEAK.....................................................43 
      8.7.  STOP......................................................44 
      8.8.  BARGE-IN-OCCURRED.........................................45 
  
 S Shanmugham                  IETF-Draft                        Page 2 

                            MRCPv2 Protocol              October, 2004 

      8.9.  PAUSE.....................................................47 
      8.10. RESUME....................................................48 
      8.11. CONTROL...................................................49 
      8.12. SPEAK-COMPLETE............................................50 
      8.13. SPEECH-MARKER.............................................51 
      8.14. DEFINE-LEXICON............................................52 
      9.   Speech Recognizer Resource.................................53 
      9.1.  Recognizer State Machine..................................54 
      9.2.  Recognizer Methods........................................54 
      9.3.  Recognizer Events.........................................55 
      9.4.  Recognizer Header Fields..................................55 
      9.5.  Recognizer Message Body...................................69 
      9.6.  DEFINE-GRAMMAR............................................83 
      9.7.  RECOGNIZE.................................................87 
      9.8.  STOP......................................................89 
      9.9.  GET-RESULT................................................90 
      9.10. START-OF-SPEECH...........................................91 
      9.11. START-INPUT-TIMERS........................................92 
      9.12. RECOGNITION-COMPLETE......................................92 
      9.13. START-PHRASE-ENROLLMENT...................................94 
      9.14. ENROLLMENT-ROLLBACK.......................................95 
      9.15. END-PHRASE-ENROLLMENT.....................................96 
      9.16. MODIFY-PHRASE.............................................96 
      9.17. DELETE-PHRASE.............................................97 
      9.18. INTERPRET.................................................97 
      9.19. INTERPRETATION-COMPLETE...................................98 
      9.20. DTMF Detection...........................................100 
      10.  Recorder Resource.........................................100 
      10.1. Recorder State Machine...................................100 
      10.2. Recorder Methods.........................................100 
      10.3. Recorder Events..........................................100 
      10.4. Recorder Header Fields...................................101 
      10.5. Recorder Message Body....................................105 
      10.6. RECORD...................................................105 
      10.7. STOP.....................................................106 
      10.8. RECORD-COMPLETE..........................................107 
      10.9. START-INPUT-TIMERS.......................................107 
      11.  Speaker Verification and Identification...................109 
      11.1. Speaker Verification State Machine.......................110 
      11.2. Speaker Verification Methods.............................110 
      11.3. Verification Events......................................111 
      11.4. Verification Header Fields...............................111 
      11.5. Verification Result Elements.............................119 
      11.6. START-SESSION............................................123 
      11.7. END-SESSION..............................................124 
      11.8. QUERY-VOICEPRINT.........................................124 
      11.9. DELETE-VOICEPRINT........................................125 
      11.10. VERIFY..................................................126 
      11.11. VERIFY-FROM-BUFFER......................................126 
      11.12. VERIFY-ROLLBACK.........................................129 
      11.13. STOP....................................................130 
  
 S Shanmugham                  IETF-Draft                        Page 3 

                            MRCPv2 Protocol              October, 2004 

      11.14. START-INPUT-TIMERS......................................131 
      11.15. VERIFICATION-COMPLETE...................................131 
      11.16. START-OF-SPEECH.........................................132 
      11.17. CLEAR-BUFFER............................................132 
      11.18. GET-INTERMEDIATE-RESULT.................................132 
      12.  Security Considerations...................................133 
      13.  Examples:.................................................133 
      14.  Reference Documents.......................................145 
      15.  Appendix..................................................146 
      15.1. ABNF Message Definitions.................................146 
      15.2. XML Schema and DTD.......................................161 
      Full Copyright Statement.......................................168 
      Intellectual Property..........................................169 
      Contributors...................................................169 
      Acknowledgements...............................................170 
      Editors' Addresses.............................................170 
     
  
 1.   Introduction: 
     
    The MRCPv2 protocol is designed for a client device to control media 
    processing resources on the network allowing to process and 
    audio/video stream. Some of these media processing resources could 
    be speech recognition, speech synthesis engines, speaker 
    verification or speaker identification engines. This allows a vendor 
    to implement distributed Interactive Voice Response platforms such 
    as VoiceXML [7] browsers. 
     
       The protocol requirements of SPEECHSC require that the protocol 
    is capable of reaching a media processing server and setting up 
    communication channels to the media resources, to send/recieve 
    control messages and media streams to/from the server. The Session 
    Initiation Protocol (SIP) protocol described in [4] meets these 
    requirements and is used to setup and tear down media and control 
    pipes to the server. In addition, the SIP re-INVITE can be used to 
    change the characteristics of these media and control pipes mid-
    session.  The MRCPv2 protocol hence is designed to leverage and 
    build upon a session management protocols such as Session Initiation 
    Protocol (SIP) and Session Description Protocol (SDP). SDP is used 
    to describe the parameters of the media pipe associated with that 
    session. It is mandatory to support SIP as the session level 
    protocol to ensure interoperability. Other protocols can be used at 
    the session level by prior agreement. 
     
       The MRCPv2 protocol depends on SIP and SDP to create the session, 
    and setup the media channels to the server. It also depends on SIP 
    and SDP to establish MRCPv2 control channels between the client and 
    the server for each media processing resource required for that 
    session. The MRCPv2 protocol exchange between the client and the 
    media resource can then happen on that control channel. The MRCPv2 

  
 S Shanmugham                  IETF-Draft                        Page 4 

                            MRCPv2 Protocol              October, 2004 

    protocol exchange happening on this control channel does not change 
    the state of the SIP session, the media or other parameters of the 
    session SIP initiated. It merely controls and affects the state of 
    the media processing resource associated with that MRCPv2 channel. 
     
       The MRCPv2 protocol defines the messages to control the different 
    media processing resources and the state machines required to guide 
    their operation. It also describes how these messages are carried 
    over a transport layer such as TCP, SCTP or TLS.  
  
     
 2.   Notational Convention 
     
    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY" and "OPTIONAL" in this 
    document are to be interpreted as described in RFC 2119[9].  
     
    Since many of the definitions and syntax are identical to HTTP/1.1, 
    this specification only points to the section where they are defined 
    rather than copying it. For brevity, [HX.Y] is to be taken to refer 
    to Section X.Y of the current HTTP/1.1 specification (RFC 2616 [1]). 
     
    All the mechanisms specified in this document are described in both 
    prose and an augmented Backus-Naur form (ABNF). It is described in 
    detail in RFC 2234 [3]. 
     
    The complete message format in ABNF form is provided in Appendix 
    section 12.1 and is the normative format definition. 
     
    Media Resource 
         An entity on the MRCP Server that can be controlled through the 
         MRCP protocol 
     
    MRCP Server  
         Aggregate of one or more "Media Resource" entities on a Server, 
         exposed through the MRCP protocol.("Server" for short) 
     
    MRCP Client  
         An entity controlling one or more Media Resources through the 
         MRCP protocol. ("Client" for short) 
     
     
     
 3.   Architecture: 
     
    The system consists of a client that requires the generation of 
    media streams or requires the processing of media streams and a 
    media resource server that has the resources or engines to process 
    or generate these streams. The client establishes a session using 
    SIP and SDP with the server to use its media processing resources. A 

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -