⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 dict2pid.c

📁 CMU大名鼎鼎的SPHINX-3大词汇量连续语音识别系统
💻 C
📖 第 1 页 / 共 2 页
字号:
/* ==================================================================== * Copyright (c) 1999-2004 Carnegie Mellon University.  All rights * reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright *    notice, this list of conditions and the following disclaimer.  * * 2. Redistributions in binary form must reproduce the above copyright *    notice, this list of conditions and the following disclaimer in *    the documentation and/or other materials provided with the *    distribution. * * This work was supported in part by funding from the Defense Advanced  * Research Projects Agency and the National Science Foundation of the  * United States of America, and the CMU Sphinx Speech Consortium. * * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND  * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,  * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY * NOR ITS EMPLOYEES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * ==================================================================== * *//* * dict2pid.c -- Triphones for dictionary *  * ********************************************** * CMU ARPA Speech Project * * Copyright (c) 1999 Carnegie Mellon University. * ALL RIGHTS RESERVED. * ********************************************** *  * HISTORY *  * 14-Sep-1999	M K Ravishankar (rkm@cs.cmu.edu) at Carnegie Mellon University * 		Added dict2pid_comsseq2sen_active(). *  * 04-May-1999	M K Ravishankar (rkm@cs.cmu.edu) at Carnegie Mellon University * 		Started. */#include "dict2pid.h"#include "logs3.h"  /** \file dict2pid.c   * \brief Implementation of dict2pid   *//** * Build a glist of triphone senone-sequence IDs (ssids) derivable from [b][r] at the word * begin position.  If no triphone found in mdef, include the ssid for basephone b. * Return the generated glist. */static glist_t ldiph_comsseq (mdef_t *mdef, int32 b, int32 r){    int32 l, p, ssid;    glist_t g;        g = NULL;    for (l = 0; l < mdef_n_ciphone(mdef); l++) {	p = mdef_phone_id (mdef, (s3cipid_t)b, (s3cipid_t)l, (s3cipid_t)r, WORD_POSN_BEGIN);		if (IS_S3PID(p)) {	    ssid = mdef_pid2ssid(mdef, p);	    if (! glist_chkdup_int32 (g, ssid))		g = glist_add_int32 (g, ssid);	}    }    if (! g)	g = glist_add_int32 (g, mdef_pid2ssid(mdef, b));        return g;}/** * Build a glist of triphone senone-sequence IDs (ssids) derivable from [r][b] at the word * end position.  If no triphone found in mdef, include the ssid for basephone b. * Return the generated glist. */static glist_t rdiph_comsseq (mdef_t *mdef, int32 b, int32 l){    int32 r, p, ssid;    glist_t g;        g = NULL;    for (r = 0; r < mdef_n_ciphone(mdef); r++) {	p = mdef_phone_id (mdef, (s3cipid_t)b, (s3cipid_t)l, (s3cipid_t)r, WORD_POSN_END);		if (IS_S3PID(p)) {	    ssid = mdef_pid2ssid(mdef, p);	    if (! glist_chkdup_int32 (g, ssid))		g = glist_add_int32 (g, ssid);	}    }    if (! g)	g = glist_add_int32 (g, mdef_pid2ssid(mdef, b));        return g;}/** * Build a glist of triphone senone-sequence IDs (ssids) derivable from [b] as a single * phone word.  If no triphone found in mdef, include the ssid for basephone b. * Return the generated glist. */static glist_t single_comsseq (mdef_t *mdef, int32 b){    int32 l, r, p, ssid;    glist_t g;        g = NULL;    for (l = 0; l < mdef_n_ciphone(mdef); l++) {	for (r = 0; r < mdef_n_ciphone(mdef); r++) {	    p = mdef_phone_id (mdef, (s3cipid_t)b, (s3cipid_t)l, (s3cipid_t)r, WORD_POSN_SINGLE);	    	    if (IS_S3PID(p)) {		ssid = mdef_pid2ssid(mdef, p);		if (! glist_chkdup_int32 (g, ssid))		    g = glist_add_int32 (g, ssid);	    }	}    }    if (! g)	g = glist_add_int32 (g, mdef_pid2ssid(mdef, b));        return g;}/** * Build a glist of triphone senone-sequence IDs (ssids) derivable from [b] as a single * phone word, with a given left context l.  If no triphone found in mdef, include the ssid * for basephone b.  Return the generated glist. */static glist_t single_lc_comsseq (mdef_t *mdef, int32 b, int32 l){    int32 r, p, ssid;    glist_t g;        g = NULL;    for (r = 0; r < mdef_n_ciphone(mdef); r++) {	p = mdef_phone_id (mdef, (s3cipid_t)b, (s3cipid_t)l, (s3cipid_t)r, WORD_POSN_SINGLE);		if (IS_S3PID(p)) {	    ssid = mdef_pid2ssid(mdef, p);	    if (! glist_chkdup_int32 (g, ssid))		g = glist_add_int32 (g, ssid);	}    }    if (! g)	g = glist_add_int32 (g, mdef_pid2ssid(mdef, b));        return g;}/** * Convert the glist of ssids to a composite sseq id.  Return the composite ID. */static s3ssid_t ssidlist2comsseq (glist_t g, mdef_t *mdef, dict2pid_t *dict2pid,				  hash_table_t *hs,	/* For composite states */				  hash_table_t *hp)	/* For composite senone seq */{    int32 i, j, n, s, ssid;    s3senid_t **sen;    s3senid_t *comsenid;    gnode_t *gn;        n = glist_count (g);    if (n <= 0)	E_FATAL("Panic: length(ssidlist)= %d\n", n);        /* Space for list of senones for each state, derived from the given glist */    sen = (s3senid_t **) ckd_calloc (mdef_n_emit_state (mdef), sizeof(s3senid_t *));    for (i = 0; i < mdef_n_emit_state (mdef); i++) {	sen[i] = (s3senid_t *) ckd_calloc (n+1, sizeof(s3senid_t));	sen[i][0] = BAD_S3SENID;	/* Sentinel */    }    /* Space for composite senone ID for each state position */    comsenid = (s3senid_t *) ckd_calloc (mdef_n_emit_state (mdef), sizeof(s3senid_t));        for (gn = g; gn; gn = gnode_next(gn)) {	ssid = gnode_int32 (gn);		/* Expand ssid into individual states (senones); insert in sen[][] if not present */	for (i = 0; i < mdef_n_emit_state (mdef); i++) {	    s = mdef->sseq[ssid][i];	    	    for (j = 0; (IS_S3SENID(sen[i][j])) && (sen[i][j] != s); j++);	    if (NOT_S3SENID(sen[i][j])) {		sen[i][j] = s;		sen[i][j+1] = BAD_S3SENID;	    }	}    }        /* Convert senones list for each state position into composite state */    for (i = 0; i < mdef_n_emit_state (mdef); i++) {	for (j = 0; IS_S3SENID(sen[i][j]); j++);	assert (j > 0);		j = hash_enter_bkey (hs, (char *)(sen[i]), j*sizeof(s3senid_t), dict2pid->n_comstate);	if (j == dict2pid->n_comstate)	    dict2pid->n_comstate++;	/* New composite state */	else	    ckd_free ((void *) sen[i]);		comsenid[i] = j;    }    ckd_free (sen);        /* Convert sequence of composite senids to composite sseq ID */    j = hash_enter_bkey (hp, (char *)comsenid, mdef->n_emit_state * sizeof(s3senid_t),			 dict2pid->n_comsseq);    if (j == dict2pid->n_comsseq) {	dict2pid->n_comsseq++;	if (dict2pid->n_comsseq >= MAX_S3SENID)	    E_FATAL("#Composite sseq limit(%d) reached; increase MAX_S3SENID\n",		    dict2pid->n_comsseq);    } else	ckd_free ((void *) comsenid);        return ((s3ssid_t)j);}/* RAH 4.16.01 This code has several leaks that must be fixed */dict2pid_t *dict2pid_build (mdef_t *mdef, dict_t *dict){    dict2pid_t *dict2pid;    s3ssid_t *internal, **ldiph, **rdiph, *single;    int32 pronlen;    hash_table_t *hs, *hp;    glist_t g;    gnode_t *gn;    s3senid_t *sen;    hash_entry_t *he;    int32 *cslen;    int32 i, j, b, l, r, w, n, p;        E_INFO("Building PID tables for dictionary\n");    dict2pid = (dict2pid_t *) ckd_calloc (1, sizeof(dict2pid_t));    dict2pid->internal = (s3ssid_t **) ckd_calloc (dict_size(dict), sizeof(s3ssid_t *));    dict2pid->ldiph_lc = (s3ssid_t ***) ckd_calloc_3d (mdef->n_ciphone,						       mdef->n_ciphone,						       mdef->n_ciphone,						       sizeof(s3ssid_t));    dict2pid->single_lc = (s3ssid_t **) ckd_calloc_2d (mdef->n_ciphone,						       mdef->n_ciphone,						       sizeof(s3ssid_t));    dict2pid->n_comstate = 0;    dict2pid->n_comsseq = 0;        hs = hash_new (mdef->n_ciphone * mdef->n_ciphone * mdef->n_emit_state, HASH_CASE_YES);    hp = hash_new (mdef->n_ciphone * mdef->n_ciphone, HASH_CASE_YES);        for (w = 0, n = 0; w < dict_size(dict); w++) {

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -