首页 › 资源下载 › 网络 › julius version 4.12. › 源码查看
factoring_sub.c

来自「julius version 4.12.about sound recognit」· C语言代码 · 共 1,355 行 · 第 1/3 页
1,355 行
/** * @file   factoring_sub.c *  * <JA> * @brief  咐胳スコアのfactoring纷换∈妈1パス∷ * * このファイルには·妈１パスにおいて咐胳スコアの factoring を乖うための * 簇眶が崔まれています. 腾菇陇步辑今惧でのサブツリ〖柒の帽胳リスト * (successor list) の菇蜜·および千急面の咐胳スコア纷换ル〖チンが * 崔まれます.  * * successor list は·腾菇陇步辑今の称ノ〖ドに充り烧けられる· * そのノ〖ドを鼎铜する帽胳のリストです. 腾菇陇步辑今において· * 晦婶尸の肌のノ〖ドがこのリストを瘦积します. 悸狠にはリストが恃步する * 眷疥·すなわち腾菇陇步辑今の晦の尸呆爬に充り烧けられます.  * 毋えば·笆布のような腾菇陇步辑今の眷圭·眶机の今いてあるノ〖ドに * successor list が充り烧けられます.  * <pre> * *        2-o-o - o-o-o - o-o-o          word "A"  *       / *  1-o-o *       \       4-o-o                   word "B" *        \     /    *         3-o-o - 5-o-o - 7-o-o         word "C" *              \        \  *               \        8-o-o          word "D" *                6-o-o                  word "E" * </pre> * * 称 successor list はそのサブツリ〖に崔まれる帽胳のリストです.  * この毋では笆布のようになります.  * * <pre> *   node  | successor list (wchmm->state[node].sc) *   ======================= *     1   | A B C D E *     2   | A *     3   |   B C D E *     4   |   B *     5   |     C D *     6   |         E *     7   |     C *     8   |       D * </pre> * * ある successor list に崔まれる帽胳が１つになったとき·その箕爬で * 帽胳が澄年する. 惧淡の眷圭·帽胳 "A" はノ〖ド 2 の疤弥ですでに * その稿鲁帽胳として "A" 笆嘲痰いので·そこで澄年する.  * すなわち·帽胳 A の赖澄な咐胳スコアは·帽胳姜眉を略たずノ〖ド 2 で疯まる.  * * 妈１パスにおける factoring の纷换は·悸狠には beam.c で乖なわれる.  * 2-gram factoringの眷圭·肌ノ〖ドに successor list が赂哼すれば, * その successor list の帽胳の 2-gram の呵络猛を滇め, 帕嚷してきている * factoring 猛を构糠する. successor list に帽胳が1つのノ〖ドでは· * 赖しい2-gramが极瓢弄に充り碰てられる.  * 1-gram factoringの眷圭·肌ノ〖ドに successor list が赂哼する眷圭· * その successor list の帽胳の 1-gram の呵络猛を滇め·帕嚷してきている * factoring 猛を构糠する. successor list に帽胳が１つのノ〖ドで·はじめて * 2-gram を纷换する.  * * 悸狠では 1-gram factoring では称 successor list における factoring 猛 * は帽胳旺悟に润巴赂なので·successor list 菇蜜箕に链てあらかじめ纷换して * おく. すなわち·エンジン弹瓢箕に腾菇陇步辑今を菇蜜稿·successor list * を菇蜜したら·帽胳を2改笆惧崔む successor list についてはその 1-gram の * 呵络猛を纷换して·それをそのノ〖ドの fscore メンバに呈羌しておき·その * successor list は free してしまえばよい. 帽胳が１つのみの successor list * についてはその帽胳IDを荒しておき·玫瑚箕にパスがそこに毗茫したら * 赖澄な2-gramを纷换すれば紊い.  * * DFA矢恕蝗脱箕は·デフォルトでは咐胳扩腆(カテゴリ滦扩腆)を * カテゴリ帽疤で腾を菇蜜することで琅弄に山附する. このため· * これらの factoring 怠菇は脱いられない. ただし· * CATEGORY_TREE が undefined であれば·疯年弄 factoring を脱いた咐胳扩腆 * 努脱を乖うことも材墙である.  * すなわち·肌ノ〖ドに successor list が赂哼すれば, * その successor list 柒の称帽胳と木涟帽胳の帽胳滦扩腆を拇べ, * そのうち办つでも儡鲁材墙な帽胳があれば·その莲败を钓し·办つも * なければ莲败させない. この怠墙は祷窖徊雇のために荒されているのみである.  * </JA> *  * <EN> * @brief  LM factoring on 1st pass. * </EN> * * This file contains functions to do language score factoring on the 1st * pass.  They build a successor lists which holds the successive words in * each sub tree on the tree lexicon, and also provide a factored LM * probability on each nodes on the tree lexicon. * * The "successor list" will be assigned for each lexicon tree node to * represent a list of words that exist in the sub-tree and share the node. * Actually they will be assigned to the branch node. * Below is the example of successor lists on a tree lexicon, in which * the lists is assigned to the numbered nodes. *  * <pre> *         2-o-o - o-o-o - o-o-o          word "A"  *        / *   1-o-o *        \       4-o-o                   word "B" *         \     /    *          3-o-o - 5-o-o - 7-o-o         word "C" *           \            \  *            \            8-o-o          word "D" *             6-o-o                      word "E" * </pre> * * The contents of the successor lists are the following: * * <pre> *  node  | successor list (wchmm->state[node].sc) *  ======================= *    1   | A B C D E *    2   | A *    3   |   B C D E *    4   |   B *    5   |     C D *    6   |         E *    7   |     C *    8   |       D * </pre> * * When the 1st pass proceeds, if the next going node has a successor list, * all the word 2-gram scores in the successor list on the next node * will be computed, and the propagating LM value in the token on * the current node will be replaced by the maximum value of the scores * when copied to the next node.  Appearently, if the successor list has * only one word, it means that the word can be determined on that point, * and the precise 2-gram value will be assigned as is. * * When using 1-gram factoring, the computation will be slightly different. * Since the factoring value (maximum value of 1-gram scores on each successor * list) is independent of the word context, they can be computed statically * before the search.  Thus, for all the successor lists that have more than * two words, the maximum 1-gram value is computed and stored to * "fscore" member in tree lexicon, and the successor lists will be freed. * The successor lists with only one word should still remain in the * tree lexicon, to compute the precise 2-gram scores for the words. * * * When using DFA grammar, Julian builds separated lexicon trees for every * word categories, to statically express the catergory-pair constraint. * Thus these factoring scheme is not used by default. * However you can still force Julian to use the grammar-based * deterministic factoring scheme by undefining CATEGORY_TREE. * If CATEGORY_TREE is undefined, the word connection constraint will be * performed based on the successor list at the middle of tree lexicon. * This enables single tree search on Julian.  This function is left * only for technical reference. *  * @author Akinobu LEE * @date   Mon Mar  7 23:20:26 2005 * * $Revision: 1.3 $ *  *//* * Copyright (c) 1991-2007 Kawahara Lab., Kyoto University * Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology * Copyright (c) 2005-2007 Julius project team, Nagoya Institute of Technology * All rights reserved */#include <julius/julius.h>/*----------------------------------------------------------------------*//**  * <JA> * @brief  腾菇陇步辑今惧のあるノ〖ドの successor list に帽胳を纳裁する.  *  * すでに票じ帽胳が判峡されていれば·糠たに判峡はされない.  * 帽胳はIDで竞界に瘦赂される.  *  * @param wchmm [i/o] 腾菇陇步辑今 * @param node [in] ノ〖ド戎规 * @param w [in] 帽胳ID * </JA> * <EN> * @brief  Add a word to the successor list on a node in tree lexicon. * Words in lists should be ordered by ID. *  * @param wchmm [i/o] tree lexicon * @param node [in] node id * @param w [in] word id * </EN> */static voidadd_successor(WCHMM_INFO *wchmm, int node, WORD_ID w){  S_CELL *sctmp, *sc;  /* malloc a new successor list element */  sctmp=(S_CELL *) mymalloc(sizeof(S_CELL));  /* assign word ID to the new element */  sctmp->word = w;  /* add the new element to existing list (keeping order) */  if (wchmm->state[node].scid == 0) {    j_internal_error("add_successor: sclist id not assigned to branch node?\n");  }  sc = wchmm->sclist[wchmm->state[node].scid];  if (sc == NULL || sctmp->word < sc->word) {    sctmp->next = sc;    wchmm->sclist[wchmm->state[node].scid] = sctmp;  } else {    for(;sc;sc=sc->next) {      if (sc->next == NULL || sctmp->word < (sc->next)->word) {	if (sctmp->word == sc->word) break; /* avoid duplication */	sctmp->next = sc->next;	sc->next = sctmp;	break;      }    }  }}/**  * <JA> * ２つのノ〖ド惧の successor list が办米するかどうかチェックする *  * @param wchmm [in] 腾菇陇步辑今 * @param node1 [in] １つめのノ〖ドID * @param node2 [in] ２つめのノ〖ドID *  * @return 窗链に办米すれば TRUE·办米しなければ FALSE.  * </JA> * <EN> * Check if successor lists on two nodes are the same. *  * @param wchmm [in] tree lexicon * @param node1 [in] 1st node id * @param node2 [in] 2nd node id *  * @return TRUE if they have the same successor list, or FALSE if they differ. * </EN> */static booleanmatch_successor(WCHMM_INFO *wchmm, int node1, int node2){  S_CELL *sc1,*sc2;  /* assume successor is sorted by ID */  if (wchmm->state[node1].scid == 0 || wchmm->state[node2].scid == 0) {    j_internal_error("match_successor: sclist id not assigned to branch node?\n");  }  sc1 = wchmm->sclist[wchmm->state[node1].scid];  sc2 = wchmm->sclist[wchmm->state[node2].scid];  for (;;) {    if (sc1 == NULL || sc2 == NULL) {      if (sc1 == NULL && sc2 == NULL) {	return TRUE;      } else {	return FALSE;      }    } else if (sc1->word != sc2->word) {      return FALSE;    }    sc1 = sc1->next;    sc2 = sc2->next;  }}/**  * <JA> * 回年ノ〖ド惧の successor list を鄂にする.  *  * @param wchmm [i/o] 腾菇陇步辑今 * @param scid [in] node id * </JA> * <EN> * Free successor list at the node *  * @param wchmm [i/o] tree lexicon * @param scid [in] node id * </EN> */static voidfree_successor(WCHMM_INFO *wchmm, int scid){  S_CELL *sc;  S_CELL *sctmp;  /* free sclist */  sc = wchmm->sclist[scid];  while (sc != NULL) {    sctmp = sc;    sc = sc->next;    free(sctmp);  }}/**  * <JA> * 腾菇陇步辑今惧からリンクが久された successor list について· * その悸挛を猴近してリストを低めるガ〖ベ〖ジコレクションを乖う.  *  * @param wchmm [i/o] 腾菇陇步辑今 * </JA> * <EN> * Garbage collection of the successor list, by deleting successor lists * to which the link was deleted on the lexicon tree. *  * @param wchmm [i/o] tree lexiton * </EN> */static voidcompaction_successor(WCHMM_INFO *wchmm){  int src, dst;  dst = 1;  for(src=1;src<wchmm->scnum;src++) {    if (wchmm->state[wchmm->sclist2node[src]].scid <= 0) {      /* already freed, skip */      continue;    }    if (dst != src) {      wchmm->sclist[dst] = wchmm->sclist[src];      wchmm->sclist2node[dst] = wchmm->sclist2node[src];      wchmm->state[wchmm->sclist2node[dst]].scid = dst;    }    dst++;  }  if (debug2_flag) {    jlog("DEBUG: successor list shrinked from %d to %d\n", wchmm->scnum, dst);  }  wchmm->scnum = dst;}/**  * <JA> * successor list 脱に充り烧けられたメモリ挝拌を铜跟な墓さに教める.  * 介袋菇蜜箕や·1-gram factoring のために猴近された successor list 尸の * メモリを豺庶する.  *  * @param wchmm [i/o] 腾菇陇步辑今 * </JA> * <EN> * Shrink the memory area that has been allocated for building successor list. *  * @param wchmm [i/o] tree lexicon * </EN> */static voidshrink_successor(WCHMM_INFO *wchmm){  if (wchmm->sclist) {    wchmm->sclist = (S_CELL **)myrealloc(wchmm->sclist, sizeof(S_CELL *) * wchmm->scnum);  }  if (wchmm->sclist2node) {    wchmm->sclist2node = (int *)myrealloc(wchmm->sclist2node, sizeof(int) * wchmm->scnum);  }}/**  * <JA> * 腾菇陇步辑今惧の链ノ〖ドに successor list を菇蜜するメイン簇眶 *  * @param wchmm [i/o] 腾菇陇步辑今 * </JA> * <EN> * Main function to build whole successor list to lexicon tree. *  * @param wchmm [i/o] tree lexicon * </EN> * * @callgraph * @callergraph *  */voidmake_successor_list(WCHMM_INFO *wchmm){  int node;  WORD_ID w;  int i;  boolean *freemark;  int s;  jlog("STAT: make successor lists for factoring\n");  /* 1. initialize */  /* initialize node->sclist index on wchmm tree */  for (node=0;node<wchmm->n;node++) wchmm->state[node].scid = 0;  /* parse the tree to get the maximum size of successor list */  s = 1;  for (w=0;w<wchmm->winfo->num;w++) {    for (i=0;i<wchmm->winfo->wlen[w];i++) {      if (wchmm->state[wchmm->offset[w][i]].scid == 0) {	wchmm->state[wchmm->offset[w][i]].scid = s;	s++;      }    }    if (wchmm->state[wchmm->wordend[w]].scid == 0) {      wchmm->state[wchmm->wordend[w]].scid = s;      s++;    }  }  wchmm->scnum = s;  if (debug2_flag) {    jlog("DEBUG: initial successor list size = %d\n", wchmm->scnum);  }  /* allocate successor list for the maximum size */  wchmm->sclist = (S_CELL **)mymalloc(sizeof(S_CELL *) * wchmm->scnum);  for (i=1;i<wchmm->scnum;i++) wchmm->sclist[i] = NULL;  wchmm->sclist2node = (int *)mymalloc(sizeof(int) * wchmm->scnum);  /* allocate misc. work area */  freemark = (boolean *)mymalloc(sizeof(boolean) * wchmm->scnum);  for (i=1;i<wchmm->scnum;i++) freemark[i] = FALSE;  /* 2. make initial successor list: assign at all possible nodes */  for (w=0;w<wchmm->winfo->num;w++) {    /* at each start node of phonemes */    for (i=0;i<wchmm->winfo->wlen[w];i++) {      wchmm->sclist2node[wchmm->state[wchmm->offset[w][i]].scid] = wchmm->offset[w][i];      add_successor(wchmm, wchmm->offset[w][i], w);    }    /* at word end */    wchmm->sclist2node[wchmm->state[wchmm->wordend[w]].scid] = wchmm->wordend[w];    add_successor(wchmm, wchmm->wordend[w], w);  }    /* 3. erase unnecessary successor list */  /* sucessor list same as the previous node is not needed, so */  /* parse lexicon tree from every leaf to find the same succesor list */  for (w=0;w<wchmm->winfo->num;w++) {    node = wchmm->wordend[w];	/* begin from the word end node */    i = wchmm->winfo->wlen[w]-1;    while (i >= 0) {		/* for each phoneme start node */      if (node == wchmm->offset[w][i]) {	/* word with only 1 state: skip */	i--;	continue;      }      if (match_successor(wchmm, node, wchmm->offset[w][i])) {	freemark[wchmm->state[node].scid] = TRUE;	/* mark the node */      }/*  *	 if (freemark[wchmm->offset[w][i]] != FALSE) { *	   break; *	 } */      node = wchmm->offset[w][i];      i--;    }  }  /* really free */
factoring_sub.c - 源码说明

本页面展示了「julius version 4.12.about sound recognition.」中的 factoring_sub.c 源码文件，采用 C语言编程语言编写，共 1,355 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与recognition相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?