📄 realtime-1stpass.c
字号:
/** * @file realtime-1stpass.c * * <JA> * @brief 妈1パス¨フレ〖ム票袋ビ〖ム玫瑚∈悸箕粗借妄惹∷ * * 妈1パスを掐蜗倡幌と票箕にスタ〖トし·掐蜗と士乖して千急借妄を乖うための * 簇眶が年盗されている. * * バッチ借妄の眷圭·Julius の不兰千急借妄は笆布の缄界で * main_recognition_loop() 柒で悸乖される. * * -# 不兰掐蜗 adin_go() ⅹ 掐蜗不兰が speech[] に呈羌される * -# 泼魔翁藐叫 new_wav2mfcc() ⅹspeechから泼魔パラメ〖タを param に呈羌 * -# 妈1パス悸乖 get_back_trellis() ⅹparam とモデルから帽胳トレリスの栏喇 * -# 妈2パス悸乖 wchmm_fbs() * -# 千急冯蔡叫蜗 * * 妈1パスを士乖借妄する眷圭·惧淡の 1 × 3 が士乖して乖われる. * Julius では·この事乖借妄を·不兰掐蜗の们室が评られるたびに * 千急借妄をその尸だけ敛肌弄に渴めることで悸刘している. * * - 泼魔翁藐叫と妈1パス悸乖を·办つにまとめてコ〖ルバック簇眶として年盗. * - 不兰掐蜗簇眶 adin_go() のコ〖ルバックとして惧淡の簇眶を涂える * * 恶挛弄には·ここで年盗されている RealTimePipeLine() がコ〖ルバックとして * adin_go() に涂えられる. adin_go() は不兰掐蜗がトリガするとその评られた掐蜗 * 们室ごとに RealTimePipeLine() を钙び叫す. RealTimePipeLine() は评られた * 们室尸について泼魔翁藐叫と妈1パスの纷换を渴める. * * CMN について庙罢が涩妥である. CMN は奶撅券厦帽疤で乖われるが· * マイク掐蜗やネットワ〖ク掐蜗のように·妈1パスと士乖に千急を乖う * 借妄箕は券厦链挛のケプストラム士堆を评ることができない. バ〖ジョン 3.5 * 笆涟では木涟の券厦5擅尸(逮笛された掐蜗を近く)の CMN がそのまま肌券厦に * 萎脱されていたが·3.5.1 からは·惧淡の木涟券厦 CMN を介袋猛として * 券厦柒 CMN を MAP-CMN を积ちいて纷换するようになった. なお· * 呵介の券厦脱の介袋CMNを "-cmnload" で涂えることもでき·また * "-cmnnoupdate" で掐蜗ごとの CMN 构糠を乖わないようにできる. * "-cmnnoupdate" と "-cmnload" と寥み圭わせることで, 呵介にグロ〖バルな * ケプストラム士堆を涂え·それを撅に介袋猛として MAP-CMN することができる. * * 肩妥な簇眶は笆布の奶りである. * * - RealTimeInit() - 弹瓢箕の介袋步 * - RealTimePipeLinePrepare() - 掐蜗ごとの介袋步 * - RealTimePipeLine() - 妈1パス士乖借妄脱コ〖ルバック∈惧揭∷ * - RealTimeResume() - ショ〖トポ〖ズセグメンテ〖ション箕の千急牲耽 * - RealTimeParam() - 掐蜗ごとの妈1パス姜位借妄 * - RealTimeCMNUpdate() - CMN の构糠 * * </JA> * * <EN> * @brief The first pass: frame-synchronous beam search (on-the-fly version) * * These are functions to perform on-the-fly decoding of the 1st pass * (frame-synchronous beam search). These function can be used * instead of new_wav2mfcc() and get_back_trellis(). These functions enable * recognition as soon as an input triggers. The 1st pass processing * will be done concurrently with the input. * * The basic recognition procedure of Julius in main_recognition_loop() * is as follows: * * -# speech input: (adin_go()) ... buffer `speech' holds the input * -# feature extraction: (new_wav2mfcc()) ... compute feature vector * from `speech' and store the vector sequence to `param'. * -# recognition 1st pass: (get_back_trellis()) ... frame-wise beam decoding * to generate word trellis index from `param' and models. * -# recognition 2nd pass: (wchmm_fbs()) * -# Output result. * * At on-the-fly decoding, procedures from 1 to 3 above will be performed * in parallel. It is implemented by a simple scheme, processing the captured * small speech fragments one by one progressively: * * - Define a callback function that can do feature extraction and 1st pass * processing progressively. * - The callback will be given to A/D-in function adin_go(). * * Actual procedure is as follows. The function RealTimePipeLine() * will be given to adin_go() as callback. Then adin_go() will watch * the input, and if speech input starts, it calls RealTimePipeLine() * for every captured input fragments. RealTimePipeLine() will * compute the feature vector of the given fragment and proceed the * 1st pass processing for them, and return to the capture function. * The current status will be hold to the next call, to perform * inter-frame processing (computing delta coef. etc.). * * Note about CMN: With acoustic models trained with CMN, Julius performs * CMN to the input. On file input, the whole sentence mean will be computed * and subtracted. At the on-the-fly decoding, the ceptral mean will be * performed using the cepstral mean of last 5 second input (excluding * rejected ones). This was a behavier earlier than 3.5, and 3.5.1 now * applies MAP-CMN at on-the-fly decoding, using the last 5 second cepstrum * as initial mean. Initial cepstral mean at start can be given by option * "-cmnload", and you can also prohibit the updates of initial cepstral * mean at each input by "-cmnnoupdate". The last option is useful to always * use static global cepstral mean as initial mean for each input. * * The primary functions in this file are: * - RealTimeInit() - initialization at application startup * - RealTimePipeLinePrepare() - initialization before each input * - RealTimePipeLine() - callback for on-the-fly 1st pass decoding * - RealTimeResume() - recognition resume procedure for short-pause segmentation. * - RealTimeParam() - finalize the on-the-fly 1st pass when input ends. * - RealTimeCMNUpdate() - update CMN data for next input * * </EN> * * @author Akinobu Lee * @date Tue Aug 23 11:44:14 2005 * * $Revision: 1.5 $ * *//* * Copyright (c) 1991-2007 Kawahara Lab., Kyoto University * Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology * Copyright (c) 2005-2007 Julius project team, Nagoya Institute of Technology * All rights reserved */#include <julius/julius.h>#undef RDEBUG ///< Define if you want local debug message/** * <JA> * MFCC纷换インスタンス柒に泼魔パラメ〖タベクトル呈羌エリアを洁洒する. * * mfcc->para の攫鼠に答づいてヘッダ攫鼠を呈羌し·介袋呈羌挝拌を澄瘦する. * 呈羌挝拌は·掐蜗箕に涩妥に炳じて极瓢弄に凯墓されるので·ここでは * その洁洒だけ乖う. すでに呈羌挝拌が澄瘦されているときはそれをキ〖プする. * * これは掐蜗/千急1搀ごとに帆り手し钙ばれる. * * </JA> * <EN> * * Prepare parameter holder in MFCC calculation instance to store MFCC * vectors. * * This function will store header information based on the parameters * in mfcc->para, and allocate initial buffer for the incoming * vectors. The vector buffer will be expanded as needed while * recognition, so at this time only the minimal amount is allocated. * If the instance already has a certain length of vector buffer, it * will be kept. * * This function will be called each time a new input begins. * * </EN> * * @param mfcc [i/o] MFCC calculation instance * */static voidinit_param(MFCCCalc *mfcc){ Value *para; para = mfcc->para; /* これから纷换されるパラメ〖タの房をヘッダに肋年 */ /* set header types */ mfcc->param->header.samptype = F_MFCC; if (para->delta) mfcc->param->header.samptype |= F_DELTA; if (para->acc) mfcc->param->header.samptype |= F_ACCL; if (para->energy) mfcc->param->header.samptype |= F_ENERGY; if (para->c0) mfcc->param->header.samptype |= F_ZEROTH; if (para->absesup) mfcc->param->header.samptype |= F_ENERGY_SUP; if (para->cmn) mfcc->param->header.samptype |= F_CEPNORM; mfcc->param->header.wshift = para->smp_period * para->frameshift; mfcc->param->header.sampsize = para->veclen * sizeof(VECT); /* not compressed */ mfcc->param->veclen = para->veclen; /* 千急借妄面/姜位稿にセットされる恃眶: param->parvec (パラメ〖タベクトル废误) param->header.samplenum, param->samplenum (链フレ〖ム眶) */ /* variables that will be set while/after computation has been done: param->parvec (parameter vector sequence) param->header.samplenum, param->samplenum (total number of frames) */ /* MAP-CMN の介袋步 */ /* Prepare for MAP-CMN */ if (mfcc->para->cmn || mfcc->para->cvn) CMN_realtime_prepare(mfcc->cmn.wrk);}/** * <JA> * @brief 妈1パス士乖千急借妄の介袋步. * * MFCC纷换のワ〖クエリア澄瘦を乖う. また涩妥な眷圭は·スペクトル负换脱の * ワ〖クエリア洁洒·ノイズスペクトルのロ〖ド·CMN脱の介袋ケプストラム * 士堆デ〖タのロ〖ドなども乖われる. * * この簇眶は·システム弹瓢稿1搀だけ钙ばれる. * </JA> * <EN> * @brief Initializations for the on-the-fly 1st pass decoding. * * Work areas for all MFCC caculation instances are allocated. * Additionaly, * some initialization will be done such as allocating work area * for spectral subtraction, loading noise spectrum from file, * loading initial ceptral mean data for CMN from file, etc. * * This will be called only once, on system startup. * </EN> * * @param recog [i/o] engine instance * * @callgraph * @callergraph */booleanRealTimeInit(Recog *recog){ Value *para; Jconf *jconf; RealBeam *r; MFCCCalc *mfcc; jconf = recog->jconf; r = &(recog->real); /* 呵络フレ〖ム墓を呵络掐蜗箕粗眶から纷换 */ /* set maximum allowed frame length */ r->maxframelen = MAXSPEECHLEN / recog->jconf->input.frameshift; /* -ssload 回年箕, SS脱のノイズスペクトルをファイルから粕み哈む */ /* if "-ssload", load noise spectrum for spectral subtraction from file */ for(mfcc = recog->mfcclist; mfcc; mfcc = mfcc->next) { if (mfcc->frontend.ssload_filename && mfcc->frontend.ssbuf == NULL) { if ((mfcc->frontend.ssbuf = new_SS_load_from_file(mfcc->frontend.ssload_filename, &(mfcc->frontend.sslen))) == NULL) { jlog("ERROR: failed to read \"%s\"\n", mfcc->frontend.ssload_filename); return FALSE; } /* check ssbuf length */ if (mfcc->frontend.sslen != mfcc->wrk->bflen) { jlog("ERROR: noise spectrum length not match\n"); return FALSE; } mfcc->wrk->ssbuf = mfcc->frontend.ssbuf; mfcc->wrk->ssbuflen = mfcc->frontend.sslen; mfcc->wrk->ss_alpha = mfcc->frontend.ss_alpha; mfcc->wrk->ss_floor = mfcc->frontend.ss_floor; } } for(mfcc = recog->mfcclist; mfcc; mfcc = mfcc->next) { para = mfcc->para; /* 滦眶エネルギ〖赖惮步のための介袋猛 */ /* set initial value for log energy normalization */ if (para->energy && para->enormal) energy_max_init(&(mfcc->ewrk)); /* デルタ纷换のためのサイクルバッファを脱罢 */ /* initialize cycle buffers for delta and accel coef. computation */ if (para->delta) mfcc->db = WMP_deltabuf_new(para->baselen, para->delWin); if (para->acc) mfcc->ab = WMP_deltabuf_new(para->baselen * 2, para->accWin); /* デルタ纷换のためのワ〖クエリアを澄瘦 */ /* allocate work area for the delta computation */ mfcc->tmpmfcc = (VECT *)mymalloc(sizeof(VECT) * para->vecbuflen); /* MAP-CMN 脱の介袋ケプストラム士堆を粕み哈んで介袋步する */ /* Initialize the initial cepstral mean data from file for MAP-CMN */ if (para->cmn || para->cvn) mfcc->cmn.wrk = CMN_realtime_new(para, mfcc->cmn.map_weight); /* -cmnload 回年箕, CMN脱のケプストラム士堆の介袋猛をファイルから粕み哈む */ /* if "-cmnload", load initial cepstral mean data from file for CMN */ if (mfcc->cmn.load_filename) { if (para->cmn) { if ((mfcc->cmn.loaded = CMN_load_from_file(mfcc->cmn.wrk, mfcc->cmn.load_filename))== FALSE) { jlog("WARNING: failed to read initial cepstral mean from \"%s\", do flat start\n", mfcc->cmn.load_filename); } } else { jlog("WARNING: CMN not required on AM, file \"%s\" ignored\n", mfcc->cmn.load_filename); } } } /* 岭墓をセット */ /* set window length */ r->windowlen = recog->jconf->input.framesize + 1; /* 岭かけ脱バッファを澄瘦 */ /* set window buffer */ r->window = mymalloc(sizeof(SP16) * r->windowlen); return TRUE;}/** * <EN> * Prepare work are a for MFCC calculation. * Reset values in work area for starting the next input. * Output probability cache for each acoustic model will be also * prepared at this function. * * This function will be called before starting each input (segment). * </EN> * <JA> * MFCC纷换を洁洒する. * いくつかのワ〖クエリアをリセットして千急に洒える. * また·不读モデルごとの叫蜗澄唯纷换キャッシュを洁洒する. * * この簇眶は·ある掐蜗∈あるいはセグメント∷の千急が * 幌まる涟に涩ず钙ばれる. * * </JA> * * @param recog [i/o] engine instance * * @callgraph * @callergraph */voidreset_mfcc(Recog *recog) { Value *para; MFCCCalc *mfcc; RealBeam *r; r = &(recog->real); /* 泼魔藐叫モジュ〖ルを介袋步 */ /* initialize parameter extraction module */ for(mfcc = recog->mfcclist; mfcc; mfcc = mfcc->next) { para = mfcc->para; /* 滦眶エネルギ〖赖惮步のための介袋猛をセット */ /* set initial value for log energy normalization */ if (para->energy && para->enormal) energy_max_prepare(&(mfcc->ewrk), para); /* デルタ纷换脱バッファを洁洒 */ /* set the delta cycle buffer */ if (para->delta) WMP_deltabuf_prepare(mfcc->db); if (para->acc) WMP_deltabuf_prepare(mfcc->ab); }}/** * <JA> * @brief 妈1パス士乖千急借妄の洁洒 * * 纷换脱恃眶をリセットし·称硷デ〖タを洁洒する. * この簇眶は·ある掐蜗∈あるいはセグメント∷の千急が * 幌まる涟に钙ばれる. * * </JA> * <EN> * @brief Preparation for the on-the-fly 1st pass decoding. * * Variables are reset and data are prepared for the next input recognition. * * This function will be called before starting each input (segment). * * </EN> * * @param recog [i/o] engine instance * * @return TRUE on success. FALSE on failure. * * @callgraph * @callergraph * */booleanRealTimePipeLinePrepare(Recog *recog){ RealBeam *r; PROCESS_AM *am; MFCCCalc *mfcc;#ifdef SPSEGMENT_NAIST RecogProcess *p;#endif r = &(recog->real); /* 纷换脱の恃眶を介袋步 */ /* initialize variables for computation */ r->windownum = 0; /* parameter check */ for(mfcc = recog->mfcclist; mfcc; mfcc = mfcc->next) { /* パラメ〖タ介袋步 */ /* parameter initialization */ if (recog->jconf->input.speech_input == SP_MFCMODULE) { if (mfc_module_set_header(mfcc, recog) == FALSE) return FALSE;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -