📄 todo
字号:
=====| soon |=============================================================- rec. files should be named <basename>.<recognizer>.<feature extractor> filename entries can then be also without path.- "There are new words" + ESC --> bailout- bailing out of UI_RAW binaries should first send newline- if no words are trained with UI_RAW: listen says bogus file- implement flag for train_ears to process WAVs directly into patterns- use util/configure to check for existence of OSS sound driver, compile both into the apps, choose at runtime depending on /dev/sndstat new config option: sound driver- use util/configure to check for existence of UI libraries, compile all into the apps, choose at runtime depending on pr_config::check_libs() new config option: user interface, remove NO_CURSES- why is libm linked in with elf version???- implement test flag (Niels)- check all comments for kludges- listen-recw w/ UI_NCURSES has a hole in the border (pr_listen.cc empty; )- use two different protocols for CL and NCL versions (yarec ui code)- "Do you want to use the digits..." + NO with other than default as basename, should we really quit?- further fine-tune 16-bit endpoint detection- SunSound::save_sample() isn't implemented**********C++ / OO design issues:- draw Booch diagram of application- rename TrainEars-/ListenProtocol to TrainEars-/ListenApplication- initialization in Oss/Voxwaresound could be handled by device classes- modularize feature- modularize endpointer- implement abstract ui event handler (callback)- make a ConfigP base class with diff. derived classes for train_ears, listen- screen::close() is called twice on a normal exit- remove class messages entirely from pr_config.h- words should be a parameter to training- singletons should be implemented otherwise, no delete...new in protocols by using AppResourcesList or giving them as ref parameters to others- write private empty copy ctors/ass op's for classes that aren't copied- func try_to_open(ears_file) for words and other files- there are still too much dependencies on ears/*.h, esp. in modules!- take yarec/modules/sample.h and make ears/sample a has-a sample class**********standard C++ library usage:- use vector<>/valarray<> for arrays (valarray needs gcc-2.8.0)- use vector<Word*> instead of list<Word*>, derive words from it publically all other members should be private- use random_shuffle to find words- use stream_iterators throughout the code: ostream_iterator<X> oit(cout," "); copy (i,j,oit);**********exception handling:- implement real exception handling (gcc-2.8.0)- exceptions in speechstream.cc and feature.cc- exceptions in cursesd.*- what do we do if a new version introduces new cfg[]s that aren't in the existing earsrc. exception handling? simply a warning? version option?==| not so soon |=========================================================- include new version of rasta, don't use it as library, so people can build from the distribution themselves- modal vocabularies to keep active word list small (M. Ward)- implement 'Calibrate mic' window for making endpoint detection more robust- in listen: the possibility to choose from a list of basenames- check endianness when writing/reading datafiles- write README section 'What if recognition is unsatisfying?'- do not train recognizer if words are incomplete- support /dev/random in myrandom.h- check existence of all patterns given in recognizer file- memory leaks in Regex and ncurses, NDTW::ref.patterns aren't freed as well as some things in libmrasta.- play with /proc/cpuinfo on Linux- speedup: calculate features while recording (enough soundcard buffer?) (new option PROCESSING ONLINE/OFFLINE) calc DTW while recording!!!- implement option to hardcode noise level- automagically always record noise sample(s) for endpoint detection maybe a small recurrent net could learn to choose between speech/noise?- use the EBUSY return value from /dev/dsp- ioctl(fd,SOUND_PCM_SUBDIVIDE,&div) with div<5 decreases sound buffer- "SOUND_BITS/SPEED doesn't match the recorded patterns! Proceed?"- "Changing SOUND_BITS/SPEED when not having recorded all patterns might yield unexpected results. Proceed?"- pattern::read(): ask user if bogus pattern files should be removed- implement another good, fast recognizer- DTW: implement a VQ preprocessor (see Rabiner)- implement robust training (see Rabiner)- does the rasta vector include energy information?- implement hybrid approach to endpoint detection (see Rabiner)- automagically always record noise sample(s) for training- implement handling of contradictory recognition results (status bar etc.)- save word.actions in net file - no more need to check for words- implement Feature=NONE (for recognizers that can/must digest raw data)- should we switch to lowercase option names?- during training, output how much words still to speak, percentage- check config options for legality- "There are xx samples missing. Do you want to record some of them?"- write more words files as examples- make word list a map - remove all stdio usage, also in cursesw.* - include Tilman's German message catalog= implement ears- ears: should we renice the spawned shell for recog performance?= implement listen fully= test ears= implement cross validation program- write texinfo- include Hermansky paper in doc/- say "Huh?" over the speaker when recognition fails 8-)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -