📄 todo

📁 ears-0.32, linux下有用的语音信号处理工具包
💻
字号:
=====| soon |=============================================================- rec. files should be named <basename>.<recognizer>.<feature extractor>  filename entries can then be also without path.- "There are new words" + ESC --> bailout- bailing out of UI_RAW binaries should first send newline- if no words are trained with UI_RAW: listen says bogus file- implement flag for train_ears to process WAVs directly into patterns- use util/configure to check for existence of OSS sound driver, compile  both into the apps, choose at runtime depending on /dev/sndstat  new config option: sound driver- use util/configure to check for existence of UI libraries, compile all  into the apps, choose at runtime depending on pr_config::check_libs()  new config option: user interface, remove NO_CURSES- why is libm linked in with elf version???- implement test flag (Niels)- check all comments for kludges- listen-recw w/ UI_NCURSES has a hole in the border (pr_listen.cc empty; )- use two different protocols for CL and NCL versions (yarec ui code)- "Do you want to use the digits..." + NO with other than default as  basename, should we really quit?- further fine-tune 16-bit endpoint detection- SunSound::save_sample() isn't implemented**********C++ / OO design issues:- draw Booch diagram of application- rename TrainEars-/ListenProtocol to TrainEars-/ListenApplication- initialization in Oss/Voxwaresound could be handled by device classes- modularize feature- modularize endpointer- implement abstract ui event handler (callback)- make a ConfigP base class with diff. derived classes for train_ears, listen- screen::close() is called twice on a normal exit- remove class messages entirely from pr_config.h- words should be a parameter to training- singletons should be implemented otherwise, no delete...new in protocols  by using AppResourcesList or giving them as ref parameters to others- write private empty copy ctors/ass op's for classes that aren't copied- func try_to_open(ears_file) for words and other files- there are still too much dependencies on ears/*.h, esp. in modules!- take yarec/modules/sample.h and make ears/sample a has-a sample class**********standard C++ library usage:- use vector<>/valarray<> for arrays (valarray needs gcc-2.8.0)- use vector<Word*> instead of list<Word*>, derive words from it publically  all other members should be private- use random_shuffle to find words- use stream_iterators throughout the code: ostream_iterator<X> oit(cout," ");  copy (i,j,oit);**********exception handling:- implement real exception handling (gcc-2.8.0)- exceptions in speechstream.cc and feature.cc- exceptions in cursesd.*- what do we do if a new version introduces new cfg[]s that aren't in  the existing earsrc.    exception handling?  simply a warning?  version option?==| not so soon |=========================================================- include new version of rasta, don't use it as library, so people can  build from the distribution themselves- modal vocabularies to keep active word list small (M. Ward)- implement 'Calibrate mic' window for making endpoint detection more robust- in listen: the possibility to choose from a list of basenames- check endianness when writing/reading datafiles- write README section 'What if recognition is unsatisfying?'- do not train recognizer if words are incomplete- support /dev/random in myrandom.h- check existence of all patterns given in recognizer file- memory leaks in Regex and ncurses, NDTW::ref.patterns aren't freed as  well as some things in libmrasta.- play with /proc/cpuinfo on Linux- speedup: calculate features while recording (enough soundcard buffer?)           (new option PROCESSING ONLINE/OFFLINE)           calc DTW while recording!!!- implement option to hardcode noise level- automagically always record noise sample(s) for endpoint detection  maybe a small recurrent net could learn to choose between speech/noise?- use the EBUSY return value from /dev/dsp- ioctl(fd,SOUND_PCM_SUBDIVIDE,&div) with div<5 decreases sound buffer- "SOUND_BITS/SPEED doesn't match the recorded patterns!  Proceed?"- "Changing SOUND_BITS/SPEED when not having recorded all patterns   might yield unexpected results.  Proceed?"- pattern::read(): ask user if bogus pattern files should be removed- implement another good, fast recognizer- DTW: implement a VQ preprocessor (see Rabiner)- implement robust training (see Rabiner)- does the rasta vector include energy information?- implement hybrid approach to endpoint detection (see Rabiner)- automagically always record noise sample(s) for training- implement handling of contradictory recognition results (status bar etc.)- save word.actions in net file - no more need to check for words- implement Feature=NONE (for recognizers that can/must digest raw data)- should we switch to lowercase option names?- during training, output how much words still to speak, percentage- check config options for legality- "There are xx samples missing.  Do you want to record some of them?"- write more words files as examples- make word list a map - remove all stdio usage, also in cursesw.* - include Tilman's German message catalog= implement ears- ears: should we renice the spawned shell for recog performance?= implement listen fully= test ears= implement cross validation program- write texinfo- include Hermansky paper in doc/- say "Huh?" over the speaker when recognition fails  8-)
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -