⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 encode_wordvec.1

📁 有关自然语言理解理解方面的源码
💻 1
字号:
.\" Process this file with .\"    groff -man -Tascii encode_wordvec.1.TH ENCODE_WORDVEC 1 "February 2004" "Infomap Project" "Infomap NLP Manual".SH NAME.TP encode_wordvec \- compress word vectors from textual to binary format.SH SYNOPSIS.B encode_wordvec.BR -m <model_data_dir>.SH DESCRIPTION.B encode_wordvecreads in the word vectors written to the .I leftfile by.BR svdinterface .It converts this textual input to a binary format that is moreefficient for lookup.  It generates associated DBM files that indexthe binary-format word vectors..SH OPTIONS.TP.BI -m \ <model_data_dir>The directory from which input is read and to which output is written..\" .SH EXAMPLES.SH INPUT FILESThese files are read from the model data directory, specified asan argument to the.B -mdiroption..I left.RSThe word vectors (left singular vectors from SVD) in a textualformat.  See.BR svdinterface (1)..RE.I dic.RSThe dictionary file listing the different terms (word types) foundin the corpus, and the term and document frequency for each.See.BR prepare_corpus (1)..RE.I model_params.bin.RSReads this file to obtain parameters for the model being built.See .BR prepare_corpus (1)..RE.SH OUTPUT FILESThese files are written to the model data directory, specified asan argument to the.B -mdiroption..I wordvec.bin.RSThe binary wordvectors.  These are really the meat of the model.Please note that this file is.I notportable to other machine architectures..RE.I word2offset.{dir,pag}.RSThese files make up a DBM database.  Each key in this database isa word; the corresponding value is the offset into .I wordvec.binat which that word's vector begins.  Thus this database and .I wordvec.bincan be used to obtain a word's vector given the word..RE.I offset2word.{dir,pag}.RSThese files make up a DBM database.  Each key in this databaseis an offset into the .I wordvec.binfile at which a word vector begins; the corresponding value isthe word whose vector begins at that offset.Thus this database and .I wordvec.bincan be used to obtain a word given its vector..RE.SH SEE ALSO.BR prepare_corpus (1), \ count_wordvec (1), \ svdinterface (1), \\ count_artvec (1), \ write_text_params(1)..SH DIAGNOSTICSReturns 0 to indicate success, 1 to indicate error..SH BUGSPlease report bugs to .BR infomap-nlp-users@lists.sourceforge.net ..SH CREDITSThe Infomap NLP software was written by Stefan Kaufmann, HinrichSchuetze, Dominic Widdows, Beate Dorow, and Scott Cederberg.  TheInfomap algorithm was originally developed by Hinrich Schuetze..SH AUTHORThis manual page was written by Scott Cederberg.  Please directinquiries and bug reports to .BR infomap-nlp-users@lists.sourceforge.net .

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -