⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 associate.1

📁 有关自然语言理解理解方面的源码
💻 1
字号:
.\" Process this file with .\"    groff -man -Tascii associate.1.TH ASSOCIATE 1 "FEBRUARY 2004" "Infomap Project" "Infomap User Manual".SH NAME .TP associate \- find words or documents most similar to a query, \or print query vector.SH SYNOPSIS.B associate .RB [ -w " | " -d " | " -q ] .RB [ -i .IR type_of_input \ (w "|" d)] (.RB [ -t ]|.RB [ -m .IR model_dir ]).RB [ -c .IR model_tag ].RB [ -n .IR num_neighbors ] .RB [ -f .IR vector_output_file ] .RI < pos_term_1 > .RI " [" pos_term_2 " ... " pos_term_n ] .RB [ NOT .IR neg_term_1 \ ... \ neg_term_n ] .B associate.RI -c \ <model_tag><query_terms>.SH DESCRIPTION.B associate finds either the words.RB ( -w ) or the documents .RB ( -d ) best matching the querycomposed of the query terms .IR pos_term_1 \ ... \ pos_term_n and, optionally, the negative query terms .IR neg_term_1 \ ... \ neg_term_n ;or simply prints the query vector.RB ( -q )for the given query.The query is either composed of words.RB (-i \ w)or of documents.RB (-i \ d)in which case.IR pos_term_1 \ ... \ pos_term_nand.IR neg_term_1 \ ... \ neg_term_nare document names (if corpus consists of many files) or document ids (if corpus is a single file) respectively..B associate prints the matching words or ID's for the matching documents one perline on its standard output, in descending order of similarity to thequery; a similarity score is printed for each word or document.  Ifthe.B -foption is used, the vectors for the matching words or documents arealso written to the specified file..B associatecan be run without any query terms to display a command usagesynopsis and the model data directory that is being used..SH OPTIONS.TP.BI -c \ corpus_tagUse the model specified by model tag.IR model_tag ; this tag is the name of the model data directory containing thevarious files that make up a model.For details on how this directory is located, see the section.B FINDING THE MODELbelow.If .B -cis not specified, then .B -m must be specified in order to perform a search.In this case, the argument to.B -mwill be interpreted as the model data directory, rather than its parent..TP.B -dRetrieve documents, rather than words..TP .BI -f \ vector_output_fileWrite the vectors of the words or documents retrieved to.IR vector_output_file ..TP.BI -i \ query_type.IR query_type is either .B w (for words) or.B d (for documents)..TP .BI -m \ model_dirUse the model stored in .I model_dir for this search; when this optionis used the model tag given as an argument to the.B -c option must name a subdirectory of .IR model_dir .  If.B -mis specified and.B -c is not specified, then the argument to.B -m must be the actual model data directory that containsthe model files.If this option is specified the .B INFOMAP_MODEL_PATHenvironment variable and the.B -toption are ignored..TP .BI -n \ num_neighborsPrint the .I num_neighbors documents or words most similar to the query..TP.B -qRather than performing retrieval, simply compute and print the queryvector for the specified query.  The .BR -f " and " -noptions are irrelevant if.B -qis specified, and will be ignored if given..TP .BI -t Look for the model data directory in the directoryspecified by the.B INFOMAP_WORKING_DIRenvironment variable; ignore the.B INFOMAP_MODEL_PATHenvironment variable.  Note that this option willbe ignored if the.B -moption is also specified..TP.B -wRetrieve words, rather than documents.  The default..\" .SH EXAMPLES.SH FINDING THE MODELAn Infomap NLP model, typically created by.BR infomap-install (1),consists of a number of files (see the .B FILESsection below).  All of these files must be in a singledirectory, the .IR "model data directory".Since all of .BR associate 'soperations are performed with respect to a model, it mustsomehow locate the appropriate model data directory.The model data directory's name is the.IR "model tag",and is given as an argument to the.B -c option.The normal way of finding this directory is to search the directoriesgiven in the.B INFOMAP_MODEL_PATHenvironment variable, a colon-separated list of directories.  Thefirst directory in this list containing a subdirectory whose name is themodel tag will be chosen, and that subdirectory's contents will beused as the model.This normal mode of operation can be overridden in two ways.First, if the .B -m option is given, its argument must be a directory, and themodel data directory must be a subdirectory of that directory.If no .B -moption is given, but the.B -t option is given, then the model data directory must be a subdirectoryof the directory given as the value of the.B INFOMAP_WORKING_DIRenvironment variable.  The.B -toption makes it convenient to run .B associateusing a model that has just been built with.BR infomap-build (1)but has not yet been installed with .BR infomap-install (1).The.B -m option is provided to handle exceptional cases.Finally, if .B -mis specified and.B -cis not, then the argument to.B -mis treated as the actual model data directory; that is, thefiles making up the model are sought in this directory..SH FILES.I model_params.bin.RSReads this file to obtain parameters for the model being used for search.See .BR prepare_corpus (1)..RE.I wordvec.bin.RSFile containing the word vectors in a binary format.  See.BR encode_wordvec (1)..RE.I word2offset.{dir,pag}.RSDBM database allowing the lookup of the vector correspondingto a given word.  See.BR encode_wordvec (1)..RE.I offset2word.{dir,pag}.RSDBM database allowing the lookup of the word corresponding toa given word vector.  See.BR encode_wordvec (1)..RE.I artvec.bin.RSFile containing the document vectors in a binary format.  See.BR count_artvec (1)..RE.I art2offset.{dir,pag}.RSDBM database allowing the lookup of the vector corresponding toa given document.  See.BR count_artvec (1)..RE.I offset2art.{dir,pag}.RSDBM database allowing the lookup of the document corresponding toa given document vector.  See.BR count_artvec (1)..RE.SH ENVIRONMENT VARIABLES.B INFOMAP_MODEL_PATH.RSPath to search for Infomap model directories.  Will beignored if .B -mor .B -tis specified..RE.B INFOMAP_WORKING_DIR.RSDirectory in which to find Infomap model directory, if the.B -tcommand-line option is given..RE.SH SEE ALSO.BR infomap-build (1), \ infomap-install (1), \ prepare_corpus (1), \ count_wordvec (1), \ svdinterface (1), \\ encode_wordvec (1), \ count_artvec (1), \ write_text_params (1)..SH DIAGNOSTICSReturns 0 to indicate success; 1 to indicate error..SH BUGSPlease report bugs to .BR infomap-nlp-users@lists.sourceforge.net ..SH CREDITSThe Infomap NLP software was written by Stefan Kaufmann, HinrichSchuetze, Dominic Widdows, Beate Dorow, and Scott Cederberg.  TheInfomap algorithm was originally developed by Hinrich Schuetze..SH AUTHORThis manual page was written by Scott Cederberg.  Please directinquiries and bug reports to .BR infomap-nlp-users@lists.sourceforge.net .

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -