⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ngram-merge.1

📁 这是一款很好用的工具包
💻 1
字号:
.\" $Id: ngram-merge.1,v 1.7 2004/12/03 17:59:01 stolcke Exp $.TH ngram-merge 1 "$Date: 2004/12/03 17:59:01 $"  "SRILM Tools".SH NAMEngram-merge \- merge N-gram counts.SH SYNOPSIS.B ngram-merge[\c.BR \-help ][\c.B \-write.IR outfile ][\c.BR \-float-counts ][\c.BR -- ].I infile1.I infile2\&....SH DESCRIPTION.B ngram-merge reads two or more lexicographically sorted N-gram count files(as produced by .BR "ngram-count -sort" )and outputs the merged, sorted counts.The output is thus suitable for subsequent merging steps..PPThe input format consists of one N-gram count per line,.br.I	word1 word2 ... wordn count.P.brThe lines must be sorted lexicographically on the words, leftmost first.The input may contain N-grams of different lengths..PPEach filename argument can be a plain ASCII count file, or a compressed file (name ending in .Z or .gz), or ``-'' to indicatestdin/stdout..PP.B ngram-merge is recommended in cases where the full counts would far exceed available real memory.Although an arbitrary number of input count files is accepted,it is best to use the program as follows.First, partition the input text into the largest chunks so that.B ngram-countcan run in real memory.Then merge the resulting sorted counts using.B ngram-mergepairwise, and continue doing so in a binary tree pattern until asingle count file containing all N-grams remains.This procedure is automated by the.B make-batch-countsand.B merge-batch-countsscripts..SH OPTIONS.PPEach filename argument can be an ASCII file, or a compressed file (name ending in .Z or .gz), or ``-'' to indicatestdin/stdout..TP.B \-helpPrint option and usage summary..TP.B \-versionPrint version information..TP.BI \-write " outfile"Write merged counts to.IR outfile ,instead of standard output..TP.B \-float-countsProcess counts as floating point numbers.By default counts are assumed to be unsigned integers..TP.B \-\-Indicates the end of options, in case the first input filename beginswith ``-''..SH "SEE ALSO"ngram-count(1), ngram(1), training-scripts(1)..SH AUTHORAndreas Stolcke <stolcke@speech.sri.com>.brCopyright 1995\-2004 SRI International

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -