⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 make-kn-discounts.gawk

📁 这是一款很好用的工具包
💻 GAWK
字号:
#!/usr/local/bin/gawk -f## make-kn-discounts --#	generate modified Kneser-Ney discounting parameters from a#	count-of-count file##	The purpose of this script is to do the KN computation off-line,#	without ngram-count having to read all counts into memory.#	The output is compatible with the ngram-count -kn<n> options.## $Header: /home/srilm/devel/utils/src/RCS/make-kn-discounts.gawk,v 1.2 2004/11/02 02:00:35 stolcke Exp $## usage: make-kn-discounts min=<mincount> countfile#BEGIN {    min=1;}/^#/ {    # skip comments    next;}{    countOfCounts[$1] = $2;}END {    # Code below is essentially identical to ModKneserNey::estimate()    # (Discount.cc).    if (countOfCounts[1] == 0 || \	countOfCounts[2] == 0 || \	countOfCounts[3] == 0 || \	countOfCounts[4] == 0) \    {	printf "error: one of required counts of counts is zero\n" \	       						>> "/dev/stderr";	exit(2);    }    Y = countOfCounts[1]/(countOfCounts[1] + 2 * countOfCounts[2]);    print "mincount", min;    print "discount1", 1 - 2 * Y * countOfCounts[2] / countOfCounts[1];    print "discount2", 2 - 3 * Y * countOfCounts[3] / countOfCounts[2];    print "discount3+", 3 - 4 * Y * countOfCounts[4] / countOfCounts[3];}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -