⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 make-lm-subset.gawk

📁 这是一款很好用的工具包
💻 GAWK
字号:
#!/usr/local/bin/gawk -f## filter a backoff model with a count file, so that only ngrams# in the countfile are represented in the output## usage: make-lm-subset count-file bo-file## $Header: /home/srilm/devel/utils/src/RCS/make-lm-subset,v 1.3 1999/10/17 06:10:10 stolcke Exp $#ARGIND==1 {	ngram = $0;	sub("[ 	]*[0-9]*$", "", ngram);	count[ngram] = 1;	next;}ARGIND==2 && /^$/ {	print; next;}ARGIND==2 && /^\\/ {	print; next;}ARGIND==2 && /^ngram / {	print; next;}ARGIND==2 {	ngram = $0;	# strip numeric stuff	sub("^[-.e0-9]*[ 	]*", "", ngram);	sub("[ 	]*[-.e0-9]*$", "", ngram);	if (count[ngram]) print;	next;}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -