⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 continuous-ngram-count.gawk

📁 这是一款很好用的工具包
💻 GAWK
字号:
#!/usr/local/bin/gawk -f## continuous-ngram-count --#	Generate ngram counts ignoring line breaks #	# usage: continous-ngram-count order=ORDER textfile | ngram-count -read -## $Header: /home/srilm/devel/utils/src/RCS/continuous-ngram-count,v 1.1 1998/08/24 00:52:30 stolcke Exp $#BEGIN {	order = 3;	head = 0;	# next position in ring buffer}function process_word(w) {	buffer[head] = w;	ngram = "";	for (j = 0; j < order; j ++) {		w1 = buffer[(head + order - j) % order];		if (w1 == "") {			break;		}		ngram = w1 " " ngram;		print ngram 1;	}	head = (head + 1) % order;}{	for (i = 1; i <= NF; i ++) {		process_word($i);	}}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -