⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 htklat-vocab.gawk

📁 这是一款很好用的工具包
💻 GAWK
字号:
#!/usr/local/bin/gawk -f## htklat-vocab --#	extract vocabulary used in an HTK lattice## usage: htklat-vocab HTK-LATTICE ... > VOCAB## $Header: /home/srilm/devel/utils/src/RCS/htklat-vocab.gawk,v 1.3 2004/02/27 21:42:28 stolcke Exp $#BEGIN {	null = "!NULL";	quotes = 0;}{	for (i = 1; i <= NF; i ++) {		# skip comments		if ($i ~ /^#/) next;		# Note: this doesn't handle quoted spaces		# (as SRILM generally doesn't)		if ($i ~ /^W=/ || $i ~ /^WORD=/) {		    word = substr($i, index($i, "=") + 1);		    if (quotes) {			# HTK quoting conventions			if (word ~ /^['"]/) {			    word = substr(word, 2, length(word)-2);			}			if (word ~ /\\/) {			    gsub(/\\\\/, "@QuOtE@", word);			    gsub(/\\/, "", word);			    gsub(/@QuOtE@/, "\\", word);			}		    }		    if (word != null) {			is_word[word] = 1;		    }		}	}}END {	for (word in is_word) {		print word;	}}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -