⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 comfix.awk

📁 这是一个同样来自贝尔实验室的和UNIX有着渊源的操作系统, 其简洁的设计和实现易于我们学习和理解
💻 AWK
字号:
# when raw index has a lot of entries like# 1578324	problematico, a, ci, che# apply this algorithm:#  treat things after comma as suffixes#  for each suffix:#      if single letter, replace last letter#      else search backwards for beginning of suffix#      and if it leads to an old suffix of approximately#      the same length, put replace that suffix# This will still leave some commas to fix by hand# Usage: awk -F'	' -f comfix.awk rawindex > newrawindexNF == 2	{		i = index($2, ",")		if(i == 0 || length($2) == 0)			print $0		else {			n = split($2, a, /,[ ]*/)			w = a[1]			printf "%s\t%s\n", $1, w			for(i = 2; i <= n; i++) {				suf = a[i]				m = matchsuflen(w, suf)				if(m) {					nw = substr(w, 1, length(w)-m) suf					printf "%s\t%s\n", $1, nw				} else					printf "%s\t%s\n", $1, w ", " suf			}		}	}NF != 2 {	print $0	}function matchsuflen(w, suf,		wlen,suflen,c,pat,k,d){	wlen = length(w)	suflen = length(suf)	if(suflen == 1)		return 1	else {		c = substr(suf, 1, 1)		for (k = 1; k <= wlen ; k++)			if(substr(w, wlen-k+1, 1) == c)				break		if(k > wlen)			return 0		d = k-suflen		if(d < 0)			d = -d		if(d > 3)			return 0		return k	}}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -