📄 sort-lm.gawk
字号:
#!/usr/local/bin/gawk -f## sort-lm --# sort the ngrams in an LM in lexicographic order, as required for # some other LM software (notably CMU's).## usage: sort-lm lm-file > sorted-lm-file## $Header: /home/srilm/devel/utils/src/RCS/sort-lm.gawk,v 1.2 2004/11/02 02:00:35 stolcke Exp $#BEGIN { sorter = ""; currorder = 0;}NF==0 { print; next;}/^ngram *[0-9][0-9]*=/ { order = substr($2,1,index($2,"=")-1); print; next;}/^\\[0-9]-grams:/ { if (sorter) { close(sorter); } currorder = substr($0,2,1); print; fflush(); # set up new sorting pipeline; sorter = "sort"; for (i = 1; i <= currorder; i ++) { sorter = sorter " +" i " -" (i+1); } # print sorter >> "/dev/stderr"; next;}/^\\/ { if (sorter) { close(sorter); sorter = ""; } currorder = 0; print; next;}currorder && NF > 1 { print | sorter; next;}{ print;}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -