pymmseg

来自「用python写的分词程序,实现的是最大匹配方法,简单易用」· 代码 · 共 48 行

TXT
48
字号
#!/usr/bin/env pythonimport sysimport getoptfrom os.path import dirname, joinsys.path.append(join(dirname(__file__), '..'))import mmsegdef print_usage():    print """\mmseg  Segment Chinese text. Read from stdin and print to stdout.Options:  -h  --help      Print this message  -s  --separator Select the separator of the segmented text. Default is              space.    """    sys.exit(0)separator = " "optlst, args = getopt.getopt(sys.argv[1:], 'hs:')for opt, val in optlst:    if opt == '-h':        print_usage()        elif opt == '-s':        separator = val# load default dictionariesmmseg.dict_load_defaults()algor = mmseg.Algorithm(sys.stdin.read())first = Truefor tk in algor:    if not first:        sys.stdout.write(separator)    first = False    sys.stdout.write(tk.text)print

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?