⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 analyzertest.java

📁 这是关于中文分词的有关程序
💻 JAVA
字号:
/*
 * Created on 2005-12-28
 * author 谢骋超
 * 
 */
package cn.edu.zju.dartsplitter.analysis;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.net.URL;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.springframework.util.StopWatch;

import cn.edu.zju.dartsplitter.DictTree;
import cn.edu.zju.dartsplitter.analysis.AnalyzerUtils;
import cn.edu.zju.dartsplitter.appcontext.AppContextFactory;

public class AnalyzerTest {

    public static void main(String[] args) throws IOException {       

        System.out.println("DictTreeAnalyzer dict preload");
        StringBuilder sb1=new StringBuilder("感冒清胶囊");
        doTreeDictAnalyzer(sb1);
        
        StringBuilder sb = loadString();        
        doTreeDictAnalyzer(sb);//the second time should be much faster!
        doSimpleAnalyzer(sb);
        doStandardAnalyzer(sb);
    }


    private static void doStandardAnalyzer(StringBuilder sb) throws IOException {
        StopWatch stopWatch;
        stopWatch = new StopWatch();
        stopWatch.start("StandardAnalyzer");
        System.out.println("StandardAnalyzer");
        AnalyzerUtils.displayTokensWithFullDetails(new StandardAnalyzer(), sb
                .toString());
        stopWatch.stop();
        System.out.println(stopWatch.prettyPrint());
        // AnalyzerUtils.displayTokensWithPositions(new StandardAnalyzer(),
        // sb.toString());
    }

    private static void doSimpleAnalyzer(StringBuilder sb) throws IOException {
        StopWatch stopWatch;
        System.out.println("SimpleAnalyzer");
        stopWatch = new StopWatch();
        stopWatch.start("SimpleAnalyzer");
        AnalyzerUtils.displayTokensWithFullDetails(new SimpleAnalyzer(), sb
                .toString());
        stopWatch.stop();
        System.out.println(stopWatch.prettyPrint());
        // AnalyzerUtils.displayTokensWithPositions(new SimpleAnalyzer(),
        // sb.toString());
    }

    private static void doTreeDictAnalyzer(StringBuilder sb) throws IOException {
        Analyzer cnAnalyzer = (Analyzer) AppContextFactory.getContext()
                .getBean("dictTreeAnalyzer");

        // cnAnalyzer.setFilterTagList(getFilterTagList());
        StopWatch stopWatch = new StopWatch();
        stopWatch.start("dictTreeAnalyzer");
        AnalyzerUtils.displayTokensWithFullDetails(cnAnalyzer, sb.toString());
        stopWatch.stop();
        System.out.println(stopWatch.prettyPrint());
        // AnalyzerUtils.displayTokensWithPositions(cnAnalyzer,
        // sb.toString());
        // System.out.println("\n----");
    }

    private static StringBuilder loadString() throws FileNotFoundException,
            IOException {
        URL url = AnalyzerTest.class.getResource("analysisTest.txt");
        File f = new File(url.getFile());
        StringBuilder sb = new StringBuilder();
        BufferedReader reader = new BufferedReader(new FileReader(f));
        String line = "";
        do {
            line = reader.readLine();
            sb.append(line);
        } while (null != line);
        return sb;
    }
}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -