📄 dsanalyzer.java

📁 dspace 用j2ee架构的一个数字图书馆.开源程序
💻 JAVA
字号:
/* * DSAnalyzer.java * * Version: $Revision: 1.8 $ * * Date: $Date: 2005/04/20 14:23:23 $ * * Copyright (c) 2002-2005, Hewlett-Packard Company and Massachusetts * Institute of Technology.  All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * - Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * - Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * - Neither the name of the Hewlett-Packard Company nor the name of the * Massachusetts Institute of Technology nor the names of their * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH * DAMAGE. */package org.dspace.search;import java.io.Reader;import java.util.Set;import org.apache.lucene.analysis.Analyzer;import org.apache.lucene.analysis.LowerCaseFilter;import org.apache.lucene.analysis.PorterStemFilter;import org.apache.lucene.analysis.StopFilter;import org.apache.lucene.analysis.TokenStream;import org.apache.lucene.analysis.standard.StandardFilter;/** * Custom Lucene Analyzer that combines the standard filter, lowercase filter, * stemming and stopword filters. */public class DSAnalyzer extends Analyzer{    /*     * An array containing some common words that are not usually useful for     * searching.     */    private static final String[] STOP_WORDS =    {    // new stopwords (per MargretB)            "a", "am", "and", "are", "as", "at", "be", "but", "by", "for",            "if", "in", "into", "is", "it", "no", "not", "of", "on", "or",            "the", "to", "was"    // old stopwords (Lucene default)    /*     * "a", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in",     * "into", "is", "it", "no", "not", "of", "on", "or", "s", "such", "t",     * "that", "the", "their","then", "there","these", "they", "this", "to",     * "was", "will", "with"     */    };    /*     * Stop table     */    final static private Set stopSet = StopFilter.makeStopSet(STOP_WORDS);    /*     * Create a token stream for this analyzer.     */    public final TokenStream tokenStream(String fieldName, final Reader reader)    {        TokenStream result = new DSTokenizer(reader);        result = new StandardFilter(result);        result = new LowerCaseFilter(result);        result = new StopFilter(result, stopSet);        result = new PorterStemFilter(result);        return result;    }}
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -