⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 indexmanager.java

📁 基于lucene的搜索引
💻 JAVA
字号:
package sample.dw.paper.lucene.index;

import java.io.File;
import java.io.IOException;
import java.io.Reader;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

import sample.dw.paper.lucene.util.HTMLDocParser;

/**
 * This class is used to create index for html files
 *
 */	


public class IndexManager {
	

	//the directory that stores html files

    private final String dataDir  ="D:\\做搜索引擎\\Eclipse_projects\\Eclipse工程\\heritrixProject\\jobs\\1-20070629030555484\\mirror\\";


    //the directory that is used to store lucene index
    public final  String indexDir = "D:\\indexDir";
    
    /**
     * create index
     */

			
    public void ccreateIndex(File file ,IndexWriter indexWriter) throws IOException{
        if (file.isDirectory()){
        	File[] files=file.listFiles();
        	for (int i=0;i<files.length;i++)
        	{
        		ccreateIndex(files[i],indexWriter);
        	}    
        }
        else{
        	if (file.getAbsolutePath().endsWith(".html")||file.getAbsolutePath().endsWith(".htm") )
        	{
 
    	
    	String htmlPath = file.getAbsolutePath();   	
    	addDocument(htmlPath, indexWriter);

    


		}
        }
  
    	
    }
public boolean createIndex() throws IOException{
	if(true == ifIndexExist()){
	    return true;	
	}


	  File filee=new File(dataDir);	
	  if(!filee.exists()){
		return false;
	}
	   	Directory fsDirectory = FSDirectory.getDirectory(indexDir, true);
    	Analyzer  analyzer    = new StandardAnalyzer();
    	IndexWriter indexWriter = new  	IndexWriter(fsDirectory, analyzer, true);
	    ccreateIndex(filee,indexWriter);

    	indexWriter.close();
	 	return true;
}
    /**
     * Add one document to the lucene index
     */
    public void addDocument(String htmlPath, IndexWriter indexWriter){
    	HTMLDocParser htmlParser = new HTMLDocParser(htmlPath);
    	String path    = htmlParser.getPath();
    	String title   = htmlParser.getTitle();
    	Reader content = htmlParser.getContent();
    	
    	Document document = new Document();
    	document.add(new Field("path",path,Field.Store.YES,Field.Index.NO));
    	document.add(new Field("title",title,Field.Store.YES,Field.Index.TOKENIZED));
    	document.add(new Field("content",content));
    	try {
			indexWriter.addDocument(document);
		} catch (IOException e) {
			e.printStackTrace();
		}
    }
    
    /**
     * judge if the index is already exist
     */
    public boolean ifIndexExist(){
        File directory = new File(indexDir);
        if(0 < directory.listFiles().length){
        	return true;
        }else{
        	return false;
        }
    }
    
    public String getDataDir(){
    	return this.dataDir;
  }
    
    public String getIndexDir(){
  
    	return this.indexDir;
    }
        
}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -