⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 traverse.java

📁 搜索引擎的预处理部分的代码
💻 JAVA
字号:
package test;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
public class Traverse {
	private String outputPath="";
	private String inputFilePath="";
	int i=0;
	public void traverse(File path)throws Exception
	{
		BufferedWriter bw=null;
		if(path==null){
			return;
		}
		if(path.isDirectory()){
			String[] files=path.list();
			for(int i=0;i<files.length;i++){
				traverse(new File(path,files[i]));
			}
		}
		else{
			if(path.getAbsolutePath().endsWith(".html")||path.getAbsolutePath().endsWith(".htm")){
				System.out.println(path);
				
				int dot=path.getName().indexOf(".");
				String txtName=path.getName().substring(0, dot);
				bw=new BufferedWriter(new FileWriter(new File(this.getOutputPath()+txtName+".txt")));
				int startPos=getInputFilePath().indexOf("mirror")+6;
				String url_seg=path.getAbsolutePath().substring(startPos);
				url_seg=url_seg.replaceAll("\\\\", "/");
				String url="http:/"+url_seg;
				System.out.println(url);
				HtmlParser htmlParser=new HtmlParser();
				bw.write(url+"\r\n"+htmlParser.HParser(path.getAbsolutePath()));
				bw.close();
			}
		}
	}

	public String getOutputPath() {
		return outputPath;
	}

	public void setOutputPath(String outputPath) {
		this.outputPath = outputPath;
	}

	public String getInputFilePath() {
		return inputFilePath;
	}

	public void setInputFilePath(String inputFilePath) {
		this.inputFilePath = inputFilePath;
	}
	
}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -