⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 urlstore.java

📁 是个java写的sipder,非常不错!能承受很大的压力,每天采集的数量在10000万
💻 JAVA
字号:
package cn.yicha.subject.spider.store;

import java.sql.*;

import cn.yicha.common.db.DAOFactory;
import cn.yicha.common.db.DataAccessObject;
import cn.yicha.common.error.DataObjectNotFoundException;
import cn.yicha.subject.spider.URLObject;



public class URLStore
{
	private static Object urlLock = new Object();
	public static void storeURL(URLObject urlObject) throws SQLException
	{
		int id;
		synchronized(urlLock) 
		{
			id = getMaxTableID("URLStore") + 1;
		}
		
		String url = urlObject.getSourceURL().toString();
		String title = ParseUrl.getTitle(urlObject.getStringContent());
		if (title.equals(""))
			title = url;
		
		String context = ParseUrl.exportContext(urlObject.getStringContent());
		String path = urlObject.convertToFileName();
			
		DataAccessObject dao = null;
		try {
			dao = DAOFactory.getDAO();

			StringBuffer sql = new StringBuffer();
			sql.append("insert into URLStore(ID, Url, Title, Context, Path) values (")
				.append(id).append(",'")
				.append(transSqlFieldWithQuote(url)).append("','")
				.append(transSqlFieldWithQuote(title)).append("','")
				.append(transSqlFieldWithQuote(context)).append("','")
				.append(transSqlFieldWithQuote(path)).append("')");

			System.out.println("begin to insert " + url + " ...");
			dao.executeUpdate(sql.toString());
		}
		catch (Exception e)
		{
			throw new SQLException(e.getMessage());
		}
		finally
		{
			if (dao != null)
				dao.dispose();
		}		
	}

	/**
	* 取得数据库表的最大ID值
	* @param tableName 数据库表名
	* @return 最大ID值
	*/
	private static int getMaxTableID(String tableName) throws SQLException
	{
		int maxID = 0;
		String sql = "select max(ID) from " + tableName;

		DataAccessObject dao = null;
		try
		{
			// 利用DAO获取商品品种数据
			dao = DAOFactory.getDAO();
			ResultSet rs = dao.getResult(sql);
			if (rs.next()) {
				maxID = rs.getInt(1);
			}
			rs.close();
		}
		catch (Exception ex)
		{
			throw new SQLException(ex.getMessage());
		}
		finally
		{
			if (dao != null)
				dao.dispose();
		}

		return maxID;
	}

	/**
	* 把单引号的字符串作转换,便于SQL处理
	* @field 待转换的字符串
	* @return 转换后的字符串
	*/
	public static String transSqlFieldWithQuote(String field)
	{
		StringBuffer str = new StringBuffer();

		for (int i=0; i < field.length(); i++)
		{
			if (field.charAt(i) == '\'')
				str.append('\'');
			str.append(field.charAt(i));
		}
		return str.toString();
	}	
}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -