⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 collect_itemcollecsteady.asp

📁 SK信息采集2.0功能介绍: 1.可针对任何静态网页,动态网页进行采集。包括htm,html,shtml,ASP,ASPX,JSP,PHP等。 2.增加自定文件采集.用户可采集网页中的所有文件.
💻 ASP
字号:
<%@LANGUAGE="VBSCRIPT" CODEPAGE="936"%>
<%option explicit%>
<!--#include file="../../Conn.asp"-->
<!--#include file="../../SysCls/KS_CommonCls.asp"-->
<!--#include file="../../SysCls/KS_CollectCommonCls.asp"-->
<!--#include file="../Inc/Session.asp"-->
<%
'===================================================================================================================
'软件名称:科汛网站管理系统
'当前版本:科汛网站管理系统 V2.2 0628个人Access版
'Copyright (C) 2006-2008 Kesion.Com  All rights reserved.
'产品咨询QQ:9537636,41904294
'技术支持QQ:111394,54004407 
'程序版权:科汛网络
'程序开发:科汛网络开发组(总策划:林文仲)
'E-Mail  :kesioncms@hotmail.com webmaster@kesion.com
'官方网站:http://www.kesion.com  
'演示站点:http://test.kesion.com 
'郑重声明:
'    ①、免费版本请在程序首页保留版权信息,并做上本站LOGO友情连接,商业版本无此要求;
'    ②、任何个人或组织不得在授权允许的情况下删除、修改、拷贝本软件及其他副本上一切关于版权的信息;
'    ③、科汛网络保留此软件的法律追究权利
'===================================================================================================================
Response.Buffer = True
Server.ScriptTimeout = 999
Response.Expires = -1
Response.ExpiresAbsolute = Now() - 1
Response.Expires = 0
Response.CacheControl = "no-cache"
Dim KSCls
Set KSCls = New Collect_ItemCollecSteady
KSCls.Execute()
Set KSCls = Nothing

Class Collect_ItemCollecSteady
        Private KSCMS
		Private KMCObj
		Private ConnItem
		Private ItemNum, ListNum, ListSuccesNum, ListFalseNum, NewsNumAll
		Private Rs, Sql, RsItem, SqlItem, FoundErr, ErrMsg, ItemEnd, ListEnd
		
		'项目变量
		Private ItemID, ItemName, LoginType, LoginUrl, LoginPostUrl, LoginUser, LoginPass, LoginFalse, ClassID
		Private ListStr, LsString, LoString, ListPageType, LPsString, LPoString, ListPageStr1, ListPageStr2, ListPageID1, ListPageID2, ListPageStr3, HsString, HoString, HttpUrlType, HttpUrlStr, CollecListNum, CollecNewsNum
		
		
		'采集相关的变量
		Private Arr_i, NewsUrl
		
		'其它变量
		Private LoginData, LoginResult
		Private Arr_Item, CacheTemp, CollecOrder, OrderTemp
		
		'执行时间变量
		Private StartTime, OverTime
		
		'列表
		Private ListUrl, ListCode, NewsArrayCode, NewsArray, ListArray, ListPageNext, ListPageTemp
		Private Sub Class_Initialize()
		  Set KSCMS=New CommonCls
		  Set KMCObj=New CollectCommonCls
		  Set ConnItem = KSCMS.ConnItem()
		End Sub
        Private Sub Class_Terminate()
		 Call KSCMS.CloseConnItem()
		 Call KSCMS.Closeconn
		 Set KSCMS=Nothing
		 Set KMCObj=Nothing
		End Sub
		Sub Execute()
		
		CacheTemp = LCase(Trim(Request.ServerVariables("SCRIPT_NAME")))
		CacheTemp = Left(CacheTemp, InStrRev(CacheTemp, "/"))
		CacheTemp = Replace(CacheTemp, "\", "_")
		CacheTemp = Replace(CacheTemp, "/", "_")
		CacheTemp = "ansir" & CacheTemp
		
		ItemNum = CLng(Trim(Request("ItemNum")))
		ListNum = CLng(Trim(Request("ListNum")))
		ListSuccesNum = CLng(Trim(Request("ListSuccesNum")))
		ListFalseNum = CLng(Trim(Request("ListFalseNum")))
		NewsNumAll = CLng(Trim(Request("NewsNumAll")))
		ListPageNext = Trim(Request("ListPageNext"))
		
		FoundErr = False
		ItemEnd = False
		ListEnd = False
		CollecListNum = 0
		CollecNewsNum = 0
		
		Call SetCache
		
		If ItemEnd <> True Then
		   If (ItemNum - 1) > UBound(Arr_Item, 2) Then
			  ItemEnd = True
		   Else
			  Call SetItems
		   End If
		End If
		
		If ItemEnd <> True Then
		   If ListPageType = 0 Then
			  If ListNum = 1 Then
				 ListUrl = ListStr
			  Else
				 ListEnd = True
			  End If
		   ElseIf ListPageType = 1 Then
			  If ListNum = 1 Then
				 ListUrl = ListStr
			  Else
				 If ListPageNext = "" Or ListPageNext = "Error" Then
					ListEnd = True
				 Else
					ListPageNext = Replace(ListPageNext, "{$ID}", "&")
					ListUrl = ListPageNext
				 End If
			  End If
		   ElseIf ListPageType = 2 Then
			  If ListNum = 1 Then
					 ListUrl = ListStr
			 Else
					If ListPageID1 > ListPageID2 Then
					   If (ListPageID1 - ListNum + 1) < ListPageID2 Or (ListPageID1 - ListNum + 1) < 0 Then
						  ListEnd = True
					   Else
						  ListUrl = Replace(ListPageStr2, "{$ID}", CStr(ListPageID1 - ListNum + 1))
					   End If
					Else
					   If (ListPageID1 + ListNum - 1) > ListPageID2 Then
						  ListEnd = True
					   Else
						  ListUrl = Replace(ListPageStr2, "{$ID}", CStr(ListPageID1 + ListNum - 1))
					   End If
					End If
			  End If
		   ElseIf ListPageType = 3 Then
			  ListArray = Split(ListPageStr3, "|")
			  If (ListNum - 1) > UBound(ListArray) Then
				 ListEnd = True
			  Else
				 ListUrl = ListArray(ListNum - 1)
			  End If
		   End If
		   If ListNum > CollecListNum And CollecListNum <> 0 Then
			  ListEnd = True
		   End If
		End If
		
		If ItemEnd = True Then
		   ErrMsg = "<br>列表分析完成"
		   ErrMsg = ErrMsg & "<br>成功分析: " & ListSuccesNum & "  页列表,失败: " & ListFalseNum & "  页,文章:" & NewsNumAll & "  篇"
		   ErrMsg = ErrMsg & "<br>正在整理数据,稍后进行文章的采集..."
		   ErrMsg = ErrMsg & "<meta http-equiv=""refresh"" content=""3;url=Collect_CollectStable.asp?ItemNum=1&NewsNum=1&NewsSuccesNum=0&NewsFalseNum=0&ImagesNumAll=0&NewsNumAll=" & NewsNumAll & """>"
		Else
		   If ListEnd = True Then
			  ItemNum = ItemNum + 1
			  ListNum = 1
			  ErrMsg = "<br>" & ItemName & "  项目所有列表分析完成,正在整理数据请稍后..."
			  ErrMsg = ErrMsg & "<meta http-equiv=""refresh"" content=""3;url=Collect_ItemCollecSteady.asp?ItemNum=" & ItemNum & "&ListNum=" & ListNum & "&ListSuccesNum=" & ListSuccesNum & "&ListFalseNum=" & ListFalseNum & "&NewsNumAll=" & NewsNumAll & """>"
		   End If
		End If
		
		Call TopItem
		If ItemEnd <> True And ListEnd <> True Then
		   FoundErr = False
		   ErrMsg = ""
		   Call StartCollection
		End If
		
		Call KMCObj.WriteCollectSuccedStart(ErrMsg)
		Response.Flush
		End Sub
		
		'==================================================
		'过程名:StartCollection
		'作  用:开始采集
		'参  数:无
		'==================================================
		Sub StartCollection()
		Dim Rs
		'第一次采集时登录
		If LoginType = 1 And ListNum = 1 Then
		   LoginData = KMCObj.UrlEncoding(LoginUser & "&" & LoginPass)
		   LoginResult = KMCObj.PostHttpPage(LoginUrl, LoginPostUrl, LoginData)
		   If InStr(LoginResult, LoginFalse) > 0 Then
			  FoundErr = True
			  ErrMsg = ErrMsg & "<br><li>在登录网站时发生错误,请确保登录信息的正确性!</li>"
		   End If
		End If
		
		Set Rs = Server.CreateObject("Adodb.Recordset")
			 Rs.Open "Select ID From KS_Class Where ID='" & ClassID & "'", conn, 1, 1
			If Rs.EOF And Rs.BOF Then
				  FoundErr = True
				  ErrMsg = ErrMsg & "<br>系统检测到栏目ID[<font color=red>" & ClassID & "</font>]在主数据库中已删除,请修改项目属性的所属栏目后,再采集"
				  Call KMCObj.WriteCollectSuccedStart(ErrMsg)
			   Response.End
			End If
			Rs.Close
			Set Rs = Nothing
			
		If FoundErr <> True Then
		   ListCode = KMCObj.GetHttpPage(ListUrl)
		   Call GetListPage
		   If ListCode = "Error" Then
			  FoundErr = True
			  ErrMsg = ErrMsg & "<br><li>在获取列表:" & ListUrl & "网页源码时发生错误!</li>"
		   Else
			  ListCode = KMCObj.GetBody(ListCode, LsString, LoString, False, False)
			  If ListCode = "Error" Or ListCode = "" Then
				 FoundErr = True
				 ErrMsg = ErrMsg & "<br><li>在截取:" & ListUrl & "的文章列表时发生错误!</li>"
			  End If
		   End If
		End If
		
		If FoundErr <> True Then
		   NewsArrayCode = KMCObj.GetArray(ListCode, HsString, HoString, False, False)
		   If NewsArrayCode = "Error" Then
			  FoundErr = True
			  ErrMsg = ErrMsg & "<br><li>在分析:" & ListUrl & "文章列表时发生错误!</li>"
		   Else
			  NewsArray = Split(NewsArrayCode, "$Array$")
			  For Arr_i = 0 To UBound(NewsArray)
				 If HttpUrlType = 1 Then
					NewsArray(Arr_i) = Trim(Replace(HttpUrlStr, "{$ID}", NewsArray(Arr_i)))
				 Else
					NewsArray(Arr_i) = Trim(KMCObj.DefiniteUrl(NewsArray(Arr_i), ListUrl))
				 End If
				 NewsArray(Arr_i) = KMCObj.CheckUrl(NewsArray(Arr_i))
			  Next
			  If CollecOrder = True Then
				 For Arr_i = 0 To Fix(UBound(NewsArray) / 2)
					OrderTemp = NewsArray(Arr_i)
					NewsArray(Arr_i) = NewsArray(UBound(NewsArray) - Arr_i)
					NewsArray(UBound(NewsArray) - Arr_i) = OrderTemp
				 Next
			  End If
		   End If
		End If
		
		If FoundErr <> True Then
		   ErrMsg = ErrMsg & "<br>本次运行 " & UBound(Arr_Item, 2) + 1 & " 个项目"
		   ErrMsg = ErrMsg & "<br>从第 " & ItemNum & " 个项目 " & ItemName & " 的第 " & ListNum & " 页列表分析出 " & UBound(NewsArray) + 1 & " 篇文章"
		   If CollecNewsNum <> 0 Then
			  ErrMsg = ErrMsg & ",限制 " & CollecNewsNum & " 篇。"
			  If (CollecNewsNum - 1) > UBound(NewsArray) Then
				 CollecNewsNum = UBound(NewsArray) + 1
			  Else
				 '保持不变CollecNewsNum
			  End If
		   Else
			  CollecNewsNum = UBound(NewsArray) + 1
		   End If
		   ListSuccesNum = ListSuccesNum + 1
		   NewsNumAll = NewsNumAll + CollecNewsNum
		   Call SaveNewsList
		Else
		   ListFalseNum = ListFalseNum + 1
		End If
		ErrMsg = ErrMsg & "<br>" & "<meta http-equiv=""refresh"" content=""3;url=Collect_ItemCollecSteady.asp?ItemNum=" & ItemNum & "&ListNum=" & ListNum + 1 & "&ListSuccesNum=" & ListSuccesNum & "&ListFalseNum=" & ListFalseNum & "&NewsNumAll=" & NewsNumAll & "&ListPageNext=" & ListPageNext & """>"
		
		End Sub
		
		
		'==================================================
		'过程名:SetCache
		'作  用:存取缓存
		'参  数:无
		'==================================================
		Sub SetCache()
		   Dim myCache
		   Set myCache = New ClsCache
		
		   '项目信息
		   myCache.name = CacheTemp & "items"
		   If myCache.valid Then
			  Arr_Item = myCache.value
		   Else
			  ItemEnd = True
		   End If
		   Set myCache = Nothing
		End Sub
		
		Sub SetItems()
			  Dim ItemNumTemp
			  ItemNumTemp = ItemNum - 1
			  ItemID = Arr_Item(0, ItemNumTemp)
			  ItemName = Arr_Item(1, ItemNumTemp)
			  ClassID = Arr_Item(4, ItemNumTemp) '目标栏目ID
			  LoginType = Arr_Item(9, ItemNumTemp)
			  LoginUrl = Arr_Item(10, ItemNumTemp)       '登录
			  LoginPostUrl = Arr_Item(11, ItemNumTemp)
			  LoginUser = Arr_Item(12, ItemNumTemp)
			  LoginPass = Arr_Item(13, ItemNumTemp)
			  LoginFalse = Arr_Item(14, ItemNumTemp)
			  ListStr = Arr_Item(15, ItemNumTemp)         '列表地址
			  LsString = Arr_Item(16, ItemNumTemp)       '列表
			  LoString = Arr_Item(17, ItemNumTemp)
			  ListPageType = Arr_Item(18, ItemNumTemp)
			  LPsString = Arr_Item(19, ItemNumTemp)
			  LPoString = Arr_Item(20, ItemNumTemp)
			  ListPageStr1 = Arr_Item(21, ItemNumTemp)
			  ListPageStr2 = Arr_Item(22, ItemNumTemp)
			  ListPageID1 = Arr_Item(23, ItemNumTemp)
			  ListPageID2 = Arr_Item(24, ItemNumTemp)
			  ListPageStr3 = Arr_Item(25, ItemNumTemp)
			  HsString = Arr_Item(26, ItemNumTemp)
			  HoString = Arr_Item(27, ItemNumTemp)
			  HttpUrlType = Arr_Item(28, ItemNumTemp)
			  HttpUrlStr = Arr_Item(29, ItemNumTemp)
			  CollecListNum = Arr_Item(80, ItemNumTemp)
			  CollecNewsNum = Arr_Item(81, ItemNumTemp)
			  CollecOrder = Arr_Item(84, ItemNumTemp)
		End Sub
		
		'==================================================
		'过程名:GetListPage
		'作  用:获取列表下一页
		'参  数:无
		'==================================================
		Sub GetListPage()
		   If ListPageType = 1 Then
			  ListPageNext = KMCObj.GetPage(ListCode, LPsString, LPoString, False, False)
			  ListPageNext = KMCObj.FpHtmlEnCode(ListPageNext)
			  If ListPageNext <> "Error" And ListPageNext <> "" Then
				 If ListPageStr1 <> "" Then
					ListPageNext = Replace(ListPageStr1, "{$ID}", ListPageNext)
				 Else
					ListPageNext = KMCObj.DefiniteUrl(ListPageNext, ListUrl)
				 End If
				 ListPageNext = Replace(ListPageNext, "&", "{$ID}")
			  End If
		   Else
			  ListPageNext = "Error"
		   End If
		End Sub
		
		'==================================================
		'过程名:SaveNewsList
		'作  用:保存文章
		'参  数:无
		'==================================================
		Sub SaveNewsList()
		   Set Rs = Server.CreateObject("adodb.recordset")
		   Sql = "select top 1 * From KS_NewsList"
		   Rs.Open Sql, ConnItem, 1, 3
		   For Arr_i = 1 To CollecNewsNum
			  Rs.AddNew
			  Rs("ItemID") = ItemID
			  Rs("NewsUrl") = NewsArray(Arr_i - 1)
			  Rs.Update
		   Next
		   Rs.Close
		   Set Rs = Nothing
		End Sub
		
		'==================================================
		'过程名:TopItem
		'作  用:显示导航信息
		'参  数:无
		'==================================================
		Sub TopItem()
		
		Response.Write "<html>"
		Response.Write "<head>"
		Response.Write "<title>文章采集系统</title>"
		Response.Write "<meta http-equiv=""Content-Type"" content=""text/html; charset=gb2312"">"
		Response.Write "<link rel=""stylesheet"" type=""text/css"" href=""../inc/Admin_Style.css"">"
		Response.Write "</head>"
		Response.Write "<body leftmargin=""0"" topmargin=""0"" marginwidth=""0"" marginheight=""0"" oncontextmenu=""return false"">"
		Response.Write "<table width=""100%"" border=""0"" align=""center"" cellpadding=""0"" cellspacing=""1"" class=""sortbutton"">"
		Response.Write "  <tr>"
		Response.Write "    <td height=""22"" colspan=""2"" align=""center""><strong>采 集 系 统 采 集 管 理</strong></td>"
		  Response.Write "</tr>"
		Response.Write "</table>"
		
		End Sub
End Class
%>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -