⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 changelog.txt

📁 网页抓取程序
💻 TXT
字号:
OpenWebSpider v0.7
- Rewrited some function from scratch
- Changed names of the fields in various tables
- Enhanced indexing limits (level of depth limit, number of pages, seconds, bytes downloaded and errors)
- Set limits via command-line and/or via DB (NEW table hostlist_extras [host_id, max_pages, max_level, max_seconds, max_bytes] )
- Added field textlinks in the table rels
- Released Sers: Experimental Clustered Full-Text Search Engine

OpenWebSpider v0.6.1
- Fixed some bugs
- Enhanced modules support

OpenWebSpider v0.6
- New argument: '-S' (OWS Server)
- New argument: '-n' (Don't index pages)
- New argument: '-F' (Free Indexing Mode)
- New argument: '-o' (OWS Own Index)
- Argument '-l' now is safe
- Fixed a lot of bugs

OpenWebSpider v0.5.3
- Fixed a bug in LookForUrls() [Thanks to golove]
- Fixed a bug in sqlTextToUTF8()
- Fixed a bug in IsPageIndexed()
- Fixed a bug in ParseHTTPRequest()

OpenWebSpider v0.5.2
- Initial support to UTF-8 and UCS-2
- Fixed a bug in UnHtml()
- Fixed a deadlock in InitIndexing()
- Enhanced ParseUrl()
- Added two variables in the structure used by external modules: htmlLength and textLength
- Updgarded the fields 'html' and 'htmlcache' from 'text' and 'blob' to 'longtext' and 'longblob' in the table 'pagelist' and in the temporary tables
  (REMEMBER TO EXECUTE THIS QUERY TO UPGRADE YOUR TABLE: "ALTER TABLE <your_spiderdb_here>.pagelist ,change html html longtext NOT NULL , change htmlcache htmlcache longblob NULL, CHARSET utf8;")
- Now OpenWebSpider compiles with Microsoft Visual C++ 2005

OpenWebSpider v0.5.1
- Extended the set of the characters indexed
- Fixed a bug in InitIndexing() and ReturnFirstUrl()
- Optimized KillThreads()
- Optimized the module handler
- Modified modRegexFilter and mod_regex.conf structure

OpenWebSpider v0.5
- New argument: '-f' (Support for external modules (.dll OR .so))
- New argument: '-d' (Crawl Delay in MilliSeconds) [Default: 0 (No Delay)] [Thanks to Craig Atkins for the feedback]
- New argument: '-l' (Limit the maximum number of pages indexed per site) [Default: 0 (No limit)] [Thanks to Constantinos Laitsas for the feedback]
- New argument: '-x' (store full html of pages)
- New argument: '-z' (store full html of pages (compressed))
- New argument: '-u' (index only new pages (Update))
- New argument: '-r' (Saves relationships between page; who links who and who is linked from)
- robots.txt: Support for "Crawl-Delay" [Thanks to Craig Atkins for the feedback]
- Support for tag STYLE in UnHtml() [Thanks to Constantinos Laitsas for the feedback]
- Support for tag BASE (<base href="http://www.openwebspider.org/">) in ParseUrl()
- Fixed a bug in FlushTempTable() [Thanks to Constantinos Laitsas for the feedback]
- Fixed a bug in ParseHTTPRequest() [Thanks to Constantinos Laitsas for the feedback]
- Optimized AddExternalHost()
- Optimized pRelationships()
- Optimized GetTickCount() for linux! now returns milliseconds
- Optimized UnHtml()
- Optimized OnlyOneSpace()
- Optimized RemoveShit()
- Optimized IndexPage()
- Optimized thrdBlock()

OpenWebSpider v0.4.1
- Support for tag IFRAME

OpenWebSpider v0.4
- GUI
- Support for "starting/stopping" threads
- Support for DELAYED INSERT
- Added parameter "-e"; using it the spider doesn't add external host to the DB
- Fixed a memory leak in IndexPage()
- Fixed a memory leak in ReturnFirstUrl()
- Fixed a memory leak in AddExternalHost()
- Fixed a memory leak in GetHostRank()

OpenWebSpider v0.3
- Support for robots.txt
- Fixed a bug in function BetweenTag
- Support for levels   (see related docs)
- Support for HostRank (see related docs)
- Support for PageRank (see related docs)
- Added parameter -m   (see ows_arguments)
- Fixed a bug in IndexPage

OpenWebSpider v0.2.1
- Support for HTTP 302
- Corrected some bugs

OpenWebSpider v0.2
- Multithreaded
- Corrected a lot of bugs
- Most functions rewritten
- List of hosts
- Changed the indexing-mode
- Changed the Search engine (Full-text)
- Removed realtime search

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -