⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 simulator.py

📁 Harvestman-最新版本
💻 PY
字号:
# -- coding: utf-8""" Simulator plugin for HarvestMan. Thisplugin changes the behaviour of HarvestManto only simulate crawling without actuallydownloading anything.Author: Anand B Pillai <abpillai at gmail dot com>Created Feb 7 2007  Anand B Pillai <abpillai at gmail dot com>Copyright (C) 2007 Anand B Pillai   """__version__ = '2.0 b1'__author__ = 'Anand B Pillai'from harvestman.lib import hooksfrom harvestman.lib.common.common import *from harvestman.lib.common.macros import CONNECTOR_DATA_MODE_INMEMdef save_url(self, urlobj):    # For simulation, we need to modify the behaviour    # of save_url function in HarvestManUrlConnector class.    # This is achieved by injecting this function as a plugin    # Note that the signatures of both functions have to    # be the same.    url = urlobj.get_full_url()    self.connect(urlobj, True, self._cfg.retryfailed)    return 6def apply_plugin():    """ All plugin modules need to define this method """    # This method is expected to perform the following steps.    # 1. Register the required hook function    # 2. Get the config object and set/override any required settings    # 3. Print any informational messages.    # The first step is required, the last two are of course optional    # depending upon the required application of the plugin.        cfg = objects.config    cfg.simulate = True    cfg.localise = 0    hooks.register_plugin_function('connector:save_url_plugin', save_url)    # Turn off caching, since no files are saved    cfg.pagecache = 0    # Turn off header dumping, since no files are saved    cfg.urlheaders = 0    # For simulator, we need in-mem data mode    # since files are never saved!    cfg.datamode = CONNECTOR_DATA_MODE_INMEM    logconsole('Simulation mode turned on. Crawl will be simulated and no files will be saved.')

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -