⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 abstractfrontier.html

📁 一个开源的网页爬虫一个开源的网页爬虫一个开源的网页爬虫一个开源的网页爬虫一个开源的网页爬虫一个开源的网页爬虫
💻 HTML
📖 第 1 页 / 共 5 页
字号:
<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Frontier is empty only if all queues are empty and no URIs are in-process</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#kickUpdate()">kickUpdate</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Notify Frontier that it should consider updating configuration info that may have changed in external files.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#loadSeeds()">loadSeeds</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Load up the seeds.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#log(org.archive.crawler.datamodel.CrawlURI)">log</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Log to the main crawl.log</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#logLocalizedErrors(org.archive.crawler.datamodel.CrawlURI)">logLocalizedErrors</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Take note of any processor-local errors that have been entered into the CrawlURI.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#needsRetrying(org.archive.crawler.datamodel.CrawlURI)">needsRetrying</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Checks if a recently completed CrawlURI that did not finish successfully needs to be retried (processed again after some time elapses)</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#noteAboutToEmit(org.archive.crawler.datamodel.CrawlURI, org.archive.crawler.frontier.WorkQueue)">noteAboutToEmit</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi,                <A HREF="../../../../org/archive/crawler/frontier/WorkQueue.html" title="class in org.archive.crawler.frontier">WorkQueue</A>&nbsp;q)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Perform fixups on a CrawlURI about to be returned via next().</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#overMaxRetries(org.archive.crawler.datamodel.CrawlURI)">overMaxRetries</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#pause()">pause</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Notify Frontier that it should not release any URIs, instead holding all threads, until instructed otherwise.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#politenessDelayFor(org.archive.crawler.datamodel.CrawlURI)">politenessDelayFor</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Update any scheduling structures with the new information in this CrawlURI.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#preNext(long)">preNext</A></B>(long&nbsp;now)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#queuedUriCount()">queuedUriCount</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(non-Javadoc)</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#reportTo(java.io.PrintWriter)">reportTo</A></B>(java.io.PrintWriter&nbsp;writer)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Make a default report to the passed-in Writer.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#retryDelayFor(org.archive.crawler.datamodel.CrawlURI)">retryDelayFor</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Return a suitable value to wait before retrying the given URI.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#saveIgnoredItems(java.lang.String, java.io.File)">saveIgnoredItems</A></B>(java.lang.String&nbsp;ignoredItems,                 java.io.File&nbsp;dir)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Dump ignored seed items (if any) to disk; delete file otherwise.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;java.io.File</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#scratchDirFor(java.lang.String)">scratchDirFor</A></B>(java.lang.String&nbsp;key)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Utility method to return a scratch dir for the given key's temp files.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#singleLineReport()">singleLineReport</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Return a short single-line summary report as a String.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#start()">start</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Request that Frontier allow crawling to begin.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#succeededFetchCount()">succeededFetchCount</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(non-Javadoc)</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#terminate()">terminate</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Notify Frontier that it should end the crawl, giving any worker ToeThread that askss for a next() an  EndedException.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#totalBytesWritten()">totalBytesWritten</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Total number of bytes contained in all URIs that have been processed.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AbstractFrontier.html#unpause()">unpause</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Resumes the release of URIs to crawl, allowing worker ToeThreads to proceed.</TD></TR></TABLE>&nbsp;<A NAME="methods_inherited_from_class_org.archive.crawler.settings.ModuleType"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TH ALIGN="left"><B>Methods inherited from class org.archive.crawler.settings.<A HREF="../../../../org/archive/crawler/settings/ModuleType.html" title="class in org.archive.crawler.settings">ModuleType</A></B></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../../org/archive/crawler/settings/ModuleType.html#addElement(org.archive.crawler.settings.CrawlerSettings, org.archive.crawler.settings.Type)">addElement</A>, <A HREF="../../../../org/archive/crawler/settings/ModuleType.html#listUsedFiles(java.util.List)">listUsedFiles</A></CODE></TD></TR></TABLE>&nbsp;<A NAME="methods_inherited_from_class_org.archive.crawler.settings.ComplexType"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TH ALIGN="left"><B>Methods inherited from class org.archive.crawler.settings.<A HREF="../../../../org/archive/crawler/settings/ComplexType.html" title="class in org.archive.crawler.settings">ComplexType</A></B></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../../org/archive/crawler/settings/ComplexType.html#addElementToDefinition(org.archive.crawler.settings.Type)">addElementToDefinition</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#checkValue(org.archive.crawler.settings.CrawlerSettings, java.lang.String, java.lang.Object)">checkValue</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#earlyInitialize(org.archive.crawler.settings.CrawlerSettings)">earlyInitialize</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAbsoluteName()">getAbsoluteName</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttribute(java.lang.Object, java.lang.String)">getAttribute</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttribute(java.lang.String)">getAttribute</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttribute(java.lang.String, org.archive.crawler.datamodel.CrawlURI)">getAttribute</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttributeInfo(org.archive.crawler.settings.CrawlerSettings, java.lang.String)">getAttributeInfo</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttributeInfo(java.lang.String)">getAttributeInfo</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttributeInfoIterator(java.lang.Object)">getAttributeInfoIterator</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getAttributes(java.lang.String[])">getAttributes</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getDataContainerRecursive(org.archive.crawler.settings.ComplexType.Context)">getDataContainerRecursive</A>, <A HREF="../../../../org/archive/crawler/settings/ComplexType.html#getDataContainerRecursive(org.archive.crawler.setti

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -