⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 adaptiverevisitfrontier.html

📁 用JAVA编写的,在做实验的时候留下来的,本来想删的,但是传上来,大家分享吧
💻 HTML
📖 第 1 页 / 共 5 页
字号:
<TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#batchSchedule(org.archive.crawler.datamodel.CandidateURI)">batchSchedule</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CandidateURI.html" title="class in org.archive.crawler.datamodel">CandidateURI</A>&nbsp;caUri)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#calculateSnoozeTime(org.archive.crawler.datamodel.CrawlURI)">calculateSnoozeTime</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Calculates how long a host queue needs to be snoozed following the crawling of a URI.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#canonicalize(org.archive.crawler.datamodel.CandidateURI)">canonicalize</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CandidateURI.html" title="class in org.archive.crawler.datamodel">CandidateURI</A>&nbsp;cauri)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Canonicalize passed CandidateURI.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#canonicalize(org.archive.net.UURI)">canonicalize</A></B>(<A HREF="../../../../org/archive/net/UURI.html" title="class in org.archive.net">UURI</A>&nbsp;uuri)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Canonicalize passed uuri.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;float</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#congestionRatio()">congestionRatio</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#considerIncluded(org.archive.net.UURI)">considerIncluded</A></B>(<A HREF="../../../../org/archive/net/UURI.html" title="class in org.archive.net">UURI</A>&nbsp;u)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Notify Frontier that it should consider the given UURI as if already scheduled.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#crawlCheckpoint(java.io.File)">crawlCheckpoint</A></B>(java.io.File&nbsp;checkpointDir)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Called by <A HREF="../../../../org/archive/crawler/framework/CrawlController.html" title="class in org.archive.crawler.framework"><CODE>CrawlController</CODE></A> when checkpointing.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#crawlEnded(java.lang.String)">crawlEnded</A></B>(java.lang.String&nbsp;sExitMessage)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Called when a CrawlController has ended a crawl and is about to exit.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#crawlEnding(java.lang.String)">crawlEnding</A></B>(java.lang.String&nbsp;sExitMessage)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Called when a CrawlController is ending a crawl (for any reason)</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#crawlPaused(java.lang.String)">crawlPaused</A></B>(java.lang.String&nbsp;statusMessage)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Called when a CrawlController is actually paused (all threads are idle).</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#crawlPausing(java.lang.String)">crawlPausing</A></B>(java.lang.String&nbsp;statusMessage)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Called when a CrawlController is going to be paused.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#crawlResuming(java.lang.String)">crawlResuming</A></B>(java.lang.String&nbsp;statusMessage)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Called when a CrawlController is resuming a crawl that had been paused.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#crawlStarted(java.lang.String)">crawlStarted</A></B>(java.lang.String&nbsp;message)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Called on crawl start.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;<A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.html" title="interface in org.archive.crawler.datamodel">UriUniqFilter</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#createAlreadyIncluded()">createAlreadyIncluded</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Create a UriUniqFilter that will serve as record  of already seen URIs.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#deepestUri()">deepestUri</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#deleted(org.archive.crawler.datamodel.CrawlURI)">deleted</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Notify Frontier that a CrawlURI has been deleted outside of the normal next()/finished() lifecycle.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#deleteURIs(java.lang.String)">deleteURIs</A></B>(java.lang.String&nbsp;match)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Delete any URI that matches the given regular expression from the list of discovered and pending URIs.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#discoveredUriCount()">discoveredUriCount</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Number of <i>discovered</i> URIs.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#disregardDisposition(org.archive.crawler.datamodel.CrawlURI)">disregardDisposition</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#disregardedUriCount()">disregardedUriCount</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Number of URIs that were scheduled at one point but have been <i>disregarded</i>.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#failedFetchCount()">failedFetchCount</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Number of URIs that <i>failed</i> to process.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/frontier/AdaptiveRevisitFrontier.html#failureDisposition(org.archive.crawler.datamodel.CrawlURI)">failureDisposition</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The CrawlURI has encountered a problem, and will not be retried.</TD></TR>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -