linksscoper.html

来自「网络爬虫开源代码」· HTML 代码 · 共 554 行 · 第 1/3 页

HTML
554
字号
<DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#org.archive.crawler.postprocessor.LinksScoper.ATTR_PREFERENCE_DEPTH_HOPS">Constant Field Values</A></DL></DL><!-- ========= CONSTRUCTOR DETAIL ======== --><A NAME="constructor_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="1"><FONT SIZE="+2"><B>Constructor Detail</B></FONT></TH></TR></TABLE><A NAME="LinksScoper(java.lang.String)"><!-- --></A><H3>LinksScoper</H3><PRE>public <B>LinksScoper</B>(java.lang.String&nbsp;name)</PRE><DL><DL><DT><B>Parameters:</B><DD><CODE>name</CODE> - Name of this filter.</DL></DL><!-- ============ METHOD DETAIL ========== --><A NAME="method_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="1"><FONT SIZE="+2"><B>Method Detail</B></FONT></TH></TR></TABLE><A NAME="innerProcess(org.archive.crawler.datamodel.CrawlURI)"><!-- --></A><H3>innerProcess</H3><PRE>protected void <B>innerProcess</B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</PRE><DL><DD><B>Description copied from class: <CODE><A HREF="../../../../org/archive/crawler/framework/Processor.html#innerProcess(org.archive.crawler.datamodel.CrawlURI)">Processor</A></CODE></B></DD><DD>Classes subclassing this one should override this method to perform their custom actions on the CrawlURI.<P><DD><DL><DT><B>Overrides:</B><DD><CODE><A HREF="../../../../org/archive/crawler/framework/Processor.html#innerProcess(org.archive.crawler.datamodel.CrawlURI)">innerProcess</A></CODE> in class <CODE><A HREF="../../../../org/archive/crawler/framework/Processor.html" title="class in org.archive.crawler.framework">Processor</A></CODE></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>curi</CODE> - The CrawlURI being processed.</DL></DD></DL><HR><A NAME="handlePrerequisite(org.archive.crawler.datamodel.CrawlURI)"><!-- --></A><H3>handlePrerequisite</H3><PRE>protected void <B>handlePrerequisite</B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi)</PRE><DL><DD>The CrawlURI has a prerequisite; apply scoping and update Link to CandidateURI in manner analogous to outlink handling.<P><DD><DL></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>curi</CODE> - CrawlURI with prereq to consider</DL></DD></DL><HR><A NAME="outOfScope(org.archive.crawler.datamodel.CandidateURI)"><!-- --></A><H3>outOfScope</H3><PRE>protected void <B>outOfScope</B>(<A HREF="../../../../org/archive/crawler/datamodel/CandidateURI.html" title="class in org.archive.crawler.datamodel">CandidateURI</A>&nbsp;caUri)</PRE><DL><DD><B>Description copied from class: <CODE><A HREF="../../../../org/archive/crawler/framework/Scoper.html#outOfScope(org.archive.crawler.datamodel.CandidateURI)">Scoper</A></CODE></B></DD><DD>Called when a CandidateUri is ruled out of scope. Override if you don't want logs as coming from this class.<P><DD><DL><DT><B>Overrides:</B><DD><CODE><A HREF="../../../../org/archive/crawler/framework/Scoper.html#outOfScope(org.archive.crawler.datamodel.CandidateURI)">outOfScope</A></CODE> in class <CODE><A HREF="../../../../org/archive/crawler/framework/Scoper.html" title="class in org.archive.crawler.framework">Scoper</A></CODE></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>caUri</CODE> - CandidateURI that is out of scope.</DL></DD></DL><HR><A NAME="getRejectLogRules(java.lang.Object)"><!-- --></A><H3>getRejectLogRules</H3><PRE>protected <A HREF="../../../../org/archive/crawler/deciderules/DecideRule.html" title="class in org.archive.crawler.deciderules">DecideRule</A> <B>getRejectLogRules</B>(java.lang.Object&nbsp;o)</PRE><DL><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="getSchedulingFor(org.archive.crawler.datamodel.CrawlURI, org.archive.crawler.extractor.Link, int)"><!-- --></A><H3>getSchedulingFor</H3><PRE>protected int <B>getSchedulingFor</B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi,                               <A HREF="../../../../org/archive/crawler/extractor/Link.html" title="class in org.archive.crawler.extractor">Link</A>&nbsp;wref,                               int&nbsp;preferenceDepthHops)</PRE><DL><DD>Determine scheduling for the  <code>curi</code>. As with the LinksScoper in general, this only handles extracted links, seeds do not pass through here, but are given MEDIUM priority.   Imports into the frontier similarly do not pass through here,  but are given NORMAL priority.<P><DD><DL></DL></DD><DD><DL></DL></DD></DL><!-- ========= END OF CLASS DATA ========= --><HR><!-- ======= START OF BOTTOM NAVBAR ====== --><A NAME="navbar_bottom"><!-- --></A><A HREF="#skip-navbar_bottom" title="Skip navigation links"></A><TABLE BORDER="0" WIDTH="100%" CELLPADDING="1" CELLSPACING="0" SUMMARY=""><TR><TD COLSPAN=2 BGCOLOR="#EEEEFF" CLASS="NavBarCell1"><A NAME="navbar_bottom_firstrow"><!-- --></A><TABLE BORDER="0" CELLPADDING="0" CELLSPACING="3" SUMMARY="">  <TR ALIGN="center" VALIGN="top">  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../overview-summary.html"><FONT CLASS="NavBarFont1"><B>Overview</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="package-summary.html"><FONT CLASS="NavBarFont1"><B>Package</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#FFFFFF" CLASS="NavBarCell1Rev"> &nbsp;<FONT CLASS="NavBarFont1Rev"><B>Class</B></FONT>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="class-use/LinksScoper.html"><FONT CLASS="NavBarFont1"><B>Use</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="package-tree.html"><FONT CLASS="NavBarFont1"><B>Tree</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../deprecated-list.html"><FONT CLASS="NavBarFont1"><B>Deprecated</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../index-all.html"><FONT CLASS="NavBarFont1"><B>Index</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../help-doc.html"><FONT CLASS="NavBarFont1"><B>Help</B></FONT></A>&nbsp;</TD>  </TR></TABLE></TD><TD ALIGN="right" VALIGN="top" ROWSPAN=3><EM></EM></TD></TR><TR><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2">&nbsp;<A HREF="../../../../org/archive/crawler/postprocessor/ImageWaitEvaluator.html" title="class in org.archive.crawler.postprocessor"><B>PREV CLASS</B></A>&nbsp;&nbsp;<A HREF="../../../../org/archive/crawler/postprocessor/LowDiskPauseProcessor.html" title="class in org.archive.crawler.postprocessor"><B>NEXT CLASS</B></A></FONT></TD><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2">  <A HREF="../../../../index.html?org/archive/crawler/postprocessor/LinksScoper.html" target="_top"><B>FRAMES</B></A>  &nbsp;&nbsp;<A HREF="LinksScoper.html" target="_top"><B>NO FRAMES</B></A>  &nbsp;&nbsp;<SCRIPT type="text/javascript">  <!--  if(window==top) {    document.writeln('<A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A>');  }  //--></SCRIPT><NOSCRIPT>  <A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A></NOSCRIPT></FONT></TD></TR><TR><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">  SUMMARY:&nbsp;<A HREF="#nested_classes_inherited_from_class_org.archive.crawler.settings.ComplexType">NESTED</A>&nbsp;|&nbsp;<A HREF="#field_summary">FIELD</A>&nbsp;|&nbsp;<A HREF="#constructor_summary">CONSTR</A>&nbsp;|&nbsp;<A HREF="#method_summary">METHOD</A></FONT></TD><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">DETAIL:&nbsp;<A HREF="#field_detail">FIELD</A>&nbsp;|&nbsp;<A HREF="#constructor_detail">CONSTR</A>&nbsp;|&nbsp;<A HREF="#method_detail">METHOD</A></FONT></TD></TR></TABLE><A NAME="skip-navbar_bottom"></A><!-- ======== END OF BOTTOM NAVBAR ======= --><HR>Copyright &copy; 2003-2007 Internet Archive. All Rights Reserved.</BODY></HTML>

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?