robotsexclusionpolicy.html

来自「网络爬虫开源代码」· HTML 代码 · 共 400 行 · 第 1/2 页

HTML
400
字号
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><!--NewPage--><HTML><HEAD><!-- Generated by javadoc (build 1.5.0_07) on Sun May 06 17:59:54 GMT 2007 --><TITLE>RobotsExclusionPolicy (Heritrix 1.12.1)</TITLE><META NAME="keywords" CONTENT="org.archive.crawler.datamodel.RobotsExclusionPolicy class"><LINK REL ="stylesheet" TYPE="text/css" HREF="../../../../stylesheet.css" TITLE="Style"><SCRIPT type="text/javascript">function windowTitle(){    parent.document.title="RobotsExclusionPolicy (Heritrix 1.12.1)";}</SCRIPT><NOSCRIPT></NOSCRIPT></HEAD><BODY BGCOLOR="white" onload="windowTitle();"><!-- ========= START OF TOP NAVBAR ======= --><A NAME="navbar_top"><!-- --></A><A HREF="#skip-navbar_top" title="Skip navigation links"></A><TABLE BORDER="0" WIDTH="100%" CELLPADDING="1" CELLSPACING="0" SUMMARY=""><TR><TD COLSPAN=2 BGCOLOR="#EEEEFF" CLASS="NavBarCell1"><A NAME="navbar_top_firstrow"><!-- --></A><TABLE BORDER="0" CELLPADDING="0" CELLSPACING="3" SUMMARY="">  <TR ALIGN="center" VALIGN="top">  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../overview-summary.html"><FONT CLASS="NavBarFont1"><B>Overview</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="package-summary.html"><FONT CLASS="NavBarFont1"><B>Package</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#FFFFFF" CLASS="NavBarCell1Rev"> &nbsp;<FONT CLASS="NavBarFont1Rev"><B>Class</B></FONT>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="class-use/RobotsExclusionPolicy.html"><FONT CLASS="NavBarFont1"><B>Use</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="package-tree.html"><FONT CLASS="NavBarFont1"><B>Tree</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../deprecated-list.html"><FONT CLASS="NavBarFont1"><B>Deprecated</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../index-all.html"><FONT CLASS="NavBarFont1"><B>Index</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../help-doc.html"><FONT CLASS="NavBarFont1"><B>Help</B></FONT></A>&nbsp;</TD>  </TR></TABLE></TD><TD ALIGN="right" VALIGN="top" ROWSPAN=3><EM></EM></TD></TR><TR><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2">&nbsp;<A HREF="../../../../org/archive/crawler/datamodel/LocalizedError.html" title="class in org.archive.crawler.datamodel"><B>PREV CLASS</B></A>&nbsp;&nbsp;<A HREF="../../../../org/archive/crawler/datamodel/RobotsHonoringPolicy.html" title="class in org.archive.crawler.datamodel"><B>NEXT CLASS</B></A></FONT></TD><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2">  <A HREF="../../../../index.html?org/archive/crawler/datamodel/RobotsExclusionPolicy.html" target="_top"><B>FRAMES</B></A>  &nbsp;&nbsp;<A HREF="RobotsExclusionPolicy.html" target="_top"><B>NO FRAMES</B></A>  &nbsp;&nbsp;<SCRIPT type="text/javascript">  <!--  if(window==top) {    document.writeln('<A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A>');  }  //--></SCRIPT><NOSCRIPT>  <A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A></NOSCRIPT></FONT></TD></TR><TR><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">  SUMMARY:&nbsp;NESTED&nbsp;|&nbsp;<A HREF="#field_summary">FIELD</A>&nbsp;|&nbsp;<A HREF="#constructor_summary">CONSTR</A>&nbsp;|&nbsp;<A HREF="#method_summary">METHOD</A></FONT></TD><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">DETAIL:&nbsp;<A HREF="#field_detail">FIELD</A>&nbsp;|&nbsp;<A HREF="#constructor_detail">CONSTR</A>&nbsp;|&nbsp;<A HREF="#method_detail">METHOD</A></FONT></TD></TR></TABLE><A NAME="skip-navbar_top"></A><!-- ========= END OF TOP NAVBAR ========= --><HR><!-- ======== START OF CLASS DATA ======== --><H2><FONT SIZE="-1">org.archive.crawler.datamodel</FONT><BR>Class RobotsExclusionPolicy</H2><PRE>java.lang.Object  <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><B>org.archive.crawler.datamodel.RobotsExclusionPolicy</B></PRE><DL><DT><B>All Implemented Interfaces:</B> <DD>java.io.Serializable</DD></DL><HR><DL><DT><PRE>public class <B>RobotsExclusionPolicy</B><DT>extends java.lang.Object<DT>implements java.io.Serializable</DL></PRE><P>RobotsExclusionPolicy represents the actual policy adopted with  respect to a specific remote server, usually constructed from  consulting the robots.txt, if any, the server provided.   (The similarly named RobotsHonoringPolicy, on the other hand,  describes the strategy used by the crawler to determine to what extent it respects exclusion rules.)  The expiration of policies after a suitable amount of time has elapsed since last fetch is handled outside this class, in  CrawlServer itself.<P><P><DL><DT><B>Author:</B></DT>  <DD>gojomo</DD><DT><B>See Also:</B><DD><A HREF="../../../../serialized-form.html#org.archive.crawler.datamodel.RobotsExclusionPolicy">Serialized Form</A></DL><HR><P><!-- =========== FIELD SUMMARY =========== --><A NAME="field_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Field Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static&nbsp;<A HREF="../../../../org/archive/crawler/datamodel/RobotsExclusionPolicy.html" title="class in org.archive.crawler.datamodel">RobotsExclusionPolicy</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/datamodel/RobotsExclusionPolicy.html#ALLOWALL">ALLOWALL</A></B></CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static&nbsp;<A HREF="../../../../org/archive/crawler/datamodel/RobotsExclusionPolicy.html" title="class in org.archive.crawler.datamodel">RobotsExclusionPolicy</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/datamodel/RobotsExclusionPolicy.html#DENYALL">DENYALL</A></B></CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>(package private) &nbsp;<A HREF="../../../../org/archive/crawler/datamodel/RobotsHonoringPolicy.html" title="class in org.archive.crawler.datamodel">RobotsHonoringPolicy</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/datamodel/RobotsExclusionPolicy.html#honoringPolicy">honoringPolicy</A></B></CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR></TABLE>&nbsp;<!-- ======== CONSTRUCTOR SUMMARY ======== --><A NAME="constructor_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Constructor Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><B><A HREF="../../../../org/archive/crawler/datamodel/RobotsExclusionPolicy.html#RobotsExclusionPolicy(org.archive.crawler.settings.CrawlerSettings, java.util.LinkedList, java.util.HashMap, org.archive.crawler.datamodel.RobotsHonoringPolicy)">RobotsExclusionPolicy</A></B>(<A HREF="../../../../org/archive/crawler/settings/CrawlerSettings.html" title="class in org.archive.crawler.settings">CrawlerSettings</A>&nbsp;settings,                      java.util.LinkedList&lt;java.lang.String&gt;&nbsp;u,                      java.util.HashMap&lt;java.lang.String,java.util.List&lt;java.lang.String&gt;&gt;&nbsp;d,                      <A HREF="../../../../org/archive/crawler/datamodel/RobotsHonoringPolicy.html" title="class in org.archive.crawler.datamodel">RobotsHonoringPolicy</A>&nbsp;honoringPolicy)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><B><A HREF="../../../../org/archive/crawler/datamodel/RobotsExclusionPolicy.html#RobotsExclusionPolicy(int)">RobotsExclusionPolicy</A></B>(int&nbsp;type)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR></TABLE>&nbsp;<!-- ========== METHOD SUMMARY =========== --><A NAME="method_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Method Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;boolean</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/datamodel/RobotsExclusionPolicy.html#disallows(org.archive.crawler.datamodel.CrawlURI, java.lang.String)">disallows</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A>&nbsp;curi,          java.lang.String&nbsp;userAgent)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR>

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?