⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 heritrix.html

📁 用JAVA编写的,在做实验的时候留下来的,本来想删的,但是传上来,大家分享吧
💻 HTML
📖 第 1 页 / 共 5 页
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><!--NewPage--><HTML><HEAD><!-- Generated by javadoc (build 1.5.0_06) on Wed Sep 27 16:03:03 PDT 2006 --><TITLE>Heritrix (Heritrix 1.10.1)</TITLE><META NAME="keywords" CONTENT="org.archive.crawler.Heritrix class"><LINK REL ="stylesheet" TYPE="text/css" HREF="../../../stylesheet.css" TITLE="Style"><SCRIPT type="text/javascript">function windowTitle(){    parent.document.title="Heritrix (Heritrix 1.10.1)";}</SCRIPT><NOSCRIPT></NOSCRIPT></HEAD><BODY BGCOLOR="white" onload="windowTitle();"><!-- ========= START OF TOP NAVBAR ======= --><A NAME="navbar_top"><!-- --></A><A HREF="#skip-navbar_top" title="Skip navigation links"></A><TABLE BORDER="0" WIDTH="100%" CELLPADDING="1" CELLSPACING="0" SUMMARY=""><TR><TD COLSPAN=2 BGCOLOR="#EEEEFF" CLASS="NavBarCell1"><A NAME="navbar_top_firstrow"><!-- --></A><TABLE BORDER="0" CELLPADDING="0" CELLSPACING="3" SUMMARY="">  <TR ALIGN="center" VALIGN="top">  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../overview-summary.html"><FONT CLASS="NavBarFont1"><B>Overview</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="package-summary.html"><FONT CLASS="NavBarFont1"><B>Package</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#FFFFFF" CLASS="NavBarCell1Rev"> &nbsp;<FONT CLASS="NavBarFont1Rev"><B>Class</B></FONT>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="class-use/Heritrix.html"><FONT CLASS="NavBarFont1"><B>Use</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="package-tree.html"><FONT CLASS="NavBarFont1"><B>Tree</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../deprecated-list.html"><FONT CLASS="NavBarFont1"><B>Deprecated</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../index-all.html"><FONT CLASS="NavBarFont1"><B>Index</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../help-doc.html"><FONT CLASS="NavBarFont1"><B>Help</B></FONT></A>&nbsp;</TD>  </TR></TABLE></TD><TD ALIGN="right" VALIGN="top" ROWSPAN=3><EM></EM></TD></TR><TR><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2">&nbsp;<A HREF="../../../org/archive/crawler/CommandLineParser.HeritrixHelpFormatter.html" title="class in org.archive.crawler"><B>PREV CLASS</B></A>&nbsp;&nbsp;<A HREF="../../../org/archive/crawler/SimpleHttpServer.html" title="class in org.archive.crawler"><B>NEXT CLASS</B></A></FONT></TD><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2">  <A HREF="../../../index.html?org/archive/crawler/Heritrix.html" target="_top"><B>FRAMES</B></A>  &nbsp;&nbsp;<A HREF="Heritrix.html" target="_top"><B>NO FRAMES</B></A>  &nbsp;&nbsp;<SCRIPT type="text/javascript">  <!--  if(window==top) {    document.writeln('<A HREF="../../../allclasses-noframe.html"><B>All Classes</B></A>');  }  //--></SCRIPT><NOSCRIPT>  <A HREF="../../../allclasses-noframe.html"><B>All Classes</B></A></NOSCRIPT></FONT></TD></TR><TR><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">  SUMMARY:&nbsp;NESTED&nbsp;|&nbsp;<A HREF="#field_summary">FIELD</A>&nbsp;|&nbsp;<A HREF="#constructor_summary">CONSTR</A>&nbsp;|&nbsp;<A HREF="#method_summary">METHOD</A></FONT></TD><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">DETAIL:&nbsp;<A HREF="#field_detail">FIELD</A>&nbsp;|&nbsp;<A HREF="#constructor_detail">CONSTR</A>&nbsp;|&nbsp;<A HREF="#method_detail">METHOD</A></FONT></TD></TR></TABLE><A NAME="skip-navbar_top"></A><!-- ========= END OF TOP NAVBAR ========= --><HR><!-- ======== START OF CLASS DATA ======== --><H2><FONT SIZE="-1">org.archive.crawler</FONT><BR>Class Heritrix</H2><PRE>java.lang.Object  <IMG SRC="../../../resources/inherit.gif" ALT="extended by "><B>org.archive.crawler.Heritrix</B></PRE><DL><DT><B>All Implemented Interfaces:</B> <DD>javax.management.DynamicMBean, javax.management.MBeanRegistration</DD></DL><HR><DL><DT><PRE>public class <B>Heritrix</B><DT>extends java.lang.Object<DT>implements javax.management.DynamicMBean, javax.management.MBeanRegistration</DL></PRE><P>Main class for Heritrix crawler. Heritrix is usually launched by a shell script that backgrounds heritrix that redirects all stdout and stderr emitted by heritrix to a log file.  So that startup messages emitted subsequent to the redirection of stdout and stderr show on the console, this class prints usage or startup output such as where the web UI can be found, etc., to a STARTLOG that the shell script is waiting on.  As soon as the shell script sees output in this file, it prints its content and breaks out of its wait. See ${HERITRIX_HOME}/bin/heritrix.  <p>Heritrix can also be embedded or launched by webapp initialization or by JMX bootstrapping.  So far I count 4 methods of instantiation: <ol> <li>From this classes main -- the method usually used;</li> <li>From the Heritrix UI (The local-instances.jsp) page;</li> <li>A creation by a JMX agent at the behest of a remote JMX client; and</li> <li>A container such as tomcat or jboss.</li> </ol><P><P><DL><DT><B>Author:</B></DT>  <DD>gojomo, Kristinn Sigurdsson, Stack</DD></DL><HR><P><!-- =========== FIELD SUMMARY =========== --><A NAME="field_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Field Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static&nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#DEFAULT_ENCODING">DEFAULT_ENCODING</A></B></CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Default encoding.</TD></TR></TABLE>&nbsp;<!-- ======== CONSTRUCTOR SUMMARY ======== --><A NAME="constructor_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Constructor Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#Heritrix()">Heritrix</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Constructor.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#Heritrix(boolean)">Heritrix</A></B>(boolean&nbsp;jmxregister)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#Heritrix(java.lang.String, boolean)">Heritrix</A></B>(java.lang.String&nbsp;name,         boolean&nbsp;jmxregister)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Constructor.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#Heritrix(java.lang.String, boolean, org.archive.crawler.admin.CrawlJobHandler)">Heritrix</A></B>(java.lang.String&nbsp;name,         boolean&nbsp;jmxregister,         <A HREF="../../../org/archive/crawler/admin/CrawlJobHandler.html" title="class in org.archive.crawler.admin">CrawlJobHandler</A>&nbsp;cjh)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Constructor.</TD></TR></TABLE>&nbsp;<!-- ========== METHOD SUMMARY =========== --><A NAME="method_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Method Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;<A HREF="../../../org/archive/crawler/admin/CrawlJob.html" title="class in org.archive.crawler.admin">CrawlJob</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#addCrawlJob(org.archive.crawler.admin.CrawlJob)">addCrawlJob</A></B>(<A HREF="../../../org/archive/crawler/admin/CrawlJob.html" title="class in org.archive.crawler.admin">CrawlJob</A>&nbsp;job)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#addCrawlJob(java.io.File, java.lang.String, java.lang.String, java.lang.String)">addCrawlJob</A></B>(java.io.File&nbsp;order,            java.lang.String&nbsp;name,            java.lang.String&nbsp;description,            java.lang.String&nbsp;seeds)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#addCrawlJob(java.lang.String, java.lang.String, java.lang.String, java.lang.String)">addCrawlJob</A></B>(java.lang.String&nbsp;orderPathOrUrl,            java.lang.String&nbsp;name,            java.lang.String&nbsp;description,            java.lang.String&nbsp;seeds)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;This method is called when we have an order file to hand that we want to base a job on.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#addCrawlJob(java.net.URL, java.net.HttpURLConnection, java.lang.String, java.lang.String, java.lang.String)">addCrawlJob</A></B>(java.net.URL&nbsp;url,            java.net.HttpURLConnection&nbsp;connection,            java.lang.String&nbsp;name,            java.lang.String&nbsp;description,            java.lang.String&nbsp;seeds)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;<A HREF="../../../org/archive/crawler/admin/CrawlJob.html" title="class in org.archive.crawler.admin">CrawlJob</A></CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#addCrawlJobBasedOn(java.io.File, java.lang.String, java.lang.String, java.lang.String)">addCrawlJobBasedOn</A></B>(java.io.File&nbsp;orderFile,                   java.lang.String&nbsp;name,                   java.lang.String&nbsp;description,                   java.lang.String&nbsp;seeds)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>&nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#addCrawlJobBasedOn(java.lang.String, java.lang.String, java.lang.String, java.lang.String)">addCrawlJobBasedOn</A></B>(java.lang.String&nbsp;jobUidOrProfile,                   java.lang.String&nbsp;name,                   java.lang.String&nbsp;description,                   java.lang.String&nbsp;seeds)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#addCrawlJobBasedonJar(java.io.File, java.lang.String, java.lang.String, java.lang.String)">addCrawlJobBasedonJar</A></B>(java.io.File&nbsp;jarFile,                      java.lang.String&nbsp;name,                      java.lang.String&nbsp;description,                      java.lang.String&nbsp;seeds)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Undo jar file and use as basis for a new job.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected static&nbsp;javax.management.ObjectName</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#addGuiPort(javax.management.ObjectName)">addGuiPort</A></B>(javax.management.ObjectName&nbsp;name)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected static&nbsp;javax.management.ObjectName</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#addVitals(javax.management.ObjectName)">addVitals</A></B>(javax.management.ObjectName&nbsp;name)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add vital stats to passed in ObjectName.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;javax.management.openmbean.OpenMBeanInfoSupport</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#buildMBeanInfo()">buildMBeanInfo</A></B>()</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Build up the MBean info for Heritrix main.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected &nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#checkForEmptyPlaceHolder(java.lang.String)">checkForEmptyPlaceHolder</A></B>(java.lang.String&nbsp;str)</CODE><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;If passed str has placeholder for the empty string, return the empty string else return orginal.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected static&nbsp;void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../org/archive/crawler/Heritrix.html#configureTrustStore()">configureTrustStore</A></B>()</CODE>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -