⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 lexicalcrawlmapper.html

📁 用JAVA编写的,在做实验的时候留下来的,本来想删的,但是传上来,大家分享吧
💻 HTML
📖 第 1 页 / 共 3 页
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><!--NewPage--><HTML><HEAD><!-- Generated by javadoc (build 1.5.0_06) on Wed Sep 27 16:03:11 PDT 2006 --><TITLE>LexicalCrawlMapper (Heritrix 1.10.1)</TITLE><META NAME="keywords" CONTENT="org.archive.crawler.processor.LexicalCrawlMapper class"><LINK REL ="stylesheet" TYPE="text/css" HREF="../../../../stylesheet.css" TITLE="Style"><SCRIPT type="text/javascript">function windowTitle(){    parent.document.title="LexicalCrawlMapper (Heritrix 1.10.1)";}</SCRIPT><NOSCRIPT></NOSCRIPT></HEAD><BODY BGCOLOR="white" onload="windowTitle();"><!-- ========= START OF TOP NAVBAR ======= --><A NAME="navbar_top"><!-- --></A><A HREF="#skip-navbar_top" title="Skip navigation links"></A><TABLE BORDER="0" WIDTH="100%" CELLPADDING="1" CELLSPACING="0" SUMMARY=""><TR><TD COLSPAN=2 BGCOLOR="#EEEEFF" CLASS="NavBarCell1"><A NAME="navbar_top_firstrow"><!-- --></A><TABLE BORDER="0" CELLPADDING="0" CELLSPACING="3" SUMMARY="">  <TR ALIGN="center" VALIGN="top">  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../overview-summary.html"><FONT CLASS="NavBarFont1"><B>Overview</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="package-summary.html"><FONT CLASS="NavBarFont1"><B>Package</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#FFFFFF" CLASS="NavBarCell1Rev"> &nbsp;<FONT CLASS="NavBarFont1Rev"><B>Class</B></FONT>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="class-use/LexicalCrawlMapper.html"><FONT CLASS="NavBarFont1"><B>Use</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="package-tree.html"><FONT CLASS="NavBarFont1"><B>Tree</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../deprecated-list.html"><FONT CLASS="NavBarFont1"><B>Deprecated</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../index-all.html"><FONT CLASS="NavBarFont1"><B>Index</B></FONT></A>&nbsp;</TD>  <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1">    <A HREF="../../../../help-doc.html"><FONT CLASS="NavBarFont1"><B>Help</B></FONT></A>&nbsp;</TD>  </TR></TABLE></TD><TD ALIGN="right" VALIGN="top" ROWSPAN=3><EM></EM></TD></TR><TR><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2">&nbsp;<A HREF="../../../../org/archive/crawler/processor/HashCrawlMapper.html" title="class in org.archive.crawler.processor"><B>PREV CLASS</B></A>&nbsp;&nbsp;NEXT CLASS</FONT></TD><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2">  <A HREF="../../../../index.html?org/archive/crawler/processor/LexicalCrawlMapper.html" target="_top"><B>FRAMES</B></A>  &nbsp;&nbsp;<A HREF="LexicalCrawlMapper.html" target="_top"><B>NO FRAMES</B></A>  &nbsp;&nbsp;<SCRIPT type="text/javascript">  <!--  if(window==top) {    document.writeln('<A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A>');  }  //--></SCRIPT><NOSCRIPT>  <A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A></NOSCRIPT></FONT></TD></TR><TR><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">  SUMMARY:&nbsp;<A HREF="#nested_classes_inherited_from_class_org.archive.crawler.settings.ComplexType">NESTED</A>&nbsp;|&nbsp;<A HREF="#field_summary">FIELD</A>&nbsp;|&nbsp;<A HREF="#constructor_summary">CONSTR</A>&nbsp;|&nbsp;<A HREF="#method_summary">METHOD</A></FONT></TD><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">DETAIL:&nbsp;<A HREF="#field_detail">FIELD</A>&nbsp;|&nbsp;<A HREF="#constructor_detail">CONSTR</A>&nbsp;|&nbsp;<A HREF="#method_detail">METHOD</A></FONT></TD></TR></TABLE><A NAME="skip-navbar_top"></A><!-- ========= END OF TOP NAVBAR ========= --><HR><!-- ======== START OF CLASS DATA ======== --><H2><FONT SIZE="-1">org.archive.crawler.processor</FONT><BR>Class LexicalCrawlMapper</H2><PRE>java.lang.Object  <IMG SRC="../../../../resources/inherit.gif" ALT="extended by ">javax.management.Attribute      <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../../org/archive/crawler/settings/Type.html" title="class in org.archive.crawler.settings">org.archive.crawler.settings.Type</A>          <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../../org/archive/crawler/settings/ComplexType.html" title="class in org.archive.crawler.settings">org.archive.crawler.settings.ComplexType</A>              <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../../org/archive/crawler/settings/ModuleType.html" title="class in org.archive.crawler.settings">org.archive.crawler.settings.ModuleType</A>                  <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../../org/archive/crawler/framework/Processor.html" title="class in org.archive.crawler.framework">org.archive.crawler.framework.Processor</A>                      <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><A HREF="../../../../org/archive/crawler/processor/CrawlMapper.html" title="class in org.archive.crawler.processor">org.archive.crawler.processor.CrawlMapper</A>                          <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><B>org.archive.crawler.processor.LexicalCrawlMapper</B></PRE><DL><DT><B>All Implemented Interfaces:</B> <DD>java.io.Serializable, javax.management.DynamicMBean, <A HREF="../../../../org/archive/crawler/datamodel/FetchStatusCodes.html" title="interface in org.archive.crawler.datamodel">FetchStatusCodes</A></DD></DL><HR><DL><DT><PRE>public class <B>LexicalCrawlMapper</B><DT>extends <A HREF="../../../../org/archive/crawler/processor/CrawlMapper.html" title="class in org.archive.crawler.processor">CrawlMapper</A></DL></PRE><P>A simple crawl splitter/mapper, dividing up CandidateURIs/CrawlURIs between crawlers by diverting some range of URIs to local log files (which can then be imported to other crawlers).   May operate on a CrawlURI (typically early in the processing chain) or its CandidateURI outlinks (late in the processing chain, after  LinksScoper), or both (if inserted and configured in both places).   <p>Uses lexical comparisons of classKeys to map URIs to crawlers. The 'map' is specified via either a local or HTTP-fetchable file. Each line of this file should contain two space-separated tokens, the first a key and the second a crawler node name (which should be legal as part of a filename). All URIs will be mapped to the crawler node name associated with the nearest mapping key equal or subsequent  to the URI's own classKey. If there are no mapping keys equal or  after the classKey, the mapping 'wraps around' to the first mapping key.  <p>One crawler name is distinguished as the 'local name'; URIs mapped to this name are not diverted, but continue to be processed normally.  <p>For example, assume a SurtAuthorityQueueAssignmentPolicy and a simple mapping file:  <pre>  d crawlerA  ~ crawlerB </pre> <p>All URIs with "com," classKeys will find the 'd' key as the nearest subsequent mapping key, and thus be mapped to 'crawlerA'. If that's the 'local name', the URIs will be processed normally; otherwise, the URI will be written to a diversion log aimed for 'crawlerA'.   <p>If using the JMX importUris operation importing URLs dropped by a <A HREF="../../../../org/archive/crawler/processor/LexicalCrawlMapper.html" title="class in org.archive.crawler.processor"><CODE>LexicalCrawlMapper</CODE></A> instance, use <code>recoveryLog</code> style.<P><P><DL><DT><B>Version:</B></DT>  <DD>$Date: 2006/08/08 07:15:24 $, $Revision: 1.1 $</DD><DT><B>Author:</B></DT>  <DD>gojomo</DD><DT><B>See Also:</B><DD><A HREF="../../../../serialized-form.html#org.archive.crawler.processor.LexicalCrawlMapper">Serialized Form</A></DL><HR><P><!-- ======== NESTED CLASS SUMMARY ======== --><A NAME="nested_class_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Nested Class Summary</B></FONT></TH></TR></TABLE>&nbsp;<A NAME="nested_classes_inherited_from_class_org.archive.crawler.settings.ComplexType"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TH ALIGN="left"><B>Nested classes/interfaces inherited from class org.archive.crawler.settings.<A HREF="../../../../org/archive/crawler/settings/ComplexType.html" title="class in org.archive.crawler.settings">ComplexType</A></B></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../../org/archive/crawler/settings/ComplexType.MBeanAttributeInfoIterator.html" title="class in org.archive.crawler.settings">ComplexType.MBeanAttributeInfoIterator</A></CODE></TD></TR></TABLE>&nbsp;<!-- =========== FIELD SUMMARY =========== --><A NAME="field_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Field Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>static&nbsp;java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/processor/LexicalCrawlMapper.html#ATTR_MAP_SOURCE">ATTR_MAP_SOURCE</A></B></CODE>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -