📄 fpmergeuriuniqfilter.html
字号:
<TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#createFp(java.lang.CharSequence)">createFp</A></B>(java.lang.CharSequence key)</CODE><BR> Create a fingerprint from the given key</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected abstract void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#finishFpMerge()">finishFpMerge</A></B>()</CODE><BR> Complete the merge of candidate and previously-known FPs (closing files/iterators as appropriate).</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#flush()">flush</A></B>()</CODE><BR> Perform a merge of all 'pending' items to the overall fingerprint list.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#forget(java.lang.String, org.archive.crawler.datamodel.CandidateURI)">forget</A></B>(java.lang.String key, <A HREF="../../../../org/archive/crawler/datamodel/CandidateURI.html" title="class in org.archive.crawler.datamodel">CandidateURI</A> value)</CODE><BR> Forget item was seen</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#note(java.lang.String)">note</A></B>(java.lang.String key)</CODE><BR> Note item as seen, without passing through to receiver.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#pend(long, org.archive.crawler.datamodel.CandidateURI)">pend</A></B>(long fp, <A HREF="../../../../org/archive/crawler/datamodel/CandidateURI.html" title="class in org.archive.crawler.datamodel">CandidateURI</A> value)</CODE><BR> Place the given FP/CandidateURI pair into the pending set, awaiting a merge to determine if it's actually accepted.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#pending()">pending</A></B>()</CODE><BR> Count of items added, but not yet filtered in or out.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#profileLog(java.lang.String)">profileLog</A></B>(java.lang.String key)</CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#requestFlush()">requestFlush</A></B>()</CODE><BR> Request that any pending items be added/dropped.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#setDestination(org.archive.crawler.datamodel.UriUniqFilter.HasUriReceiver)">setDestination</A></B>(<A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.HasUriReceiver.html" title="interface in org.archive.crawler.datamodel">UriUniqFilter.HasUriReceiver</A> receiver)</CODE><BR> Receiver of uniq URIs.</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#setMaxPending(int)">setMaxPending</A></B>(int max)</CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE> void</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/util/FPMergeUriUniqFilter.html#setProfileLog(java.io.File)">setProfileLog</A></B>(java.io.File logfile)</CODE><BR> Set a File to receive a log for replay profiling.</TD></TR></TABLE> <A NAME="methods_inherited_from_class_java.lang.Object"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TH ALIGN="left"><B>Methods inherited from class java.lang.Object</B></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE>clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait</CODE></TD></TR></TABLE> <A NAME="methods_inherited_from_class_org.archive.crawler.datamodel.UriUniqFilter"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#EEEEFF" CLASS="TableSubHeadingColor"><TH ALIGN="left"><B>Methods inherited from interface org.archive.crawler.datamodel.<A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.html" title="interface in org.archive.crawler.datamodel">UriUniqFilter</A></B></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.html#count()">count</A></CODE></TD></TR></TABLE> <P><!-- ============ FIELD DETAIL =========== --><A NAME="field_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="1"><FONT SIZE="+2"><B>Field Detail</B></FONT></TH></TR></TABLE><A NAME="receiver"><!-- --></A><H3>receiver</H3><PRE>protected <A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.HasUriReceiver.html" title="interface in org.archive.crawler.datamodel">UriUniqFilter.HasUriReceiver</A> <B>receiver</B></PRE><DL><DL></DL></DL><HR><A NAME="profileLog"><!-- --></A><H3>profileLog</H3><PRE>protected java.io.PrintWriter <B>profileLog</B></PRE><DL><DL></DL></DL><HR><A NAME="quickDuplicateCount"><!-- --></A><H3>quickDuplicateCount</H3><PRE>protected long <B>quickDuplicateCount</B></PRE><DL><DL></DL></DL><HR><A NAME="quickDupAtLast"><!-- --></A><H3>quickDupAtLast</H3><PRE>protected long <B>quickDupAtLast</B></PRE><DL><DL></DL></DL><HR><A NAME="pendDuplicateCount"><!-- --></A><H3>pendDuplicateCount</H3><PRE>protected long <B>pendDuplicateCount</B></PRE><DL><DL></DL></DL><HR><A NAME="pendDupAtLast"><!-- --></A><H3>pendDupAtLast</H3><PRE>protected long <B>pendDupAtLast</B></PRE><DL><DL></DL></DL><HR><A NAME="mergeDuplicateCount"><!-- --></A><H3>mergeDuplicateCount</H3><PRE>protected long <B>mergeDuplicateCount</B></PRE><DL><DL></DL></DL><HR><A NAME="mergeDupAtLast"><!-- --></A><H3>mergeDupAtLast</H3><PRE>protected long <B>mergeDupAtLast</B></PRE><DL><DL></DL></DL><HR><A NAME="pendingSet"><!-- --></A><H3>pendingSet</H3><PRE>protected java.util.TreeSet <B>pendingSet</B></PRE><DL><DD>items awaiting merge TODO: consider only sorting just pre-merge TODO: consider using a fastutil long->Object class TODO: consider actually writing items to disk file, as in Najork/Heydon<P><DL></DL></DL><HR><A NAME="maxPending"><!-- --></A><H3>maxPending</H3><PRE>protected int <B>maxPending</B></PRE><DL><DD>size at which to force flush of pending items<P><DL></DL></DL><HR><A NAME="DEFAULT_MAX_PENDING"><!-- --></A><H3>DEFAULT_MAX_PENDING</H3><PRE>public static final int <B>DEFAULT_MAX_PENDING</B></PRE><DL><DL><DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#org.archive.crawler.util.FPMergeUriUniqFilter.DEFAULT_MAX_PENDING">Constant Field Values</A></DL></DL><HR><A NAME="nextFlushAllowableAfter"><!-- --></A><H3>nextFlushAllowableAfter</H3><PRE>protected long <B>nextFlushAllowableAfter</B></PRE><DL><DD>time-based throttle on flush-merge operations<P><DL></DL></DL><HR><A NAME="FLUSH_DELAY_FACTOR"><!-- --></A><H3>FLUSH_DELAY_FACTOR</H3><PRE>public static final long <B>FLUSH_DELAY_FACTOR</B></PRE><DL><DL><DT><B>See Also:</B><DD><A HREF="../../../../constant-values.html#org.archive.crawler.util.FPMergeUriUniqFilter.FLUSH_DELAY_FACTOR">Constant Field Values</A></DL></DL><HR><A NAME="quickCache"><!-- --></A><H3>quickCache</H3><PRE>protected <A HREF="../../../../org/archive/util/fingerprint/ArrayLongFPCache.html" title="class in org.archive.util.fingerprint">ArrayLongFPCache</A> <B>quickCache</B></PRE><DL><DD>cache of most recently seen FPs<P><DL></DL></DL><!-- ========= CONSTRUCTOR DETAIL ======== --><A NAME="constructor_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="1"><FONT SIZE="+2"><B>Constructor Detail</B></FONT></TH></TR></TABLE><A NAME="FPMergeUriUniqFilter()"><!-- --></A><H3>FPMergeUriUniqFilter</H3><PRE>public <B>FPMergeUriUniqFilter</B>()</PRE><DL></DL><!-- ============ METHOD DETAIL ========== --><A NAME="method_detail"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="1"><FONT SIZE="+2"><B>Method Detail</B></FONT></TH></TR></TABLE><A NAME="setMaxPending(int)"><!-- --></A><H3>setMaxPending</H3><PRE>public void <B>setMaxPending</B>(int max)</PRE><DL><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="pending()"><!-- --></A><H3>pending</H3><PRE>public long <B>pending</B>()</PRE><DL><DD><B>Description copied from interface: <CODE><A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.html#pending()">UriUniqFilter</A></CODE></B></DD><DD>Count of items added, but not yet filtered in or out. Some implementations may buffer up large numbers of pending items to be evaluated in a later large batch/scan/merge with disk files.<P><DD><DL><DT><B>Specified by:</B><DD><CODE><A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.html#pending()">pending</A></CODE> in interface <CODE><A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.html" title="interface in org.archive.crawler.datamodel">UriUniqFilter</A></CODE></DL></DD><DD><DL><DT><B>Returns:</B><DD>Count of items added not yet evaluated</DL></DD></DL><HR><A NAME="setDestination(org.archive.crawler.datamodel.UriUniqFilter.HasUriReceiver)"><!-- --></A><H3>setDestination</H3><PRE>public void <B>setDestination</B>(<A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.HasUriReceiver.html" title="interface in org.archive.crawler.datamodel">UriUniqFilter.HasUriReceiver</A> receiver)</PRE><DL><DD><B>Description copied from interface: <CODE><A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.html#setDestination(org.archive.crawler.datamodel.UriUniqFilter.HasUriReceiver)">UriUniqFilter</A></CODE></B></DD><DD>Receiver of uniq URIs. Items that have not been seen before are pass through to this object.<P><DD><DL><DT><B>Specified by:</B><DD><CODE><A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.html#setDestination(org.archive.crawler.datamodel.UriUniqFilter.HasUriReceiver)">setDestination</A></CODE> in interface <CODE><A HREF="../../../../org/archive/crawler/datamodel/UriUniqFilter.html" title="interface in org.archive.crawler.datamodel">UriUniqFilter</A></CODE></DL>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -