📄 writerpoolprocessor.html
字号:
<p> This method is garanteed to be called after the crawl is set up, but before any URI-processing has occured.<P><DD><DL><DT><B>Overrides:</B><DD><CODE><A HREF="../../../../org/archive/crawler/framework/Processor.html#initialTasks()">initialTasks</A></CODE> in class <CODE><A HREF="../../../../org/archive/crawler/framework/Processor.html" title="class in org.archive.crawler.framework">Processor</A></CODE></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="getSerialNo()"><!-- --></A><H3>getSerialNo</H3><PRE>protected java.util.concurrent.atomic.AtomicInteger <B>getSerialNo</B>()</PRE><DL><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="setupPool(java.util.concurrent.atomic.AtomicInteger)"><!-- --></A><H3>setupPool</H3><PRE>protected abstract void <B>setupPool</B>(java.util.concurrent.atomic.AtomicInteger serialNo)</PRE><DL><DD>Set up pool of files.<P><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="innerProcess(org.archive.crawler.datamodel.CrawlURI)"><!-- --></A><H3>innerProcess</H3><PRE>protected abstract void <B>innerProcess</B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A> curi)</PRE><DL><DD>Writes a CrawlURI and its associated data to store file. Currently this method understands the following uri types: dns, http, and https.<P><DD><DL><DT><B>Overrides:</B><DD><CODE><A HREF="../../../../org/archive/crawler/framework/Processor.html#innerProcess(org.archive.crawler.datamodel.CrawlURI)">innerProcess</A></CODE> in class <CODE><A HREF="../../../../org/archive/crawler/framework/Processor.html" title="class in org.archive.crawler.framework">Processor</A></CODE></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>curi</CODE> - CrawlURI to process.</DL></DD></DL><HR><A NAME="checkBytesWritten()"><!-- --></A><H3>checkBytesWritten</H3><PRE>protected void <B>checkBytesWritten</B>()</PRE><DL><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="getHostAddress(org.archive.crawler.datamodel.CrawlURI)"><!-- --></A><H3>getHostAddress</H3><PRE>protected java.lang.String <B>getHostAddress</B>(<A HREF="../../../../org/archive/crawler/datamodel/CrawlURI.html" title="class in org.archive.crawler.datamodel">CrawlURI</A> curi)</PRE><DL><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="getAttributeUnchecked(java.lang.String)"><!-- --></A><H3>getAttributeUnchecked</H3><PRE>public java.lang.Object <B>getAttributeUnchecked</B>(java.lang.String name)</PRE><DL><DD>Version of getAttributes that catches and logs exceptions and returns null if failure to fetch the attribute.<P><DD><DL></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>name</CODE> - Attribute name.<DT><B>Returns:</B><DD>Attribute or null.</DL></DD></DL><HR><A NAME="getMaxSize()"><!-- --></A><H3>getMaxSize</H3><PRE>public int <B>getMaxSize</B>()</PRE><DL><DD>Max size we want files to be (bytes). Default is ARCConstants.DEFAULT_MAX_ARC_FILE_SIZE. Note that ARC files will usually be bigger than maxSize; they'll be maxSize + length to next boundary.<P><DD><DL></DL></DD><DD><DL><DT><B>Returns:</B><DD>ARC maximum size.</DL></DD></DL><HR><A NAME="getPrefix()"><!-- --></A><H3>getPrefix</H3><PRE>public java.lang.String <B>getPrefix</B>()</PRE><DL><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="getOutputDirs()"><!-- --></A><H3>getOutputDirs</H3><PRE>public java.util.List <B>getOutputDirs</B>()</PRE><DL><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="isCompressed()"><!-- --></A><H3>isCompressed</H3><PRE>public boolean <B>isCompressed</B>()</PRE><DL><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="getPoolMaximumActive()"><!-- --></A><H3>getPoolMaximumActive</H3><PRE>public int <B>getPoolMaximumActive</B>()</PRE><DL><DD><DL></DL></DD><DD><DL><DT><B>Returns:</B><DD>Returns the poolMaximumActive.</DL></DD></DL><HR><A NAME="getPoolMaximumWait()"><!-- --></A><H3>getPoolMaximumWait</H3><PRE>public int <B>getPoolMaximumWait</B>()</PRE><DL><DD><DL></DL></DD><DD><DL><DT><B>Returns:</B><DD>Returns the poolMaximumWait.</DL></DD></DL><HR><A NAME="getSuffix()"><!-- --></A><H3>getSuffix</H3><PRE>public java.lang.String <B>getSuffix</B>()</PRE><DL><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="getMaxToWrite()"><!-- --></A><H3>getMaxToWrite</H3><PRE>public long <B>getMaxToWrite</B>()</PRE><DL><DD><DL></DL></DD><DD><DL></DL></DD></DL><HR><A NAME="crawlEnding(java.lang.String)"><!-- --></A><H3>crawlEnding</H3><PRE>public void <B>crawlEnding</B>(java.lang.String sExitMessage)</PRE><DL><DD><B>Description copied from interface: <CODE><A HREF="../../../../org/archive/crawler/event/CrawlStatusListener.html#crawlEnding(java.lang.String)">CrawlStatusListener</A></CODE></B></DD><DD>Called when a CrawlController is ending a crawl (for any reason)<P><DD><DL><DT><B>Specified by:</B><DD><CODE><A HREF="../../../../org/archive/crawler/event/CrawlStatusListener.html#crawlEnding(java.lang.String)">crawlEnding</A></CODE> in interface <CODE><A HREF="../../../../org/archive/crawler/event/CrawlStatusListener.html" title="interface in org.archive.crawler.event">CrawlStatusListener</A></CODE></DL></DD><DD><DL><DT><B>Parameters:</B><DD><CODE>sExitMessage</CODE> - Type of exit. Should be one of the STATUS constants in defined in CrawlJob.<DT><B>See Also:</B><DD><A HREF="../../../../org/archive/crawler/admin/CrawlJob.html" title="class in org.archive.crawler.admin"><CODE>CrawlJob</CODE></A></DL></DD></DL><HR><A NAME="crawlEnded(java.lang.String)"><!-- --></A><H3>crawlEnded</H3><PRE>public void <B>crawlEnded</B>(java.lang.String sExitMessage)</PRE><DL><DD><B>Description copied from interface: <CODE><A HREF="../../../../org/archive/crawler/event/CrawlStatusListener.html#crawlEnded(java.lang.String)">CrawlStatusListener</A></CODE></B></DD><DD>Called when a CrawlController has ended a crawl and is about to exit.<P><DD><DL><DT><B>Specified by:</B><DD><CODE><A HREF="../../../../org/archive/crawler/event/CrawlStatusListener.html#crawlEnded(java.lang.String)">crawlEnded</A></CODE> in interface <CODE><A HREF="../../../../org/archive/crawler/event/CrawlStatusListener.html" title="interface in org.archive.crawler.event">CrawlStatusListener</A></CODE></DL>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -