⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 1_4_0.html

📁 用JAVA编写的,在做实验的时候留下来的,本来想删的,但是传上来,大家分享吧
💻 HTML
📖 第 1 页 / 共 3 页
字号:
type of the <code class="literal">path</code> changed from <code class="literal">string</code>to <code class="literal">stringList</code>):<pre class="programlisting">+++ order.xml   2005-02-01 13:12:34.000000000 -0800@@ -162,7 +162,9 @@         &lt;string name="prefix"&gt;BT&lt;/string&gt;         &lt;string name="suffix"&gt;&lt;/string&gt;         &lt;integer name="max-size-bytes"&gt;100000000&lt;/integer&gt;-        &lt;string name="path"&gt;arcs&lt;/string&gt;+        &lt;stringList name="path"&gt;+          &lt;string&gt;arcs&lt;/string&gt;+        &lt;/stringList&gt;         &lt;integer name="pool-max-active"&gt;5&lt;/integer&gt;         &lt;integer name="pool-max-wait"&gt;300000&lt;/integer&gt;       &lt;/newObject&gt;</pre>        </p></div><div class="sect3" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="cme_frontier"></a>5.1.4.&nbsp;<a href="https://sourceforge.net/tracker/?func=detail&atid=539099&aid=1119644&group_id=73833" target="_top">[ 1119644 ] frontier      ConcurrentModificationException</a></h4></div></div></div><p>Sometimes you'll get a ConcurrentModificationException exception      when you go to view or refresh the Frontier's report page. Workaround      is to retry. The page should eventually come up.      </p></div><div class="sect3" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="arcfile_suffix"></a>5.1.5.&nbsp;New ARC file suffix</h4></div></div></div><p>Pre-release 1.2.0, currently open ARC files that are being        written to by the crawler were differentiated by an '.open' suffix.        When the crawler finished writing, the suffix was removed.        A new suffix has been introduced -- '.invalid' -- which the crawler        will use to mark ARC files it thinks suspect -- usually because there        was an IOException thrown during the writing of an ARC Record. Such        ARCs need to be checked for validity.  Run <code class="literal">% gzip -t</code>         and <code class="literal">% ARCReader --strict</code>         against all files with an '.invalid' suffix -- and any unclosed        '.open' files present after a crawl has ended -- to        check for corruption.        </p></div><div class="sect3" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="1149470"></a>5.1.6.&nbsp;DNS lookups fail (-6 in crawl.log)</h4></div></div></div><p>        <a href="https://sourceforge.net/tracker/index.php?func=detail&aid=1149470&group_id=73833&atid=539099" target="_top">[1149470] all DNS attempts fail -6</a> discusses badly-formatted DNS records returned on windows platform        that Heritrix fails to parse and it includes a pointer to a mailing        list discussion of failed lookups on non-english windows.        The issue includes description of a workaround.        </p></div><div class="sect3" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="1178102"></a>5.1.7.&nbsp;FatalConfigurationException creating new job based on old</h4></div></div></div><p>Older SUN JVMS -- pre-beta3 versions of the SUN JVM 1.5.0 for        instance -- had an issue using nio copying files.        Try upgrading your JVM.  See <a href="http://sourceforge.net/tracker/index.php?func=detail&aid=1178102&group_id=73833&atid=539099" target="_top">[1178102] FCE on creation of new job based on job w/ overrides</a> for more on this.        </p></div><div class="sect3" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="oome142"></a>5.1.8.&nbsp;OutOfMemoryErrors (OOMEs)</h4></div></div></div><p>Unusual pages -- pages of unorthodox structure,        pages that contain thousands upon thousands of        links -- will on occasion produce OOMEs.</p><p>There have been improvements regards memory usage running        multiple jobs in series,    <a href="1_2_0.html#oome_pending_jobs" title="6.1.3.&nbsp;Running more than one job in series throws OOME">Section&nbsp;6.1.3, &ldquo;Running more than one job in series throws OOME&rdquo;</a>, but starting up a new job after        a long-running job can prompt OOMEs.        Workaround for now is to restart Heritrix between the running of        big jobs.</p></div></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="1_4_0_changes"></a>5.2.&nbsp;Changes</h3></div></div></div><div class="sect3" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="bdbfrontier"></a>5.2.1.&nbsp;Berkeley DB Based Frontier</h4></div></div></div><p>The BdbFrontier -- a frontier that keeps its queues of

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -