📄 1_4_0.html
字号:
<html><head><META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>5. Release 1.4.0 - 04/28/2005</title><link href="../docbook.css" rel="stylesheet" type="text/css"><meta content="DocBook XSL Stylesheets V1.67.2" name="generator"><link rel="start" href="index.html" title="Heritrix Release Notes"><link rel="up" href="index.html" title="Heritrix Release Notes"><link rel="prev" href="1_6_0.html" title="4. Release 1.6.0 - 12/01/2005"><link rel="next" href="1_2_0.html" title="6. Release 1.2.0 - 11/16/2004"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table summary="Navigation header" width="100%"><tr><th align="center" colspan="3">5. Release 1.4.0 - 04/28/2005</th></tr><tr><td align="left" width="20%"><a accesskey="p" href="1_6_0.html">Prev</a> </td><th align="center" width="60%"> </th><td align="right" width="20%"> <a accesskey="n" href="1_2_0.html">Next</a></td></tr></table><hr></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="1_4_0"></a>5. Release 1.4.0 - 04/28/2005</h2></div></div></div><div class="abstract"><p class="title"><b>Abstract</b></p><p>Much improved memory usage, new scoping/filter model, and a new revisiting frontier. Over 90 bugs fixed.</p></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="1_4_0_limitations"></a>5.1. Known Limitations/Issues</h3></div></div></div><div class="sect3" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="glibc2_3_2"></a>5.1.1. Glibc 2.3.2 and NPTL</h4></div></div></div><p>NPTL is the 'new' linux threading model. It replaces <span class="emphasis"><em>linuxthreads</em></span> the 'old' model. You can tell you're running NPTL if your java process shows as one process only in the process listing. Wwith linuxthreads, all java threads show as distinct linux processes. Linux threading is integral to glibc. </p><p>On rare occasions we've seen the crawler hang without obvious explaination when running with NPTL threading on linux. Doing a thread dump on the hung crawler, one version of the hung crawler has threads waiting to obtain a lock that no one apparently holds. Our reading has these rare, crawl-killing, hangs as a problem in glbc2.3.2 when running with NPTL (NPTL 0.60) (We used to hang frequently but workarounds seem to have mitigated the frequency of lockup making it extremely rare). An upgrade to glibc2.3.3+ seems to do away with these hangs. Glibc2.3.3 has NPTL 0.61. Fedora3 has glibc2.3.4. If an upgrade is not possible -- for example, the new glibc is not currently available for debian -- you can disable NPTL and run with old threads by setting the environment variable <code class="literal">LD_ASSUME_KERNEL=2.4.1</code> (You can set this environment variable on a per process basis). </p><p>NPTL is usually the default threading model on linux and is usually what you want -- threads are more lightweight and java throughput seems to be slightly higher with NPTL enabled. Various are the ways in which you can see which threading model you are using. Do an ldd on the java executable to see what shared libraries its using. Note the location of the glibc shared library. Executing <code class="literal">PATH_TO_GLIBC/lib.so.6</code>, usually <code class="literal">/lib/lib.so.6</code>, will list details on glibc. Look in the listing for either 'nptl' or 'linuxthreads'. On debian systems, lib.so.6 is not executable but you can make it so. You can also do the following to determine library versions and which threading you are using: <code class="literal">% getconf GNU_LIBC_VERSION</code> and <code class="literal">% getconf GNU_LIBPTHREAD_VERSION</code>. </p><p>See <a href="http://sourceforge.net/tracker/index.php?func=detail&aid=1086554&group_id=73833&atid=539099" target="_top">[ 1086554 ] glibc 2.3.2 NPTL hang (Was bdbfrontier stall in...)</a> for more on the issue.</p></div><div class="sect3" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="1093962"></a>5.1.2. <a href="http://sourceforge.net/tracker/index.php?func=detail&aid=1093962&group_id=73833&atid=539099" target="_top">[1093962] SSL handshake fails when server requests switch to SSL V3</a></h4></div></div></div><p>When connecting to a secure server, if the server wants to switch from SSL V2 to SSL V3 when client is using a SUN JVM, the connection fails. See issue 1093962for more. </p></div><div class="sect3" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="old_jobs_profiles"></a>5.1.3. Using old jobs or profiles with 1.4</h4></div></div></div><p>You'll need to make one change to make your old order.xml files and profiles to run with Heritrix 1.4.x. Below is a diff that shows the change that needs to be made (The
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -