⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 0_10_0.html

📁 用JAVA编写的,在做实验的时候留下来的,本来想删的,但是传上来,大家分享吧
💻 HTML
字号:
<html><head><META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>10.&nbsp;Release 0.10.0 - 2004-06-046</title><link href="../docbook.css" rel="stylesheet" type="text/css"><meta content="DocBook XSL Stylesheets V1.67.2" name="generator"><link rel="start" href="index.html" title="Heritrix Release Notes"><link rel="up" href="index.html" title="Heritrix Release Notes"><link rel="prev" href="1_0_0.html" title="9.&nbsp;Release 1.0.0 - 2004-08-06"><link rel="next" href="0_8_1.html" title="11.&nbsp;Release 0.10.0 - 2004-06-04"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table summary="Navigation header" width="100%"><tr><th align="center" colspan="3">10.&nbsp;Release 0.10.0 - 2004-06-046</th></tr><tr><td align="left" width="20%"><a accesskey="p" href="1_0_0.html">Prev</a>&nbsp;</td><th align="center" width="60%">&nbsp;</th><td align="right" width="20%">&nbsp;<a accesskey="n" href="0_8_1.html">Next</a></td></tr></table><hr></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="0_10_0"></a>10.&nbsp;Release 0.10.0 - 2004-06-046</h2></div></div></div><div class="abstract"><p class="title"><b>Abstract</b></p><p>Release for second heritrix workshop, Copenhagen 06/2004 (1.0.0      first release candidate). Added site-first prioritization, fixed link      extraction of multibyte URIs, added metadata to arcs as xml, changed arc      naming template, new user and developer manuals, added basic/digest auth      and http post/get login facility, and added help to UI. Bug      fixes.</p></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="N11DF9"></a>10.1.&nbsp;Changes</h3></div></div></div><p><div class="table"><a name="N11DFD"></a><p class="title"><b>Table&nbsp;10.&nbsp;Changes</b></p><table summary="Changes" border="1"><colgroup><col><col><col></colgroup><thead><tr><th>ID</th><th>Type</th><th>Summary</th></tr></thead><tbody><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=896769" target="_top">896769</a></td><td>Add</td><td>job report: show 'active' hosts, show more size                totals</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=896772" target="_top">896772</a></td><td>Add</td><td>"Site-first"/'frontline' prioritization</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=956614" target="_top">956614</a></td><td>Add</td><td>multiple open http connections per host needed</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=896674" target="_top">896674</a></td><td>Add</td><td>Add help to web UI</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=964931" target="_top">964931</a></td><td>Add</td><td>When a host last had a completed URI shown in crawl                report</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=958335" target="_top">958335</a></td><td>Add</td><td>Encode multibyte URIs using page charset before                queuing</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=909246" target="_top">909246</a></td><td>Add</td><td>One src for site, help, and readme docs.</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=936684" target="_top">936684</a></td><td>Add</td><td>identifying ARCs: unique names, header records</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=930667" target="_top">930667</a></td><td>Add</td><td>Resetting arc file counter for every job.</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=863318" target="_top">863318</a></td><td>Add</td><td>ARCs need better headers</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=908507" target="_top">908507</a></td><td>Add</td><td>Specify location of jobs dir</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=914301" target="_top">914301</a></td><td>Add</td><td>Logging in (HTTP POST, Basic Auth, etc.)</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833&atid=539099&aid=944066" target="_top">944066</a></td><td>Add</td><td>Update dnsjava from 1.5 to 1.6.2 (Fix NPE)</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=966168" target="_top">966168</a></td><td>Fix</td><td>crawl.log entries without annotations end with a                space</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=966172" target="_top">966172</a></td><td>Fix</td><td>An issue with arc names' date and serial number                alignment</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=957963" target="_top">957963</a></td><td>Fix</td><td>Output of warning message leads to                NullPointerExceptions</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=963965" target="_top">963965</a></td><td>Fix</td><td>Either UURI or ExtractHTML should strip whitespace                better</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=965267" target="_top">965267</a></td><td>Fix</td><td>Maximum documents not enforced</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=965308" target="_top">965308</a></td><td>Fix</td><td>NPE in path depth filter</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=934549" target="_top">934549</a></td><td>Fix</td><td>embed/speculative inclusion too loose</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=962899" target="_top">962899</a></td><td>Fix</td><td>UnsupportedCharsetException handled awkwardly</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=962892" target="_top">962892</a></td><td>Fix</td><td>UURI accepting/creating unUsable URIs (bad                hosts)</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=860733" target="_top">860733</a></td><td>Fix</td><td>CachingDiskLongFPSet UI availability</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=954130" target="_top">954130</a></td><td>Fix</td><td>Crawls slow till change a setting</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=961867" target="_top">961867</a></td><td>Fix</td><td>zero link-hops should work</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=942627" target="_top">942627</a></td><td>Fix</td><td>multiple robots.txt URLs in the "default"                frontier</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=957941" target="_top">957941</a></td><td>Fix</td><td>NPE in ExtractorHTML#isHtmlExpectedHere</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=953718" target="_top">953718</a></td><td>Fix</td><td>Unwanted behavior with seed redirection</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=952636" target="_top">952636</a></td><td>Fix</td><td>Link extraction failing</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=863315" target="_top">863315</a></td><td>Fix</td><td>Memory issues: Frontier.snoozeQueue</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=903838" target="_top">903838</a></td><td>Fix</td><td>Transitive scope confusion, may not work as                expected</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=955345" target="_top">955345</a></td><td>Fix</td><td>Wrong stats after deleting URIs from Frontier</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=952276" target="_top">952276</a></td><td>Fix</td><td>NoSuchElementException in                admin/reports/frontier.jsp</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=952665" target="_top">952665</a></td><td>Fix</td><td>Alert: Authentication scheme(s) not supported</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=936702" target="_top">936702</a></td><td>Fix</td><td>IP validity: units, TTL vs. setting</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=951582" target="_top">951582</a></td><td>Fix</td><td>ConcurrentModificationException in DomainScope focus                filter</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=949489" target="_top">949489</a></td><td>Fix</td><td>ConcurrentModificationException terminate job</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=949551" target="_top">949551</a></td><td>Fix</td><td>Authentication bug</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=948898" target="_top">948898</a></td><td>Fix</td><td>terminate running crawl == NPE</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=927940" target="_top">927940</a></td><td>Fix</td><td>java.net.URI parses %20 but getHost null</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=874220" target="_top">874220</a></td><td>Fix</td><td>NPE in java.net.URI.encode</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=808270" target="_top">808270</a></td><td>Fix</td><td>java.net.URI chokes on hosts_with_underscores</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=788277" target="_top">788277</a></td><td>Fix</td><td>Doing separate DNS lookup for same host</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=910120" target="_top">910120</a></td><td>Fix</td><td>java.net.URI#getHost fails when leading digit</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=949548" target="_top">949548</a></td><td>Fix</td><td>Constraining java URI class</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=943373" target="_top">943373</a></td><td>Fix</td><td>Same CrawlServer instance for http &amp; https.</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=887999" target="_top">887999</a></td><td>Fix</td><td>Broad crawl/ too many open files</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=926912" target="_top">926912</a></td><td>Fix</td><td>multiple charset headers + long lines</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=926338" target="_top">926338</a></td><td>Fix</td><td>Corrupted blue image in progress bars</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=896757" target="_top">896757</a></td><td>Fix</td><td>NPEs in Andy's Th-Fri Crawl + NPE in RIS</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=922080" target="_top">922080</a></td><td>Fix</td><td>IllegalArgumentEx/ReplayCharSequenceFactory (offset vs.                size</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=935271" target="_top">935271</a></td><td>Fix</td><td>FTP URIs in seeds interpreted as HTTP</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=945923" target="_top">945923</a></td><td>Fix</td><td>maven rc2 won't make src distribution</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=947754" target="_top">947754</a></td><td>Fix</td><td>Corrupted arc files on termination of job</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=931269" target="_top">931269</a></td><td>Fix</td><td>https exception: java.io.IOException: SSL                failure</td></tr><tr><td><a href="http://sourceforge.net/tracker/index.php?func=detail&group_id=73833amp;atid=539099amp;aid=935146" target="_top">935146</a></td><td>Fix</td><td>Excessive ARCWriterPool timeouts:</td></tr></tbody></table></div></p></div></div><div class="navfooter"><hr><table summary="Navigation footer" width="100%"><tr><td align="left" width="40%"><a accesskey="p" href="1_0_0.html">Prev</a>&nbsp;</td><td align="center" width="20%">&nbsp;</td><td align="right" width="40%">&nbsp;<a accesskey="n" href="0_8_1.html">Next</a></td></tr><tr><td valign="top" align="left" width="40%">9.&nbsp;Release 1.0.0 - 2004-08-06&nbsp;</td><td align="center" width="20%"><a accesskey="h" href="index.html">Home</a></td><td valign="top" align="right" width="40%">&nbsp;11.&nbsp;Release 0.10.0 - 2004-06-04</td></tr></table></div></body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -