📄 statisticssummary.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><!--NewPage--><HTML><HEAD><!-- Generated by javadoc (build 1.5.0_06) on Wed Sep 27 16:03:04 PDT 2006 --><TITLE>StatisticsSummary (Heritrix 1.10.1)</TITLE><META NAME="keywords" CONTENT="org.archive.crawler.admin.StatisticsSummary class"><LINK REL ="stylesheet" TYPE="text/css" HREF="../../../../stylesheet.css" TITLE="Style"><SCRIPT type="text/javascript">function windowTitle(){ parent.document.title="StatisticsSummary (Heritrix 1.10.1)";}</SCRIPT><NOSCRIPT></NOSCRIPT></HEAD><BODY BGCOLOR="white" onload="windowTitle();"><!-- ========= START OF TOP NAVBAR ======= --><A NAME="navbar_top"><!-- --></A><A HREF="#skip-navbar_top" title="Skip navigation links"></A><TABLE BORDER="0" WIDTH="100%" CELLPADDING="1" CELLSPACING="0" SUMMARY=""><TR><TD COLSPAN=2 BGCOLOR="#EEEEFF" CLASS="NavBarCell1"><A NAME="navbar_top_firstrow"><!-- --></A><TABLE BORDER="0" CELLPADDING="0" CELLSPACING="3" SUMMARY=""> <TR ALIGN="center" VALIGN="top"> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../../overview-summary.html"><FONT CLASS="NavBarFont1"><B>Overview</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="package-summary.html"><FONT CLASS="NavBarFont1"><B>Package</B></FONT></A> </TD> <TD BGCOLOR="#FFFFFF" CLASS="NavBarCell1Rev"> <FONT CLASS="NavBarFont1Rev"><B>Class</B></FONT> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="class-use/StatisticsSummary.html"><FONT CLASS="NavBarFont1"><B>Use</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="package-tree.html"><FONT CLASS="NavBarFont1"><B>Tree</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../../deprecated-list.html"><FONT CLASS="NavBarFont1"><B>Deprecated</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../../index-all.html"><FONT CLASS="NavBarFont1"><B>Index</B></FONT></A> </TD> <TD BGCOLOR="#EEEEFF" CLASS="NavBarCell1"> <A HREF="../../../../help-doc.html"><FONT CLASS="NavBarFont1"><B>Help</B></FONT></A> </TD> </TR></TABLE></TD><TD ALIGN="right" VALIGN="top" ROWSPAN=3><EM></EM></TD></TR><TR><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2"> <A HREF="../../../../org/archive/crawler/admin/SeedRecord.html" title="class in org.archive.crawler.admin"><B>PREV CLASS</B></A> <A HREF="../../../../org/archive/crawler/admin/StatisticsTracker.html" title="class in org.archive.crawler.admin"><B>NEXT CLASS</B></A></FONT></TD><TD BGCOLOR="white" CLASS="NavBarCell2"><FONT SIZE="-2"> <A HREF="../../../../index.html?org/archive/crawler/admin/StatisticsSummary.html" target="_top"><B>FRAMES</B></A> <A HREF="StatisticsSummary.html" target="_top"><B>NO FRAMES</B></A> <SCRIPT type="text/javascript"> <!-- if(window==top) { document.writeln('<A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A>'); } //--></SCRIPT><NOSCRIPT> <A HREF="../../../../allclasses-noframe.html"><B>All Classes</B></A></NOSCRIPT></FONT></TD></TR><TR><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2"> SUMMARY: NESTED | <A HREF="#field_summary">FIELD</A> | <A HREF="#constructor_summary">CONSTR</A> | <A HREF="#method_summary">METHOD</A></FONT></TD><TD VALIGN="top" CLASS="NavBarCell3"><FONT SIZE="-2">DETAIL: <A HREF="#field_detail">FIELD</A> | <A HREF="#constructor_detail">CONSTR</A> | <A HREF="#method_detail">METHOD</A></FONT></TD></TR></TABLE><A NAME="skip-navbar_top"></A><!-- ========= END OF TOP NAVBAR ========= --><HR><!-- ======== START OF CLASS DATA ======== --><H2><FONT SIZE="-1">org.archive.crawler.admin</FONT><BR>Class StatisticsSummary</H2><PRE>java.lang.Object <IMG SRC="../../../../resources/inherit.gif" ALT="extended by "><B>org.archive.crawler.admin.StatisticsSummary</B></PRE><HR><DL><DT><PRE>public class <B>StatisticsSummary</B><DT>extends java.lang.Object</DL></PRE><P>This class provides descriptive statistics of a finished crawl job by using the crawl report files generated by StatisticsTracker. Any formatting changes to the way StatisticsTracker writes to the summary crawl reports will require changes to this class. <p> The following statistics are accessible from this class: <ul> <li> Successfully downloaded documents per fetch status code <li> Successfully downloaded documents per document mime type <li> Amount of data per mime type <li> Successfully downloaded documents per host <li> Amount of data per host <li> Successfully downloaded documents per top-level domain name (TLD) <li> Disposition of all seeds <li> Successfully downloaded documents per host per source </ul> <p>TODO: Make it so summarizing is not done all in RAM so we avoid OOME.<P><P><DL><DT><B>Author:</B></DT> <DD>Frank McCown</DD><DT><B>See Also:</B><DD><A HREF="../../../../org/archive/crawler/admin/StatisticsTracker.html" title="class in org.archive.crawler.admin"><CODE>StatisticsTracker</CODE></A></DL><HR><P><!-- =========== FIELD SUMMARY =========== --><A NAME="field_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Field Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#bandwidthKbytesPerSec">bandwidthKbytesPerSec</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#dnsStatusCodeDistribution">dnsStatusCodeDistribution</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#durationTime">durationTime</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#hostsBytes">hostsBytes</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#hostsDistribution">hostsDistribution</A></B></CODE><BR> Keep track of hosts</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#hostsDnsBytes">hostsDnsBytes</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#hostsDnsDistribution">hostsDnsDistribution</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#mimeTypeBytes">mimeTypeBytes</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#mimeTypeDistribution">mimeTypeDistribution</A></B></CODE><BR> Keep track of the file types we see (mime type -> count)</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#mimeTypeDnsBytes">mimeTypeDnsBytes</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#mimeTypeDnsDistribution">mimeTypeDnsDistribution</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#processedDocsPerSec">processedDocsPerSec</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Map</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#processedSeedsRecords">processedSeedsRecords</A></B></CODE><BR> Keep track of processed seeds</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#statusCodeDistribution">statusCodeDistribution</A></B></CODE><BR> Keep track of status codes</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#tldBytes">tldBytes</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#tldDistribution">tldDistribution</A></B></CODE><BR> Keep track of TLDs</TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.util.Hashtable</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#tldHostDistribution">tldHostDistribution</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected java.lang.String</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalDataWritten">totalDataWritten</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalDnsHostDocuments">totalDnsHostDocuments</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalDnsHostSize">totalDnsHostSize</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalDnsMimeSize">totalDnsMimeSize</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalDnsMimeTypeDocuments">totalDnsMimeTypeDocuments</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalDnsStatusCodeDocuments">totalDnsStatusCodeDocuments</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalFileTypeDocuments">totalFileTypeDocuments</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalHostDocuments">totalHostDocuments</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalHosts">totalHosts</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalHostSize">totalHostSize</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalMimeSize">totalMimeSize</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalMimeTypeDocuments">totalMimeTypeDocuments</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalStatusCodeDocuments">totalStatusCodeDocuments</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalTldDocuments">totalTldDocuments</A></B></CODE><BR> </TD></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD ALIGN="right" VALIGN="top" WIDTH="1%"><FONT SIZE="-1"><CODE>protected long</CODE></FONT></TD><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#totalTldSize">totalTldSize</A></B></CODE><BR> </TD></TR></TABLE> <!-- ======== CONSTRUCTOR SUMMARY ======== --><A NAME="constructor_summary"><!-- --></A><TABLE BORDER="1" WIDTH="100%" CELLPADDING="3" CELLSPACING="0" SUMMARY=""><TR BGCOLOR="#CCCCFF" CLASS="TableHeadingColor"><TH ALIGN="left" COLSPAN="2"><FONT SIZE="+2"><B>Constructor Summary</B></FONT></TH></TR><TR BGCOLOR="white" CLASS="TableRowColor"><TD><CODE><B><A HREF="../../../../org/archive/crawler/admin/StatisticsSummary.html#StatisticsSummary(org.archive.crawler.admin.CrawlJob)">StatisticsSummary</A></B>(<A HREF="../../../../org/archive/crawler/admin/CrawlJob.html" title="class in org.archive.crawler.admin">CrawlJob</A> cjob)</CODE><BR> Constructor</TD></TR></TABLE> <!-- ========== METHOD SUMMARY =========== --><A NAME="method_summary"><!-- --></A>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -