⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 mkgathererstats.pl

📁 harvest是一个下载html网页得机器人
💻 PL
字号:
: # *-*-perl-*-*	eval 'exec perl -S $0 "$@"'	if $running_under_some_shell;  ##  Generates SOIF stats body for a gatherer database.  ##  Usage: #  gzip -dc All-Templates.gz | mkgathererstats.pl 'GathererID' > INFO.soif##  mkgathererstats.pl,v 1.13 1995/03/22 17:41:41 wessels Exp#$ENV{'HARVEST_HOME'} = "/usr/local/harvest" if (!defined($ENV{'HARVEST_HOME'}));unshift(@INC, "$ENV{'HARVEST_HOME'}/lib");      # use local files require 'soif.pl';$gid = shift(@ARGV);exit 0 if ($gid eq "");$n = 0;$min_time=9999999999;$max_time=0;$soif'input = 'STDIN';$skip = <STDIN>;	# skips @DELETE$skip = <STDIN>;	# skips @REFRESH$skip = <STDIN>;	# skips @UPDATEundef $skip;while (($ttype, $url, %SOIF) = &soif'parse()) {	next if (%SOIF == ());	$n++;	$GathererName = $SOIF{'Gatherer-Name'} if (!defined($GathererName));	$GathererHost = $SOIF{'Gatherer-Host'} if (!defined($GathererHost));	$time         = $SOIF{'Update-Time'};	$min_time = $time if ($time < $min_time);	$max_time = $time if ($time > $max_time);	foreach $k (keys %SOIF) {		$k =~ s/^Embed.\d+..//i;		$att_hist{$k}++;	}	undef %SOIF;		# fully delete the GDBM storage}$OUT{'Gatherer-Name'}		= $GathererName;$OUT{'Gatherer-Host'}		= $GathererHost;$OUT{'Object-Count'}		= $n;$OUT{'Min-Update-Time'}		= $min_time;$OUT{'Max-Update-Time'}		= $max_time;$OUT{'Attribute-Histogram'}	= "";foreach $k (sort att_cmp keys %att_hist) {	$OUT{'Attribute-Histogram'} .= sprintf ("%5d %s\n", $att_hist{$k}, $k);}&soif'print ('INFO', $gid, %OUT);exit 0;sub usage {	print STDERR "Usage: mkgathererstats.pl GathererID\n";	exit 1;}sub att_cmp {	$att_hist{$b} <=> $att_hist{$a}}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -