⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 newsarticle.sum

📁 harvest是一个下载html网页得机器人
💻 SUM
字号:
: # *-*-perl-*-*    eval 'exec perl -S $0 "$@"'    if $running_under_some_shell;##  NewsArticle.sum - Summarizes a USENET article##  NewsArticle.sum,v 1.5 1995/11/07 01:34:38 duane Exp########################################################################  Copyright (c) 1994, 1995.  All rights reserved.##    The Harvest software was developed by the Internet Research Task#    Force Research Group on Resource Discovery (IRTF-RD):##          Mic Bowman of Transarc Corporation.#          Peter Danzig of the University of Southern California.#          Darren R. Hardy of the University of Colorado at Boulder.#          Udi Manber of the University of Arizona.#          Michael F. Schwartz of the University of Colorado at Boulder.#          Duane Wessels of the University of Colorado at Boulder.##    This copyright notice applies to software in the Harvest#    ``src/'' directory only.  Users should consult the individual#    copyright notices in the ``components/'' subdirectories for#    copyright information about other software bundled with the#    Harvest source code distribution.##  TERMS OF USE##    The Harvest software may be used and re-distributed without#    charge, provided that the software origin and research team are#    cited in any use of the system.  Most commonly this is#    accomplished by including a link to the Harvest Home Page#    (http://harvest.cs.colorado.edu/) from the query page of any#    Broker you deploy, as well as in the query result pages.  These#    links are generated automatically by the standard Broker#    software distribution.##    The Harvest software is provided ``as is'', without express or#    implied warranty, and with no support nor obligation to assist#    in its use, correction, modification or enhancement.  We assume#    no liability with respect to the infringement of copyrights,#    trade secrets, or any patents, and are not responsible for#    consequential damages.  Proper use of the Harvest software is#    entirely the responsibility of the user.##  DERIVATIVE WORKS##    Users may make derivative works from the Harvest software, subject#    to the following constraints:##      - You must include the above copyright notice and these#        accompanying paragraphs in all forms of derivative works,#        and any documentation and other materials related to such#        distribution and use acknowledge that the software was#        developed at the above institutions.##      - You must notify IRTF-RD regarding your distribution of#        the derivative work.##      - You must clearly notify users that your are distributing#        a modified version and not the original Harvest software.##      - Any derivative product is also subject to these copyright#        and use restrictions.##    Note that the Harvest software is NOT in the public domain.  We#    retain copyright, as specified above.##  HISTORY OF FREE SOFTWARE STATUS##    Originally we required sites to license the software in cases#    where they were going to build commercial products/services#    around Harvest.  In June 1995 we changed this policy.  We now#    allow people to use the core Harvest software (the code found in#    the Harvest ``src/'' directory) for free.  We made this change#    in the interest of encouraging the widest possible deployment of#    the technology.  The Harvest software is really a reference#    implementation of a set of protocols and formats, some of which#    we intend to standardize.  We encourage commercial#    re-implementations of code complying to this set of standards.##$TTL = 86400 * 14;       # 2 weeks$len = length($TTL);print "Time-to-Live\{$len\}:\t$TTL\n";$file = shift(@ARGV);open(IN, "<$file") || die "NewsArticle.sum: Cannot read $file.\n";#  Read the header$att = "";while (<IN>) {    	s/\r//g;            	# strip CR    	last if (/^\s*$/o);	# end of the header    	if (/^\t/o && $att ne "") {        	chop $_;        	$SOIF{$att} .= "\n$_";        	next;    	}    	$i = index ($_, ":");    	next if ($i == $[-1);    	$att = substr ($_, 0, $i);    	$val = substr ($_, $i+2);    	chop $val;    	$SOIF{$att} = $val;}#  Text summarize the body$| = 1;open(TEXTSUM, "| Text.sum") || die "NewsArticle.sum: Cannot run Text summarizer.\n";print TEXTSUM <IN>;close(TEXTSUM);close(IN);$SOIF{'Description'} = $SOIF{'Subject'} if (defined($SOIF{'Subject'}));$SOIF{'Description'} = $SOIF{'Summary'} if (defined($SOIF{'Summary'}));while (($k, $v) = each %SOIF) {	next if ($k eq "Path");	next if ($k =~ /^X-/o);    	next if ($k =~ /[ \t]/o);  # don't output atts with whitespace	$len = length($v);	next if ($len < 1);	print "$k\{$len\}:\t$v\n";}exit 0;		# END OF PROGRAM

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -