⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 displaysoif.in

📁 harvest是一个下载html网页得机器人
💻 IN
字号:
#!@PERL@#----------------------------------------------------------------------#  displaySOIF - 20020218 beta version :)##  Take a SOIF file and print it##  (c) 2002 RedIRIS. The author may be contacted by the email#                    address: javier.masa @ rediris.es##  You can use, distribute and/or modifie this file under the#  terms of the GNU General Public License as published by#  the Free Software Foundation (http://www.fsf.org/copyleft/gpl.html).##  This program is free software; you can redistribute it and/or modify#  it under the terms of the GNU General Public License as published by#  the Free Software Foundation; either version 2 of the License, or#  (at your option) any later version.##  This program is distributed in the hope that it will be useful,#  but WITHOUT ANY WARRANTY; without even the implied warranty of#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the#  GNU General Public License for more details.##  You should have received a copy of the GNU General Public License#  along with this program; if not, write to the Free Software#  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.#----------------------------------------------------------------------use IO::File;use Socket;$prefixDir = "@prefix@";$regFile = '^\@FILE { (.+)$';$regLine = '(.*){(.+)}:\s*';&ReadParse;$rFile = "$in{'object'}";$rFile =~ s/\/Harvest/$prefixDir/;$searchText = $in{'query'};print &Cabecera;#select STDERR ; $| = 1;#select STDOUT ; $| = 1;#--- We download the SOIF page using standard libraries#open FILE,"<$rFile";$body = join "", <FILE>;close FILE;=We don't use this yet$sock = new IO::File;$rWebServer= $ENV{'SERVER_NAME'};$rPort= $ENV{'SERVER_PORT'};$rPage= $in{'object'};$oldAlarmHandler = $SIG{'ALRM'} || 'DEFAULT';$SIG{'ALRM'} = sub{  $SIG{'ALRM'}=$oldAlarmHandler;  print "ERR-ERR-2\n";  exit;};alarm(9);$a=&connect($rWebServer, $rPort);if (defined $a){  alarm(0);  $SIG{'ALRM'}=$oldAlarmHandler;  print "ERR-ERR-1\n";  exit;}&_sock_write($sock, "GET $rPage HTTP/1.0\nHost: $rWebServer\n\n");while (($line=<$sock>) && ($line ne "\r\n")){#  print "<p>$line";}$body = join "", <$sock>;&disconnect();alarm(0);$SIG{'ALRM'}=$oldAlarmHandler;=cut@a = ($body =~ /$regFile/comig);$nFile = $a[0];$len = length("\@FILE { ")+length($nFile);if ("$in{'style'}" ne "plain") {    print_formatted ();} else {    print_plain ();}print_footer ();exit;sub print_formatted {#--- We print the header#print << "HEADER";<html><head><style type="text/css"><!--  body {color: #666666; font-family: arial, helvetica, sans-serif; font-size: small;}  h3 {color: #666666; font-size: large;}  .col0 {color:#000000; background-color:#ffcc00; text-decoration:none}  .col1 {color:#ffffff; background-color:#f35d00; text-decoration:none}  .col2 {color:#ffffff; background-color:#000099; text-decoration:none}  .col3 {color:#ffffff; background-color:#009900; text-decoration:none}  .col4 {color:#ffffff; background-color:#999000; text-decoration:none}  .col5 {color:#ffff00; background-color:#dd2233; text-decoration:none}  .col6 {color:#ffffff; background-color:#00dd00; text-decoration:none}  .col7 {color:#ffffff; background-color:#abcdef; text-decoration:none}  .col8 {color:#333333; background-color:#ffddaa; text-decoration:none}  .col9 {color:#cccccc; background-color:#222222; text-decoration:none}  .title {font-size: xx-large;}  .meta {color: #666666; font-weight: bold;}  .metaMenu {color: #6666ff; font-size: small;  .url {font-size: small;}--></style><title>Detailed metadata for $nFile</title></head><body bgcolor="#ffffff"><p><span class="title">Detailed metadata for</span><br><span class="url">$nFile</span>HEADER#--- Searched text#print "<h3>Searched text</h3>\n";$searchText =~ s/ AND | OR | NOT / /g; #--- If $searchText is like: aceite AND toxico# If $searchText is like: guerra AND civil OR ( "url" : "hispanianova\.rediris\.es" )$searchText =~ s/\(|\)|:|"//g;$searchText =~ s/[ ]+/ /g;@colorQuery = split / /, $searchText;#--- We need to order elements in searched text in reverse lenght order#    to avoid problems in coloring texts like "de" and "index"@colorQuery = sort bylength @colorQuery;print "Search terms are colored in order to ease their identification in the text. Click on one of these elements to directly access its first occurrence in the document.<p>";print "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;";for ($xx=0; $xx<= $#colorQuery; $xx++){  print "<a class=\"col$xx\" href=\"#$colorQuery[$xx]\"><b>&nbsp;$colorQuery[$xx]&nbsp;</b></a> ";}#--- We add key and lenght#    update-time{10} ....#@b = ($body =~ /$regLine/comig);$i = "1";foreach $m (@b){  if ($i eq "1")  {    $tagName=$m;    $i="0";  }  else  {    $aux{$tagName}=$m;    $i="1";  }  $x++;}#--- Print Meta-tags index in 3 columns#print "\n\n<a name=\"meta\"></a>\n";print "<h3>Meta-tags index</h3>\n";print "<dl><dd>\n";print "<table border=\"0\" cellspacing=\"4\">\n<tr valign=\"top\">\n<td>\n";$x = 0;$numElem = keys %aux;foreach $key (sort keys %aux){  print "</td>\n<td>\n" if ($x == int($numElem/3)+1);  print "</td>\n<td>\n" if ($x == int($numElem*2/3)+1);  if ($key eq "")  {    #--- No tag name    print "  <li><a class=\"metaMenu\" href=\"#no_name\">

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -