📄 ch21.htm

📁 prrl 5 programs codes in the book
💻 HTM
📖 第 1 页 / 共 5 页
字号:

    ($site, $status) = (parseLogEntry())[0, 9];



    if ($status eq '401') {

        $siteList{$site}++;

    }

}

close(LOGFILE);



@sortedSites = sort(keys(%siteList));



if (scalar(@sortedSites) == 0) {

    print(&quot;There were no unauthorized access attempts.\n&quot;);

}

else {

    foreach $site (@sortedSites) {

        $count = $siteList{$site};

        write;

    }

}

</PRE>

</BLOCKQUOTE>

<P>

This program displays:

<BLOCKQUOTE>

<PRE>       

Unauthorized Access Report        Pg 1



  Remote Site Name                        Access Count

  --------------------------------------- ------------

  ip48-max1-fitch.zipnet.net                     1

  kairos.algonet.se                              4

</PRE>

</BLOCKQUOTE>

<P>

You can expand this program's usefulness by also displaying the

logName and fullName items from the log file.

<H3><A NAME="ExampleConvertingtheReporttoaWebPage">

Example: Converting the Report to a Web Page</A></H3>

<P>

Creating nice reports for your own use is all well and good. But

suppose your boss wants the statistics updated hourly and available

on demand? Printing the report and faxing to the head office is

probably a bad idea. One solution is to convert the report into

a Web page. Listing 21.5 contains a program that does just that.

The program will create a Web page that displays the access counts

for the documents that start with a 's.' Figure 21.1 shows the

Web page that displayed the access counts.

<P>

<A HREF="f21-1.gif" tppabs="http://cheminf.nankai.edu.cn/~eb~/Perl%205%20By%20Example/f21-1.gif"><B>Figure 21.1 : </B><I>The Web page that displayed the Access

Counts</I>.</A>

<P>

<IMG SRC="pseudo.gif" tppabs="http://cheminf.nankai.edu.cn/~eb~/Perl%205%20By%20Example/pseudo.gif" BORDER=1 ALIGN=RIGHT><p>

<BLOCKQUOTE>

<I>Turn on the warning option.<BR>

<I>Define the </I></I><TT><I>parseLogEntry()</I></TT><I>

fuNCtion.<BR>

Declare a local variable to hold the pattern that matches a single

item.<BR>

Use the matching operator to extract information into pattern

memory.<BR>

Return a list that contains the 11 items extracted from the log

entry.<BR>

Initialize some variables to be used later. The file name of the

accesslog, the web page file name, and the email address of the

web page maintainer.<BR>

Open the logfile.<BR>

Iterate over each line of the logfile.<BR>

Parse the entry to extract the 11 items but only keep the file

specification that was requested.<BR>

Put the filename into pattern memory.<BR>

Store the filename into </I><TT><I>$fileName</I></TT><I>.

<BR>

Test to see if </I><TT><I>$fileName</I></TT><I>

is defined.<BR>

INCrement the file specification's value in the </I><TT><I>%docList</I></TT><I>

hash.Close the log file.<BR>

Open the output file that will become the web page.<BR>

Output the HTML header.<BR>

Start the body of the HTML page.<BR>

Output current time.<BR>

Start an unorder list so the subsequent table is indented.<BR>

Start a HTML table.<BR>

Output the heading for the two columns the table will use.<BR>

Iterate over hash that holds the document list.<BR>

Output a table row for each hash entry.<BR>

End the HTML table.<BR>

End the unordered list.<BR>

Output a message about who to contact if questions arise.<BR>

End the body of the page.<BR>

End the HTML.<BR>

Close the web page file.</I>

</BLOCKQUOTE>

<HR>

<P>

<B>Listing 21.5&nbsp;&nbsp;21LST05.PL-Creating a Web Page to View

Access Counts<BR>

</B>

<BLOCKQUOTE>

<PRE>#!/usr/bin/perl -w



sub parseLogEntry {

    my($w) = &quot;(.+?)&quot;;

    m/^$w $w $w \[$w:$w $w\] &quot;$w $w $w&quot; $w $w/;

    return($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11);

}



$LOGFILE  = &quot;access.log&quot;;

$webPage  = &quot;acescnt.htm&quot;;

$mailAddr = 'medined@planet.net';



open(LOGFILE) or die(&quot;Could not open log file.&quot;);

foreach (&lt;LOGFILE&gt;) {

    $fileSpec = (parseLogEntry())[7];

    $fileSpec =~ m!.+/(.+)!;

    $fileName = $1;

    # some requests don't specify a filename, just a directory.

    

if (defined($fileName)) {

        $docList{$fileSpec}++ if $fileName =~ m/^s/i;

    }

}

close(LOGFILE);



open(WEBPAGE, &quot;&gt;$webPage&quot;);

print WEBPAGE (&quot;&lt;HEAD&gt;&lt;TITLE&gt;Access Counts&lt;/TITLE&gt;&lt;/HEAD&gt;&quot;);

print WEBPAGE (&quot;&lt;BODY&gt;&quot;);

print WEBPAGE (&quot;&lt;H1&gt;&quot;, scalar(localtime), &quot;&lt;/H1&gt;&quot;);

print WEBPAGE (&quot;&lt;UL&gt;&quot;);

print WEBPAGE (&quot;&lt;TABLE BORDER=1 CELLPADDING=10&gt;&quot;);

print WEBPAGE (&quot;&lt;TR&gt;&lt;TH&gt;Document&lt;/TH&gt;&lt;TH&gt;Access&lt;BR&gt;Count&lt;/TH&gt;&lt;/TR&gt;&quot;);



foreach $document (sort(keys(%docList))) {

    $count = $docList{$document};

    print WEBPAGE (&quot;&lt;TR&gt;&quot;);

    print WEBPAGE (&quot;&lt;TD&gt;&lt;FONT SIZE=2&gt;&lt;TT&gt;$document&lt;/TT&gt;&lt;/FONT&gt;&lt;/TD&gt;&quot;);

    print WEBPAGE (&quot;&lt;TD ALIGN=right&gt;$count&lt;/TD&gt;&quot;);

    print WEBPAGE (&quot;&lt;/TR&gt;&quot;);

}



print WEBPAGE (&quot;&lt;/TABLE&gt;&lt;P&gt;&quot;);

print WEBPAGE (&quot;&lt;/UL&gt;&quot;);

print WEBPAGE (&quot;Have questions? Contact &lt;A HREF=\&quot;mailto:$mailAddr\ &quot;&gt;$mailAddr&lt;/A&gt;&quot;);

print WEBPAGE (&quot;&lt;/BODY&gt;&lt;/HTML&gt;&quot;);

close(WEBPAGE);

</PRE>

</BLOCKQUOTE>



<HR>

<H3><A NAME="ExistingLogFileAnalyzingPrograms">

Existing Log File Analyzing Programs</A></H3>

<P>

Now that you've learned some of the basics of log file statistics,

you should check out a program called Statbot, which can be used

to automatically generate statistics and graphs. You can find

it at:

<BLOCKQUOTE>

<PRE>http://www.xmission.com:80/~dtubbs/

</PRE>

</BLOCKQUOTE>

<P>

Statbot is a WWW log analyzer, statistics generator, and database

program. It works by &quot;snooping&quot; on the logfiles generated

by most WWW servers and  creating a database that contains information

about the WWW server. This  database is then used to create a

statistics page and GIF charts that can be  &quot;linked to&quot;

by other WWW resources.

<P>

Because Statbot &quot;snoops&quot; on the server logfiles, it

does not require the use  of the server's cgi-bin capability.

It simply runs from the user's own directory, automatically updating

statistics. Statbot uses a text-based configuration file for setup,

so it is very easy to install and operate, even for  people with

no programming experieNCe. Most importantly, Statbot is fast.

ONCe it is up and running, updating the database and creating

the new  HTML page can take as little as 10 seconds. Because of

this, many Statbot users run Statbot oNCe every 5-10 minutes,

which provides them with the very latest statistical information

about their site.

<P>

Another fine log analysis program is AccessWatch, written by Dave

Maher. AccessWatch is a World Wide Web utility that provides a

comprehensive view of daily accesses for individual users. It

is equally capable of gathering statistics for an entire server.

It provides a regularly updated summary of WWW server hits and

accesses, and gives a graphical representation of available statistics.

It generates statistics for hourly server load, page demand, accesses

by domain, and accesses by host. AccessWatch parses the WWW server

log and searches for a common set of documents, usually specified

by a user's root directory, such as /~username/ or /users/username.

AccessWatch displays results in a graphical, compact format.

<P>

If you'd like to look at <I>all</I> of the available log file

analyzers, go to Yahoo's Log Analysis Tools page:

<BLOCKQUOTE>

<PRE>http://www.yahoo.com/Computers_and_Internet/Internet/

    World_Wide_Web/HTTP/Servers/Log_Analysis_Tools/

</PRE>

</BLOCKQUOTE>

<P>

This page lists all types of log file analyzers-from simple Perl

scripts to full-blown graphical applications.

<H3><A NAME="CreatingYourOwnCGILogFile">

Creating Your Own CGI Log File</A></H3>

<P>

It is generally a good idea to keep track of who executes your

CGI scripts. You've already been introduced to the environment

variables that are available within your CGI script. Using the

information provided by those environment variables, you can create

your own log file.

<P>

<IMG SRC="pseudo.gif" tppabs="http://cheminf.nankai.edu.cn/~eb~/Perl%205%20By%20Example/pseudo.gif" BORDER=1 ALIGN=RIGHT><p>

<BLOCKQUOTE>

<I>Turn on the warning option.<BR>

<I>Define the </I></I><TT><I>writeCgiEntry()</I></TT><I>

fuNCtion.<BR>

Initialize the log file name.<BR>

Initialize the name of the current script.<BR>

Create local versions of environment variables.<BR>

Open the log file in append mode.<BR>

Output the variables using ! as a field delimiter.<BR>

Close the log file.<BR>

Call the </I><TT><I>writeCgiEntry()</I></TT><I>

fuNCtion.<BR>

Create a test HTML page.</I>

</BLOCKQUOTE>

<P>

Listing 21.6 shows how to create your own CGI log file based on

environment variables.

<HR>

<P>

<B>Listing 21.6&nbsp;&nbsp;21LST06.PL-Creating Your Own CGI Log

File Based on Environment Variables<BR>

</B>

<BLOCKQUOTE>

<PRE>#!/usr/bin/perl -w



sub writeCgiEntry {

    my($logFile) = &quot;cgi.log&quot;;

    my($script)  = __FILE__;

    my($name)    = $ENV{'REMOTE_HOST'};

    my($addr)    = $ENV{'REMOTE_ADDR'};

    my($browser) = $ENV{'HTTP_USER_AGENT'};

    my($time)    = time;



    open(LOGFILE,&quot;&gt;&gt;$logFile&quot;) or die(&quot;Can't open cgi log file.\n&quot;);

    print LOGFILE (&quot;$script!$name!$addr!$browser!$time\n&quot;);

    close(LOGFILE);

}



writeCgiEntry();



# do some CGI activity here.



print &quot;Content-type: text/html\n\n&quot;;

print &quot;&lt;HTML&gt;&quot;;

print &quot;&lt;TITLE&gt;CGI Test&lt;/TITLE&gt;&quot;;

print &quot;&lt;BODY&gt;&lt;H1&gt;Testing!&lt;/H1&gt;&lt;/BODY&gt;&quot;;

print &quot;&lt;/HTML&gt;&quot;;

</PRE>

</BLOCKQUOTE>

<HR>

<P>

Every time this script is called, an entry will be made in the

CGI log file. If you place a call to the <TT>writeCgiEntry()</TT>

fuNCtion in all of your CGI scripts, after a while you will be

able perform some statistical analysis on who uses your CGI scripts.

<H2><A NAME="CommunicatingwithUsers"><FONT SIZE=5 COLOR=#FF0000>

Communicating with Users</FONT></A></H2>

<P>

So far we've been looking at examining the server log files in

this chapter. Perl is also very useful for creating the Web pages

that the user will view.

<H3><A NAME="ExampleGeneratingaWhatsNewPage">

Example: Generating a What's New Page</A></H3>

<P>

One of the most common features of a Web site is a What's New

page. This page typically lists all of the files modified in the

last week or month along with a short description of the document.

<P>

A What's New page is usually automatically generated using a scheduler

program, like <TT>cron</TT>. If you

try to generate the What's New page via a CGI script, your server

will quickly be overrun by the large number of disk accesses that

will be required and your users will be upset that a simple What's

New page takes so long to load.

<P>

Perl is an excellent tool for creating a What's New page. It has

good directory access fuNCtions and regular expressions that can

be used to search for titles or descriptions in HTML pages. Listing

21.7 contains a Perl program that will start at a specified base

directory and search for files that have been modified siNCe the

last time that the script was run. When the search is complete,

an HTML page is generated. You can have your home page point to

the automatically generated What's New page.

<P>

This program uses a small data file-called <TT>new.log</TT>-to

keep track of the last time that the program was run. Any files

that have changed siNCe that date are displayed on the HTML page.

<BR>

<p>

<CENTER>

<TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>

<TR><TD><B>Note</B></TD></TR>

<TR><TD>

<BLOCKQUOTE>

This program contains the first significant use of recursion in this book. Recursion happens when a fuNCtion calls itself and will be fully explained after the program listing.</BLOCKQUOTE>



</TD></TR>

</TABLE>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -