⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch21.htm

📁 this is a book on pearl , simple example with explanation is given here. it could be beneficial for
💻 HTM
📖 第 1 页 / 共 4 页
字号:
it at:<BLOCKQUOTE><PRE>http://www.xmission.com:80/~dtubbs/</PRE></BLOCKQUOTE><P>Statbot is a WWW log analyzer, statistics generator, and databaseprogram. It works by &quot;snooping&quot; on the logfiles generatedby most WWW servers and  creating a database that contains informationabout the WWW server. This  database is then used to create astatistics page and GIF charts that can be  &quot;linked to&quot;by other WWW resources.<P>Because Statbot &quot;snoops&quot; on the server logfiles, itdoes not require the use  of the server's cgi-bin capability.It simply runs from the user's own directory, automatically updatingstatistics. Statbot uses a text-based configuration file for setup,so it is very easy to install and operate, even for  people withno programming experieNCe. Most importantly, Statbot is fast.ONCe it is up and running, updating the database and creatingthe new  HTML page can take as little as 10 seconds. Because ofthis, many Statbot users run Statbot oNCe every 5-10 minutes,which provides them with the very latest statistical informationabout their site.<P>Another fine log analysis program is AccessWatch, written by DaveMaher. AccessWatch is a World Wide Web utility that provides acomprehensive view of daily accesses for individual users. Itis equally capable of gathering statistics for an entire server.It provides a regularly updated summary of WWW server hits andaccesses, and gives a graphical representation of available statistics.It generates statistics for hourly server load, page demand, accessesby domain, and accesses by host. AccessWatch parses the WWW serverlog and searches for a common set of documents, usually specifiedby a user's root directory, such as /~username/ or /users/username.AccessWatch displays results in a graphical, compact format.<P>If you'd like to look at <I>all</I> of the available log fileanalyzers, go to Yahoo's Log Analysis Tools page:<BLOCKQUOTE><PRE>http://www.yahoo.com/Computers_and_Internet/Internet/    World_Wide_Web/HTTP/Servers/Log_Analysis_Tools/</PRE></BLOCKQUOTE><P>This page lists all types of log file analyzers-from simple Perlscripts to full-blown graphical applications.<H3><A NAME="CreatingYourOwnCGILogFile">Creating Your Own CGI Log File</A></H3><P>It is generally a good idea to keep track of who executes yourCGI scripts. You've already been introduced to the environmentvariables that are available within your CGI script. Using theinformation provided by those environment variables, you can createyour own log file.<P><IMG SRC="pseudo.gif" BORDER=1 ALIGN=RIGHT><p><BLOCKQUOTE><I>Turn on the warning option.<BR><I>Define the </I></I><TT><I>writeCgiEntry()</I></TT><I>fuNCtion.<BR>Initialize the log file name.<BR>Initialize the name of the current script.<BR>Create local versions of environment variables.<BR>Open the log file in append mode.<BR>Output the variables using ! as a field delimiter.<BR>Close the log file.<BR>Call the </I><TT><I>writeCgiEntry()</I></TT><I>fuNCtion.<BR>Create a test HTML page.</I></BLOCKQUOTE><P>Listing 21.6 shows how to create your own CGI log file based onenvironment variables.<HR><P><B>Listing 21.6&nbsp;&nbsp;21LST06.PL-Creating Your Own CGI LogFile Based on Environment Variables<BR></B><BLOCKQUOTE><PRE>#!/usr/bin/perl -wsub writeCgiEntry {    my($logFile) = &quot;cgi.log&quot;;    my($script)  = __FILE__;    my($name)    = $ENV{'REMOTE_HOST'};    my($addr)    = $ENV{'REMOTE_ADDR'};    my($browser) = $ENV{'HTTP_USER_AGENT'};    my($time)    = time;    open(LOGFILE,&quot;&gt;&gt;$logFile&quot;) or die(&quot;Can't open cgi log file.\n&quot;);    print LOGFILE (&quot;$script!$name!$addr!$browser!$time\n&quot;);    close(LOGFILE);}writeCgiEntry();# do some CGI activity here.print &quot;Content-type: text/html\n\n&quot;;print &quot;&lt;HTML&gt;&quot;;print &quot;&lt;TITLE&gt;CGI Test&lt;/TITLE&gt;&quot;;print &quot;&lt;BODY&gt;&lt;H1&gt;Testing!&lt;/H1&gt;&lt;/BODY&gt;&quot;;print &quot;&lt;/HTML&gt;&quot;;</PRE></BLOCKQUOTE><HR><P>Every time this script is called, an entry will be made in theCGI log file. If you place a call to the <TT>writeCgiEntry()</TT>fuNCtion in all of your CGI scripts, after a while you will beable perform some statistical analysis on who uses your CGI scripts.<H2><A NAME="CommunicatingwithUsers"><FONT SIZE=5 COLOR=#FF0000>Communicating with Users</FONT></A></H2><P>So far we've been looking at examining the server log files inthis chapter. Perl is also very useful for creating the Web pagesthat the user will view.<H3><A NAME="ExampleGeneratingaWhatsNewPage">Example: Generating a What's New Page</A></H3><P>One of the most common features of a Web site is a What's Newpage. This page typically lists all of the files modified in thelast week or month along with a short description of the document.<P>A What's New page is usually automatically generated using a schedulerprogram, like <TT>cron</TT>. If youtry to generate the What's New page via a CGI script, your serverwill quickly be overrun by the large number of disk accesses thatwill be required and your users will be upset that a simple What'sNew page takes so long to load.<P>Perl is an excellent tool for creating a What's New page. It hasgood directory access fuNCtions and regular expressions that canbe used to search for titles or descriptions in HTML pages. Listing21.7 contains a Perl program that will start at a specified basedirectory and search for files that have been modified siNCe thelast time that the script was run. When the search is complete,an HTML page is generated. You can have your home page point tothe automatically generated What's New page.<P>This program uses a small data file-called <TT>new.log</TT>-tokeep track of the last time that the program was run. Any filesthat have changed siNCe that date are displayed on the HTML page.<BR><p><CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%><TR><TD><B>Note</B></TD></TR><TR><TD><BLOCKQUOTE>This program contains the first significant use of recursion in this book. Recursion happens when a fuNCtion calls itself and will be fully explained after the program listing.</BLOCKQUOTE></TD></TR></TABLE></CENTER><P><P><IMG SRC="pseudo.gif" BORDER=1 ALIGN=RIGHT><p><BLOCKQUOTE><I>Turn on the warning option.<BR>Turn on the strict pragma.<BR>Declare some variables.<BR>Call the </I><TT><I>checkFiles()</I></TT><I>fuNCtion to find modified files.<BR>Call the </I><TT><I>setLastTime()</I></TT><I>fuNCtion to update the log file.<BR>Call the </I><TT><I>createHTML()</I></TT><I>fuNCtion to create the web page.<BR><BR>Define the </I><TT><I>getLastTime()</I></TT><I>fuNCtion.<BR>Declare local variables to hold the parameters.<BR>If the data file can't be opened, use the current time as thedefault.<BR>Read in the time of the last running of the program.<BR>Close the data file.<BR>Return the time.<BR>Define the </I><TT><I>setLastTime()</I></TT><I>fuNCtion.<BR>Declare local variables to hold the parameters.<BR>Open the data file for writing.<BR>Output </I><TT><I>$time</I></TT><I>which is the current time this program is running.  <BR>Close the data file.<BR>Define the </I><TT><I>checkFiles()</I></TT><I>fuNCtion.<BR>Declare local variables to hold the parameters.<BR>Declare more local variables.<BR>Create an array containing the files in the </I><TT><I>$path</I></TT><I>directory.<BR>Iterate over the list of files.<BR>If current file is current dir or parent dir, move on to nextfile.<BR>Create full filename by joining dir (</I><TT><I>$path</I></TT><I>)with filename (</I><TT><I>$_</I></TT><I>).<BR>If current file is a directory, then recurse and move to nextfile.<BR>Get last modification time of current file.<BR>Provide a default value for the file's title.<BR>If the file has been changed siNCe the last running of this program,openthe file, look for a title HTML tag, and close the file.<BR>Create an anonymous array and assign it to a hash entry.<BR>Define the </I><TT><I>createHTML()</I></TT><I>fuNCtion.<BR>Declare local variables to hold the parameters.<BR>Declare more local variables.<BR>Open the HTML file for output.<BR>Output the HTML header and title tags.<BR>Output an H1 header tag.<BR>If no files have changed, output a message.<BR>Otherwise output the HTML tags to begin a table.<BR>Iterate the list of modified files.<BR>Output info about modified file as an HTML table row.<BR>Output the HTML tags to end a table.<BR>Output the HTML tags to end the document.<BR>Close the HTML file.</I></BLOCKQUOTE><HR><P><B>Listing 21.7&nbsp;&nbsp;21LST07.PL-Generating a Primitive What'sNew Page<BR></B><BLOCKQUOTE><PRE>#!/usr/bin/perl -wuse strict;my($root)     = &quot;/website/root&quot;;          # root of servermy($newLog)   = &quot;new.log&quot;;                # file w/time of last run.my($htmlFile) = &quot;$root/whatnew.htm&quot;;      # output file.my($lastTime) = getLastTime($newLog);     # time of last run.my(%modList);                             # hash of modified files.checkFiles($root, $root, $lastTime, \%modList);setLastTime($newLog, time());createHTML($htmlFile, $lastTime, \%modList);sub getLastTime {    my($newLog) = shift;        # filename of log file.    my($time)   = time();       # the current time is the default.    if (open(NEWLOG, &quot;&lt;$newLog&quot;)) {        chomp($time = &lt;NEWLOG&gt;);        close(NEWLOG);    }    return($time);}sub setLastTime {    my($newLog) = shift;        # filename of log file.    my($time)   = shift;        # the time of this run.    open(NEWLOG, &quot;&gt;$newLog&quot;) or die(&quot;Can't write What's New log file.&quot;);    print NEWLOG (&quot;$time\n&quot;);    close(NEWLOG);}sub checkFiles {    my($base)    = shift;   # the root of the dir tree to search    my($path)    = shift;   # the current dir as we recurse    my($time)    = shift;   # the time of the last run of this script    my($hashRef) = shift;   # the hash where modified files are listed.    my($fullFilename);      # a combo of $path and the current filename.    my(@files);             # holds a list of files in current dir.    my($title);             # the HTML title of a modified doc.    my($modTime);           # the modification time of a modfied doc.    opendir(ROOT, $path);    @files = readdir(ROOT);    closedir(ROOT);    foreach (@files) {        next if /^\.|\.\.$/;                $fullFilename    = &quot;$path/$_&quot;;        if (-d $fullFilename) {            checkFiles($base, $fullFilename, $time, $hashRef);            next;        }        $modTime = (stat($fullFilename))[9]; # only need the mod time.        $title   = 'Untitled';               # provide a default value        if ($modTime &gt; $time) {            open(FILE, $fullFilename);                while (&lt;FILE&gt;) {                    if (m!&lt;title&gt;(.+)&lt;/title&gt;!i) {                        $title = $1;                        last;                    }                }            close(FILE);            %{$hashRef}-&gt;{substr($fullFilename, length($base))} =                [ $modTime, $title ];        }    }}sub createHTML {    my($htmlFile)   = shift;    my($lastTime)   = shift;    my($hashRef)    = shift;    my($htmlTitle)  = &quot;What's New SiNCe &quot; . scalar(localtime($lastTime)). &quot;!&quot;;    my(@sortedList) = sort(keys(%{$hashRef}));    open(HTML, &quot;&gt;$htmlFile&quot;);    print HTML (&quot;&lt;TITLE&gt;$htmlTitle&lt;/TITLE&gt;\n&quot;);    print HTML (&quot;&lt;HTML&gt;\n&quot;);    print HTML (&quot;&lt;HEAD&gt;&lt;TITLE&gt;$htmlTitle&lt;/TITLE&gt;&lt;/HEAD&gt;\n&quot;);    print HTML (&quot;&lt;BODY&gt;\n&quot;);    print HTML (&quot;&lt;H1&gt;$htmlTitle&lt;/H1&gt;&lt;P&gt;\n&quot;);    if (scalar(@sortedList) == 0) {        print HTML (&quot;There are no new files.\n&quot;);    }    else {        print HTML (&quot;&lt;TABLE BORDER=1 CELLPADDING=10&gt;\n&quot;);        print HTML (&quot;&lt;TR&gt;\n&quot;);        print HTML (&quot;  &lt;TH&gt;Filename&lt;/TH&gt;\n&quot;);        print HTML (&quot;  &lt;TH&gt;Modification&lt;BR&gt;Date&lt;/TH&gt;\n&quot;);        print HTML (&quot;  &lt;TH&gt;Title&lt;/TH&gt;\n&quot;);        print HTML (&quot;&lt;/TR&gt;\n&quot;);        foreach (sort(keys(%{$hashRef}))) {            my($modTime, $title) = @{%{$hashRef}-&gt;{$_}};            $modTime = scalar(localtime($modTime));            print HTML (&quot;&lt;TR&gt;\n&quot;);            print HTML (&quot;  &lt;TD&gt;&lt;FONT SIZE=2&gt;&lt;A HREF=\&quot;$_\&quot;&gt;$_&lt;/A&gt;&lt;/FONT&gt;&lt;/TD&gt;\n&quot;);            print HTML (&quot;  &lt;TD&gt;&lt;FONT SIZE=2&gt;$modTime&lt;/FONT&gt;&lt;/TD&gt;\n&quot;);            print HTML (&quot;  &lt;TD&gt;&lt;FONT SIZE=2&gt;$title&lt;/FONT&gt;&lt;/TD&gt;\n&quot;);            print HTML (&quot;&lt;/TR&gt;\n&quot;);        }        print HTML (&quot;&lt;/TABLE&gt;\n&quot;);    }    print HTML (&quot;&lt;/BODY&gt;\n&quot;);    print HTML (&quot;&lt;/HTML&gt;\n&quot;);    close(HTML);}</PRE></BLOCKQUOTE><HR><P>The program from Listing 21.7 will generate an HTML file thatcan be displayed in any browser capable of handling HTML tables.Figure 21.2 shows how the page looks in Netscape Navigator.<P><A HREF="f21-2.gif"><B>Figure 21.2 : </B><I>A What's New page</I>.</A><P>You might wonder why I end the HTML lines with newline characterswhen newlines are ignored by Web browsers. The newline characterswill help you to edit the resulting HTML file with a standardtext editor if you need to make an emergeNCy change. For example,a document might change status from <TT>visible</TT>to <TT>for internal use only</TT>and you'd like to remove it from the What's New page. It is mucheasier to fire up a text editor and remove the refereNCe thento rerun the What's New script.<P>I think the only tricky code in Listing 22.7 is where it createsan anonymous array that is stored into the hash that holds thechanged files. Look at that line of code closely.<BLOCKQUOTE><PRE>%{$hashRef}-&gt;{substr($fullFilename, length($base))} = [ $modTime, $title </PRE></BLOCKQUOTE><P>The <TT>$hashRef</TT> variable holdsa refereNCe to <TT>%modList</TT> thatwas passed from the main program. The key part of the key-valuepair for this hash is the relative path and file name. The valuepart is an anonymous array that holds the modification time andthe document title.<BR><p><CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%><TR><TD><B>Tip</B></TD></TR><TR><TD><BLOCKQUOTE>An array was used to store the information about the modified file so that you can easily change the program to display additional information. You might also want to display the file size or perhaps some category information.</BLOCKQUOTE>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -