📄 ch21.htm
字号:
</I>Define a format for the report's detail line.<BR>Define a format for the report's header line.<BR>Define the <TT>parseLogEntry()</TT>fuNCtion.<BR>Declare a local variable to hold the pattern that matches a singleitem.<BR>Use the matching operator to extract information into patternmemory.<BR>Return a list that contains the 11 items extracted from the logentry.<BR>Open the logfile.<BR>Iterate over each line of the logfile.<BR>Parse the entry to extract the 11 items but only keep the filespecification that was requested.<BR>Put the filename into pattern memory.<BR>Store the filename into <TT>$fileName</TT>.<BR>Test to see if <TT>$fileName</TT>is defined.<BR>INCrement the file specification's value in the <TT>%docList</TT>hash.<BR>Close the log file.<BR>Iterate over the hash that holds the file specifications.<BR>Write out each hash entry in a report.</BLOCKQUOTE><HR><P><B>Listing 21.3 21LST03.PL-Creating a Report of theAccess Counts for Documents that Start with the Letter S<BR></B><BLOCKQUOTE><PRE>#!/usr/bin/perl -wformat = @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @>>>>>>> $document, $count.format STDOUT_TOP = @|||||||||||||||||||||||||||||||||||| Pg @< "Access Counts for S* Documents",, $% Document Access Count --------------------------------------- ------------.sub parseLogEntry { my($w) = "(.+?)"; m/^$w $w $w \[$w:$w $w\] "$w $w $w" $w $w/; return($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11);}$LOGFILE = "access.log";open(LOGFILE) or die("Could not open log file.");foreach (<LOGFILE>) { $fileSpec = (parseLogEntry())[7]; $fileSpec =~ m!.+/(.+)!; $fileName = $1; # some requests don't specify a filename, just a directory. if (defined($fileName)) { $docList{$fileSpec}++ if $fileName =~ m/^s/i; }}close(LOGFILE);foreach $document (sort(keys(%docList))) { $count = $docList{$document}; write;}</PRE></BLOCKQUOTE><HR><P>This program displays:<BR><BLOCKQUOTE><PRE>Access Counts for S* Documents Pg 1 Document Access Count -------------------------------------- ------------ /~bamohr/scapenow.gif 1 /~jltiNChe/songs/song2.gif 5 /~mtmortoj/mortoja_html/song.html 1 /~scmccubb/pics/shock.gif 1</PRE></BLOCKQUOTE><P>This program has a couple of points that deserve a comment ortwo. First, notice that the program takes advantage of the factthat Perl's variables default to a global scope. The main programvalues <TT>$_</TT> with each log fileentry and <TT>parseLogEntry()</TT>also directly accesses <TT>$_</TT>.This is okay for a small program but for larger programs, youneed to use local variables. Second, notice that it takes twosteps to specify files that start with a letter. The filenameneeds to be extracted from <TT>$fileSpec</TT>and then the filename can be filtered inside the <TT>if</TT>statement. If the file that was requested has no filename, theserver will probably default to <TT>index.html</TT>.However, this program doesn't take this into account. It simplyignores the log file entry if no file was explicitly requested.<P>You can use this same counting technique to display the most frequentremote sites that contact your server. You can also check thestatus code to see how many requests have been rejected. The nextsection looks at status codes.<H3><A NAME="ExampleLookingattheStatusCode">Example: Looking at the Status Code</A></H3><P>It is important for you to periodically check the server's logfile in order to determine if unauthorized people are trying toaccess secured documents. This is done by checking the statuscode in the log file entries.<P>Every status code is a three digit number. The first digit defineshow your server responded to the request. The last two digitsdo not have any categorization role. There are five values forthe first digit:<UL><LI><B>1xx: </B>Informational-Not used, but reserved for futureuse<LI><B>2xx</B>: Success-The action was successfully received,understood, and accepted.<LI><B>3xx</B>: Redirection - Further action must be taken inorder to complete the request.<LI><B>4xx</B>: Client Error - The request contains bad syntaxor cannot be fulfilled.<LI><B>5xx</B>: Server Error - The server failed to fulfill anapparently valid request.</UL><P>Table 21.1 contains a list of the most common status codes thatcan appear in your log file. You can find a complete list on the<B>http://www.w3.org/pub/WWW/Protocols/HTTP/1.0/spec.html</B>Web page.<BR><P><CENTER><B>Table 21.1 The Most Common Server StatusCodes</B></CENTER><p><CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%><TR><TD WIDTH=97><CENTER><I>Status</I></CENTER></TD><TD WIDTH=234><I>Description Code</I></TD></TR><TR><TD WIDTH=97><CENTER>200</CENTER></TD><TD WIDTH=234>OK</TD></TR><TR><TD WIDTH=97><CENTER>204</CENTER></TD><TD WIDTH=234>No content</TD></TR><TR><TD WIDTH=97><CENTER>301</CENTER></TD><TD WIDTH=234>Moved permanently</TD></TR><TR><TD WIDTH=97><CENTER>302</CENTER></TD><TD WIDTH=234>Moved temporarily</TD></TR><TR><TD WIDTH=97><CENTER>400</CENTER></TD><TD WIDTH=234>Bad Request</TD></TR><TR><TD WIDTH=97><CENTER>401</CENTER></TD><TD WIDTH=234>Unauthorized</TD></TR><TR><TD WIDTH=97><CENTER>403</CENTER></TD><TD WIDTH=234>Forbidden</TD></TR><TR><TD WIDTH=97><CENTER>404</CENTER></TD><TD WIDTH=234>Not found</TD></TR><TR><TD WIDTH=97><CENTER>500</CENTER></TD><TD WIDTH=234>Internal server error</TD></TR><TR><TD WIDTH=97><CENTER>501</CENTER></TD><TD WIDTH=234>Not implemented</TD></TR><TR><TD WIDTH=97><CENTER>503</CENTER></TD><TD WIDTH=234>Service unavailable</TD></TR></TABLE></CENTER><P><P>Status code 401 is logged when a user attempts to access a secureddocument and enters an iNCorrect password. By searching the logfile for this code, you can create a report of the failed attemptsto gain entry into your site. Listing 21.4 shows how the log filecould be searched for a specific error code-in this case, 401.<P><IMG SRC="pseudo.gif" BORDER=1 ALIGN=RIGHT><p><BLOCKQUOTE><I>Turn on the warning option.<BR>Define a format for the report's detail line.<BR>Define a format for the report's header line.<BR>Define the </I><TT><I>parseLogEntry()</I></TT><I>fuNCtion.<BR>Declare a local variable to hold the pattern that matches a singleitem.<BR>Use the matching operator to extract information into patternmemory.<BR>Return a list that contains the 11 items extracted from the logentry.<BR>Open the logfile.<BR>Iterate over each line of the logfile.<BR>Parse the entry to extract the 11 items but only keep the siteinformation and the status code that was requested.<BR>If the status code is 401 then save the iNCrement the counterfor that site.<BR>Close the log file.<BR>Check the site name to see if it has any entries. If not, displaya message that says no unauthorized accesses took place.<BR>Iterate over the hash that holds the site names.<BR>Write out each hash entry in a report.</I></BLOCKQUOTE><HR><P><B>Listing 21.4 21LST04.PL-Checking for UnauthorizedAccess Attempts<BR></B><BLOCKQUOTE><PRE>#!/usr/bin/perl -wformat = @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @>>>>>>> $site, $count.format STDOUT_TOP = @|||||||||||||||||||||||||||||||||||| Pg @< "Unauthorized Access Report", $% Remote Site Name Access Count --------------------------------------- ------------.sub parseLogEntry { my($w) = "(.+?)"; m/^$w $w $w \[$w:$w $w\] "$w $w $w" $w $w/; return($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11);}$LOGFILE = "access.log";open(LOGFILE) or die("Could not open log file.");foreach (<LOGFILE>) { ($site, $status) = (parseLogEntry())[0, 9]; if ($status eq '401') { $siteList{$site}++; }}close(LOGFILE);@sortedSites = sort(keys(%siteList));if (scalar(@sortedSites) == 0) { print("There were no unauthorized access attempts.\n");}else { foreach $site (@sortedSites) { $count = $siteList{$site}; write; }}</PRE></BLOCKQUOTE><P>This program displays:<BLOCKQUOTE><PRE> Unauthorized Access Report Pg 1 Remote Site Name Access Count --------------------------------------- ------------ ip48-max1-fitch.zipnet.net 1 kairos.algonet.se 4</PRE></BLOCKQUOTE><P>You can expand this program's usefulness by also displaying thelogName and fullName items from the log file.<H3><A NAME="ExampleConvertingtheReporttoaWebPage">Example: Converting the Report to a Web Page</A></H3><P>Creating nice reports for your own use is all well and good. Butsuppose your boss wants the statistics updated hourly and availableon demand? Printing the report and faxing to the head office isprobably a bad idea. One solution is to convert the report intoa Web page. Listing 21.5 contains a program that does just that.The program will create a Web page that displays the access countsfor the documents that start with a 's.' Figure 21.1 shows theWeb page that displayed the access counts.<P><A HREF="f21-1.gif"><B>Figure 21.1 : </B><I>The Web page that displayed the AccessCounts</I>.</A><P><IMG SRC="pseudo.gif" BORDER=1 ALIGN=RIGHT><p><BLOCKQUOTE><I>Turn on the warning option.<BR><I>Define the </I></I><TT><I>parseLogEntry()</I></TT><I>fuNCtion.<BR>Declare a local variable to hold the pattern that matches a singleitem.<BR>Use the matching operator to extract information into patternmemory.<BR>Return a list that contains the 11 items extracted from the logentry.<BR>Initialize some variables to be used later. The file name of theaccesslog, the web page file name, and the email address of theweb page maintainer.<BR>Open the logfile.<BR>Iterate over each line of the logfile.<BR>Parse the entry to extract the 11 items but only keep the filespecification that was requested.<BR>Put the filename into pattern memory.<BR>Store the filename into </I><TT><I>$fileName</I></TT><I>.<BR>Test to see if </I><TT><I>$fileName</I></TT><I>is defined.<BR>INCrement the file specification's value in the </I><TT><I>%docList</I></TT><I>hash.Close the log file.<BR>Open the output file that will become the web page.<BR>Output the HTML header.<BR>Start the body of the HTML page.<BR>Output current time.<BR>Start an unorder list so the subsequent table is indented.<BR>Start a HTML table.<BR>Output the heading for the two columns the table will use.<BR>Iterate over hash that holds the document list.<BR>Output a table row for each hash entry.<BR>End the HTML table.<BR>End the unordered list.<BR>Output a message about who to contact if questions arise.<BR>End the body of the page.<BR>End the HTML.<BR>Close the web page file.</I></BLOCKQUOTE><HR><P><B>Listing 21.5 21LST05.PL-Creating a Web Page to ViewAccess Counts<BR></B><BLOCKQUOTE><PRE>#!/usr/bin/perl -wsub parseLogEntry { my($w) = "(.+?)"; m/^$w $w $w \[$w:$w $w\] "$w $w $w" $w $w/; return($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11);}$LOGFILE = "access.log";$webPage = "acescnt.htm";$mailAddr = 'medined@planet.net';open(LOGFILE) or die("Could not open log file.");foreach (<LOGFILE>) { $fileSpec = (parseLogEntry())[7]; $fileSpec =~ m!.+/(.+)!; $fileName = $1; # some requests don't specify a filename, just a directory. if (defined($fileName)) { $docList{$fileSpec}++ if $fileName =~ m/^s/i; }}close(LOGFILE);open(WEBPAGE, ">$webPage");print WEBPAGE ("<HEAD><TITLE>Access Counts</TITLE></HEAD>");print WEBPAGE ("<BODY>");print WEBPAGE ("<H1>", scalar(localtime), "</H1>");print WEBPAGE ("<UL>");print WEBPAGE ("<TABLE BORDER=1 CELLPADDING=10>");print WEBPAGE ("<TR><TH>Document</TH><TH>Access<BR>Count</TH></TR>");foreach $document (sort(keys(%docList))) { $count = $docList{$document}; print WEBPAGE ("<TR>"); print WEBPAGE ("<TD><FONT SIZE=2><TT>$document</TT></FONT></TD>"); print WEBPAGE ("<TD ALIGN=right>$count</TD>"); print WEBPAGE ("</TR>");}print WEBPAGE ("</TABLE><P>");print WEBPAGE ("</UL>");print WEBPAGE ("Have questions? Contact <A HREF=\"mailto:$mailAddr\ ">$mailAddr</A>");print WEBPAGE ("</BODY></HTML>");close(WEBPAGE);</PRE></BLOCKQUOTE><HR><H3><A NAME="ExistingLogFileAnalyzingPrograms">Existing Log File Analyzing Programs</A></H3><P>Now that you've learned some of the basics of log file statistics,you should check out a program called Statbot, which can be usedto automatically generate statistics and graphs. You can find
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -