⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch20_14.htm

📁 By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998
💻 HTM
字号:
<HTML><HEAD><TITLE>Recipe 20.13. Processing Server Logs (Perl Cookbook)</TITLE><METANAME="DC.title"CONTENT="Perl Cookbook"><METANAME="DC.creator"CONTENT="Tom Christiansen &amp; Nathan Torkington"><METANAME="DC.publisher"CONTENT="O'Reilly &amp; Associates, Inc."><METANAME="DC.date"CONTENT="1999-07-02T01:46:03Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-243-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch20_01.htm"TITLE="20. Web Automation"><LINKREL="prev"HREF="ch20_13.htm"TITLE="20.12. Parsing a Web Server Log File"><LINKREL="next"HREF="ch20_15.htm"TITLE="20.14. Program: htmlsub"></HEAD><BODYBGCOLOR="#FFFFFF"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl Cookbook"><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><p><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch20_13.htm"TITLE="20.12. Parsing a Web Server Log File"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 20.12. Parsing a Web Server Log File"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1"><ACLASS="chapter"REL="up"HREF="ch20_01.htm"TITLE="20. Web Automation"></A></FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch20_15.htm"TITLE="20.14. Program: htmlsub"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 20.14. Program: htmlsub"BORDER="0"></A></TD></TR></TABLE></DIV><DIVCLASS="sect1"><H2CLASS="sect1"><ACLASS="title"NAME="ch20-16638">20.13. Processing Server Logs</A></H2><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch20-pgfId-1395">Problem<ACLASS="indexterm"NAME="ch20-idx-1000002688-0"></A><ACLASS="indexterm"NAME="ch20-idx-1000002688-1"></A><ACLASS="indexterm"NAME="ch20-idx-1000002688-2"></A><ACLASS="indexterm"NAME="ch20-idx-1000002688-3"></A></A></H3><PCLASS="para">You need to summarize your server logs, but you don't have a customizable program to do it.</P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch20-pgfId-1401">Solution</A></H3><PCLASS="para">Parse the error log yourself with regular expressions, or use the Logfile modules from CPAN.</P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch20-pgfId-1407">Discussion</A></H3><PCLASS="para"><ACLASS="xref"HREF="ch20_14.htm#ch20-16450"TITLE="sumwww">Example 20.9</A> is a sample report generator for an Apache weblog.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch20-16450">Example 20.9: sumwww</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -w# sumwww - summarize web server log activity$lastdate = &quot;&quot;;daily_logs();summary();exit;# read CLF files and tally hits from the host and to the URLsub daily_logs {    while (&lt;&gt;) {        ($type, $what) = /&quot;(GET|POST)\s+(\S+?) \S+&quot;/ or next;        ($host, undef, undef, $datetime) = split;        ($bytes) = /\s(\d+)\s*$/ or next;        ($date)  = ($datetime =~ /\[([^:]*)/);        $posts  += ($type eq POST);        $home++ if m, / ,;        if ($date ne $lastdate) {            if ($lastdate) { write_report()     }            else           { $lastdate = $date  }        }        $count++;        $hosts{$host}++;        $what{$what}++;        $bytesum += $bytes;    }    write_report() if $count;}# use *typeglob aliasing of global variables for cheap copysub summary  {    $lastdate = &quot;Grand Total&quot;;    *count   = *sumcount;    *bytesum = *bytesumsum;    *hosts   = *allhosts;    *posts   = *allposts;    *what    = *allwhat;    *home    = *allhome;    write;}# display the tallies of hosts and URLs, using formatssub write_report {    write;    # add to summary data    $lastdate    = $date;    $sumcount   += $count;    $bytesumsum += $bytesum;    $allposts   += $posts;    $allhome    += $home;    # reset daily data    $posts = $count = $bytesum = $home = 0;    @allwhat{keys %what}   = keys %what;    @allhosts{keys %hosts} = keys %hosts;    %hosts = %what = ();}format STDOUT_TOP =@|||||||||| @|||||| @||||||| @||||||| @|||||| @|||||| @|||||||||||||&quot;Date&quot;,     &quot;Hosts&quot;, &quot;Accesses&quot;, &quot;Unidocs&quot;, &quot;POST&quot;, &quot;Home&quot;, &quot;Bytes&quot;----------- ------- -------- -------- ------- ------- --------------.format STDOUT =@&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt; @&gt;&gt;&gt;&gt;&gt;&gt; @&gt;&gt;&gt;&gt;&gt;&gt;&gt; @&gt;&gt;&gt;&gt;&gt;&gt;&gt; @&gt;&gt;&gt;&gt;&gt;&gt; @&gt;&gt;&gt;&gt;&gt;&gt; @&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;$lastdate,  scalar(keys %hosts),            $count, scalar(keys %what),                             $posts,  $home,  $bytesum  .</PRE></DIV><PCLASS="para">Here's sample output from that program:</P><PRECLASS="programlisting"><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>    Date      Hosts  Accesses Unidocs   POST    Home       Bytes</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>----------- ------- -------- -------- ------- ------- --------------</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>19/May/1998     353     6447     3074     352      51       16058246</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>20/May/1998    1938    23868     4288     972     350       61879643</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>21/May/1998    1775    27872     6596    1064     376       64613798</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>22/May/1998    1680    21402     4467     735     285       52437374</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>23/May/1998    1128    21260     4944     592     186       55623059</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>Grand Total    6050   100849    10090    3715    1248      250612120</I></CODE></B></CODE></PRE><PCLASS="para">Use the <ACLASS="indexterm"NAME="ch20-idx-1000002689-0"></A>Logfile::Apache module from CPAN, shown in <ACLASS="xref"HREF="ch20_14.htm#ch20-35579"TITLE="aprept">Example 20.10</A>, to write a similar, but less specific, program. This module is distributed with other Logfile modules in a single Logfile distribution (Logfile-0.115.tar.gz at the time of writing).</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch20-35579">Example 20.10: aprept</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -w# aprept - report on Apache logsuse Logfile::Apache;$l = Logfile::Apache-&gt;new(    File  =&gt; &quot;-&quot;,                   # STDIN    Group =&gt; [ Domain, File ]);$l-&gt;report(Group =&gt; Domain, Sort =&gt; Records);$l-&gt;report(Group =&gt; File,   List =&gt; [Bytes,Records]);</PRE></DIV><PCLASS="para">The <CODECLASS="literal">new</CODE> constructor reads a log file and builds indices internally. Supply a filename with the parameter named <CODECLASS="literal">File</CODE> and the fields to index in the <CODECLASS="literal">Group</CODE> parameter. The possible fields are <CODECLASS="literal">Date</CODE> (date request), <CODECLASS="literal">Hour</CODE> (time of day the request was received), <CODECLASS="literal">File</CODE> (file requested), <CODECLASS="literal">User</CODE> (username parsed from request), <CODECLASS="literal">Host</CODE> (hostname requesting the document), and <CODECLASS="literal">Domain</CODE> (<CODECLASS="literal">Host</CODE> translated into "France", "Germany", etc.).</P><PCLASS="para">To produce a report on STDOUT, call the <CODECLASS="literal">report</CODE> method. Give it the index to use with the <CODECLASS="literal">Group</CODE> parameter, and optionally say how to sort (<CODECLASS="literal">Records</CODE> is by number of hits, <CODECLASS="literal">Bytes</CODE> is by number of bytes transferred) or how to further break it down (by number of bytes or number of records).</P><PCLASS="para">Here's some sample output:</P><PRECLASS="programlisting"><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>Domain                  Records </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>===============================</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>US Commercial        222 38.47% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>US Educational       115 19.93% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>Network               93 16.12% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>Unresolved            54  9.36% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>Australia             48  8.32% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>Canada                20  3.47% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>Mexico                 8  1.39% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>United Kingdom         6  1.04% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>File                               Bytes          Records </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>=========================================================</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>/                           13008  0.89%         6  1.04% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>/cgi-bin/MxScreen           11870  0.81%         2  0.35% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>/cgi-bin/pickcards          39431  2.70%        48  8.32% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>/deckmaster                143793  9.83%        21  3.64% </I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>/deckmaster/admin           54447  3.72%         3  0.52% </I></CODE></B></CODE><ACLASS="indexterm"NAME="ch20-idx-1000002691-0"></A><ACLASS="indexterm"NAME="ch20-idx-1000002691-1"></A><ACLASS="indexterm"NAME="ch20-idx-1000002691-2"></A><ACLASS="indexterm"NAME="ch20-idx-1000002691-3"></A><ACLASS="indexterm"NAME="ch20-idx-1000002691-4"></A><ACLASS="indexterm"NAME="ch20-idx-1000002691-5"></A></PRE></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch20-pgfId-1000003550">See Also</A></H3><PCLASS="para">The documentation for the CPAN module Logfile::Apache; <ICLASS="filename">perlform </I>(1) and the section on "Formats" in <ACLASS="olink"HREF="../prog/ch02_01.htm">Chapter 2</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch20_13.htm"TITLE="20.12. Parsing a Web Server Log File"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 20.12. Parsing a Web Server Log File"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch20_15.htm"TITLE="20.14. Program: htmlsub"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 20.14. Program: htmlsub"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">20.12. Parsing a Web Server Log File</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">20.14. Program: htmlsub</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -