⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch14.htm

📁 美国Macmillan出版社编写的Perl教程《Perl CGI Web Pages for WINNT》
💻 HTM
📖 第 1 页 / 共 3 页
字号:
<HTML>

<HEAD>

<TITLE>Chapter 14 -- Perl and Tracking</TITLE>



<META>

</HEAD>

<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EE" VLINK="#551A8B" ALINK="#CE2910">

<H1><FONT SIZE=6 COLOR=#FF0000>Chapter&nbsp;14</FONT></H1>

<H1><FONT SIZE=6 COLOR=#FF0000>Perl and Tracking</FONT></H1>

<HR>

<P>

<CENTER><B><FONT SIZE=5><A NAME="CONTENTS">CONTENTS</A></FONT></B></CENTER>

<UL>

<LI><A HREF="#Logging">

Logging</A>

<UL>

<LI><A HREF="#TheLogFile">

The Log File</A>

<LI><A HREF="#HTTPStatusCodes">

HTTP Status Codes </A>

</UL>

<LI><A HREF="#TrackingandEnvironmentalVariables">

Tracking and Environmental Variables</A>

<UL>

<LI><A HREF="#Browsers">

Browsers</A>

<LI><A HREF="#IPAddressesDomainNames">

IP Addresses/Domain Names</A>

<LI><A HREF="#TheReferURL">

The Refer URL</A>

</UL>

<LI><A HREF="#TrackingHitswiththeLog">

Tracking Hits with the Log</A>

<LI><A HREF="#CountersRevisited">

Counters Revisited</A>

<UL>

<LI><A HREF="#ManagingDBMFiles">

Managing DBM Files</A>

</UL>

<LI><A HREF="#NTPerlChecklistScript">

NT Perl Checklist Script</A>

<LI><A HREF="#ChapterInReview">

Chapter In Review</A>

</UL>

</UL>



<HR>

<P>

There has been little mention so far in this book about the use

of logs and other methods of tracking. Once you have a complete

Web site, it is necessary to find out how users travel through

it. One of the ways to do this is to track usage with logs. To

do this accurately, you can place a Perl script at the top of

the Web page to be tracked. To demonstrate this use of Perl, in

this chapter a tracking element is added to the Goo Goo Records

Web site.

<H2><A NAME="Logging"><FONT SIZE=5 COLOR=#FF0000>

Logging</FONT></A></H2>

<P>

There are all kinds of logs, or lists of actions, that happen

inside a computer. Logs tend to be divided up based on their purpose,

such as a system log to record actions-called events-done by the

NT system, or an application log that records events caused by

applications. These logs can be used to keep track of anything

that happens that might be of interest to a Web Master or Network

Administrator. For example, every time the computer is asked to

start up an application, a note of that event is made in the application

log. This log, like most others, can be viewed using Event Viewer.

<P>

Within a Web site logs can be used to monitor who is visiting

your site, and even if they are trying to go into places they're

not supposed to.

<P>

One of the early problems with tracking and logging on the Web

were unrealistic and high hit counts for Web sites. These inflated

numbers were, and still are, caused by simplistic uses of counters

to record hits on a particular page. It is quite common for Web

pages to contain two or three hypertext links, an image link,

and a next page link. If the user accesses all of these links

then a hit count of four or five may occur, giving a skewed version

of site usage. This is only true if the links on the page are

used; their presence alone will not skew the hit count.

<P>

There are several solutions available to avoid this problem. One

is to add a short Perl script to the top of the larger Perl script

that delivers the Web pages that are being monitored for user

traffic. There are two ways to do this:

<OL>

<LI>Have a form call a CGI script, and that script will load the

page, and record the hit to that page.

<LI>Have one of the links to the HTML documents call an URL, like

this one from the Goo Goo Records site-http://www.googoo.com/cgi-bin/page.pl?next.htm.

</OL>

<P>

This second method will call a Perl script, page.pl, which will

read the query string information for the HTML document, in this

case, next.html. The script will then deliver that HTML document.

<P>

The second method is a little more flexible because you only need

one Perl script to deliver any page: the page to deliver changes

with the query string info. One drawback is that all of your links

will be to a Perl script, making the response time longer. Also,

you would have to do all of your logging from the Perl script

because the Web server log would only record that every user called

the script <I>x</I> number of times, without recording what the

destination was. This may be desirable, though, because using

this method allows you to make the Web site's log files as minimal

or as detailed as you like. This method is explored in detail

later in this chapter, as it is the same method used by the Goo

Goo Records' Web Master on their Web site. This is the script

that performs the logging task:

<BLOCKQUOTE>

<PRE>

#!/usr/bin/perl

     ###################################################

     #

     # This is the Page delivery script.

     #

     # This script takes the query string information    as the filename and

     # delivers the file to the browser.  A link to      deliver the page new.html would

     # look like this:

     #

     # &lt;A HREF=&quot;http://www.googoo.com/cgi-bin/page.pl?new.html&gt;new&lt;/a&gt;

     #

     # Path information is also valid, and necessary to get lower in the directory

     # structure:

     #

     # &lt;A HREF=&quot;http://www.googoo.com/cgi-bin/page.pl?/newstuff/new/new.html&gt;new&lt;/a&gt;

     #

     # This will allow more flexible logging of any page that is delivered with this

     # script.  With a little work, you can even get this script to process server

     # side includes, counter, and all that jazz.

     #  The trouble here is that the server logs will now only show the user hitting

     # page.pl, no matter which page they request.  This is fine if you are creating

     # your own logs, but can be frustrating if you are not. This script generates

     # a log similar to the one generated by the EWACS server.

     #####################################################

     if ($ENV{'REQUEST_METHOD'} EQ 'GET') {

          $file=$ENV{'QUERY_STRING'};

          $file=~s/%([a-fA-F0-9][a-fA-F0-9])/pack(&quot;C&quot;,hex($1))/eg;

          print &quot;Content-type: text/html\n\n&quot;;

          $file=&quot;c:\googoo\$file&quot;;

          if (-e $file) {

               open(LOG,&quot;&gt;&gt;c:\\logs\\access&quot;);

               $t=localtime;

               print &quot;$t $ENV{'SERVER_NAME'} $ENV{'REMOTE_HOST'}

     $ENV{'REQUEST_METHOD'} $file

     $ENV{'SERVER_PROTOCOL'}\n&quot;;

               close(LOG);

               open(HTML,&quot;$file&quot;);

               while ($line=&lt;HTML&gt;) {

                    print $line;

               }

               close(HTML);

          }

          else {

               print &lt;&lt;EOF;

&lt;HTML&gt;

&lt;HEAD&gt;

&lt;TITLE&gt;Error! File not found&lt;/TITLE&gt;

&lt;/HEAD&gt;

&lt;H1&gt;Error! File not found&lt;/H1&gt;

&lt;HR&gt;&lt;P&gt;

The file you requested was not found.  Please contact &lt;address&gt;&lt;A

HREF=&quot;mailto:webmaster@googoo.com&quot;&gt;webmaster@googoo.com&lt;/a&gt;&lt;/address&gt;

&lt;/HTML&gt;

EOF

          }

     }

     else {

          print &quot;&lt;HTML&gt;\n&quot;;

          print &quot;&lt;title&gt;Error - Script Error&lt;/title&gt;\n&quot;;

          print &quot;&lt;h1&gt;Error: Script Error&lt;/h1&gt;\n&quot;;

          print &quot;&lt;P&gt;&lt;hr&gt;&lt;P&gt;\n&quot;;

          print &quot;There was an error with the Server Script. Please\n&quot;;

          print &quot;contact GooGoo Records at &lt;address&gt;&lt;a

href=\&quot;mailto:support@googoo.com\&quot;&gt;support@googoo.com&lt;/a&gt;&lt;/address&gt;\n&quot;;

          print &quot;&lt;/HTML&gt;\n&quot;;

          exit;

     }

</PRE>

</BLOCKQUOTE>

<P>

Another method of tracking is to read information from a log file,

and to create your tracking data from this data.

<H3><A NAME="TheLogFile">

The Log File</A></H3>

<P>

The file that contains the important information about the Goo

Goo Records site is known as the log file. Since they are using

the EMWAC HTTP service with their Web site, a log file is created

each day and kept in the log file directory. The directory path

for the log file directory on the Goo Goo Records server is C:\WINNT35\system32\LogFiles.

Each log file is given a file name relating to the date it was

created, following the general format of HSyymmdd.LOG. For example,

a log file created for July 6, 1996 would have the log filename

HS960706.LOG. An example of a log file's contents would resemble

this excerpt of a listing from the log file HS960509, from a server

in Finland:

<BLOCKQUOTE>

<PRE>

Thu May 09 20:09:17 1996 wait.pspt.fi 194.100.26.175 GET /ACEINDEX.HTM HTTP/1.0

Thu May 09 20:09:18 1996 wait.pspt.fi 194.100.26.175 GET /gif/AMKVLOGO.GIF HTTP/1.0

Thu May 09 20:09:19 1996 wait.pspt.fi 194.100.26.175 GET /gif/RNBW.GIF HTTP/1.0

Thu May 09 20:09:19 1996 wait.pspt.fi 194.100.26.175 GET /gif/RNBWBAR.GIF HTTP/1.0

Thu May 09 22:35:09 1996 wait.pspt.fi 194.215.82.227 GET /gif/WLOGO.GIF HTTP/1.0

Thu May 09 22:35:11 1996 wait.pspt.fi 194.215.82.227 GET /gif/BLUEBUL.GIF HTTP/1.0

Thu May 09 22:35:11 1996 wait.pspt.fi 194.215.82.227 GET /cgi-bin/counter.exe?-smittari+-w5+./DEFAULT.HTM 

HTTP/1.0

Thu May 09 22:35:13 1996 wait.pspt.fi 194.215.82.227 GET /gif/EHI.JPG HTTP/1.0

Thu May 09 22:35:17 1996 wait.pspt.fi 194.215.82.227 GET /gif/NAPPI1.gif HTTP/1.0

Thu May 09 22:35:17 1996 wait.pspt.fi 194.215.82.227 GET /gif/NAPPI2.gif HTTP/1.0

Thu May 09 22:35:19 1996 wait.pspt.fi 194.215.82.227 GET /AVIVF.HTM HTTP/1.0

Thu May 09 22:35:23 1996 wait.pspt.fi 194.215.82.227 GET /gif/virtlogo.gif HTTP/1.0

Thu May 09 22:35:23 1996 wait.pspt.fi 194.215.82.227 GET /gif/NAPPI1.gif HTTP/1.0

Thu May 09 22:35:29 1996 wait.pspt.fi 194.215.82.227 GET /gif/KOULU.GIF HTTP/1.0

Thu May 09 22:35:32 1996 wait.pspt.fi 194.215.82.227 GET /gif/NAPPI2.gif HTTP/1.0

Thu May 09 22:35:45 1996 wait.pspt.fi 194.215.82.227 GET /gif/VF21.GIF HTTP/1.0

Thu May 09 22:36:02 1996 wait.pspt.fi 194.215.82.227 GET /gif/NAPPI3.gif HTTP/1.0

Thu May 09 22:36:14 1996 wait.pspt.fi 194.215.82.227 GET /gif/LETTER.GIF HTTP/1.0

Thu May 09 22:37:46 1996 wait.pspt.fi 194.215.82.227 GET /AVIONGEL.HTM HTTP/1.0

Thu May 09 22:37:52 1996 wait.pspt.fi 194.215.82.227 GET /gif/PIRUNLG.GIF HTTP/1.0

Thu May 09 22:44:43 1996 wait.pspt.fi 194.215.82.227 GET /AVIPELI1.HTM HTTP/1.0

Thu May 09 22:44:45 1996 wait.pspt.fi 194.215.82.227 GET /gif/STRESSLG.GIF HTTP/1.0

Fri May 10 04:29:29 1996 wait.pspt.fi 192.83.26.48 GET /gif/NAPPI3.gif HTTP/1.0

Fri May 10 04:29:30 1996 wait.pspt.fi 192.83.26.48 GET /gif/LETTER.GIF HTTP/1.0

Fri May 10 04:29:31 1996 wait.pspt.fi 192.83.26.48 GET /gif/engflag.jpg HTTP/1.0

Fri May 10 04:30:21 1996 wait.pspt.fi 192.83.26.48 GET /AVIVF.HTM HTTP/1.0

Fri May 10 04:30:26 1996 wait.pspt.fi 192.83.26.48 GET /gif/virtlogo.gif HTTP/1.0

Fri May 10 04:30:27 1996 wait.pspt.fi 192.83.26.48 GET /gif/VF21.GIF HTTP/1.0

Fri May 10 04:30:30 1996 wait.pspt.fi 192.83.26.48 GET /gif/KOULU.GIF HTTP/1.0

Fri May 10 04:31:11 1996 wait.pspt.fi 192.83.26.48 GET /AVIPELI2.HTM HTTP/1.0

Fri May 10 04:31:13 1996 wait.pspt.fi 192.83.26.48 GET /gif/LAITE.GIF HTTP/1.0

Fri May 10 04:31:14 1996 wait.pspt.fi 192.83.26.48 GET /gif/KOKOONP.JPG HTTP/1.0

Fri May 10 04:31:32 1996 wait.pspt.fi 192.83.26.48 GET /AVIPELI3.HTM HTTP/1.0

Fri May 10 04:31:33 1996 wait.pspt.fi 192.83.26.48 GET /gif/TIKI1.GIF HTTP/1.0

Fri May 10 04:31:33 1996 wait.pspt.fi 192.83.26.48 GET /gif/TPIRU1.GIF HTTP/1.0

Fri May 10 04:31:33 1996 wait.pspt.fi 192.83.26.48 GET /gif/TSTRE1.GIF HTTP/1.0

Fri May 10 04:31:46 1996 wait.pspt.fi 192.83.26.48 GET /AVIPELI4.HTM HTTP/1.0

Fri May 10 04:32:03 1996 wait.pspt.fi 192.83.26.48 GET /ACEINDEX.HTM HTTP/1.0

Fri May 10 04:32:19 1996 wait.pspt.fi 192.83.26.48 GET /ACEVF.HTM HTTP/1.0

Fri May 10 04:32:21 1996 wait.pspt.fi 192.83.26.48 GET /gif/ROBOCOP1.GIF HTTP/1.0

Fri May 10 04:33:01 1996 wait.pspt.fi 192.83.26.48 GET /ACEINDEX.HTM HTTP/1.0

Fri May 10 07:54:44 1996 wait.pspt.fi 193.166.48.136 GET /gif/NAPPI1.gif HTTP/1.0

Fri May 10 07:54:45 1996 wait.pspt.fi 193.166.48.136 GET /gif/NAPPI2.gif HTTP/1.0

Fri May 10 07:54:45 1996 wait.pspt.fi 193.166.48.136 GET /gif/NAPPI3.gif HTTP/1.0

Fri May 10 07:54:45 1996 wait.pspt.fi 193.166.48.136 GET /cgi-bin/counter.exe?-smittari+-w5+./DEFAULT.HTM 

HTTP/1.0

Fri May 10 07:54:45 1996 wait.pspt.fi 193.166.48.136 GET /gif/LETTER.GIF HTTP/1.0

Fri May 10 10:08:25 1996 wait.pspt.fi 192.89.123.26 GET /gif/VFLOGO.GIF HTTP/1.0

Fri May 10 10:08:25 1996 wait.pspt.fi 192.89.123.26 GET /gif/AMKVLOGO.GIF HTTP/1.0

Fri May 10 10:08:37 1996 wait.pspt.fi 192.89.123.26 GET /AVIVF.HTM HTTP/1.0

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -