📄 ch33.htm
字号:
A common problem that results from CGI scripts is that malformed
headers are sent back when a request for data arrives from a browser.
Normally, a MIME header is sent from a server back to a browser.
For example, to send an HTML document back, a header will be of
the form <TT><FONT FACE="Courier">Content-type: text/html \n\n</FONT></TT>,
for a GIF image <TT><FONT FACE="Courier">Content-type: image/gif
\n\n</FONT></TT>, and so on. A script that has errors in it or
simply does not run will not return this header to the browser.
<P>
Also, don't forget to send two new lines at the end of every header.
The server expects a blank line following the MIME header, so
make the header call like this:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">print "Content-type: image/gif \n\n";</FONT></TT>
</BLOCKQUOTE>
<P>
The <TT><FONT FACE="Courier">\n\n</FONT></TT> construct may not
work under all conditions, especially those that require an explicit
carriage-return/line-feed pair. In this case you should use the
construct <TT><FONT FACE="Courier">\r\n\r\n</FONT></TT> instead
of <TT><FONT FACE="Courier">\n\n</FONT></TT>.
<H2><A NAME="FlushOutputBuffersImmediately"><B><FONT SIZE=5 COLOR=#FF0000>Flush
Output Buffers Immediately</FONT></B></A></H2>
<P>
It's important to flush the output buffers used by CGI scripts
immediately. The underlying operating system may keep output written
to a file handle such as <TT><FONT FACE="Courier">STDOUT</FONT></TT>
for some time. This time may be longer than a browser expects
to spend while waiting for a response. The simplest way to do
this is to select the output file handle and then set the <TT><FONT FACE="Courier">$|</FONT></TT>
variable to <TT><FONT FACE="Courier">1</FONT></TT>.
<H2><A NAME="DontSetEnvironmentVariables"><B><FONT SIZE=5 COLOR=#FF0000>Don't
Set Environment Variables</FONT></B></A></H2>
<P>
A CGI script is the child process of the Web server running on
a system. Being a child process, it cannot set its environment
variables for a period longer than its own execution time. That
is, any environment variables set using statements like the following
will only set the value of the environment variable <TT><FONT FACE="Courier">GEEPERS</FONT></TT>
for the script while it's executing:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">$ENV{'GEEPERS'} = "creepers";</FONT></TT>
</BLOCKQUOTE>
<P>
The value of <TT><FONT FACE="Courier">GEEPERS</FONT></TT> is not
available to the parent (server) process, which invoked this shell
script in the first place. In fact, the next time the same CGI
script is run, the value of the environment variable <TT><FONT FACE="Courier">GEEPERS</FONT></TT>
will be the value set in the server, not one set previously by
a client.
<P>
A possible way to track information between successive runs of
a CGI script is to use an HTML <TT><FONT FACE="Courier">FORM</FONT></TT>
object to store variables. HTML <TT><FONT FACE="Courier">FORM</FONT></TT>
handling is covered in detail in <A HREF="ch20.htm" tppabs="http://www.mcp.com/815097600/0-672/0-672-30891-6/ch20.htm" >Chapter 20</A>.
Basically, what you can do is store intermediate values in a <TT><FONT FACE="Courier">TEXT</FONT></TT>
object, making the <TT><FONT FACE="Courier">TEXT</FONT></TT> box
invisible. Successive calls to the CGI script update the value
of the variable in the <TT><FONT FACE="Courier">TEXT</FONT></TT>
box. Of course, you can chew up disk space by saving intermediate
results to disk.
<H2><A NAME="CleanUpAfterYourself"><B><FONT SIZE=5 COLOR=#FF0000>Clean
Up After Yourself</FONT></B></A></H2>
<P>
There are occasions when CGI scripts use temporary files to store
information. Don't forget to delete these files after your script
is done. After some time, such temporary files can accumulate
and use up valuable disk space. It's a good idea to exit from
one point in the code by calling a subroutine and to remove all
temporary files in that subroutine before exiting.
<P>
Keeping temporary files on a server also poses the problem of
synchronizing the temporary file with the process that created
it. Normally, the name of the temporary file is derived from the
process ID of the creating application. This, in turn, means that
only the process that created the file knows the filename and
when to delete the file. Even if a common prefix, such as <TT><FONT FACE="Courier">CGI</FONT></TT>,
is used for all temporary filenames, processes within the same
process group should not arbitrarily delete all temporary files
beginning with <TT><FONT FACE="Courier">CGI</FONT></TT>. For one
thing, other CGI applications might be using the temporary files
when another process deletes them. Also, there might be other
unrelated processes using <TT><FONT FACE="Courier">CGI</FONT></TT>
as the prefix for their filenames.
<H2><A NAME="ConfigureYourServerCorrectly"><B><FONT SIZE=5 COLOR=#FF0000>Configure
Your Server Correctly</FONT></B></A></H2>
<P>
Another common problem with CGI scripts is that beginning Webmasters
forget to make the path to these scripts visible to the Web server.
Most servers look in the <TT><FONT FACE="Courier">cgi-bin</FONT></TT>
subdirectory as the top of the path for a CGI script to execute.
If the named file in the path does not follow the rules for the
server you happen to be running, the server will pick up the script
and ship it back to the browser as a text file. In most cases,
this is simply an annoyance. In some cases, looking at your CGI
script may give away valuable directory information to the end
user at the browser.
<P>
To avoid such problems, you should edit the configuration files
for the server you are running. For the ncSA server, this entails
editing the <TT><FONT FACE="Courier">srm.conf</FONT></TT> file
in <TT><FONT FACE="Courier">conf</FONT></TT> subdirectory of where
you installed the distribution for the server. The <TT><FONT FACE="Courier">ScriptAlias</FONT></TT>
directive in the <TT><FONT FACE="Courier">srm.conf</FONT></TT>
file controls which directories contain server scripts. The format
for the <TT><FONT FACE="Courier">ScriptAlias</FONT></TT> directive
in the <TT><FONT FACE="Courier">srm.conf</FONT></TT> file is
<BLOCKQUOTE>
<TT><FONT FACE="Courier">ScriptAlias fakename realname</FONT></TT>
</BLOCKQUOTE>
<P>
For example, the following setting will make the <TT><FONT FACE="Courier">/home/webserver/httpd/cgi-bin/</FONT></TT>
directory look like the <TT><FONT FACE="Courier">/cgi-bin</FONT></TT>
directory to the Web server:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">ScriptAlias /cgi-bin/ /home/webserver/httpd/cgi-bin/</FONT></TT>
</BLOCKQUOTE>
<P>
Also, if you want to execute files at locations other than those
specified in the <TT><FONT FACE="Courier">ScriptAlias</FONT></TT>
path, you can specify what file extensions are allowed with the
<TT><FONT FACE="Courier">AddType</FONT></TT> directive. For example,
the following directive allows all executable scripts with <TT><FONT FACE="Courier">.pl</FONT></TT>
or <TT><FONT FACE="Courier">.cgi</FONT></TT> to be executed:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">AddType application/x-httpd-cgi .cgi</FONT></TT>
</BLOCKQUOTE>
<P>
In general, use absolute pathnames to all the files your CGI script
accesses. Specifying a relative pathname causes all searches using
the relative pathname to be started from the "root"
of the <TT><FONT FACE="Courier">DocumentRoot</FONT></TT>. The
<TT><FONT FACE="Courier">DocumentRoot</FONT></TT> directive in
the <TT><FONT FACE="Courier">srm.conf</FONT></TT> file is the
base directory from which files are searched for binary files.
The benefit of using an external base starting directory is that
an entire directory tree can be moved by simply moving the root
of that tree. This way you do not have the agony of resetting
all pathnames if all the scripts in the root of the directory
change. However, the downside of this base directory path is that
it makes your movable directory susceptible to hackers who can
use the relative pathnames to point to their own files in place
of a directory tree on a system and let your documents point to
their own versions of your documents.
<P>
Finally, the configuration file <TT><FONT FACE="Courier">access.conf</FONT></TT>
has a <TT><FONT FACE="Courier">FollowSymLinks/</FONT></TT> directive.
If this directive is enabled, a browser can be used to follow
symbolic links when it's resolving pathnames to find a document.
If your CGI script is accessing a file via a symbolic link, the
script will not work unless this directive is set to allow the
follow-up of links. Unfortunately, enabling the follow-through
opens up a security hole big enough to drive a virtual bus through.
If someone symbolically links a document to the <TT><FONT FACE="Courier">/bin</FONT></TT>
or <TT><FONT FACE="Courier">/sbin</FONT></TT> directory on your
system, he or she has free run of the system.<P>
<CENTER>
<TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>
<TR VALIGN=TOP><TD><B>Warning</B></TD></TR>
<TR VALIGN=TOP><TD>
<BLOCKQUOTE>
Never put <TT><FONT FACE="Courier">perl.exe</FONT></TT> in the <TT><FONT FACE="Courier">httpd</FONT></TT> directories in the heat of debugging. It's a major mistake that will let users at the browser run anything on your system! Don't even symbolically
link to an executable program such as <TT><FONT FACE="Courier">perl</FONT></TT>, <TT><FONT FACE="Courier">sh</FONT></TT>, or something similar that a user could run off the command line.
</BLOCKQUOTE>
</TD></TR>
</TABLE></CENTER>
<P>
<H2><A NAME="AlwaysCreateanindexhtmlFile"><B><FONT SIZE=5 COLOR=#FF0000>Always
Create an </FONT></B><TT><B><FONT SIZE=5 COLOR=#FF0000 FACE="Courier">index.html</FONT></B></TT><B><FONT SIZE=5 COLOR=#FF0000>
File</FONT></B></A></H2>
<BLOCKQUOTE>
<TT><FONT FACE="Courier">Almost all servers append /index.html
or index.html to a given URL that references a directory. Therefore,
the following URLs both become http://www.ikra.com/index.html:
<BR>
http://www.ikra.com/<BR>
<BR>
http://www.ikra.com</TT>
</BLOCKQUOTE>
<P>
Guess what happens when there is no <TT><FONT FACE="Courier">index.html</FONT></TT>
file in the directory being referenced? The server returns an
FTP listing of <I>all</I> files in the directory! This type of
exposure of your directory subtree to the world might not be what
you want.
<P>
CGI scripts can often return HTML pages as responses. One of the
first things you should do is to check all URLs generated in these
scripts that refer to your server. Make sure that there is an
<TT><FONT FACE="Courier">index.html</FONT></TT> file in all the
directories that a URL generated at your server can refer to.
It's a good idea for all URLs that you generate to be absolute
pathnames instead of relative pathnames.
<P>
One very important directory to place an <TT><FONT FACE="Courier">index.html</FONT></TT>
file in is the logs subdirectory in the <TT><FONT FACE="Courier">httpd</FONT></TT>
tree. Not placing an <TT><FONT FACE="Courier">index.html</FONT></TT>
file in the logs subdirectory will expose all your Web server
logs.
<H2><A NAME="Summary"><B><FONT SIZE=5 COLOR=#FF0000>Summary</FONT></B></A>
</H2>
<P>
This chapter is a synopsis of some of the problems you can run
into when coding CGI scripts using Perl. I cannot possibly enumerate
all the problems you might run into when debugging CGI applications;
however, this checklist will help you in debugging some common
problems:
<UL>
<LI><FONT COLOR=#000000>Make sure your CGI script has execute
permissions.</FONT>
<LI><FONT COLOR=#000000>Make sure that any data files used by
the CGI script are readable by user "nobody."</FONT>
<LI><FONT COLOR=#000000>The CGI script should compile and run
correctly. Do use the </FONT><TT><FONT FACE="Courier">-w</FONT></TT>
switch on your CGI scripts, especially when testing, to make sure
that your Perl script does not have any embarrassing bugs.
<LI><FONT COLOR=#000000>On UNIX systems, make sure you have the
Perl interpreter line (</FONT><TT><FONT FACE="Courier">#!/usr/local/bin/perl</FONT></TT>,
<TT><FONT FACE="Courier">#!/usr/bin/perl</FONT></TT>, whatever<TT><FONT FACE="Courier">
</FONT></TT>), and on NT systems, take this line out. If you forget
to place this line in the front of your Perl script in UNIX, a
browser will get a <TT><FONT FACE="Courier">500 Server Error</FONT></TT>.
<LI><FONT COLOR=#000000>Libraries for dynamically loaded modules
must exist in the </FONT><TT><FONT FACE="Courier">@Inc </FONT></TT>path
of the Perl script.
<LI><FONT COLOR=#000000>Any system calls in the CGI script must
be supported in the underlying system.</FONT>
<LI><FONT COLOR=#000000>Return valid MIME headers in all cases
from a CGI script.</FONT>
<LI><FONT COLOR=#000000>Flush output buffers immediately by setting
</FONT><TT><FONT FACE="Courier">$|</FONT></TT> to <TT><FONT FACE="Courier">1</FONT></TT>.
<LI><FONT COLOR=#000000>Don't rely on environment variables being
set on successive runs of the same CGI script.</FONT>
<LI><FONT COLOR=#000000>Clean up any temporary files created by
CGI scripts.</FONT>
<LI><FONT COLOR=#000000>Make the CGI directory visible by configuring
</FONT><TT><FONT FACE="Courier">ScriptAlias</FONT></TT> in the
configuration file.
</UL>
<P>
<HR WIDTH="100%"></P>
<CENTER><P><A HREF="ch32.htm" tppabs="http://www.mcp.com/815097600/0-672/0-672-30891-6/ch32.htm"><IMG SRC="pc.gif" tppabs="http://www.mcp.com/815097600/0-672/0-672-30891-6/pc.gif" BORDER=0 HEIGHT=88 WIDTH=140></A><A HREF="#CONTENTS"><IMG SRC="cc.gif" tppabs="http://www.mcp.com/815097600/0-672/0-672-30891-6/cc.gif" BORDER=0 HEIGHT=88 WIDTH=140></A><A HREF="index.htm" tppabs="http://www.mcp.com/815097600/0-672/0-672-30891-6/index.htm"><IMG SRC="hb.gif" tppabs="http://www.mcp.com/815097600/0-672/0-672-30891-6/hb.gif" BORDER=0 HEIGHT=88 WIDTH=140></A><A HREF="appa.htm" tppabs="http://www.mcp.com/815097600/0-672/0-672-30891-6/appa.htm"><IMG
SRC="nc.gif" tppabs="http://www.mcp.com/815097600/0-672/0-672-30891-6/nc.gif" BORDER=0 HEIGHT=88 WIDTH=140></A></P></CENTER>
<P>
<HR WIDTH="100%"></P>
</BODY>
</HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -