📄 ch22.htm
字号:
print "$_ = $ENV{$_}";<BR>
}</FONT></TT>
</BLOCKQUOTE>
</TD></TR>
</TABLE></CENTER>
<P>
<H3><A NAME="WhattheRawDataLooksLike">What the Raw Data Looks
Like</A></H3>
<P>
Let's take a closer look at the raw data produced by simply e-mailing
the form output with <TT><FONT FACE="Courier">mailto:</FONT></TT>.
This data is in the same form as that which will be passed to
the CGI:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">applicant=John+Doe&address=520+Main+St.+Apt.%23204&city=Cicily&state=Alaska&zipcode=9021
<BR>
0&country=USA&email=jdoe@KBHR.org&zines=Cooking+With+Soylent+Green&zines=Asteroid+Living
<BR>
&gift=TriCorder&payment_method=Frontiers+Credit+Card&card_number=123456789123&suggestion
<BR>
s=Can+you+offer+%22Asteroid+Living+For+Kids%22%3F%0D%0A</FONT></TT>
</BLOCKQUOTE>
<P>
The data is URL encoded. The name/value pairs established by the
form are separated by ampersands (<TT><FONT FACE="Courier">&),</FONT></TT>spaces
have been turned into pluses (<TT><FONT FACE="Courier">+)</FONT></TT>,
and some characters are expressed in HEX format (%<I>xx</I>).
Our first objective will be to make this data readable.
<H2><A NAME="ProcessingtheDatawithPerl"><FONT SIZE=5 COLOR=#FF0000>Processing
the Data with Perl</FONT></A></H2>
<P>
Perl specializes in string handling and manipulation. This makes
it the language of choice for a great many CGI applications. Another
powerful feature of Perl is associative arrays, which allow us
to reference elements of an array by using a string instead of
by index.
<P>
<CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>
<TR><TD><B>Note</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
Some tools already exist to aid you in form processing as well as other functions performed with CGIs. Cgi-lib.pl, a library of Perl functions by Stephen Brenner, gives you a resource that will make processing forms a lot easier. The latest version of Perl
5 has a CGI.pm library of on-the-fly form parsing and processing features built in. Because these tools are not as widely available as standard Perl 4, all the code in this chapter is written in standard Perl 4 for maximum portability.</BLOCKQUOTE>
</TD></TR>
</TABLE></CENTER>
<P>
<H3><A NAME="ASimpleParsingCGI">A Simple Parsing CGI</A></H3>
<P>
This Perl program reads the order form we've submitted with <TT><FONT FACE="Courier">METHOD=POST</FONT></TT>
and displays the name/value pairs on-screen. This code should
work with any form submitted via <TT><FONT FACE="Courier">METHOD=POST</FONT></TT>.
<BLOCKQUOTE>
<TT><FONT FACE="Courier">#!/usr/bin/perl<BR>
print "Content-type: text/html\n\n";<BR>
<BR>
read(STDIN,$input,$ENV{'CONTENT_LENGTH'});<BR>
<BR>
@input = split (/&/,$input);<BR>
<BR>
foreach $i (0 .. $#input) {<BR>
$input[$i] =~ s/\+/ /g;<BR>
($name, $value) = split(/=/,$input[$i],2);<BR>
$input{$name} .= $value;<BR>
}<BR>
<BR>
print <<EOT;<BR>
<HTML><HEAD><BR>
<TITLE>Order Output</TITLE><BR>
</HEAD><BR>
<BR>
<BODY><BR>
EOT<BR>
<BR>
foreach (keys %input) {<BR>
print "$_ = $input{$_}<br>\n";<BR>
}<BR>
<BR>
print <<EOT;<BR>
</BODY><BR>
</HTML><BR>
EOT</FONT></TT>
</BLOCKQUOTE>
<P>
Let's take a closer look at this code step by step.
<BLOCKQUOTE>
<TT><FONT FACE="Courier">#!/usr/bin/perl<BR>
print "Content-type: text/html\n\n";</FONT></TT>
</BLOCKQUOTE>
<P>
These are the first two lines of just about any CGI written in
Perl. The first line simply states that the CGI is a Perl program
and indicates the path for the Perl interpreter. You need to know
where the Perl interpreter has been installed on your system.
You can generally find this out by typing
<BLOCKQUOTE>
<TT><FONT FACE="Courier">which perl</FONT></TT>
</BLOCKQUOTE>
<P>
from your shell. If you still can't find the proper path, ask
your system administrator.
<P>
The second line of code lets the world know that this particular
Perl program will be producing HTML.
<P>
Because we are using <TT><FONT FACE="Courier">METHOD=POST</FONT></TT>,
the information from the form is being passed to the CGI via STDIN
(standard input). The form data will be read into the variable
<TT><FONT FACE="Courier">$input</FONT></TT>. We need to use the
environment variable <TT><FONT FACE="Courier">CONTENT_LENGTH</FONT></TT>
to determine the length of the input stream we will be reading.
<BLOCKQUOTE>
<TT><FONT FACE="Courier">read(STDIN,$input,$ENV{'CONTENT_LENGTH'});
<BR>
</FONT></TT>
</BLOCKQUOTE>
<P>
<CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>
<TR><TD><B>Caution</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
Whenever you are using associative arrays, make sure you remember to use curly brackets <TT><FONT FACE="Courier">{}</FONT></TT> to enclose the name of the array element you are referencing. Unless your screen display is particularly good, the difference
between curly brackets <TT><FONT FACE="Courier">{}</FONT></TT> and round brackets <TT><FONT FACE="Courier">()</FONT></TT> can be difficult to see.
</BLOCKQUOTE>
</TD></TR>
</TABLE></CENTER>
<P>
<P>
As you saw earlier in the use of a <TT><FONT FACE="Courier">mailto:</FONT></TT>
tag to pipe the output of the form, the name/value pairs are separated
by ampersands (<TT><FONT FACE="Courier">&</FONT></TT>).Take
the input string, <TT><FONT FACE="Courier">$input</FONT></TT>,
and break it into an array, separating elements at ampersands.
Each element of this array will contain a name/value pair separated
by an equal sign. It should be noted that the ampersands are removed
from the input string during this process.
<BLOCKQUOTE>
<TT><FONT FACE="Courier">@input = split (/&/,$input);</FONT></TT>
</BLOCKQUOTE>
<P>
We will now cycle through our input array (<TT><FONT FACE="Courier">@input</FONT></TT>),
making the data more readable, separating the name/value pairs,
and putting them into an associative array. In Perl the number
of elements in an array is contained by the variable <TT><FONT FACE="Courier">$#<I>array_name</I></FONT></TT>.
<BLOCKQUOTE>
<TT><FONT FACE="Courier">foreach $i (0 .. $#input) {</FONT></TT>
</BLOCKQUOTE>
<P>
In the URL encoding process used to transmit the form data, all
the spaces have been converted to pluses (<TT><FONT FACE="Courier">+</FONT></TT>).
Perform a global substitution on each element of <TT><FONT FACE="Courier">@input</FONT></TT>
replacing <TT><FONT FACE="Courier">+</FONT></TT> with a blank
space. It is important to precede the <TT><FONT FACE="Courier">+</FONT></TT>
with a backslash <TT><FONT FACE="Courier">\</FONT></TT> in this
instance because the default use of <TT><FONT FACE="Courier">+</FONT></TT>
is as a wild card match. The <TT><FONT FACE="Courier">g</FONT></TT>
indicates that this is a "global" replace: all <TT><FONT FACE="Courier">+</FONT></TT>
signs will be replaced; not just the first match.
<BLOCKQUOTE>
<TT><FONT FACE="Courier">$input[$i] =~ s/\+/ /g;<BR>
</FONT></TT>
</BLOCKQUOTE>
<P>
<CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>
<TR><TD><B>Note</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
Perl's substitution command <TT><FONT FACE="Courier">s</FONT></TT> is a very powerful tool and is one of the things that makes Perl the language of choice for a great deal of CGI programming.
</BLOCKQUOTE>
<BLOCKQUOTE>
To perform a substitution on a string, the format is as follows:</BLOCKQUOTE>
<BLOCKQUOTE>
<TT><FONT FACE="Courier">$string =~ s/PATTERN/REPLACEMENT/[g][i][e]</FONT></TT>
</BLOCKQUOTE>
<BLOCKQUOTE>
The <TT><FONT FACE="Courier">g</FONT></TT> option substitutes globally; all references to <TT><FONT FACE="Courier">PATTERN</FONT></TT> within <TT><FONT FACE="Courier">$string</FONT></TT> will be replaced. If <TT><FONT FACE="Courier">g</FONT></TT> is not
indicated, only the first reference to <TT><FONT FACE="Courier">PATTERN</FONT></TT> will be replaced. The <TT><FONT FACE="Courier">i</FONT></TT> option indicates that the substitution will be insensitive to case: <TT><FONT
FACE="Courier">PATTERN</FONT></TT> will also match <TT><FONT FACE="Courier">PatTern</FONT></TT>, and both will be replaced. The <TT><FONT FACE="Courier">e</FONT></TT> option allows you to match and replace an evaluated expression. You'll see an example of
this when we get rid of the HEX codes sent to our CGI in the section "Parsing the Data: Round 2."
</BLOCKQUOTE>
</TD></TR>
</TABLE></CENTER>
<P>
<P>
We will now split the name/value pair into the scalar variables
<TT><FONT FACE="Courier">$name</FONT></TT> and <TT><FONT FACE="Courier">$value</FONT></TT>
and place the values in an associative array for easier reference.
Because name and value are separated by an equal sign (<TT><FONT FACE="Courier">=</FONT></TT>),
we separate the variables on that character. Because we know how
many fields we're breaking each element into, indicate <TT><FONT FACE="Courier">2</FONT></TT>
as the <TT><FONT FACE="Courier">LIMIT</FONT></TT> parameter. Note
that this wasn't possible in our previous use of the <TT><FONT FACE="Courier">split</FONT></TT>
function because we didn't know how many times we would be spitting
the STDIN stream.
<P>
The fact that we are putting these variables into an associative
array is indicated by assigning the elements of that array using
curly brackets. Notice that we are not simply assigning values
to the associative array but appending them by using <TT><FONT FACE="Courier">.=</FONT></TT>;
the reason for this will become clear shortly.
<BLOCKQUOTE>
<TT><FONT FACE="Courier"> ($name, $value) = split(/=/,$input[$i],2);
<BR>
$input{$name} .= $value;</FONT></TT>
</BLOCKQUOTE>
<BLOCKQUOTE>
<TT><FONT FACE="Courier">}</FONT></TT>
</BLOCKQUOTE>
<P>
With Perl, you can easily embed HTML within your CGIs because
you can indicate you want whole lists of text printed. This is
standard HTML file header information. I use <TT><FONT FACE="Courier">EOT</FONT></TT>
to indicate "End Of Text," but you can use any unique
string you wish.
<BLOCKQUOTE>
<TT><FONT FACE="Courier">print <<EOT;<BR>
<HTML><HEAD><BR>
<TITLE>Order Output</TITLE><BR>
</HEAD><BR>
<BR>
<BODY><BR>
EOT<BR>
</FONT></TT>
</BLOCKQUOTE>
<P>
<CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>
<TR><TD><B>Caution</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
Many commands in Perl accept a list as their parameter, such as <TT><FONT FACE="Courier">print<<EOT</FONT></TT>; if you are indenting your code, make sure that you do not indent the line containing the <TT><FONT FACE="Courier">EOT</FONT></TT> marker.
Perl will not recognize an <TT><FONT FACE="Courier">End of List</FONT></TT> marker unless it is identical to the marker for which it is looking. That means you cannot use preceding or trailing spaces or tabs.
</BLOCKQUOTE>
</TD></TR>
</TABLE></CENTER>
<P>
<P>
We will now print the contents of the associative array <TT><FONT FACE="Courier">%input</FONT></TT>.
The element names of an associative array are referred to as <TT><FONT FACE="Courier">keys</FONT></TT>.
The keys of the associative array are stored, conveniently, in
<TT><FONT FACE="Courier">keys</FONT></TT>. In each iteration of
the <TT><FONT FACE="Courier">foreach</FONT></TT> loop, the current
key is stored in the special variable <TT><FONT FACE="Courier">$_</FONT></TT>.
The <TT><FONT FACE="Courier">print</FONT></TT> statement contains
both an HTML linebreak (<TT><FONT FACE="Courier"><br></FONT></TT>)
and a Perl linebreak (<TT><FONT FACE="Courier">\n</FONT></TT>).
This will force a break in both the screen output and in the source
listing.
<BLOCKQUOTE>
<TT><FONT FACE="Courier">foreach (keys %input) {<BR>
</FONT></TT>print "$_ = $input{$_}<br>\n";<BR>
<TT><FONT FACE="Courier">}</FONT></TT>
</BLOCKQUOTE>
<P>
The rest of the code simply cleans up the bottom of our HTML.
<H3><A NAME="TheOutputoftheSimpleCGI">The Output of the Simple
CGI</A></H3>
<P>
By taking a look at the data produced by this simple program example,
you can already see a vast improvement in readability. This is
some typical data that would be displayed after pressing the submit
button on our form:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">state = CA<BR>
card_number = 123456789<BR>
country = USA<BR>
address = 520 Main St. %23204<BR>
email = ken.hunt@westbevhigh.edu<BR>
city = Beverly Hills<BR>
zines = Cooking With Soylent GreenAsteroid Living<BR>
gift = TriCorder<BR>
suggestions = Can you offer %22Asteroid Living for Kids%22 Magazine%3F%0D%0A
<BR>
applicant = Ken Hunt<BR>
payment_method = Frontiers Credit Card<BR>
zipcode = 90210</FONT></TT>
</BLOCKQUOTE>
<H3><A NAME="ParsingtheDataRound2">Parsing the Data: Round 2</A>
</H3>
<P>
A few problems are still evident in this data. It's in a fairly
random order, the two magazines indicated by the checkboxes in
our form have been concatenated into one <TT><FONT FACE="Courier">zines</FONT></TT>
variable by our use of the appending operator, and some characters
are still in the form of HEX codes.
<P>
The random order results from the manner in which the associative
array hash tables are stored by Perl. We could have them listed
in alphabetical order by sorting the keys in our <TT><FONT FACE="Courier">foreach</FONT></TT>
loop (not that alphabetical order is necessarily any more helpful
in this case).
<BLOCKQUOTE>
<TT><FONT FACE="Courier">foreach (sort keys %input)</FONT></TT>
</BLOCKQUOTE>
<P>
In cases such as the <TT><FONT FACE="Courier">zines</FONT></TT>
where it's possible to have more than one value per variable,
we need a method of separating those values. We'll do this when
we're building the associative array by checking to see if a particular
variable is already defined. If so, we'll throw a comma (<TT><FONT FACE="Courier">,</FONT></TT>)
on the end of the value before we append the next value.
<BLOCKQUOTE>
<TT><FONT FACE="Courier">$input{$name} .= ',' if (defined($input{$name}));</FONT></TT>
</BLOCKQUOTE>
<P>
The HEX codes are listed in the URL encoded data using the format
<TT><FONT FACE="Courier">%<I>xx</I></FONT></TT> where<I> </I><TT><I><FONT FACE="Courier">xx</FONT></I></TT>
is a hexadecimal number. We convert sequences like this into characters
by using an evaluated substitution.
<BLOCKQUOTE>
<TT><FONT FACE="Courier">$input[$i] =~ s/%(..)/pack("c",hex($1))/ge;</FONT></TT>
</BLOCKQUOTE>
<P>
Implementing these three changes, the output now looks like this:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">address = 520 Main St. #204<BR>
applicant = Ken Hunt<BR>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -