📄 ch5.htm

📁 CGI programming is the hottest stuff to look out for in this book
💻 HTM
📖 第 1 页 / 共 3 页
字号:
<P>
Pseudocode is just plain language with a little bit of techie
added for good measure. It's still little more than an application
sketch, but it's the first phase where you begin to draw in the
elements you'll have to deal with when coding the application.
What you start to add are the Hows and Wheres into your overall
statement of What.
<P>
First of all, you have to wonder how data is going to find itself
heading toward your program: Does your program want user input
or not? Is it from a user filling out a form, like in the example
we're building? Does it come from a database, like a random link
program? Does it just do the same thing every time, like a fish-cam?
<P>
In this example, information is coming from a form that might
look roughly like the one shown in Figure 5.1.
<P>
<A HREF="f5-1.gif" ><B>Figure 5.1: </B><I>Shows name, e-mail address, product version, and comments fields.</I></A>
<P>
<CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>
<TR><TD><B>Note</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
Forms are one of the most common methods of allowing users to enter data to be used by your CGI program. To understand all the things that forms can do for you, you'll want to make sure you look at the material presented in <A HREF="ch8.htm" >Chapter 
8</A>, &quot;Forms and How to Handle Them.&quot;
</BLOCKQUOTE>

</TD></TR>
</TABLE></CENTER>
<P>
<P>
This particular form specifies the four bits of data we're interested
in. When the user clicks on the Submit Survey button, it tells
the server to execute the CGI program. This is where the processes
of CGI come into play.
<H3><A NAME="PlanningforProcessing">Planning for Processing</A>
</H3>
<P>
Changing your general program sketch into something a little more
on the techie side can best be approached in steps. In <A HREF="ch3.htm" >Chapter 3</A>
&quot;Crash Course in CGI,&quot; you were introduced to the processes
of CGI and where data gets placed. Now you just need to define
where those processes fit into your sketch. This is a little bit
of a jump, in some cases, but not much.
<P>
What you want to do is break your sketch into sections, and then
deal with each one of those sections individually before putting
them back together into a real listing of pseudocode. What kind
of sections should you be breaking it up into? Well, there are
really four types of operations for a CGI program:
<OL>
<LI>Initialization/Termination
<LI>Gathering Input
<LI>Processing Input
<LI>Generating Output
</OL>
<P>
Out of these four possible phases, you're normally most concerned
with parts 2 through 4. The initialization and termination of
the CGI program involve memory and process allocation by the server,
as well as some other background processes. While they're important
steps, they're taken care of by the server software and the operating
system; and they're out of your hands other than providing someone
with a way to start your script through a hard-coded link or a
form action. Sure, if you do something very strange with allocated
memory or file locking, you'll want to be certain you clear that
up (in case the server can't do it for you); but for the most
part, you're out of the loop, if you're careful.
<H3><A NAME="GatheringInput">Gathering Input</A></H3>
<P>
There are very few CGI programs out there that don't take input
of one kind or another. Whether it's from a user's form, a link,
or even from an external file or device, something's normally
being read in. In your program, where is data coming from? For
every possible source, you need to apply acceptable methods of
going in and getting that data.
<P>
The example we've been batting about, where you're obtaining the
name, e-mail address, and comments from users, involves a form.
To read the data in, the program has to determine where the data
is. In this case, it's user data, so it's coming from environment
variables and possibly standard input (STDIN), as well. You'll
need to isolate where the data is by determining how it's getting
to you, and then you'll be able to read it in. Because there are
really two methods for a client to send data through an HTTP request,
you want to determine if it's one of the two, and then act on
it. If it's not either of the conditions you were expecting, bail
out. Listing 5.1 shows an example of pseudocode for checking where
the source of the data is and producing an error if it's neither
of the expected situations.
<HR>
<BLOCKQUOTE>
<B>Listing 5.1. Determining the source of user data.<BR>
</B>
</BLOCKQUOTE>
<BLOCKQUOTE>
<TT><FONT FACE="Courier">read in REQUEST_METHOD environment variable
<BR>
if REQUEST_METHOD is GET<BR>
&nbsp;&nbsp;&nbsp;&nbsp;read in QUERY_STRING environment variable
<BR>
if REQUEST_METHOD is POST<BR>
&nbsp;&nbsp;&nbsp;&nbsp;read in CONTENT_LENGTH environment variable
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;read CONTENT_LENGTH bytes from Standard
Input (STDIN)<BR>
otherwise<BR>
&nbsp;&nbsp;&nbsp;&nbsp;create an error message and end the program</FONT></TT>
</BLOCKQUOTE>
<HR>
<P>
Environment variables, like <TT><FONT FACE="Courier">REQUEST_METHOD</FONT></TT>,
provide storage for information about the client's request. When
the client requested something through the <TT><FONT FACE="Courier">GET</FONT></TT>
method, all the data is stored in the <TT><FONT FACE="Courier">QUERY_STRING</FONT></TT>
environment variable. When using the <TT><FONT FACE="Courier">POST</FONT></TT>
method, the data has been sent to STDIN, and a count of how much
data was sent is made available to your program in the environment
variable <TT><FONT FACE="Courier">CONTENT_LENGTH</FONT></TT>.
<H3><A NAME="Processing">Processing</A></H3>
<P>
The Processing phase of a CGI program is where you let your design
run wild. There's nothing that says you have to do your task in
a specific way or what the result of it all has to be. The two
things you need to pay attention to are making sure you correctly
interpret information that's sent to the application and that
it finishes the tasks you assign it.
<H4>Dealing with Input</H4>
<P>
By the time you get hold of incoming user data, two steps have
been applied to it by the client and the server. It's up to your
program to undo those steps and get the information back that
it needs, but to do that you need to understand what's already
been done.
<H5>Ordered Pairs</H5>
<BLOCKQUOTE>
Information comes to your program in <I>ordered pairs</I>. That
means that wherever applicable, there's a named chunk of data
and a value that goes along with that name. The format looks something
like this:
</BLOCKQUOTE>
<BLOCKQUOTE>
<TT><FONT FACE="Courier">name1=value1&amp;name2=value2&amp;name3=more+values+here</FONT></TT>
</BLOCKQUOTE>
<BLOCKQUOTE>
In this example, there are three separate pairs of information,
each separated by an ampersand (<TT><FONT FACE="Courier">&amp;</FONT></TT>).
As you'll see in <A HREF="ch8.htm" >Chapter 8</A>, you can control
what the names are for these value pairs, which will make it easier
to do things with the data and identify what the values are really
there for.
</BLOCKQUOTE>
<H5>URL Encoding</H5>
<BLOCKQUOTE>
The other step that takes place when the data is sent is the replacement
of special characters with a substitute value. In the preceding
example, for the name and value pairs, you'll notice that in the
last pair of information there are plus signs (<TT><FONT FACE="Courier">+</FONT></TT>)
between the words <TT><FONT FACE="Courier">more</FONT></TT>, <TT><FONT FACE="Courier">values</FONT></TT>,
and <TT><FONT FACE="Courier">here</FONT></TT>. This is the tip
of the iceberg: When sending data that has spaces in it, those
spaces are changed over to plus signs (<TT><FONT FACE="Courier">+</FONT></TT>)
so that the data is one continuous string with no information
that could be interpreted as a break.
</BLOCKQUOTE>
<BLOCKQUOTE>
Other special characters include back and forward slashes, ampersands,
line feeds/carriage returns, tildes, percent signs, and a variety
of others. Whenever one of these characters is encoded, you'll
see a percent sign followed by two digits, such as <TT><FONT FACE="Courier">%25</FONT></TT>.
What this means is that the two characters are actually the hexadecimal
value of the character that originally went there.
</BLOCKQUOTE>
<BLOCKQUOTE>
For instance, because you use the percent sign as a special character,
you'd need to encode it. Instead of seeing the character <TT><FONT FACE="Courier">%</FONT></TT>
in the data, you'd see its encoded equivalent, which is <TT><FONT FACE="Courier">%25</FONT></TT>.
</BLOCKQUOTE>
<BLOCKQUOTE>
So, before your program decides to try to do anything with that
data, make sure you run through and convert all plus signs (<TT><FONT FACE="Courier">+</FONT></TT>)
to spaces, find all the <TT><FONT FACE="Courier">%##</FONT></TT>
combinations, and convert them back to their original form, using
whatever's available in your programming language.
</BLOCKQUOTE>
<H4>Completing Your Tasks</H4>
<P>
What's the point of having a CGI program if it doesn't do what
you want it to do in the first place? While you have complete
control over what's done, and how, keep these things in mind:
<OL>
<LI>Provide error checking at every complex step.
<LI>Don't get fancy when simple will work just as well.
<LI>Be prepared for the unexpected: provide time-outs and other
failsafes to ensure that your program doesn't just sit there.
<LI>Be concerned about security: don't leave a hole that you think
no one will find. They'll find it.
<LI>Make sure you've provided for all possible cases of data.
</OL>
<H3><A NAME="GeneratingOutput">Generating Output</A></H3>
<P>
Is your program going to tell the user when it's done doing what
it was doing? Most likely it will, unless you're playing around
with server-push images and just letting it sit there forever.
Because output is a very important part of the application, give
it at least as much thought as you give to accepting input. If
your program has error handling, consider what kinds of errors
you're going to return to the user. Would <TT><FONT FACE="Courier">Error
4A</FONT></TT> give the user any idea what to do next? How about
<TT><FONT FACE="Courier">I'm sorry, I can't do that right now</FONT></TT>?
Feedback is either data that the user was expecting or information
the user needs to know, such as an execution error. If you've
taken the time to check for the errors in the first place, take
a little more time and help create errors that make sense, or
at least don't impart a feeling of hopelessness in the user.
<P>
Output the user was expecting can vary, as well. Any type of output
you send back to the server and the client needs to be prefaced
with some instructions telling the server what kind of data it
is. For instance, if you're thanking a user for filling out the
survey, you're normally sending back HTML. The way to do this
is to instruct the server that you're sending back HTML, and then
send it. You can do this in Perl, as shown in Listing 5.2.
<HR>
<BLOCKQUOTE>
<B>Listing 5.2. Sample HTML response in Perl.<BR>
</B>
</BLOCKQUOTE>
<BLOCKQUOTE>
<TT><FONT FACE="Courier">print &quot;Content-type: text/html \n\n&quot;;
<BR>
print </FONT></TT>&quot;&lt;h1&gt;Survey Received&lt;/h1&gt; \n&quot;;
<BR>
<TT><FONT FACE="Courier">print &quot;Thanks for submitting the
survey, we appreciate it. \n&quot;;</FONT></TT>
</BLOCKQUOTE>
<HR>
<P>
All that's needed is a <TT><FONT FACE="Courier">Content-type:</FONT></TT>
header. This is the MIME (Multipart Internet Mail Extensions)
type that the information consists of, which gives the server
some clue as to what to do with it.
<H2><A NAME="TheFinePrint"><FONT SIZE=5 COLOR=#FF0000>The Fine
Print</FONT></A></H2>
<P>
With pseudocode in hand, roll up your sleeves and sit down in
front of the machine. The time of reckoning has come: It's time
to let the code hit the machine. What you need to consider now
are the ways of performing the tasks you've laid out for yourself
and make sure everything is going to work smoothly, without too
much effort on your part.
<H3><A NAME="Libraries">Libraries</A></H3>
<P>
Let's take a look at &quot;without too much effort&quot; for a
moment. Looking at your application, are there things in it that
you're not sure you know how to do-things that could be a real
pain? For instance, writing your own special code to generate
images on-the-fly or creating a whole URL decoding sequence just
for one tiny, little three-line program that's just supposed to
echo someone's name back as a cool example of what you did with
CGI. Don't worry; you're not alone.
<P>
CGI libraries are very common because there are so many people
doing CGI programming, and people have found easy ways of getting
some of the most repetitive and complex tasks done without too
much suffering. In fact, <A HREF="ch4.htm" >Chapter 4</A>, &quot;Comparison
of the Various CGI Programming Libraries,&quot; is devoted entirely
to the topic of libraries. They're everywhere. The point of libraries
is to save you time and effort by providing you (normally at no
charge) premade and pretested routines that perform certain tasks
for you.
<P>
A great example of this is the classic cgi-lib.pl library for
Perl, written by Steven Brenner and in current use by more people
and their programs than can be counted easily. This simple library
takes the drudge work out of reading in data and turns lines upon
lines of code that beginning programmers may not be comfortable
with into one reference to a subroutine that does everything for
you. Imagine being able to find several pieces of code like this
that people have made freely available that do the things you've
💿 文件大小 1276 K
👤 上传用户 as7512158
📂 所属分类软件工程
📄 代码行数 838 行
💻 语言类型 HTM
🏷️ 相关标签

#programming #hottest #stuff #book
更多programming资源 →
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -