⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch3.htm

📁 CGI programming is the hottest stuff to look out for in this book
💻 HTM
📖 第 1 页 / 共 4 页
字号:
one of two ways, depending on how the user ended up starting the
conversation with the server. The user doesn't really have any
control over this, so they can't help you. What you need to do
is find where the server wrote down how information was sent.
That place is the <TT><FONT FACE="Courier">REQUEST_METHOD</FONT></TT>
environment variable.
<P>
There are two commonly used values for the <TT><FONT FACE="Courier">REQUEST_METHOD</FONT></TT>
environment variable: <TT><FONT FACE="Courier">GET</FONT></TT>
and <TT><FONT FACE="Courier">POST</FONT></TT>. By looking at <TT><FONT FACE="Courier">REQUEST_METHOD</FONT></TT>
and seeing which of these methods the request used, the CGI program
can then decide where its data is hiding and can go out and get
it. What's different about the two methods? Find out in the following
sections.
<H4><TT><FONT FACE="Courier">GET</FONT></TT></H4>
<P>
When the server uses the <TT><FONT FACE="Courier">GET</FONT></TT>
method to process a request, it has a very simple way of dealing
with the data being sent by the user: It tacks it on to the end
of the URL (the location of your script). So, let's say that the
URL to get to your script is <TT><FONT FACE="Courier">http://yourplace.com/cgi-bin/mine.pl</FONT></TT>,
and you just want to pass it the word <TT><FONT FACE="Courier">&quot;catapult&quot;</FONT></TT>.
The new URL that gets sent to the server is <TT><FONT FACE="Courier">http://yourplace.com/cgi-bin/mine.pl?catapult</FONT></TT>.
The question mark (<TT><FONT FACE="Courier">?</FONT></TT>) is
what the CGI program (and the server) use to separate out where
the data starts. Everything that comes after the question mark
(<TT><FONT FACE="Courier">?</FONT></TT>) is considered to be the
<TT><FONT FACE="Courier">QUERY_STRING</FONT></TT> environment
variable. So in this case, the environment variable <TT><FONT FACE="Courier">QUERY_STRING</FONT></TT>
would just contain the word <TT><FONT FACE="Courier">&quot;catapult&quot;</FONT></TT>.
<P>
<CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>
<TR><TD><B>Note</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
If you've ever seen an entry in an HTML file that looks something like <TT><FONT FACE="Courier">&lt;a href=/cgi-bin/mine.pl?data&gt;</FONT></TT>, then you've seen an example of how the <TT><FONT FACE="Courier">GET</FONT></TT> method works. If the URL 
already has a question mark, then it already has a <TT><FONT FACE="Courier">QUERY_STRING</FONT></TT>, and the server automatically assumes that it's coming through the <TT><FONT FACE="Courier">GET</FONT></TT> method. This easy method of calling a script 
with fixed data is often used for things like random link programs, viewing stock values for a specific company's stock, and other such things where the result may change and the users may change, but the data that goes into the program should stay the 
same.
</BLOCKQUOTE>

</TD></TR>
</TABLE></CENTER>
<P>
<H4><TT><FONT FACE="Courier">POST</FONT></TT></H4>
<P>
Using the <TT><FONT FACE="Courier">POST</FONT></TT> method allows
the server to accept more information, so you'll normally see
it used more often with forms and things that have lots of stuff
to send. The difficulty is that it's a little harder to get the
data that's been sent in. What happens is that when the <TT><FONT FACE="Courier">POST</FONT></TT>
method is called, all the data is gathered up and sent to Standard
Input (<TT><FONT FACE="Courier">STDIN</FONT></TT>). While it seems
that it would then be just as easy to call up <TT><FONT FACE="Courier">STDIN</FONT></TT>
as it is to call up <TT><FONT FACE="Courier">QUERY_STRING</FONT></TT>,
it's not. <TT><FONT FACE="Courier">STDIN</FONT></TT> is a really
big buffer, and you don't want to be reading everything that might
be contained in there-you might run out of space!
<P>
To help you out, the environment variable <TT><FONT FACE="Courier">CONTENT_LENGTH</FONT></TT>
tells you how much data was placed into <TT><FONT FACE="Courier">STDIN</FONT></TT>.
If there were 500 bytes, <TT><FONT FACE="Courier">CONTENT_LENGTH</FONT></TT>
will be the value 500. If there were 10 bytes, <TT><FONT FACE="Courier">CONTENT_LENGTH</FONT></TT>
will be the value 10. What this allows you to do is use your programming
language's easiest method to read that number of bytes of data
from <TT><FONT FACE="Courier">STDIN</FONT></TT> and then do something
with it.
<H3><A NAME="StrangeLookingData">Strange Looking Data</A></H3>
<P>
When you get hold of the data, from either <TT><FONT FACE="Courier">QUERY_STRING</FONT></TT>
or <TT><FONT FACE="Courier">STDIN</FONT></TT>, you may notice
that it looks kind of strange. For instance, you might end up
with a long string of data that looks like this:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">name=Bill&amp;company=Aimtech&amp;email=bills&amp;stuff=%25+signs</FONT></TT>
</BLOCKQUOTE>
<P>
What you've run into are the two steps that the client and server
run before giving you access to the data. You should really thank
them for doing it because it's designed to remove possible &quot;problem&quot;
characters that could make the CGI program misbehave and also
to organize everything into one convenient group. Let's look at
what it has really done.
<H4><TT><FONT FACE="Courier">Name=Value</FONT></TT> Pairs</H4>
<P>
To try to help you get organized, the behind-the-scenes CGI mechanisms
have arranged everything into pairs of information, separated
by ampersands (<TT><FONT FACE="Courier">&amp;</FONT></TT>). If
you've ever seen an HTML form (and you definitely will in <A HREF="ch8.htm" >Chapter 8</A>,
&quot;Forms and How to Handle Them&quot;), you may be familiar
with the fact that each possible area where information can be
entered has a name associated with it. For instance, a form might
have fields for <TT><FONT FACE="Courier">&quot;name&quot;</FONT></TT>,
<TT><FONT FACE="Courier">&quot;company&quot;</FONT></TT>, <TT><FONT FACE="Courier">&quot;Email&quot;</FONT></TT>,
and <TT><FONT FACE="Courier">&quot;stuff&quot;</FONT></TT>. This
helps the CGI program make sense of what information comes from
where. You wouldn't want it to start confusing e-mail addresses
and names, would you?
<P>
What happens is that every piece of data that can have a name
associated with it does. This all happens automatically; you just
see the results. This kind of formatting can be called &quot;Name=Value
pairs,&quot; but a more common term is &quot;ordered pairs&quot;
because it's ordered in the <TT><FONT FACE="Courier">Name=Value</FONT></TT>
fashion, and whatever data is sent first is the first pair in
the order. So if the <TT><FONT FACE="Courier">&quot;Email&quot;</FONT></TT>
field was first, the order would be <TT><FONT FACE="Courier">email=bills&amp;name=...</FONT></TT>
and so on, until all the fields and their information had been
accounted for.
<H4>URL Encoding</H4>
<P>
The other process that's already happened to the data as it comes
in is called <I>URL encoding</I>, or <I>escaping</I>, of special
characters. The reason this has been done is to prevent any accidental
interpretation of characters like percent signs (<TT><FONT FACE="Courier">%</FONT></TT>),
backslashes (<TT><FONT FACE="Courier">\</FONT></TT>), and other
pieces that would toss the server or the CGI program for a loop.
<P>
So how do you know when things have been encoded and what's a
special character? Well, anything that has an ASCII value greater
than 127 or lower than 33 is going to get encoded. But what the
heck does that mean? Knowing that most of us haven't memorized
the ASCII character table (because it's not something you really
need to know during parties or casual conversation), all that
you really need to know is this: Anything that's in the format
<TT><FONT FACE="Courier">%##</FONT></TT> (such as <TT><FONT FACE="Courier">%25</FONT></TT>)
is a special character that's been encoded. How do you know someone
didn't accidentally put a percent sign (<TT><FONT FACE="Courier">%</FONT></TT>)
in a string and cause confusion? Because when percent signs are
used as part of the information being sent by the user, the percent
signs get encoded, too. In fact, <TT><FONT FACE="Courier">%25</FONT></TT>
is actually the percent sign when it's encoded.
<P>
You might wonder, though, what kind of special characters show
up in the data that aren't encoded because they mean something
special. For instance, let's look at the sample data shown previously:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">name=Bill&amp;company=Aimtech&amp;email=bills&amp;stuff=%25+signs</FONT></TT>
</BLOCKQUOTE>
<P>
You can see the <TT><FONT FACE="Courier">%25</FONT></TT> there
in the end, which means there was originally a percent sign there
that got encoded. But what about the plus signs (<TT><FONT FACE="Courier">+</FONT></TT>),
the equal sign (<TT><FONT FACE="Courier">=</FONT></TT>), and the
ampersand (<TT><FONT FACE="Courier">&amp;</FONT></TT>)? They all
have special reserved functionality, as you see in Table 3.4.
Each one signifies that they're either a break in the data or
a special piece of encoding.<BR>
<P>
<CENTER><B>Table 3.4. Characters reserved for use in URL encoded
strings.</B></CENTER>
<CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>
<TR><TD><I>Name</I></TD><TD WIDTH=83><CENTER><I>Character</I></CENTER>
</TD><TD WIDTH=320><I>Purpose</I></TD></TR>
<TR><TD WIDTH=95>Ampersand</TD><TD WIDTH=83><CENTER><TT><FONT FACE="Courier">&amp;</FONT></TT></CENTER>
</TD><TD WIDTH=320>Joins ordered pairs together.</TD></TR>
<TR><TD WIDTH=95>Equal</TD><TD WIDTH=83><CENTER><TT><FONT FACE="Courier">=</FONT></TT></CENTER>
</TD><TD WIDTH=320>Separates pair names from values.</TD></TR>
<TR><TD WIDTH=95>Percent</TD><TD WIDTH=83><CENTER><TT><FONT FACE="Courier">%</FONT></TT></CENTER>
</TD><TD WIDTH=320>Marks the beginning of an encoded character.
</TD></TR>
<TR><TD WIDTH=95>Plus</TD><TD WIDTH=83><CENTER><TT><FONT FACE="Courier">+</FONT></TT></CENTER>
</TD><TD WIDTH=320>Substitutes for space.</TD></TR>
</TABLE></CENTER>
<P>
<P>
The plus sign (<TT><FONT FACE="Courier">+</FONT></TT>) is kind
of strange because <TT><FONT FACE="Courier">%20</FONT></TT> also
means that there should be a space, but because spaces are so
common, it looks nicer to just have a little <TT><FONT FACE="Courier">+</FONT></TT>
sign instead of <TT><FONT FACE="Courier">%20</FONT></TT> over
and over again.
<P>
By now you might be thinking &quot;Okay, so it's encoded; but
as what? And how do I decode it to make sense of it all?&quot;
What it's encoded as is easy: hexadecimal. How you decode it,
well, that's a little bit more involved. What you want to do is
break up the data into individual ordered pairs, which means that
every time you see an ampersand (<TT><FONT FACE="Courier">&amp;</FONT></TT>),
you want to make a new pair. Then you want to split the ordered
pairs into the Name and the Value by breaking it apart at the
equal sign (<TT><FONT FACE="Courier">=</FONT></TT>). Next, you
want to substitute spaces for any of the plus signs (<TT><FONT FACE="Courier">+</FONT></TT>)
you see. Now you're ready to use whatever method your programming
language makes available to convert all the special <TT><FONT FACE="Courier">%##</FONT></TT>
characters into their real values.
<P>
<CENTER><TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>
<TR><TD><B>Tip</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
There are a large number of data-processing libraries that will do all that for you, so you never even have to worry about it. All you do is insert a line that calls the other function and let it do all the work for you.</BLOCKQUOTE>

</TD></TR>
</TABLE></CENTER>
<P>
<H2><A NAME="RSVP"><FONT SIZE=5 COLOR=#FF0000>RSVP</FONT></A>
</H2>
<P>
When people start a conversation, you normally respond-unless
you're ignoring them. With conversations between a user and your
CGI program, it's important to make sure that you do something
to let the user know the conversation is over, preferably without
slamming the connection closed on them. So how do you eloquently
end the conversation? It all depends on what you want to say in
closing.
<H3><A NAME="TypesofResponses">Types of Responses</A></H3>
<P>
There are a lot of good reasons to generate output that gets sent
back to the user. Normally, the whole purpose of the application
is to obtain that information and then send it along, as with
what happens when people use a search engine. In this case, the
general idea is that the program has accomplished the mission
it was assigned by evaluating the submitted data and coming up
with something useful in return, and it's ready to call it a day.
<P>
Fortunately, or unfortunately, users can't stop the CGI program
once it decides to generate the output and cease and desist; they
can only grumble and restart the process all over again. To prevent
them from having to do that needlessly, it's important for CGI
programmers to make sure that output is carefully thought through
so there are no surprises.
<P>
Normally, output falls into three classes: successful, not successful,
and something else. <I>Successful</I> is the kind of result you
get back when a search engine finds some matches to your inquiry
and presents them to you in what it hopes is an orderly fashion.
<I>Not successful</I> is pretty self-explanatory; it means that
something went wrong, and you're not going to get what you're

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -