📄 cgi_guide.html

📁 this gives details of the network programming
💻 HTML
📖 第 1 页 / 共 2 页
字号:
上一页 12
content-length: 26
http-headers: whatever

fname=John&lname=Doe+a+Deer
</PRE>
<P>

<P>In this case the same encoding takes place, but the query string is
submitted as the content of the request, not as part of the resource name.
You might also notice the request includes an HTTP header specifying the
length of the content - this is important as we'll see soon...</P>

</TD> </TR> </TABLE>

<HR>

<A NAME=query>
<H3>How does the web server pass the query to my CGI program?</H3>
</A>

<P>Once again, the answer depends on whether your CGI program is referenced
with a GET request, or with a POST request. </P>

<P><B>GET method: </B> When a GET request is received by the web server
and the resource specified (everything before the '?') is your CGI
program, the web server will grab everything after the '?' and stuff
it in to the environment variable named QUERY_STRING. The web server
will also set the environment variable REQUEST_METHOD to the value
"GET".  Then the web server will start up your CGI program connecting
STDOUT of your program to a pipe the server can read. Your program
should get the query by reading the environment variable QUERY_STRING,
and then process the query and send the results to the web server by
simply writing to STDOUT.</P>

<P>Many operating systems have limitations on the size of the environment
variables - this might get in the way if you have large queries, since
the entire query must be able to fit in QUERY_STRING. Most non-trival
queries are submitted using the POST method.</P>

<P><B>POST method:</B> When a POST request is received by the web server
and the resource specified is your CGI program, the web server will
read the HTTP headers (including the one specifying the content
length) and set the environment variable CONTENT_LENGTH. The
REQUEST_METHOD environment variable will be set to POST. Now your CGI
program will be started up with STDIN and STDOUT attached to pipes
going back to the web server. The server will now write the entire query string
(the content of the POST) to the pipe connected to your STDIN.</P>

<P>Your program should get the length of the query string from the
CONTENT_LENGTH environment variable so it knows how much to read from
STDIN (coming from the web server). BE CAREFUL! Don't use a static
array unless you are willing to refuse to read the entire query (it
might be larger than your array and could screw up your program and
make it possible for bad guys to break into your machine, delete all
your files, send mail from you to the FBI suggesting that you might be
someone they are looking for, and worst of all - create a really lousy
project and submit it to netprog-submit@cs.rpi.edu...).</P>


<HR>

<A NAME=content>
<H3>How does my CGI program send content back to the client
(browser)?</H3>
</A>

<P>For starters, you have to tell the browser what kind of document you
are sending.  In most cases you will be sending HTML, and you need to
tell the browser this by sending the string:</P>

<PRE>
Content-type: text/html
</PRE>

<P>before sending the content. This is actually a HTTP header you are
sending, so assuming it is the only header you want to send you need
to also send a blank line (and remember, 
all header lines should end with <CODE>\r\n</CODE>).</P>

<P>To send this header and the content back to the browser you simply
write to STDOUT, which actually goes back to the web server via a
pipe, and the web server forwards it to the browser. The web server
will probably add a bunch of headers as well, generally we don't need
to worry about this although there are ways to configure the server
to not send any extra headers. </P>

<P>In short - just use <CODE>printf</CODE> to send the content back to
the browser.</P>

<HR>

<A NAME=security>
<H3>Security</H3>
</A>

<P>When writing a CGI program you need to realize you are allowing anyone
to run your program any time they want. This doesn't sound terribly
harmful, but you should keep in mind that they can (and will) send 
all kinds of crazy stuff as the request. You must make sure that you 
don't allow an unexpected request to screw up your system.</P>

<P>Just about the worst thing you can do is to 
blindly construct a Unix command line based on a request, and
give the command to a Unix shell to run (using <CODE>popen()</CODE>,
<CODE>system()</CODE>, etc).
In the example shown in class, the CGI program expects a keyword as a
request, the CGI program greps a dictionary for all words the contain
the keyword. So the intent was that if a user send the request "foo",
the Unix command <CODE>grep foo /usr/dict/words</CODE> is constructed
and run (using popen). However, if a user enters the query "; rm *"
 the resulting command would look like this:</P>

<PRE>
grep; rm * /usr/dict/words
</PRE>

<P>and you might lose a bunch of files...</P>

<P>One common theme among well know cracks (of many Unix services) is to
overflow an input buffer in the server. If all servers were written
correctly, this would not be a problem. You must make sure that read
is never called with an input buffer smaller than the maximum size
given to the read system call. </P>

<P>Here is a classic example of the problem. This code (from a CGI
program) is handling a POST request, so it checks to see how large
the request is (by getting the environment variable CONTENT_LENGTH)
and then reads that much stuff from STDIN:</P>

<PRE>
char buff[1000];
int len;
char *cl;

cl = getenv("CONTENT_LENGTH");
if (cl==NULL) {
  /* Error */
  exit(1);
}
len = atoi(cl);

read(buff,0,len);
</PRE>

<P>This code never makes sure that <CODE>len</CODE> is less than 1000!!!!
At a minimum this makes it easy to crash your server, in the worst
case some clever hacker (with lots of spare time) could use this to
break in to your computer.</P>

<P>For more information about CGI security check out:</P>

  <LI><A HREF=../../../../../www.w3.org/Security/Faq/wwwsf4.html>
  The CGI section of the WWW Security FAQ
   </A>

<HR>

<A NAME=cookies>
<H3>Cookies</H3>
</A>

<P>Cookies are simply (name,value) pairs that you (you being a CGI
program) can store on the client, and that will subsequently be sent
along by the client with each request. In the simplest case you could
use cookies to identify <I>sessions</I>. Once a request is made to
your <I>service</I> you could see if the request included the cookie
named "SESSION_ID". If so, you already assigned an ID to that
user/browser and you can keep track of all the requests coming from
that user/browser. If there is no SESSION_ID, you assume it is a new
user, and assign a SESSION_ID by generating a unique string (all
cookies are ASCII strings) and telling the browser to remember the
SESSION_ID and to sent it along with all subsequent requests. </P>

<P><B>Setting Cookies</B>: To set a cookie, you need to include an HTTP header
line in the response. The header field name is
<CODE>Set-cookie</CODE>. The entire header line can look something
like this (all one a single line):</P>

<PRE>
Set-cookie: SESSION_ID=018365; path=/; domain=.rpi.edu expires=Sunday,
12-April-98 12:00:00 GMT
</PRE>

<P>The first part of the cookie specifies the cookie name (in this case
SESSION_ID) and value (in this case 018365). The domain and path
control when the client will send the cookie along with a request. The
domain is a DNS domain name (or hostname) and identifies those servers
that should be sent the cookie. Once the browser has received the
cookie it will only be sent along with requests to web servers in the
specified domain. The path allows us to further specify that only some
of the entities in the domain indicated should receive the cookie. 
The expires field specifies how long the client should hold on to the
cookie before tossing it. If there is no expires field the cookie will
never be saved to disk (by the browser) and will be gone once the
browser exits. In other words the browser will <I style="color:green">toss it's cookies</I>.
You should remember that all cookies are subject to removal by the
user, and your service should never <em>require</em> that cookies persist
between sessions.</P>

<P>Since you need to use an HTTP header to set a cookie, remember that
this header must come before any content is sent back by your browser,
and must be before a blank line is sent back (because a blank line
tells the browser there are no more headers).</P>

<P><B>Getting Cookies</B> First you must get the kids outside or
otherwise occupied. It is also usually good to make sure the dog is 
not watching since she might later attempt to get the cookies herself.
You'll need a brown paper bag to hide the cookies,
 I suggest a grocery bag for large cookie excursions,
although a small lunch style bag is perfect if you only plan on
nabbing a few cookies at once. OK - head into the kitchen, and...
&nbsp &nbsp &nbsp Oops, sorry!.</P>

<P><B>Getting Cookies from the <I>browser</I></B> 
The browser sends it's cookies as
an HTTP header. Your web server will store the cookies in the
environment variable "HTTP_COOKIE" before starting your CGI
program. Since there may be many cookies, this environment variable
will hold them all in the form "name1=value1; name2=value2; ...". We
can see right away that the cookie name and value can't include the
characters '=' or ';'. </P>

<P>You need to parse the HTTP_COOKIE string to get at the cookie you are
interested in - check out the code <A HREF=../../code/CGI/showcookies> here </A>
for a simple example.</P>

<P>For more information on cookies, here are some links:</P>

<UL>
<LI>
<A HREF=../../../../../www.cookiecentral.com/unofficial_cookie_faq.htm>
Unofficial Cookie FAQ</A> at <A HREF=../../../../../www.cookiecentral.com/default.htm>
www.cookiecentral.com</A>.

<LI><A HREF=../../../../../www.cis.ohio-state.edu/htbin/rfc/rfc2109.html>
RFC 2109: HTTP State Management Mechanism</A>
<LI>The (original, I belive)
<A HREF=../../../../../www.netscape.com/newsref/std/cookie_spec.html> 
Preliminary Specification</A> from the originators (Netscape).
</UL>

<HR>

<A NAME=forms>
<H3>HTML Forms</H3>
</A>

<P>Here are some links to get you started writing HTML forms:</P>
<UL>
  <LI><A HREF=../../../../../robot0.ge.uiuc.edu/~carlosp/cs317/cft.html>A Forms
Tutorial</A>
  <LI><A HREF=../../../../../www.cis.ohio-state.edu/htbin/rfc/rfc1866.html>RFC 1866 -
  HTML 2.0</A>
</UL>

<HR>

<P style="text-align: center; color: red; font-weight: bold;
font-size: 18pt"><BLINK>REMEMBER</BLINK></P>

<P>Your CGI program can be run with any kind of query, not
just the queries you expect (based on forms you've created). It's
often a good idea to build in some default behavior, so that your
CGI does something intelligent (like return an HTML error message) 
if the query is not valid.</P>


<HR>
<P><B>Q:</B> Dave, I actually read this far but I still don't know how to get
started!</P>

<P><B>A:</B> The best way to get started is to copy one of the example
CGI programs to your CS public.html directory, build it and run it. Then
play around making changes to the HTML and CGI program. This might
help:</P>

<UL>
<LI>Log on to a CS sun. 
<LI><CODE>cd public.html</CODE>
<LI><CODE>cp -r ~hollingd/public.html/netprog/code/CGI/pizza .</CODE>
<LI><CODE>cd pizza</CODE>
<LI><CODE>make</CODE>
<LI><CODE>chmod go+r *</CODE>
<LI>Now point your browser at
http://cgi.cs.rpi.edu/~yourname/pizza/pizza.cgi
and give it a try.
</UL>

<P><B>NOTE:</B> As far as I know you can't run user CGI programs 
on RCS (the web server on www.rpi.edu is not configured to 
allow you to have your own CGI programs). You can have your own
cgi programs on the CS machines - the web server cgi.cs.rpi.edu
is configured to provide this. The machine cgi2.cs.rpi.edu will
also work (it's a faster machine!).</P>

<P><B>BUT</B>: cgi.cs.rpi.edu and cgi2.cs.rpi.edu are Sun
workstations, so your cgi programs must be executable on a Sun. If you
build your C program on monica or another BSD machine it won't run on
cgi.cs.rpi.edu or any other Sun (nor will Microsoft Internet Explorer,
but then - that's a feature...).</P>
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -