📄 ch20.htm
字号:
<HTML>
<HEAD>
<TITLE>Chapter 20 -- Introduction to Web Pages and CGI</TITLE>
<META NAME="GENERATOR" CONTENT="Mozilla/3.0b5aGold (WinNT; I) [Netscape]">
</HEAD>
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EE" VLINK="#551A8B" ALINK="#CE2910">
<H1><FONT COLOR=#FF0000>Chapter 20</FONT></H1>
<H1><B><FONT SIZE=5 COLOR=#FF0000>Introduction to Web Pages and CGI</FONT></B>
</H1>
<P>
<HR WIDTH="100%"></P>
<P>
<H3 ALIGN=CENTER><FONT COLOR="#000000"><FONT SIZE=+2>CONTENTS<A NAME="CONTENTS"></A>
</FONT></FONT></H3>
<UL>
<LI><A HREF="#HTMLCGIandMIME" >HTML, CGI, and MIME</A>
<LI><A HREF="#ASimpleHTMLDocument" >A Simple HTML Document</A>
<LI><A HREF="#FormattingLists" >Formatting Lists </A>
<LI><A HREF="#UsingHTMLTags" >Using HTML Tags</A>
<LI><A HREF="#PreformattedText" >Preformatted Text</A>
<LI><A HREF="#SpecialCharactersinHTMLDocuments" >Special Characters in HTML Documents</A>
<LI><A HREF="#WhatIsaURL" >What Is a URL?</A>
<LI><A HREF="#CGIScripts" >CGI Scripts</A>
<UL>
<LI><A HREF="#CONTENT_LENGTH" >CONTENT_LENGTH</A>
<LI><A HREF="#CONTENT_TYPE" >CONTENT_TYPE</A>
<LI><A HREF="#GATEWAY_INTERFACE" >GATEWAY_INTERFACE</A>
<LI><A HREF="#PATH_INFO" >PATH_INFO</A>
<LI><A HREF="#QUERY_STRING" >QUERY_STRING</A>
</UL>
<LI><A HREF="#InputandOutputtoCGIScripts" >Input and Output to CGI Scripts</A>
<LI><A HREF="#ATestCGIScript" >A Test CGI Script</A>
<LI><A HREF="#UsingFrames" >Using Frames</A>
<LI><A HREF="#Summary" >Summary</A>
</UL>
<HR>
<P>
This chapter offers a brief introduction to the HyperText Markup
Language (HTML) and the Common Gateway Interface (CGI). The information
in this chapter provides the basis for the rest of the chapters
about Web pages in this book, especially for the topic of writing
CGI scripts in Perl. This chapter assumes that you have a cursory
knowledge of what the World Wide Web (WWW) is about and how to
use a browser.
<P>
I also assume you're somewhat familiar with HTML code. Going into
more detail about HTML programming would cause us to move too
far away from the scope of the book: Perl programming. Therefore,
I stick to the very basic HTML elements for text formatting and
listing items rather than covering a lot of HTML programming issues.
<P>
Reading this one chapter won't make you a Webmaster, but you'll
learn enough to create Web pages you can subsequently use in conjunction
with Perl scripts. With these basics, you'll be able to easily
incorporate other HTML page-layout elements in your documents.
<P>
If you are not familiar with HTML or would like more information,
don't worry. There are several documents on the Internet that
describe how to write HTML pages. For up-to-date documentation
on HTML, conduct a search on the keywords <I>HTMLPrimer</I> and
<I>html-primer</I> in the Internet search areas.
<P>
For more information via printed text books, you might want to
consult these titles:
<UL>
<LI><I>Teach Yourself Web Publishing with HTML 3.0 in a Week</I>,
Laura Lemay, Sams.net Publishing, 1-57521-064-9, 1996.
<LI><I>HTML & CGI Unleashed</I>, John December and Mark Ginsberg,
Sams.net Publishing, 0-672-30745-6, 1995.
<LI><I>Using HTML</I>, Neil Randall, Que, 0-7897-0622-9, 1995.
</UL>
<H2><A NAME="HTMLCGIandMIME"><FONT SIZE=5 COLOR=#FF0000>HTML,
CGI, and MIME</FONT></A></H2>
<P>
HTML is the de facto standard language for writing Web pages on
the Internet. HTML documents are written as text files and are
meant to be interpreted by a Web browser. A Web browser displays
the data in HTML pages by reading in the tags around the data.
Web browsers reside on client machines, and Web server daemons
run on Web servers. The protocol used by Web servers and clients
to talk to each other is called the HyperText Transfer Protocol
(HTTP).
<P>
An HTML page contains uniform resource locators (URLs) in addition
to the tags. A URL tells the browser where to get certain data.
URLs can point to other Web documents, FTP sites, Gopher sites,
and even executable programs on the server side. The Common Gateway
Interface (CGI) is the standard used to run programs for a client
on the server.
<P>
A Web server gets a request for action from the browser when the
user selects the URL. The request is processed by the server by
running a program. The program is often referred to as a CGI script
because a lot of programs for handling CGI requests are Perl scripts.
The results of the CGI script are sent back to the browser making
the request. The browser displays the results back to the user.
Results can be in plain text, binary data, or HTML documents.
<P>
The browser reading the output from the CGI script has to know
the type of input it is receiving. The type of information is
sent back as a multipurpose Internet mail-extension (MIME) header.
For example, to send back plain text, you use <TT><FONT FACE="Courier">"Content-Type:
text/plain\n\n"</FONT></TT> at the start of the document.
To send back HTML data, you use <TT><FONT FACE="Courier">"Content-type:
text/html\n\n"</FONT></TT>. <P>
<CENTER>
<TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=80%>
<TR VALIGN=TOP><TD ><B>Note</B></TD></TR>
<TR VALIGN=TOP><TD >
<BLOCKQUOTE>
Using two carriage returns after the type of data is very important. The HTML standard requires a blank line after the <TT><FONT FACE="Courier">Content-type</FONT></TT> string. This is why we have <TT><FONT FACE="Courier">"\n\n"</FONT></TT>
appended to <TT><FONT FACE="Courier">Content-type</FONT></TT>. In most cases, the <TT><FONT FACE="Courier">"\n\n"</FONT></TT> will work as intended to produce a blank line for a browser. Sometimes this will not work, and the data being sent back
to the browser will not be shown because the server will be handling carriage-returns/line-feeds using the <TT><FONT FACE="Courier">"\r\n"</FONT></TT> string instead of <TT><FONT FACE="Courier">"\n"</FONT></TT>. To allow for
inconsistencies in the way operating systems handle carriage-return/line-feed pairs, you should use the string <TT><FONT FACE="Courier">"\r\n\r\n"</FONT></TT>.
</BLOCKQUOTE>
</TD></TR>
</TABLE></CENTER>
<H2><A NAME="ASimpleHTMLDocument"><FONT SIZE=5 COLOR=#FF0000>A
Simple HTML Document</FONT></A></H2>
<P>
An HTML document uses markup tags to specify special areas of
the text. The format of an HTML document is as follows:
<BLOCKQUOTE>
<TT><FONT FACE="Courier"><HTML><BR>
<HEAD><BR>
<TITLE>Title of the page</TITLE><BR>
</HEAD><BR>
<BODY><BR>
The
body of the document.<BR>
</BODY><BR>
</HTML></FONT></TT>
</BLOCKQUOTE>
<P>
All text for the HTML document is shown between the <TT><FONT FACE="Courier"><HTML></FONT></TT>
and <TT><FONT FACE="Courier"></HTML></FONT></TT> tags. There
can be only two pairs of elements, one pair of <TT><FONT FACE="Courier"><BODY></FONT></TT>
and <TT><FONT FACE="Courier"></BODY></FONT></TT> tags to
store the text matter for the HTML document, and the other pair
of <TT><FONT FACE="Courier"><HEAD></FONT></TT> and <TT><FONT FACE="Courier"></HEAD></FONT></TT>
tags. The <TT><FONT FACE="Courier"><HEAD></FONT></TT> and
<TT><FONT FACE="Courier"></HEAD></FONT></TT> tags show the
document title in the heading section of a viewer. The <TT><FONT FACE="Courier"><TITLE></FONT></TT>
and <TT><FONT FACE="Courier"></TITLE></FONT></TT> tags hold
the string for text in the title bar for your browser and are
the only required element within the <TT><FONT FACE="Courier"><HEAD></FONT></TT>
and <TT><FONT FACE="Courier"></HEAD></FONT></TT> tags.
<P>
Both the <TT><FONT FACE="Courier"><HEAD></FONT></TT> and
<TT><FONT FACE="Courier"><TITLE></FONT></TT> tags are optional.
However, for compatibility with some browsers, you should include
them. The <<TT><FONT FACE="Courier">BODY></FONT></TT> and
<TT><FONT FACE="Courier"></BODY></FONT></TT> tags are required
in all cases. Most HTML tags are paired. So if you have <TT><FONT FACE="Courier"><HEAD></FONT></TT>,
then you should have <TT><FONT FACE="Courier"></HEAD></FONT></TT>.
There are exceptions to this rule. For example, the paragraph
tag <TT><FONT FACE="Courier"><P></FONT></TT> and the line
break <TT><FONT FACE="Courier"><BR></FONT></TT> tag are
used by themselves and do not require any accompanying <TT><FONT FACE="Courier"></P></FONT></TT>
or <TT><FONT FACE="Courier"></BR></FONT></TT> tags. (The
<TT><FONT FACE="Courier"></P></FONT></TT> tag is sometimes
used to terminate a paragraph, but the <TT><FONT FACE="Courier"></BR></FONT></TT>
tag does not exist.)
<P>
Tags are not case sensitive, and any formatting in between the
tags is almost always ignored. Therefore, the tag <TT><FONT FACE="Courier"><html></FONT></TT>
is the same as <TT><FONT FACE="Courier"><HtMl></FONT></TT>
and <TT><FONT FACE="Courier"><HTML></FONT></TT>.
<P>
It's the presence of <TT><FONT FACE="Courier"><HTML></FONT></TT>,
<TT><FONT FACE="Courier"><HEAD></FONT></TT>, and <TT><FONT FACE="Courier"><BODY></FONT></TT>
tags in the page that distinguishes an HTML page from a simple
text page. Figure 20.1 presents a sample text page which does
not have any formatting on it whatsoever being loaded into an
HTML browser.
<P>
<A HREF="f20-1.gif" tppabs="http://www.mcp.com/815097600/0-672/0-672-30891-6/f20-1.gif" ><B>Figure 20.1: </B><I>An unformatted document.</I></A>
<P>
All the text shown in Figure 20.1 is aligned in the way that the
original text document was set up. In some cases, the text would
have been clumped in one long paragraph. Here is the text for
the document shown in Figure 20.1:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">Futures Contracts in Sugar<BR>
<BR>
Test Test HTML Test HTML<BR>
<BR>
Summary of Contracts available.<BR>
<BR>
[Image] Sugar Contracts<BR>
[Image] Sugar Options<BR>
[Image] Combination<BR>
----------------------------------------------------------------------------
<BR>
<BR>
Ordered list of particulars<BR>
<BR>
* Price per cent move of Sugar prices: $1120.00<BR>
* Appox min. deposit for contract required by broker: $5000 to
$10000.<BR>
* Appox min. deposit for option required by broker: $1500 to $3000.
<BR>
* Appox commissions cost: $35 to $75<BR>
<BR>
----------------------------------------------------------------------------
<BR>
<BR>
Some Detailed Information in Description Lists.<BR>
<BR>
[Image] Risks with open contracts<BR>
One cent move equals $1120 in your profits.
Therefore a 4 cent move can<BR>
</FONT></TT> <TT><FONT FACE="Courier">
either make you a handsome profit or break your bank. A flood
in sugar<BR>
growing area may cause prices to drop
sharply. If you are holding a<BR>
</FONT></TT> <TT><FONT FACE="Courier">
long contract, this drop in price will have to be covered at the
end of<BR>
the trading day or your position will
be liquidated.<BR>
[Image] Sugar<BR>
Options cost a fixed amount of money.
However, the money spent on an<BR>
</FONT></TT> <TT><FONT FACE="Courier">
option should be treated like insurance. No matter where the price
goes<BR>
your loss will be limited to the price
of the option. Of course, with<BR>
limiting risk you are also limiting profits.</FONT></TT>
</BLOCKQUOTE>
<P>
To make the text more presentable, you can add some HTML tags
to the document, as shown in Listing 20.1. First, we'll delimit
the paragraphs with a <TT><FONT FACE="Courier"><P></FONT></TT>
tag and then add some headings to it. HTML provides six levels
of headings, numbered <TT><FONT FACE="Courier">H1</FONT></TT>
through <TT><FONT FACE="Courier">H6</FONT></TT>. <TT><FONT FACE="Courier">H1</FONT></TT>
is the top-level heading in a document's hierarchy and <TT><FONT FACE="Courier">H6</FONT></TT>
is the bottom. Generally, you use <TT><FONT FACE="Courier">H2</FONT></TT>
headers inside <TT><FONT FACE="Courier">H1</FONT></TT> headers,
<TT><FONT FACE="Courier">H3</FONT></TT> headers inside <TT><FONT FACE="Courier">H2</FONT></TT>
headers, and so on. Do not skip heading levels unless you have
a compelling reason to switch heading styles. Use the tags <TT><FONT FACE="Courier"><H1>Text
for heading</H1></FONT></TT> for defining a heading.
<P>
A sample HTML page is shown in Listing 20.1. See the output in
Figure 20.2.
<P>
<A HREF="f20-2.gif" tppabs="http://www.mcp.com/815097600/0-672/0-672-30891-6/f20-2.gif" ><B>Figure 20.2:</B><I>Using tags to enhance the appearance of HTML documents.</I></A>
<HR>
<BLOCKQUOTE>
<B>Listing 20.1. Formatted text.<BR>
</B>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -