⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch2.htm

📁 美国Macmillan出版社编写的Perl教程《Perl CGI Web Pages for WINNT》
💻 HTM
📖 第 1 页 / 共 4 页
字号:
<HTML>

<HEAD>

<TITLE>Chapter 2 -- Principles of General Text Processing骉he Backbone of Perl

</TITLE>



<META>

</HEAD>

<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EE" VLINK="#551A8B" ALINK="#CE2910">

<H1><FONT SIZE=6 COLOR=#FF0000>Chapter&nbsp;2</FONT></H1>

<H1><FONT SIZE=6 COLOR=#FF0000>Principles of General Text Processing-<br>The

Backbone of Perl</FONT></H1>

<HR>

<P>

<CENTER><B><FONT SIZE=5><A NAME="CONTENTS">CONTENTS</A></FONT></B></CENTER>

<UL>

<LI><A HREF="#ScalarData">

Scalar Data</A>

<UL>

<LI><A HREF="#FloatsIntegersandLiterals">

Floats, Integers, and Literals</A>

<LI><A HREF="#Integers">

Integers</A>

<LI><A HREF="#CharacterStrings">

Character Strings</A>

<LI><A HREF="#Operators">

Operators</A>

</UL>

<LI><A HREF="#ScalarVariables">

Scalar Variables</A>

<UL>

<LI><A HREF="#TheChopOperator">

The Chop Operator</A>

<LI><A HREF="#InterpolationofScalarsintoStrings">

Interpolation of Scalars into Strings</A>

<LI><A HREF="#StandardInputltSTDINgt">

Standard Input &lt;STDIN&gt;</A>

<LI><A HREF="#ThePrintOperator">

The Print Operator</A>

<LI><A HREF="#TheValueundef">

The Value undef</A>

</UL>

<LI><A HREF="#ArraysDefined">

Arrays Defined</A>

<UL>

<LI><A HREF="#ArrayVariables">

Array Variables</A>

<LI><A HREF="#ArrayOperators">

Array Operators</A>

<LI><A HREF="#ArraysandltSTDINgt">

Arrays and &lt;STDIN&gt;</A>

<LI><A HREF="#ArrayInterpolation">

Array Interpolation</A>

</UL>

</UL>



<HR>

<P>

In the first chapter of this book there was a brief mention of

what text was, how it is the primary building block for CGI communication,

and how Perl is very good at dealing with text. Text, to reiterate,

is data in the form of characters, integers, and non-alphanumeric

characters that you use in creating text files, HTML files, and

Perl scripts.

<P>

Getting into Perl means getting into text manipulation, which

is what you're going to do in this chapter. You are also going

to explore basic programming concepts as they apply to Perl and

its building blocks, also called data structures.

<P>

The areas of text processing and programming you should understand

from this chapter are scalar data, arrays and list data, control

structures, associative arrays, regular expressions, functions,

filehandles and file tests, and formats. All of these will be

covered in this chapter as they apply to Perl.

<H2><A NAME="ScalarData"><FONT SIZE=5 COLOR=#FF0000>

Scalar Data</FONT></A></H2>

<P>

The term <I>scalar</I> in Perl is applied to either a number,

like 12 or 4.3213e32, or a string of characters, like the words

&quot;Hey, now!,&quot; or the play <I>Hamlet</I>. Perl makes no

distinction between numbers and character strings, treating them

the same. Any collection of these numbers or characters is collectively

called<I> scalar data.</I>

<P>

Scalar variables are used, or manipulated, with operators. This

manipulation may produce another scalar value. Scalar values are

stored in scalar variables. You can have scalar values read from

files or written to them.

<P>

While numbers and strings are treated the same by Perl, there

are some fine details that you should be aware of, if only for

the fact that knowing fine details is something that separates

the programmers from the hackers.

<H3><A NAME="FloatsIntegersandLiterals">

Floats, Integers, and Literals</A></H3>

<P>

When dealing with numbers, some are written as their value is,

like 4, and some are written using short forms, like 2.5 (which

is two and a half) or -3.453e32 (which is negative three point

four five three times ten to the power of thirty-two). There are

obvious reasons why short forms are used for some numbers. In

Perl the numbers that are written as their value are called integers.

Those that are representations of one kind or another are called

floats.

<P>

Perl treats both integers or floats as literals. A <I>literal</I>

is the way a value is designated in the actual coding of a program.

This is the data that is fed to the Perl compiler. Perl will accept

the following kinds of number types (whole, fractions, negatives,

and exponents) as floats:

<UL>

<LI>2.5-Two and a half

<LI>5.321e7-5.321 times 10 to the power of 7

<LI>-8.34e8-Negative 8.34 times 10 to the power of 8

<LI>-4.76e-13-Negative 4.76 times 10 to the power of negative

13

</UL>

<P>

When using this notation you can substitute an uppercase &quot;E&quot;

for the lowercase &quot;e&quot; without changing the value of

the number.

<H3><A NAME="Integers">

Integers</A></H3>

<P>

Integers use the familiar notation:

<BLOCKQUOTE>

<PRE>

18

-32

1&Oslash;&Oslash;&Oslash;&Oslash;32458

</PRE>

</BLOCKQUOTE>

<P>

but you don't use the number 0 at the beginning of an integer

literal because Perl can handle hexadecimal numbers, as well as

octal numbers, both of which use zeros at the beginning of their

notations.

<H3><A NAME="CharacterStrings">

Character Strings</A></H3>

<P>

The characters used to make up character strings, or just strings,

each have an 8-bit value. There is a 256 character set that is

recognized by Perl.

<P>

A string can range in size from having no characters to one so

long it would be longer than you need it to be. This reflects

one of the premises of Perl, and that is to have &quot;no built-in

limits&quot; in its various abilities whenever possible.

<P>

This ability to process strings, regardless of the characters

that make it up, is what makes Perl adept at CGI programming.

<P>

Perl also treats characters as literal notations. There are two

kinds of literal strings: single- and double-quoted (see Figure

2.1).

<P>

<A HREF="f2-1.gif" tppabs="http://210.32.137.15/ebook/PC%20Magazine%20Programming%20Perl%205.0%20CGI%20Web%20Pages%20for%20Microsoft%20Windows%20NT/f2-1.gif"><B>Figure 2.1 :</B> <I>Examples of single- and double-quoted strings</I>.</A>

<H4>Single-Quoted Strings</H4>

<P>

If a string is contained by a single pair of quotes, like 'Hey,

now!', it is called a single-quoted string. These quote marks

are not part of the string, they merely indicate to Perl where

the string starts and ends. If you want to put a single quote

inside a string (and not have it treated as the delimiter for

your string), precede it with a backslash, since the backslash

is used to denote special characters. If you want to put a backslash

into your string, precede the backslash with a backslash, as well.

These are the only two instances of special meaning using a backslash

inside a single-quoted string.

<H4>Double-Quoted Strings</H4>

<P>

When a string is enclosed by a double pair of quotes, like &quot;Hey,

now!&quot;, it is a double-quoted string. With double-quoted strings

the backslash has much more &quot;umph.&quot; Inside the double

quotes, a backslash can be used to indicate some control characters

or octal and even hexadecimal representations of special characters.

Examples of such might be:

<UL>

<LI>&quot;Hey, now!\n&quot;-where the string Hey, now! is followed

by a newline command

<LI>&quot;No flipping! \177&quot;-where the string No flipping!

is followed by octal 177, the delete character

<LI>&quot;Live \tto tape&quot;-where the string Live to tape is

spliced by a tab, outputting:<BR>

Live    to tape

</UL>

<P>

The backslash can place many powerful commands inside a string.

These are called backslash escapes. Table 2.1 outlines them.<BR>

<P>

<CENTER><B>Table 2.1 Double-Quoted String Backslash Escapes</B></CENTER>

<P>

<CENTER>

<TABLE BORDERCOLOR=#000000 BORDER=1 WIDTH=60% CELLPADDING=3>

<TR VALIGN=TOP><TD><CENTER><B>Backslash Escape</B></CENTER></TD>

<TD><B>Command Function</B></TD></TR>

<TR VALIGN=TOP><TD><CENTER>\n</CENTER></TD><TD>Newline

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\r</CENTER></TD><TD>Return

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\t</CENTER></TD><TD>Tab</TD>

</TR>

<TR VALIGN=TOP><TD><CENTER>\f</CENTER></TD><TD>Formfeed

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\b</CENTER></TD><TD>Backspace

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\v</CENTER></TD><TD>Vertical tab

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\a</CENTER></TD><TD>Bell</TD>

</TR>

<TR VALIGN=TOP><TD><CENTER>\e</CENTER></TD><TD>Escape

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\177</CENTER></TD><TD>Any octal ASCII value, like 177-delete character

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\x7f </CENTER></TD><TD>Any hex ASCII value, like x7f-delete character

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\cC</CENTER></TD><TD>Any control character, like control C

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\\ </CENTER></TD><TD>Backslash

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\&quot;</CENTER></TD><TD>Double-quote

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\l</CENTER></TD><TD>Make the next letter lowercase

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\L</CENTER></TD><TD>Make all the next letters lowercase until \E

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\u</CENTER></TD><TD>Make the next letter uppercase 

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\U</CENTER></TD><TD>Make all the next letters uppercase until \E

</TD></TR>

<TR VALIGN=TOP><TD><CENTER>\E </CENTER></TD><TD>Terminate \L or \U

</TD></TR>

</TABLE></CENTER>

<P>

<P>

One other facet of double-quoted strings is that they are variable

interpolated. This means that a variable inside the string can

have its current value replaced once the string is read.

<P>

Relating this back to what we know about CGI, we can write a script

that demonstrates the difference between single- and double-quoted

strings and puts it up as an HTML page to our browser. In Listing

2.1 we will also encounter some Perl commands, which are explained

in a comment line that starts with the &quot;#&quot; symbol.

<HR>

<BLOCKQUOTE>

<B>Listing 2.1 Perl Command Script<BR>

</B>

</BLOCKQUOTE>

<BLOCKQUOTE>

<PRE>

#! usr/bin/perop

# quotes_examples.pl

print &quot;Content-type: text/html\n \n&quot;; # print is a command that outputs 

# data. The string being printed is a

# common header used in CGI for

#returning HTML documents

$date='date';

# date is a system command and $date # is a scalar variable

chop ($date);

# chop is an operator

print &lt;&lt;&quot;eop&quot;; #the end of perl tag using double quotes

&lt;HTML&gt;

&lt;HEAD&gt;

&lt;TITLE&gt;Examples of single and double quoted strings&lt;/TITLE&gt;

&lt;/HEAD&gt;

&lt;BODY&gt;

&lt;H2&gt;Examples of single and double quoted strings&lt;/H2&gt;

&lt;P&gt;

Hey, now!

&lt;BR&gt;

Today the date is $date.

&lt;HR NOSHADE&gt; 

eop

print &lt;&lt;'eop'; # the eop tag using 

# single quotes

&lt;H2&gt;Examples of single and double quoted strings&lt;/H2&gt;

&lt;P&gt;

Hey, now!

&lt;BR&gt;

Today the date is $date.

&lt;HR NOSHADE&gt;

&lt;/BODY&gt;

&lt;/HTML&gt;

eop

</PRE>

</BLOCKQUOTE>

<HR>

<P>

Right away you will see that using double quotes on the eop string

has a much different effect than the single quotes. The scalar

variable $date is set by the system command with single quotes.

This directs Perl to execute the system command within the single

quotes. The &quot;=&quot; symbol is an assignment statement that

tells Perl to assign the output of the system command to the scalar

variable $date.

<P>

The Perl operator chop is used to remove the last character from

the argument within its parentheses. In quotes_examples.pl chop

takes off the last character from the scalar variable $date. Don't

ask why, but there are very handy uses for chop listed in the

next section.

<P>

The Perl operator print is used to output the signified scalar

variable, in this case eop, into standard output. When the print

operator is used in Perl it should really have a set of parentheses

around the variable it is assigning to standard output, as with

our example.

<P>

When you run the script from a browser you get:

<BLOCKQUOTE>

<PRE>

print &lt;&lt;(&quot;eop&quot;);

</PRE>

</BLOCKQUOTE>

<P>

which is better syntax than:

<BLOCKQUOTE>

<PRE>

print &lt;&lt;&quot;eop&quot;;

</PRE>

</BLOCKQUOTE>

<P>

but in almost all cases leaving off the parentheses will not affect

your script. The parentheses help get rid of any ambiguity that

may exist in a larger Perl program. Keep this in mind if you are

having trouble with your larger scripts.

<P>

In the first print statement double-quotes are used. This tells

Perl to decipher any variables that occur in its print string

between the eop tags. This makes Perl put the value of the current

date in the system command &quot;date&quot; in the variable $date.

<P>

When the single quotes are used around the variable eop, it tells

Perl to ignore all variables inside the print string. This makes

the $date variable part of the HMTL document text, and so it is

presented on the page with the other text. Amazing what one little

pair of quotes can do to you if you're not careful, eh?

<P>

Both chop and print are Perl operators. There are more commands

like this in Perl that will help you get things done in your scripts.

<H3><A NAME="Operators">

Operators</A></H3>

<P>

An operator in Perl makes a new value, called a <I>result</I>,

from one or more operands, or other values. An example of this

might be the plus sign used in simple addition. The operator &quot;+&quot;

can take two values, like &quot;1&quot; and &quot;2,&quot; and

make the result &quot;3,&quot; as in 1 + 2 = 3. Operators work

on both numbers and character strings in conjunction with the

suitable operands.

<P>

If you accidentally use a number operand with a string, Perl will

convert it based on the operand, not the number or string value.

If you put a &quot;+&quot; operand between &quot;Beverly Hills

90210&quot; and &quot;Oceans 11&quot; you'll end up with the numeric

result 90221. White space and nonnumeric characters are given

the value of 0 by the operand, and then ignored.

<P>

If you, with equal abandon, put a string operand between two numbers,

you'll get a number that has been expanded into whatever its string

equivalent might be. An example would be putting the string concatenate

(a pretty fancy word that means putting two strings together)

&quot;.&quot; between &quot;The Dirty&quot; and (2*6) like this:

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -