📄 ch26.htm
字号:
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT">field</TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT">ends with b</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>.</TT></TD>
<TD WIDTH="102" ALIGN="LEFT">Matches any</TD>
<TD ALIGN="LEFT"><TT>$3 ~ /i.m/</TT></TD>
<TD ALIGN="LEFT">Matches any record that has</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT">single character</TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT">a third field value of i, another character,</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT"></TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT">and then m</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>|</TT></TD>
<TD WIDTH="102" ALIGN="LEFT">Or</TD>
<TD ALIGN="LEFT"><TT>/cat|CAT/</TT></TD>
<TD ALIGN="LEFT">Matches cat or CAT</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>*</TT></TD>
<TD WIDTH="102" ALIGN="LEFT">Zero or more repe-</TD>
<TD ALIGN="LEFT"><TT>/UNI*X/</TT></TD>
<TD ALIGN="LEFT">Matches UNX, UNIX,</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT">titions of a character</TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT">UNIIX, UNIIIX, and so on</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>+</TT></TD>
<TD WIDTH="102" ALIGN="LEFT">One of more repe-</TD>
<TD ALIGN="LEFT"><TT>/UNI+X/</TT></TD>
<TD ALIGN="LEFT">Matches UNIX, UNIIX, and</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT">titions of a character</TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT">so on, but not UNX</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>\{a,b\}</TT></TD>
<TD WIDTH="102" ALIGN="LEFT">The number of</TD>
<TD ALIGN="LEFT"><TT>/UNI\{1,3\}X</TT></TD>
<TD ALIGN="LEFT">Matches only UNIX,</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT">repetitions between</TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT">UNIIX, and UNIIIX</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT">a and b (both</TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT"></TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT">integers)</TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT"></TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>?</TT></TD>
<TD WIDTH="102" ALIGN="LEFT">Zero or one repe-</TD>
<TD ALIGN="LEFT"><TT>/UNI?X/</TT></TD>
<TD ALIGN="LEFT">Matches UNX and UNIX</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT">titions of a string</TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT">only</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>[]</TT></TD>
<TD WIDTH="102" ALIGN="LEFT">Range of</TD>
<TD ALIGN="LEFT"><TT>/I[BDG]M/</TT></TD>
<TD ALIGN="LEFT">Matches IBM, IDM, and</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT">characters</TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT">IGM</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT">[^]</TD>
<TD WIDTH="102" ALIGN="LEFT">Not in the set</TD>
<TD ALIGN="LEFT"><TT>/I[^DE]M/</TT></TD>
<TD ALIGN="LEFT">Matches all three character sets starting with I and ending in M, except IDM andIEM</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT"></TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT"></TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"></TD>
<TD WIDTH="102" ALIGN="LEFT"></TD>
<TD ALIGN="LEFT"></TD>
<TD ALIGN="LEFT"></TD>
</TR>
</TABLE>
</CENTER>
<P><BR>
Some of these metacharacters are used frequently. You will see some examples later
in this chapter.
<CENTER>
<H3><A NAME="Heading15<FONT COLOR="#000077">Calling gawk Programs</FONT></H3>
</CENTER>
<P>Running pattern-action pairs one or two at a time from the command line would
be pretty difficult (and time consuming), so <TT>gawk</TT> allows you to store pattern-action
pairs in a file. A <TT>gawk</TT> program (called a script) is a set of pattern-action
pairs stored in an ASCII file. For example, this could be the contents of a valid
<TT>gawk</TT> script:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">/tparker/{print $6}
$2 != "foo" {print}
</FONT></PRE>
<P>The first line would look for <TT>tparker</TT> and print the sixth field, and
the second line would look for second fields that don't match the string <TT>"foo"</TT>,
then display the entire line. When you are writing a script, you don't need to worry
about the quotation marks around the pattern-action pairs as you did on the command
line, because the new command to execute this script makes it obvious where the pattern-action
pairs start and end. After you have saved all of the pattern-action pairs in a program,
they are called by <TT>gawk</TT> with the <TT>-f</TT> option on the command line:</P>
<PRE><FONT COLOR="#0066FF">gawk -f script filename
</FONT></PRE>
<P>This command causes <TT>gawk</TT> to read all of the pattern-action pairs from
the file script and process them against the file called filename. This is how most
<TT>gawk</TT> programs are written. Don't confuse the <TT>-f</TT> and <TT>-F</TT>
options!</P>
<P>If you want to specify a different field separator on the command line (they can
be specified in the script, but use a special format you'll see later), the <TT>-F</TT>
option must follow the <TT>-f</TT> option:</P>
<PRE><FONT COLOR="#0066FF">gawk -f script -F":" filename
</FONT></PRE>
<P>If you want to process more than one file using the script, just append the names
of the files:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">gawk -f script filename1 filename2 filename3 ...
</FONT></PRE>
<P>By default, all output from the <TT>gawk</TT> command is displayed on the screen.
You could redirect it to a file with the usual UNIX redirection commands:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">gawk -f script filename > save_file
</FONT></PRE>
<P>There is another way of specifying the output file from within the script, but
we'll come back to that in a moment.
<CENTER>
<H4><A NAME="Heading16<FONT COLOR="#000077">BEGIN and END</FONT></H4>
</CENTER>
<P>Two special patterns supported by <TT>gawk</TT> are useful when writing scripts.
The <TT>BEGIN</TT> pattern is used to indicate any actions that should take place
before <TT>gawk</TT> starts processing a file. This is usually used to initialize
values, set parameters such as field separators, and so on. The <TT>END</TT> pattern
is used to execute any instructions after the file has been completely processed.
Typically, this can be for summaries or completion notices.</P>
<P>Any instructions following the <TT>BEGIN</TT> and <TT>END</TT> patterns are enclosed
in curly braces to identify which instructions are part of both patterns. Both <TT>BEGIN</TT>
and <TT>END</TT> must appear in capitals. Here's a simple example of a <TT>gawk</TT>
script that uses <TT>BEGIN</TT> and <TT>END</TT>, albeit only for sending a message
to the terminal:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">BEGIN { print "Starting to process the file" }
$1 == "UNIX" {print}
$2 > 10 {printf "This line has a value of %d", $2}
END { print "Finished processing the file. Bye!"}
</FONT></PRE>
<P>In this script, a message is initially printed, and each line that has the word
<TT>UNIX</TT> in the first field is echoed to the screen. Next, any line with the
second field greater than 10 is found, and the message is generated with its current
value. Finally, the <TT>END</TT> pattern prints a message that the program is finished.
<CENTER>
<H4><A NAME="Heading17<FONT COLOR="#000077">Variables</FONT></H4>
</CENTER>
<P>If you have used any programming language before, you know that a variable is
a storage location for a value. Each variable has a name and an associated value,
which may change.</P>
<P>With <TT>gawk</TT>, you assign a variable a value by using <TT>=</TT>, the assignment
operator:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">var1 = 10
</FONT></PRE>
<P>This assigns the value 10 (numeric, not string) to the variable <TT>var1</TT>.
With <TT>gawk</TT>, you don't have to declare variable types before you use them
as you must with most other languages. This makes it easy to work with variables
in <TT>gawk</TT>.
<DL>
<DT></DT>
</DL>
<DL>
<DD>
<HR>
<A NAME="Heading18<FONT COLOR="#000077"><B>NOTE:</B> </FONT>Don't confuse the
assignment operator, <TT>=</TT>, which assigns a value, with the comparison operator,
<TT>==</TT>, which compares two values. This is a common error that takes a little
practice to overcome.
<HR>
</DL>
<P>The <TT>gawk</TT> language lets you use variables within actions, so the pattern-action
pair<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">$1 == "Plastic" { count = count + 1 }
</FONT></PRE>
<P>checks to see if the first field is equal to the string "<TT>Plastic</TT>",
and if it is, increments the value of <TT>count</TT> by one. Somewhere above this
line we should set a preliminary value for the variable <TT>count</TT> (usually in
the <TT>BEGIN</TT> section), or we will be adding one to an unknown value.
<DL>
<DT></DT>
</DL>
<DL>
<DD>
<HR>
<A NAME="Heading19<FONT COLOR="#000077"><B>NOTE:</B> </FONT>Actually, <TT>gawk</TT>
assigns all variables a value of zero when they are first used, so you don't really
have to define the value before you use it. It is, however, good programming practice
to initialize the variable anyway.
<HR>
</DL>
<P>Here's a more complete example:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">BEGIN { count = 0 }
$5 == "UNIX" { count = count + 1 }
END { printf "%d occurrences of UNIX were found", count }
</FONT></PRE>
<P>In the <TT>BEGIN</TT> section, the variable count is set to zero. Then, the <TT>gawk</TT>
pattern-action pair is processed, with every occurrence of "<TT>UNIX</TT>"
adding one to the value of <TT>count</TT>. After the entire file has been processed,
the <TT>END</TT> statement displays the total number.</P>
<P>Variables can be used in combination with fields and values, so all of the following
statements are legal:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">count = count + $6
count = $5 - 8
count = $5 + var1
</FONT></PRE>
<P>Variables can also be part of a pattern. The following are all valid as pattern-action
pairs:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">$2 > max_value {print "Max value exceeded by ", $2 - max_value}
$4 - var1 < min_value {print "Illegal value of ", $4}
</FONT></PRE>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -