📄 ch26.htm
字号:
<P>Two special operators are used with variables to increment and decrement by one,
because these are common operations. Both of these special operators are borrowed
from the C language:
<TABLE BORDER="0">
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>count++</TT></TD>
<TD ALIGN="LEFT">Increments count by one</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT">
<BLOCKQUOTE>
<P><TT>count--</TT>
</BLOCKQUOTE>
<P>
</TD>
<TD ALIGN="LEFT">
<BLOCKQUOTE>
<P>Decrements count by one
</BLOCKQUOTE>
<P>
</TD>
</TR>
</TABLE>
<CENTER>
<H4><A NAME="Heading20<FONT COLOR="#000077">Built-In Variables</FONT></H4>
</CENTER>
<P>The <TT>gawk</TT> language has a few built-in variables that are used to represent
things such as the total number of records processed. These are useful when you want
to get totals. Table 26.7 shows the important built-in variables. <BR>
<CENTER>
<P><FONT SIZE="4"><B>Table 26.7. The important built-in variables. </B></FONT>
<TABLE BORDER="0">
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><I>Variable</I></TD>
<TD ALIGN="LEFT"><I>Description</I></TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>NR</TT></TD>
<TD ALIGN="LEFT">The number of records read so far</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>FNR</TT></TD>
<TD ALIGN="LEFT">The number of records read from the current file</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>FILENAME</TT></TD>
<TD ALIGN="LEFT">The name of the input file</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>FS</TT></TD>
<TD ALIGN="LEFT">Field separator (default is whitespace)</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>RS</TT></TD>
<TD ALIGN="LEFT">Record separator (default is newline)</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>OFMT</TT></TD>
<TD ALIGN="LEFT">Output format for numbers (default is <TT>%g</TT>)</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>OFS</TT></TD>
<TD ALIGN="LEFT">Output field separator</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>ORS</TT></TD>
<TD ALIGN="LEFT">Output record separator</TD>
</TR>
<TR ALIGN="LEFT" rowspan="1">
<TD ALIGN="LEFT"><TT>NF</TT></TD>
<TD ALIGN="LEFT">The number of fields in the current record</TD>
</TR>
</TABLE>
<BR>
</CENTER>
<P>The <TT>NR</TT> and <TT>FNR</TT> values are the same if you are processing only
one file, but if you are doing more than one file, <TT>NR</TT> is a running total
of all files, while <TT>FNR</TT> is the total for the current file only.</P>
<P>The <TT>FS</TT> variable is useful, because it controls the input file's field
separator. To use the colon for the <TT>/etc/passwd</TT> file, for example, you would
use the command<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">FS=":"
</FONT></PRE>
<P>in the script, usually as part of the <TT>BEGIN</TT> pattern.</P>
<P>You can use these built-in variables as you would any other. For example, the
command</P>
<PRE><FONT COLOR="#0066FF">NF <= 5 {print "Not enough fields in the record"}
</FONT></PRE>
<P>gives you a way to check the number of fields in the file you are processing and
generate an error message if the values are incorrect.
<CENTER>
<H3><A NAME="Heading21<FONT COLOR="#000077">Control Structures</FONT></H3>
</CENTER>
<P>Enough of the details have been covered to allow us to start doing some real <TT>gawk</TT>
programming. Although we have not covered all of <TT>gawk</TT>'s pattern and action
considerations, we have seen all the important material. Now we can look at writing
control structures.</P>
<P>If you have any programming experience at all, or have tried some shell script
writing, many of these control structures will appear familiar. Follow the examples
and try a few test programs of your own.</P>
<P>Incidentally, <TT>gawk</TT> enables you to place comments anywhere in your scripts,
as long as the comment starts with a <TT>#</TT> sign. You should use comments to
indicate what is going on in your scripts if it is not immediately obvious.
<CENTER>
<H4><A NAME="Heading22<FONT COLOR="#000077">The if Statement</FONT></H4>
</CENTER>
<P>The <TT>if</TT> statement is used to allow <TT>gawk</TT> to test some condition
and, if it is true, execute a set of commands. The general syntax for the <TT>if</TT>
statement is<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">if (expression) {commands} else {commands}
</FONT></PRE>
<P>The expression is always evaluated to see if it is true or false. No other value
is calculated for the <TT>if</TT> expression. Here's a simple <TT>if</TT> script:<FONT
COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF"># a simple if statement
(if ($1 == 0){
print "This cell has a value of zero"
}
else {
printf "The value is %d\n", $1
})
</FONT></PRE>
<P>You will notice that I used the curly braces to lay out the program in a readable
manner. Of course, this could all have been typed on one line and <TT>gawk</TT> would
have understood it, but writing in a nicely formatted manner makes it easier to understand
what is going on, and debugging the program becomes much easier if the need arises.</P>
<P>In this simple script, we test the first field to see if the value is zero. If
it is, a message to that effect is printed. If not, the <TT>printf</TT> statement
prints the value of the field.</P>
<P>The flow of the <TT>if</TT> statement is quite simple to follow. There can be
several commands in each part, as long as the curly braces mark the start and end
of each command. There is no need to have an <TT>else</TT> section. It can be left
out entirely, if desired. For example, this is a complete and valid <TT>gawk</TT>
script:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">(if ($1 == 0){
print "This cell has a value of zero"
</FONT></PRE>
<PRE><FONT COLOR="#0066FF"> })
</FONT></PRE>
<P>The <TT>gawk</TT> language, to be compatible with other programming languages,
allows a special format of the <TT>if</TT> statement when a simple comparison is
being conducted. This quick-and-dirty <TT>if</TT> structure is harder to read for
novices, and I don't recommend it if you are new to the language. For example, here's
the <TT>if</TT> statement written the proper way:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF"># a nicely formatted if loop
(if ($1 > $2){
print "The first field is larger"
}
else {
print "The second field is larger"
</FONT></PRE>
<PRE><FONT COLOR="#0066FF"> })
</FONT></PRE>
<P>Here's the quick-and-dirty method:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF"># if syntax from hell
$1 > $2{
print "The first field is larger"
}
{print "The second field is larger")
</FONT></PRE>
<P>You will notice that the keywords <TT>if</TT> and <TT>else</TT> are left off.
The general structure is retained: expression, true commands, and false commands.
However, this is much less readable if you do not know that it is an <TT>if</TT>
statement! Not all versions of <TT>gawk</TT> will allow this method of using <TT>if</TT>,
so don't be too surprised if it doesn't work. Besides, you should be using the more
verbose method of writing <TT>if</TT> statements for readability's sake.
<CENTER>
<H4><A NAME="Heading23<FONT COLOR="#000077">The while Loop</FONT></H4>
</CENTER>
<P>The <TT>while</TT> statement allows a set of commands to be repeated as long as
some condition is true. The condition is evaluated each time the program loops. The
general format of the <TT>gawk</TT> <TT>while</TT> loop is<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">while (expression){
commands
}
</FONT></PRE>
<P>For example, the <TT>while</TT> loop can be used in a program that calculates
the value of an investment over several years (the formula for the calculation is
value=amount(1+interest_rate)^years):</P>
<PRE><FONT COLOR="#0066FF">
# interest calculation computes compound interest
# inputs from a file are the amount, interest_rate, and years
{var = 1
while (var <= $3) {
printf("%f\n", $1*(1+$2)^var)
var++
}
}
</FONT></PRE>
<P>You can see in this script that we initialize the variable <TT>var</TT> to 1 before
entering the <TT>while</TT> loop. If we hadn't done this, <TT>gawk</TT> would have
assigned a value of zero. The values for the three variables we use are read from
the input file. The <TT>autoincrement</TT> command is used to add one to <TT>var</TT>
each time the line is executed.
<CENTER>
<H4><A NAME="Heading24<FONT COLOR="#000077">The for Loop</FONT></H4>
</CENTER>
<P>The <TT>for</TT> loop is commonly used when you want to initialize a value and
then ignore it. The syntax of the <TT>gawk</TT> <TT>for</TT> loop is<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">for (initialization; expression; increment) {
command
}
</FONT></PRE>
<P>The initialization is executed only once and then ignored, the expression is evaluated
each time the loop executes, and the increment is executed each time the loop is
executed. Usually the increment is a counter of some type, but it can be any collection
of valid commands. Here's an example of a <TT>for</TT> loop, which is the same basic
program as shown for the <TT>while</TT> loop:<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF"># interest calculation computes compound interest
# inputs from a file are the amount, interest_rate, and years
{for (var=1; var <= $3; var++) {
printf("%f\n", $1*(1+$2)^var)
}
}
</FONT></PRE>
<P>In this case, <TT>var</TT> is initialized when the <TT>for</TT> loop starts. The
expression is evaluated, and if true, the loop runs. Then the value of <TT>var</TT>
is incremented and the expression is tested again.</P>
<P>The format of the <TT>for</TT> loop might look strange if you haven't encountered
programming languages before, but it is the same as the <TT>for</TT> loop used in
C, for example.
<CENTER>
<H4><A NAME="Heading25<FONT COLOR="#000077">next and exit</FONT></H4>
</CENTER>
<P>The <TT>next</TT> instruction tells <TT>gawk</TT> to process the next record in
the file, regardless of what it was doing. For example, in the script<FONT COLOR="#0066FF"></FONT>
<PRE><FONT COLOR="#0066FF">{ command1
command2
command3
next
command4
}
</FONT></PRE>
<P>as soon as the <TT>next</TT> statement is read, <TT>gawk</TT> moves to the next
record in the file and starts at the top of the current script block (given by the
curly brace). In this example, command4 will never be executed because the <TT>next</TT>
statement moves back up to command1 each time.</P>
<P>The <TT>next</TT> statement is usually used inside an <TT>if</TT> loop, where
you may want execution to return to the start of the script if some condition is
met.</P>
<P>The <TT>exit</TT> statement makes <TT>gawk</TT> behave as though it has reached
the end of the file, and it then executes any <TT>END</TT> patterns (if any exist).
This is a useful method of aborting processing if there was an error in the file.
<CENTER>
<H4><A NAME="Heading26<FONT COLOR="#000077">Arrays</FONT></H4>
</CENTER>
<P>The <TT>gawk</TT> language supports arrays and enables you to access any element
in the array easily. No special initialization is neces
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -