453-456.html

来自「linux-unix130.linux.and.unix.ebooks130 l」· HTML 代码 · 共 166 行

HTML
166
字号
<HTML>

<HEAD>

<TITLE>Linux Unleashed, Third Edition:gawk</TITLE>

<SCRIPT>
<!--
function displayWindow(url, width, height) {
        var Win = window.open(url,"displayWindow",'width=' + width +
',height=' + height + ',resizable=1,scrollbars=yes');
}
//-->
</SCRIPT>
</HEAD>

 -->




<!--ISBN=0672313723//-->

<!--TITLE=Linux Unleashed, Third Edition//-->

<!--AUTHOR=Tim Parker//-->

<!--PUBLISHER=Macmillan Computer Publishing//-->

<!--IMPRINT=Sams//-->

<!--CHAPTER=25//-->

<!--PAGES=453-456//-->

<!--UNASSIGNED1//-->

<!--UNASSIGNED2//-->



<CENTER>

<TABLE BORDER>

<TR>

<TD><A HREF="450-453.html">Previous</A></TD>

<TD><A HREF="../ewtoc.html">Table of Contents</A></TD>

<TD><A HREF="456-459.html">Next</A></TD>

</TR>

</TABLE>

</CENTER>

<P><BR></P>

<H3><A NAME="Heading11"></A><FONT COLOR="#000077">Calling gawk Programs</FONT></H3>

<P>Running pattern-action pairs one or two at a time from the command line would be pretty difficult (and time-consuming), so <TT>gawk</TT> allows you to store pattern-action pairs in a file. A <TT>gawk</TT> program (called a <I>script</I>) is a set of pattern-action pairs stored in an ASCII file. For example, this could be the contents of a valid <TT>gawk</TT> script:</P>

<!-- CODE SNIP //-->

<PRE>

/tparker/&#123;print &#36;6&#125;

&#36;2 != &#147;foo&#148; &#123;print&#125;

</PRE>

<!-- END CODE SNIP //-->

<P>The first line looks for <TT>tparker</TT> and prints the sixth column, and the second line starts at the top of the file again and looks for second columns that don&#146;t match the string <TT>&#147;foo,&#148;</TT> then displays the entire line. When you are writing a script, you don&#146;t need to worry about the quotation marks around the pattern-action pairs as you did on the command line, because the new command to execute this script makes it obvious where the pattern-action pairs start and end.</P>

<P>After you have saved all of the pattern-action pairs in a program, they are called by <TT>gawk</TT> with the <TT>-f</TT> option on the command line:</P>

<!-- CODE SNIP //-->

<PRE>

gawk -f <I>script filename</I>

</PRE>

<!-- END CODE SNIP //-->

<P>This command causes <TT>gawk</TT> to read all of the pattern-action pairs from the file <I>script</I> and process them against the file called <I>filename</I>. This is how most <TT>gawk</TT> programs are written. Don&#146;t confuse the <TT>-f</TT> and <TT>-F</TT> options!</P>

<P>If you want to specify a different field separator on the command line (they can be specified in the script, but use a special format you&#146;ll see later), the <TT>-F</TT> option must follow the <TT>-f</TT> option:</P>

<!-- CODE SNIP //-->

<PRE>

gawk -f <I>script</I> -F&#148;:&#148; <I>filename</I>

</PRE>

<!-- END CODE SNIP //-->

<P>If you want to process more than one file using the script, just append the names of the files:

</P>

<!-- CODE SNIP //-->

<PRE>

gawk -f <I>script filename1 filename2 filename3</I> &#133;

</PRE>

<!-- END CODE SNIP //-->

<P>By default, all output from the <TT>gawk</TT> command is displayed on the screen. You can redirect it to a file with the usual Linux redirection commands:</P>

<!-- CODE SNIP //-->

<PRE>

gawk -f <I>script filename</I> &gt; save_file

</PRE>

<!-- END CODE SNIP //-->

<P>There is another way of specifying the output file from within the script, but we&#146;ll come back to that in a moment.

</P>

<H4 ALIGN="LEFT"><A NAME="Heading12"></A><FONT COLOR="#000077">BEGIN and END</FONT></H4>

<P>Two special patterns supported by <TT>gawk</TT> are useful when writing scripts. The <TT>BEGIN</TT> pattern is used to indicate any actions that should take place before <TT>gawk</TT> starts processing a file. This is typically used to initialize values, set parameters such as field separators, and so on. The <TT>END</TT> pattern is used to execute any instructions after the file has been completely processed. Typically, this can be for summaries or completion notices.</P>

<P>Any instructions following the <TT>BEGIN</TT> and <TT>END</TT> patterns are enclosed in curly braces to identify which instructions are part of both patterns. Both <TT>BEGIN</TT> and <TT>END</TT> must appear in capitals. Here&#146;s a simple example of a <TT>gawk</TT> script that uses <TT>BEGIN</TT> and <TT>END</TT>, albeit only for sending a message to the terminal:</P>

<!-- CODE SNIP //-->

<PRE>

BEGIN &#123; print &#147;Starting the process the file&#148; &#125;

&#36;1 == &#147;UNIX&#148; &#123;print&#125;

&#36;2 &gt; 10 &#123;printf &#147;This line has a value of %d&#148;, &#36;2&#125;

END &#123; print &#147;Finished processing the file. Bye!&#148;&#125;

</PRE>

<!-- END CODE SNIP //-->

<P>In this script, a message is initially printed out, and each line that has the word <TT>UNIX</TT> in the first column is echoed to the screen. Next, the file is processed again to look for any line with the second column greater than 10, and the message is generated with its current value. Finally, the <TT>END</TT> pattern prints out a message that the program is finished.</P>

<H4 ALIGN="LEFT"><A NAME="Heading13"></A><FONT COLOR="#000077">Variables</FONT></H4>

<P>If you have used any programming language before, you know that a <I>variable</I> is a storage location for a value. Each variable has a name and an associated value, which may change.</P>

<P>With <TT>gawk</TT>, you assign a variable a value using the assignment operator (<TT>=)</TT>:</P>

<!-- CODE SNIP //-->

<PRE>

var1 = 10

</PRE>

<!-- END CODE SNIP //-->

<BLOCKQUOTE>

<P><FONT SIZE="-1"><HR><B>Note:&nbsp;&nbsp;</B><BR>Don&#146;t confuse the assignment operator, <TT>=</TT>, which assigns a value, with the comparison operator, <TT>==</TT>, which compares two values. This is a common error that takes a little practice to overcome.<HR></FONT>

</BLOCKQUOTE>

<P>This assigns the value 10 (numeric, not string) to the variable <TT>var1</TT>. With <TT>gawk</TT>, you don&#146;t have to declare variable types before you use them as you must with most other languages. This makes it easy to work with variables in <TT>gawk</TT>.</P>

<P>The <TT>gawk</TT> language lets you use variables within actions:</P>

<!-- CODE SNIP //-->

<PRE>

&#36;1 == &#147;Plastic&#148; &#123; count = count &#43; 1 &#125;

</PRE>

<!-- END CODE SNIP //-->

<BLOCKQUOTE>

<P><FONT SIZE="-1"><HR><B>Note:&nbsp;&nbsp;</B><BR>Actually, <TT>gawk</TT> assigns all variables a value of zero when they are first used, so you don&#146;t really have to define the value before you use it. It is, however, good programming practice to initialize the variable anyway.<HR></FONT>

</BLOCKQUOTE>

<P>This pattern-action pair checks to see if the first column is equal to the string &#147;<TT>Plastic,</TT>&#148; and if it is, increments the value of <TT>count</TT> by one. Somewhere above this line we should set a preliminary value for the variable <TT>count</TT> (usually in the <TT>BEGIN</TT> section), or we will be adding one to something that isn&#146;t a recognizable number.</P>

<P>Here&#146;s a more complete example:</P>

<!-- CODE SNIP //-->

<PRE>

BEGIN &#123; count = 0 &#125;

&#36;5 == &#147;UNIX&#148; &#123; count = count &#43; 1 &#125;

END &#123; printf &#147;%d occurrences of UNIX were found&#148;, count &#125;

</PRE>

<!-- END CODE SNIP //-->

<P>In the <TT>BEGIN</TT> section, the variable count is set to zero. Then, the <TT>gawk</TT> pattern-action pair is processed, with every occurrence of &#147;<TT>UNIX</TT>&#148; adding one to the value of <TT>count</TT>. After the entire file has been processed, the <TT>END</TT> statement displays the total number.</P>

<P>Variables can be used in combination with columns and values, so all of the following statements are legal:</P>

<!-- CODE SNIP //-->

<PRE>

count = count &#43; &#36;6



count = &#36;5 - 8



count = &#36;5 &#43; var1

</PRE>

<!-- END CODE SNIP //-->

<P>Variables can also be part of a pattern. The following are both valid as pattern-action pairs:

</P>

<!-- CODE SNIP //-->

<PRE>

&#36;2 &gt; max_value &#123;print &#147;Max value exceeded by &#147;, &#36;2 - max_value&#125;



&#36;4 - var1 &lt; min_value &#123;print &#147;Illegal value of &#147;, &#36;4&#125;

</PRE>

<!-- END CODE SNIP //-->

<P>Two special operators are used with variables to increment and decrement by one, because these are common operations. Both of these special operators are borrowed from the C language:

</P>

<CENTER>

<TABLE WIDTH="80%"><TR>

<TD WIDTH="40%"><TT>count&#43;&#43;</TT>

<TD WIDTH="60%">Increments count by one

<TR>

<TD><TT>count--</TT>

<TD>Decrements count by one

</TABLE>

</CENTER>

<P><BR></P>

<CENTER>

<TABLE BORDER>

<TR>

<TD><A HREF="450-453.html">Previous</A></TD>

<TD><A HREF="../ewtoc.html">Table of Contents</A></TD>

<TD><A HREF="456-459.html">Next</A></TD>

</TR>

</TABLE>

</CENTER>





</td>
</tr>
</table>

<!-- begin footer information -->





</body></html>

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?