0545-0548.html
来自「linux-unix130.linux.and.unix.ebooks130 l」· HTML 代码 · 共 392 行
HTML
392 行
<HTML
<HEAD>
<TITLE>Developer.com - Online Reference Library - 0672311739:RED HAT LINUX 2ND EDITION:gawk Programming</TITLE>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<SCRIPT>
<!--
function displayWindow(url, width, height) {
var Win = window.open(url,"displayWindow",'width=' + width +
',height=' + height + ',resizable=1,scrollbars=yes');
}
//-->
</SCRIPT>
</HEAD>
-->
<!-- ISBN=0672311739 //-->
<!-- TITLE=RED HAT LINUX 2ND EDITION //-->
<!-- AUTHOR=DAVID PITTS ET AL //-->
<!-- PUBLISHER=MACMILLAN //-->
<!-- IMPRINT=SAMS PUBLISHING //-->
<!-- PUBLICATION DATE=1998 //-->
<!-- CHAPTER=27 //-->
<!-- PAGES=0545-0582 //-->
<!-- UNASSIGNED1 //-->
<!-- UNASSIGNED2 //-->
<P><CENTER>
<a href="../ch26/0544-0544.html">Previous</A> | <a href="../ewtoc.html">Table of Contents</A> | <a href="0549-0551.html">Next</A>
</CENTER></P>
<A NAME="PAGENUM-545"><P>Page 545</P></A>
<H3><A NAME="ch27_ 1">
CHAPTER 27<BR>
</A></H3>
<H2>
gawk Programming
</H2>
<P><B>by David B. Horvath, CCP
</B></P>
<H3><A NAME="ch27_ 2">
IN THIS CHAPTER</A></H3>
<UL>
<LI> Applications 546
<LI> Features 547
<LI> awk Fundamentals 547
<LI> Actions 555
<LI> Advanced Input and Output 569
<LI> Functions 574
<LI> Writing Reports 577
<LI> Commands On-the-Fly 579
<LI> One Last Built-in Function: system 580
</UL>
<A NAME="PAGENUM-546"><P>Page 546</P></A>
<P>gawk, or GNU awk, is one of the newer versions of the
awk programming language created for UNIX by Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan in 1977. The
name awk comes from the initials of the creators' last names. Kernighan was also involved with
the creation of the C programming language and UNIX; Aho and Weinberger were involved
with the development of UNIX. Because of their backgrounds, you will see many similarities
between awk and C.
</P>
<P>There are several versions of awk: the original
awk, nawk, POSIX awk, and of course, gawk. nawk was
created in 1985 and is the version described in The
awk Programming Language (see the complete reference to this book later in the chapter in the section "Summary").
POSIX awk is defined in the IEEE Standard for Information
Technology, Portable Operating System Interface, Part 2: Shell and Utilities Volume
2, ANSI-approved April 5, 1993 (IEEE is the Institute
of Electrical and Electronics Engineers, Inc.). GNU
awk is based on POSIX awk.
</P>
<P>The awk language (in all of its versions) is a pattern-matching and processing language with
a lot of power. It will search a file (or multiple files) searching for records that match a
specified pattern. When a match is found, a specified action is performed. As a programmer, you
do not have to worry about opening, looping through the file reading each record, handling
end-of-file, or closing it when done. These details are handled automatically for you.
</P>
<P>It is easy to create short awk programs because of this functionality—many of the details
are handled by the language automatically. There are also many functions and built-in features
to handle many of the tasks of processing files.
</P>
<H3><A NAME="ch27_ 3">
Applications
</A></H3>
<P>There are many possible uses for awk, including extracting data from a file, counting
occurrences of within a file, and creating reports.
</P>
<P>The basic syntax of the awk language matches the C programming language; if you
already know C, you know most of awk. In many ways,
awk is an easier version of C because of the way it handles strings and arrays (dynamically). If you do not know C yet, learning
awk will make learning C a little easier.
</P>
<P>awk is also very useful for rapid prototyping or trying out an idea that will be implemented
in another language like C. Instead of your having to worry about some of the minute
details, the built-in automation takes care of them. You worry about the basic functionality.
</P>
<P>
<CENTER>
<TABLE BGCOLOR="#FFFF99">
<TR><TD><B>
TIP
</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
awk works with text files, not binary. Because binary data can contain values that look
like record terminators (newline characters)—or not have any at
all—awk will get confused. If you need to process binary files, look into Perl or use a traditional programming
language like C.
</BLOCKQUOTE></TD></TR>
</TABLE></CENTER>
</P>
<A NAME="PAGENUM-547"><P>Page 547</P></A>
<H3><A NAME="ch27_ 4">
Features
</A></H3>
<P>As is the UNIX environment, awk is flexible, contains predefined variables, automates many
of the programming tasks, provides the conventional variables, supports the C-formatted
output, and is easy to use. awk lets you combine the best of shell scripts and C programming.
</P>
<P>There are usually many different ways to perform the same task within
awk. Programmers get to decide which method is best suited to their applications. With the built-in variables
and functions, many of the normal programming tasks are automatically performed.
awk will automatically read each record, split it up into fields, and perform type conversions whenever
needed. The way a variable is used determines its type—there is no need (or method) to declare
variables of any type.
</P>
<P>Of course, the "normal" C programming constructs like
if/else, do/while, for, and while are supported.
awk doesn't support the switch/case construct. It supports C's
printf() for formatted output and also has a print command for
simpler output.
</P>
<H3><A NAME="ch27_ 5">
awk Fundamentals
</A></H3>
<P>Unlike some of the other UNIX tools (shell,
grep, and so on), awk requires a program (known as an
"awk script"). This program can be as simple as one line or as complex as several
thousand lines. (I once developed an awk program that summarizes data at several levels with
multiple control breaks; it was just short of 1000 lines.)
</P>
<P>The awk program can be entered a number of ways—on the command line or in a
program file. awk can accept input from a file, piped in from another program, or even directly from
the keyboard. Output normally goes to the standard output device, but that can be redirected to
a file or piped into another program. Output can also be sent directly to a file instead of
standard output.
</P>
<H4><A NAME="ch27_ 6">
Using awk from the Command Line
</A></H4>
<P>The simplest way to use awk is to code the program on the command line, accept input
from the standard input device (keyboard), and send output to the standard output device
(screen). Listing 27.1 shows this in its simplest form; it prints the number of fields in the input
record along with that record.
</P>
<P>Listing 27.1. Simplest use of awk.
</P>
<!-- CODE //-->
<PRE>$ gawk `{print NF ": " $0}'
Now is the time for all
Good Americans to come to the Aid
of Their Country.
Ask not what you can do for awk, but rather what awk can do for you.
Ctrl+d
</PRE>
<!-- END CODE //-->
<PRE>
continues
</PRE>
<A NAME="PAGENUM-548"><P>Page 548</P></A>
<P>
Listing 27.1. continued</P>
<!-- CODE //-->
<PRE>6: Now is the time for all
7: Good Americans to come to the Aid
3: of Their Country.
16: Ask not what you can do for awk, but rather what awk can do for you.
$ _
</PRE>
<!-- END CODE //-->
<CENTER>
<TABLE BGCOLOR="#FFFF99">
<TR><TD><B>
NOTE
</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
Ctrl+D is one way of showing that you should press (and hold) the Ctrl (or Control) key
and then press the D key. This is the default end-of-file key for UNIX. If this doesn't work on
your system, use stty -a to determine which key to press. Another way this action or key
is shown on the screen is ^d.<BR>
The entire awk script is contained within single quotes
(`) to prevent the shell from interpreting its contents. This is a requirement of the operating system or shell, not the
awk language.
</BLOCKQUOTE></TD></TR>
</TABLE></CENTER>
</P>
<P>NF is a predefined variable that is set to the number of fields on each record.
$0 is that record. The individual fields can be
referenced as $1, $2, and so on.
</P>
<P>You can also store your awk script in a file and specify that filename on the command line
by using the -f flag. If you do that, you don't have to contain the program within single quotes.
</P>
<P>
<CENTER>
<TABLE BGCOLOR="#FFFF99">
<TR><TD><B>
NOTE
</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
gawk and other versions of awk that meet the POSIX standard support the specification
of multiple programs through the use of multiple
-f options. This allows you to execute multiple
awk programs on the same input. Personally, I tend to avoid this just because it
gets a bit confusing.
</BLOCKQUOTE></TD></TR>
</TABLE></CENTER>
</P>
<P>You can use the normal UNIX shell redirection or just specify the filename on the
command line to accept the input from a file instead of the keyboard:
</P>
<!-- CODE SNIP //-->
<PRE>gawk `{print NF ": " $0}' < inputs
gawk `{print NF ": " $0}' inputs
</PRE>
<!-- END CODE SNIP //-->
<P>Multiple files can be specified by just listing them on the command line as shown in the
second form above—they will be processed in the order specified. Output can be redirected
through the normal UNIX shell facilities to send it to a file or pipe it into another program:
</P>
<!-- CODE SNIP //-->
<PRE>gawk `{print NF ": " $0}' > outputs
gawk `{print NF ": " $0}' | more
</PRE>
<!-- END CODE SNIP //-->
<P>Of course, both input and output can be redirected at the same time.
</P>
<P><CENTER>
<a href="../ch26/0544-0544.html">Previous</A> | <a href="../ewtoc.html">Table of Contents</A> | <a href="0549-0551.html">Next</A>
</CENTER></P>
</td>
</tr>
</table>
<!-- begin footer information -->
</body></html>
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?