📄 0578-0580.html
字号:
<HTML
<HEAD>
<TITLE>Developer.com - Online Reference Library - 0672311739:RED HAT LINUX 2ND EDITION:gawk Programming</TITLE>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<SCRIPT>
<!--
function displayWindow(url, width, height) {
var Win = window.open(url,"displayWindow",'width=' + width +
',height=' + height + ',resizable=1,scrollbars=yes');
}
//-->
</SCRIPT>
</HEAD>
-->
<!-- ISBN=0672311739 //-->
<!-- TITLE=RED HAT LINUX 2ND EDITION //-->
<!-- AUTHOR=DAVID PITTS ET AL //-->
<!-- PUBLISHER=MACMILLAN //-->
<!-- IMPRINT=SAMS PUBLISHING //-->
<!-- PUBLICATION DATE=1998 //-->
<!-- CHAPTER=27 //-->
<!-- PAGES=0545-0582 //-->
<!-- UNASSIGNED1 //-->
<!-- UNASSIGNED2 //-->
<P><CENTER>
<a href="0574-0577.html">Previous</A> | <a href="../ewtoc.html">Table of Contents</A> | <a href="0581-0582.html">Next</A>
</CENTER></P>
<A NAME="PAGENUM-578"><P>Page 578</P></A>
<H4><A NAME="ch27_ 46">
Complex Reports
</A></H4>
<P>Using awk, it is possible to quickly create complex reports. It is much easier to perform
string comparisons, build arrays on-the-fly, and take advantage of associative arrays than to code
in another language (like C). Instead of having to search through an array for a match with a
text key, that key can be used as the array subscript.
</P>
<P>I have produced reports using awk with three levels of control breaks, multiple sections of
reports in the same control break, and multiple totaling pages. The totaling pages were for
each level of control break plus a final page; if the control break didn't have a particular type of
data, then the total page didn't have it either. If there was only one member of a control break,
then the total page for that level wasn't created. (This saved a lot of paper when there was really
only one level of control break—the highest.)
</P>
<P>This report ended up being more than 1,000 lines of
awk (nawk to be specific) code. It takes a little longer to run than the equivalent C program, but it took a lot less programmer time
to create. Because it was easy to create and modify, it was developed using prototypes. The
users briefly described what they wanted, and I produced a report. They decided they needed
more control breaks, and I added them; then they realized a lot of paper was wasted on total
pages, so the report was modified as described.
</P>
<P>Being easy to develop incrementally without knowing the final result made it easier and
more fun for me. By my being responsive to user changes, the users were
made happy!
</P>
<H4><A NAME="ch27_ 47">
Extracting Data
</A></H4>
<P>As mentioned early in this chapter, many systems don't produce data in the desired
format. When working with data stored in relational databases, there are two main ways to get
data out: Use a query tool with SQL or write a program to get the data from the database and
output it in the desired form. SQL query tools have limited formatting ability but can
provide quick and easy access to the data.
</P>
<P>One technique I have found very useful is to extract the data from the database into a file
that is then manipulated by an awk script to produce the exact format required. When required,
an awk script can even create the SQL statements used to query the database (specifying the
key values for the rows to select).
</P>
<P>The following example is used when the query tool places a space before a numeric field
that must be removed for program that will use the data in another system (mainframe COBOL):
</P>
<!-- CODE SNIP //-->
<PRE>{ printf("%s%s%-25.25s\n", $1, $2, $3); }
</PRE>
<!-- END CODE SNIP //-->
<P>awk automatically removes the field separator (the space character) when splitting the
input record into individual fields, and the formatting
%s string format specifiers in printf are contiguous (do not have any spaces
between them).
</P>
<A NAME="PAGENUM-579"><P>Page 579</P></A>
<H3><A NAME="ch27_ 48">
Commands On-the-Fly
</A></H3>
<P>The ability to pipe the output of a command into another is very powerful because the
output from the first becomes the input that the second can manipulate. A frequent use of
one-line awk programs is the creation of commands based on a list.
</P>
<P>The find command can be used to produce a list of files that match its conditions, or it
can execute a single command that takes a single command-line argument. You can see files in
a directory (and subdirectories) that match specific conditions with the following:
</P>
<!-- CODE SNIP //-->
<PRE>$ find . -name "*.prn" -print
</PRE>
<!-- END CODE SNIP //-->
<P>This outputs
</P>
<!-- CODE SNIP //-->
<PRE>./exam2.prn
./exam1.prn
./exam3.prn
</PRE>
<!-- END CODE SNIP //-->
<P>Or you can print the contents of those files with the following:
</P>
<!-- CODE SNIP //-->
<PRE>find . -name "*.prn" -exec lp {} \;
</PRE>
<!-- END CODE SNIP //-->
<P>The find command inserts the individual filenames that it locates in place of the
{} and executes the lp command. But if you wanted to execute a command that required two
arguments (to copy files to a new name) or execute multiple commands at once, you couldn't do it
with find alone. You could create a shell script that would accept the single argument and use it
in multiple places, or you could create an awk single-line program:
</P>
<!-- CODE SNIP //-->
<PRE>$ find . -name "*.prn" -print | awk `{print "echo bak" $1;
Âprint "cp " $1 " " $1".bak";}'
</PRE>
<!-- END CODE SNIP //-->
<P>This outputs
</P>
<!-- CODE SNIP //-->
<PRE>echo bak./exam2.prn
cp ./exam2.prn ./exam2.prn.bak
echo bak./exam1.prn
cp ./exam1.prn ./exam1.prn.bak
echo bak./exam3.prn
cp ./exam3.prn ./exam3.prn.bak
</PRE>
<!-- END CODE SNIP //-->
<P>To get the commands to actually execute, you need to pipe the commands into one of the
shells. The following example uses the Korn shell; you can use the one you prefer:
</P>
<!-- CODE SNIP //-->
<PRE>$ find . -name "*.prn" -print |
awk `{print "echo bak" $1; print "cp " $1 " " $1".bak";}' |
ksh
</PRE>
<!-- END CODE SNIP //-->
<P>This outputs
</P>
<!-- CODE SNIP //-->
<PRE>bak./exam2.prn
bak./exam1.prn
bak./exam3.prn
</PRE>
<!-- END CODE SNIP //-->
<A NAME="PAGENUM-580"><P>Page 580</P></A>
<P>Before each copy takes place, the message is shown. This is also handy if you want to search
for a string (using the grep command) in the files of multiple subdirectories. Many versions of
the grep command don't show the name of the file searched unless you use wildcards (or
specify multiple filenames on the command line). The following uses
find to search for C source files, awk to create
grep commands to look for an error message, and the shell
echo command to show the file being searched:
</P>
<!-- CODE SNIP //-->
<PRE>$ find . -name "*.c" -print |
awk `{print "echo " $1; print "grep error-message " $1;}' |
ksh
</PRE>
<!-- END CODE SNIP //-->
<P>The same technique can be used to perform lint checks on source code in a series
of subdirectories. I execute the following in a shell script periodically to check all C code:
</P>
<!-- CODE SNIP //-->
<PRE>$ find . -name "*.c" -print |
awk `{print "lint " $1 " > " $1".lint"}' |
ksh
</PRE>
<!-- END CODE SNIP //-->
<P>The lint version on one system prints the code error as a heading line and then the parts
of code in question as a list below. grep shows the heading but not the detail lines. The
awk script prints all lines from the heading until the first blank line (end of the
lint section).
</P>
<P>When in doubt, pipe the output into more or pg to view the created commands before you
pipe them into a shell for execution.
</P>
<H3><A NAME="ch27_ 49">
One Last Built-in Function: system
</A></H3>
<P>There is one more built-in function that doesn't fit in the character or numeric categories:
system. The system function executes the string passed to it as an argument. This allows you to
execute commands or scripts on-the-fly when your
awk code has the need.
</P>
<P>You can code a report to automatically print to paper when it is complete. The code
looks something like Listing 27.8.
</P>
<P>Listing 27.8. Using the system function.
</P>
<!-- CODE //-->
<PRE>BEGIN { pageno = 0;
pageno = print_header(pageno);
printf("the page number is now %d\n", pageno);
}
# The production of the report would be coded here
END { close ("report.txt");
system ("lpr -Pmyprinter report.txt");
}
function print_header(page ) {
page++;
printf("This is the header for page %d\n", page) > "report.txt";
</PRE>
<!-- END CODE //-->
<P><CENTER>
<a href="0574-0577.html">Previous</A> | <a href="../ewtoc.html">Table of Contents</A> | <a href="0581-0582.html">Next</A>
</CENTER></P>
</td>
</tr>
</table>
<!-- begin footer information -->
</body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -