📄 0561-0563.html
字号:
<HTML
<HEAD>
<TITLE>Developer.com - Online Reference Library - 0672311739:RED HAT LINUX 2ND EDITION:gawk Programming</TITLE>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<SCRIPT>
<!--
function displayWindow(url, width, height) {
var Win = window.open(url,"displayWindow",'width=' + width +
',height=' + height + ',resizable=1,scrollbars=yes');
}
//-->
</SCRIPT>
</HEAD>
-->
<!-- ISBN=0672311739 //-->
<!-- TITLE=RED HAT LINUX 2ND EDITION //-->
<!-- AUTHOR=DAVID PITTS ET AL //-->
<!-- PUBLISHER=MACMILLAN //-->
<!-- IMPRINT=SAMS PUBLISHING //-->
<!-- PUBLICATION DATE=1998 //-->
<!-- CHAPTER=27 //-->
<!-- PAGES=0545-0582 //-->
<!-- UNASSIGNED1 //-->
<!-- UNASSIGNED2 //-->
<P><CENTER>
<a href="0558-0560.html">Previous</A> | <a href="../ewtoc.html">Table of Contents</A> | <a href="0564-0567.html">Next</A>
</CENTER></P>
<A NAME="PAGENUM-561"><P>Page 561</P></A>
<P>The match(string, reg) function determines whether
string contains the set of characters defined by
reg. If there is a match, the position is returned, and the variables
RSTART and RLENGTH are set.
</P>
<P>The printf(format, variables) function writes formatted data converting
variables based on the format string. This function is very similar to the C
printf() function. More information about this function and the formatting strings is provided in the section
"printf" later in this chapter.
</P>
<P>The split(string, store, delim) function splits
string into elements of the array store based on the
delim string. The number of elements in store is returned. If you omit the
delim string, FS is used. To split a slash (/) delimited date into its component parts, code the following:
</P>
<!-- CODE SNIP //-->
<PRE>split("08/12/1962", results, "/");
</PRE>
<!-- END CODE SNIP //-->
<P>After the function call, results[1] contains 08,
results[2] contains 12, and results[3] contains
1962. When used with the split function, the array begins with the element one.
This also works with strings that contain text.
</P>
<P>The sprintf(format, variables) function behaves like the
printf function except that it returns the result string instead of writing output. It produces formatted data converting
variables based on the format string. This function is very similar to the C
sprintf() function. More information about this function and the formatting strings is provided in the
"printf" section of this chapter.
</P>
<P>The strftime(format, timestamp) function returns a formatted date or time based on the
format string; timestamp is the number of seconds since midnight on January 1, 1970. The
systime function returns a value in this form. The format is
the same as the C strftime() function.
</P>
<P>The sub(reg, string, target) function allows you to substitute the one set of characters
for the first occurrence of another (defined in the form of the regular expression
reg) within string. The number of substitutions is returned by the function. If
target is omitted, the input record, $0, is the target. This is patterned after the
substitute command in the ed text editor.
</P>
<P>The substr(string, position, len) function allows you to extract a substring based on a
starting position and length. If you omit the len parameter, the remaining string is returned.
</P>
<P>The tolower(string) function returns the uppercase alphabetic characters in
string converted to lowercase. Any other characters are returned without any conversion.
</P>
<P>The toupper(string) function returns the lowercase alphabetic characters in
string converted to uppercase. Any other characters are returned without any
conversion.
</P>
<H5><A NAME="ch27_ 20">
Special String Constants
</A></H5>
<P>awk supports special string constants that cannot be entered from the keyboard or have
special meaning. If you wanted to have a double quote
(") character as a string constant (x =
"""), how would you prevent awk from thinking the second one (the one you really want) is the
end
</P>
<A NAME="PAGENUM-562"><P>Page 562</P></A>
<P>
of the string? The answer is by escaping, or telling
awk that the next character has special meaning. This is done through the backslash
(\) character, as in the rest of UNIX.
</P>
<P>Table 27.6 shows most of the constants that
gawk supports.
</P>
<P>Table 27.6. gawk special string constants.
</P>
<TABLE WIDTH="360">
<TR><TD>
Expression
</TD><TD>
Meaning
</TD></TR>
<TR><TD>
\\
</TD><TD>
The means of including a backslash
</TD></TR>
<TR><TD>
\a
</TD><TD>
The alert or bell character
</TD></TR>
<TR><TD>
\b
</TD><TD>
Backspace
</TD></TR>
<TR><TD>
\f
</TD><TD>
Formfeed
</TD></TR>
<TR><TD>
\n
</TD><TD>
Newline
</TD></TR>
<TR><TD>
\r
</TD><TD>
Carriage return
</TD></TR>
<TR><TD>
\t
</TD><TD>
Tab
</TD></TR>
<TR><TD>
\v
</TD><TD>
Vertical tab
</TD></TR>
<TR><TD>
\"
</TD><TD>
Double quote
</TD></TR>
<TR><TD>
\xNN
</TD><TD>
Indicates that NN is a hexadecimal number
</TD></TR>
<TR><TD>
\0NNN
</TD><TD>
Indicates that NNN is an octal number
</TD></TR>
</TABLE>
<H4><A NAME="ch27_ 21">
Arrays
</A></H4>
<P>When you have more than one related piece of data, you have two choices—you can
create multiple variables, or you can use an array. An array enables you to keep a collection of
related data together.
</P>
<P>You access individual elements within an array by enclosing the subscript within square
brackets ([]). In general, you can use an array element any place you can use a regular variable.
</P>
<P>Arrays in awk have special capabilities that are lacking in most other languages: They are
dynamic, they are sparse, and the subscript is actually a string. You don't have to declare a
variable to be an array, and you don't have to define the maximum number of
elements—when you use an element for the first time, it is created dynamically. Because of this, a block of
memory is not initially allocated; in normal programming practice, if you want to accumulate sales
for each month in a year, 12 elements will be allocated, even if you are only processing
December at the moment. awk arrays are sparse; if you are working with December, only that element
will exist, not the other 11 (empty) months.
</P>
<P>In my experience, the last capability is the most useful—the subscript being a string. In
most programming languages, if you want to accumulate data based on a string (like totaling
sales by state or country), you need to have two arrays—the state or country name (a string) and
the
</P>
<A NAME="PAGENUM-563"><P>Page 563</P></A>
<P>
numeric sales array. You search the state or country name for a match and then use the
same element of the sales array. awk performs this for you. You create an element in the sales
array with the state or country name as the subscript and address it directly like the following:
</P>
<!-- CODE SNIP //-->
<PRE>total_sales["Pennsylvania"] = 10.15
</PRE>
<!-- END CODE SNIP //-->
<P>Much less programming and much easier to read (and maintain) than the search one array
and change another method. This is known as an associative array.
</P>
<P>However, awk does not directly support multidimension
arrays.
</P>
<H5><A NAME="ch27_ 22">
Array Functions
</A></H5>
<P>gawk provides a couple of functions specifically for use with arrays:
in and delete. The in function tests for membership in an array. The
delete function removes elements from an array.
</P>
<P>If you have an array with a subscript of states and want to determine if a specific state is in
the list, you would put the following within a conditional test (more about conditional tests in
the "Conditional Flow" section):
</P>
<!-- CODE SNIP //-->
<PRE>"Delaware" in total_sales
</PRE>
<!-- END CODE SNIP //-->
<P>You can also use the in function within a loop to step through the elements in an array
(especially if the array is sparse or associative). This is a special case of the
for loop and is described in the section "The
for statement," later in the chapter.
</P>
<P>To delete an array element (the state of Delaware, for example), you code the following:
</P>
<!-- CODE SNIP //-->
<PRE>delete total_sales["Delaware"]
</PRE>
<!-- END CODE SNIP //-->
<TABLE BGCOLOR="#FFFF99">
<TR><TD><B>
CAUTION
</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
When an array element is deleted, it has been removed from memory. The data is
no longer available.
</BLOCKQUOTE></TD></TR>
</TABLE></CENTER>
</P>
<P>It is always good practice to delete elements in an array, or entire arrays, when you are
done with them. Although memory is cheap and large quantities are available (especially with
virtual memory), you will eventually run out if you don't clean up.
</P>
<P>
<TABLE BGCOLOR="#FFFF99">
<TR><TD><B>
NOTE
</B></TD></TR>
<TR><TD>
<BLOCKQUOTE>
You must loop through all loop elements and delete each one. You cannot delete an
entire array directly; the following is not valid:
delete total_sales
</BLOCKQUOTE></TD></TR>
</TABLE></CENTER>
</P>
<P><CENTER>
<a href="0558-0560.html">Previous</A> | <a href="../ewtoc.html">Table of Contents</A> | <a href="0564-0567.html">Next</A>
</CENTER></P>
</td>
</tr>
</table>
<!-- begin footer information -->
</body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -