450-453.html
来自「linux-unix130.linux.and.unix.ebooks130 l」· HTML 代码 · 共 218 行
HTML
218 行
<HTML>
<HEAD>
<TITLE>Linux Unleashed, Third Edition:gawk</TITLE>
<SCRIPT>
<!--
function displayWindow(url, width, height) {
var Win = window.open(url,"displayWindow",'width=' + width +
',height=' + height + ',resizable=1,scrollbars=yes');
}
//-->
</SCRIPT>
</HEAD>
-->
<!--ISBN=0672313723//-->
<!--TITLE=Linux Unleashed, Third Edition//-->
<!--AUTHOR=Tim Parker//-->
<!--PUBLISHER=Macmillan Computer Publishing//-->
<!--IMPRINT=Sams//-->
<!--CHAPTER=25//-->
<!--PAGES=450-453//-->
<!--UNASSIGNED1//-->
<!--UNASSIGNED2//-->
<CENTER>
<TABLE BORDER>
<TR>
<TD><A HREF="447-450.html">Previous</A></TD>
<TD><A HREF="../ewtoc.html">Table of Contents</A></TD>
<TD><A HREF="453-456.html">Next</A></TD>
</TR>
</TABLE>
</CENTER>
<P><BR></P>
<P>Finally, we can impose some formatting on the output lines themselves. In an earlier example, you saw the use of “\n” to add a newline character. These are called <I>escape codes,</I> because the backslash is interpreted by <TT>gawk</TT> to mean something different than a backslash. Table 25.5 shows the important escape codes that <TT>gawk</TT> supports.</P>
<TABLE WIDTH="100%"><CAPTION ALIGN=LEFT><B>Table 25.5.</B> Escape codes.
<TR>
<TH COLSPAN="2"><HR>
<TR>
<TH WIDTH="25%" ALIGN="LEFT">Code
<TH WIDTH="75%" ALIGN="LEFT">Description
<TR>
<TH COLSPAN="2"><HR>
<TR>
<TD><TT>\a</TT>
<TD>Bell
<TR>
<TD><TT>\b</TT>
<TD>Backspace
<TR>
<TD><TT>\f</TT>
<TD>Formfeed
<TR>
<TD><TT>\n</TT>
<TD>Newline
<TR>
<TD><TT>\r</TT>
<TD>Carriage return
<TR>
<TD><TT>\t</TT>
<TD>Tab
<TR>
<TD><TT>\v</TT>
<TD>Vertical tab
<TR>
<TD><TT>\ooo</TT>
<TD>Octal character <TT>ooo</TT>
<TR>
<TD><TT>\xdd</TT>
<TD>Hexadecimal character <TT>dd</TT>
<TR>
<TD><TT>\c</TT>
<TD>Any character <TT>c</TT>
<TR>
<TD COLSPAN="2"><HR>
</TABLE>
<P>You can, for example, escape a quotation mark by using the sequence <TT>\”</TT>, which places a quotation mark in the string without interpreting it to mean something special:</P>
<!-- CODE SNIP //-->
<PRE>
{printf “I said \”Hello\” and he said “\Hello\”.”
</PRE>
<!-- END CODE SNIP //-->
<P>Awkward-looking, perhaps, but necessary to avoid problems. You’ll see lots more escape characters used in examples later in this chapter.
</P>
<H4 ALIGN="LEFT"><A NAME="Heading9"></A><FONT COLOR="#000077">Changing Field Separators</FONT></H4>
<P>As I mentioned earlier, the default field separator is always a whitespace character (spaces or tabs). This is often not convenient, as we found with the <TT>/etc/passwd</TT> file. You can change the field separator on the <TT>gawk</TT> command line by using the <TT>-F</TT> option followed by the separator you want to use:</P>
<!-- CODE SNIP //-->
<PRE>
gawk -F”:” ’/tparker/{print}’ /etc/passwd
</PRE>
<!-- END CODE SNIP //-->
<P>This command changes the field separator to a colon and searches the <TT>etc/passwd</TT> file for the lines containing the string <TT>tparker</TT>. The new field separator is put in quotation marks to avoid any confusion. Also, the <TT>-F</TT> option (it must be a capital F) is before the first quotation mark enclosing the pattern-action pair. If it comes after, it won’t be applied.</P>
<H4 ALIGN="LEFT"><A NAME="Heading10"></A><FONT COLOR="#000077">Metacharacters</FONT></H4>
<P>Earlier I mentioned that <TT>gawk</TT> is particular about its pattern-matching habits. The string <TT>cat</TT> matches anything with the three letters on the line. Sometimes you want to be more exact in the matching. If you only want to match the word “cat” but not “concatenate,” put spaces on each side of the pattern:</P>
<!-- CODE SNIP //-->
<PRE>
/ cat / {print}
</PRE>
<!-- END CODE SNIP //-->
<P>What about matching different cases? That’s where the <TT>or</TT> instruction, represented by a vertical bar, comes in.</P>
<!-- CODE SNIP //-->
<PRE>
/ cat | CAT / {print}
</PRE>
<!-- END CODE SNIP //-->
<P>The preceding pattern will match “cat” or “CAT” on a line. However, what about “Cat”? That’s where we also need to specify options within a pattern. With <TT>gawk</TT>, we use square brackets for this. To match any combination of “cat” in upper- or lowercase, write the pattern like this:</P>
<!-- CODE SNIP //-->
<PRE>
/ [Cc][Aa][Tt] / {print}
</PRE>
<!-- END CODE SNIP //-->
<P>This can get pretty awkward, but it’s seldom necessary. To match just “Cat” and “cat,” for example, use the following pattern:
</P>
<!-- CODE SNIP //-->
<PRE>
/ [Cc]at / {print}
</PRE>
<!-- END CODE SNIP //-->
<P>A useful matching operator is the tilde (<TT>~</TT>). This is used when you want to look for a match in a particular field in a record. Consider the following example:</P>
<!-- CODE SNIP //-->
<PRE>
$5 ~ /tparker/
</PRE>
<!-- END CODE SNIP //-->
<P>This pattern matches any records where the fifth field is <TT>tparker</TT>. It is similar to the <TT>==</TT> operator. The matching operator can be negated, so</P>
<!-- CODE SNIP //-->
<PRE>
$5 !~ /tparker/
</PRE>
<!-- END CODE SNIP //-->
<P>This pattern finds any record where the fifth field is not equal to <TT>tparker</TT>.</P>
<P>A few characters (called <I>metacharacters</I>) have special meaning to <TT>gawk</TT>. Many of these metacharacters are familiar to shell users because they are carried over from UNIX shells. The metacharacters shown in Table 25.6 can be used in <TT>gawk</TT> patterns.</P>
<TABLE WIDTH="100%"><CAPTION ALIGN=LEFT><B>Table 25.6.</B> Metacharacters.
<TR>
<TH COLSPAN="4"><HR>
<TR>
<TH WIDTH="20%" ALIGN="LEFT">Metacharacter
<TH WIDTH="25%" ALIGN="LEFT">Meaning
<TH WIDTH="20%" ALIGN="LEFT">Example
<TH WIDTH="35%" ALIGN="LEFT">Meaning of Example
<TR>
<TH COLSPAN="4"><HR>
<TR>
<TD VALIGN="TOP"><TT>~</TT>
<TD>The beginning of the field
<TD VALIGN="TOP"><TT>$3 ~ /^b/</TT>
<TD>Matches if the third field starts with b
<TR>
<TD VALIGN="TOP"><TT>$</TT>
<TD VALIGN="TOP">The end of the field
<TD VALIGN="TOP"><TT>$3 ~ /b$/</TT>
<TD>Matches if the third field ends with b
<TR>
<TD VALIGN="TOP"><TT>.</TT>
<TD VALIGN="TOP">Matches any single character
<TD VALIGN="TOP"><TT>$3 ~ /i.m/</TT>
<TD>Matches any record that has a third field value of i, another character, and then m
<TR>
<TD><TT>|</TT>
<TD>Or.
<TD><TT>/cat|CAT/</TT>
<TD>Matches cat or CAT
<TR>
<TD VALIGN="TOP"><TT>*</TT>
<TD>Zero or more repetitions of a character
<TD VALIGN="TOP"><TT>/UNI*X/</TT>
<TD VALIGN="TOP">Matches UNX, UNIX, UNIIX, UNIIIX, and so on
<TR>
<TD VALIGN="TOP"><TT>+</TT>
<TD>One or more repetitions of a character
<TD VALIGN="TOP"><TT>/UNI+X/</TT>
<TD>Matches UNIX, UNIIX, and so on, but not UNX
<TR>
<TD VALIGN="TOP"><TT>\{a,b\}</TT>
<TD>The number of repetitions between a and b (both integers)
<TD VALIGN="TOP"><TT>/UNI\{1,3\}X</TT>
<TD VALIGN="TOP">Matches only UNIX, UNIIX, and UNIIIX
<TR>
<TD VALIGN="TOP"><TT>?</TT>
<TD>Zero or one repetition of a string
<TD VALIGN="TOP"><TT>/UNI?X/</TT>
<TD VALIGN="TOP">Matches UNX and UNIX only
<TR>
<TD VALIGN="TOP"><TT>[]</TT>
<TD>Range of characters
<TD VALIGN="TOP">/I[BDG]M/
<TD>Matches IBM, IDM, and IGM
<TR>
<TD VALIGN="TOP"><TT>[^]</TT>
<TD VALIGN="TOP">Not in the set
<TD VALIGN="TOP"><TT>/I[^DE]M/</TT>
<TD>Matches all three character sets starting with I and ending in M, except IDM and IEM
<TR>
<TD COLSPAN="4"><HR>
</TABLE>
<P>Some of these metacharacters are used frequently. You will see some examples later in this chapter.
</P><P><BR></P>
<CENTER>
<TABLE BORDER>
<TR>
<TD><A HREF="447-450.html">Previous</A></TD>
<TD><A HREF="../ewtoc.html">Table of Contents</A></TD>
<TD><A HREF="453-456.html">Next</A></TD>
</TR>
</TABLE>
</CENTER>
</td>
</tr>
</table>
<!-- begin footer information -->
</body></html>
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?