📄 textproc.html
字号:
CLASS="COMMAND">egrep</B>, <BCLASS="COMMAND">fgrep</B>. They are handy for searching through a mixed set of files, some compressed, some not.</P><P><ANAME="BZGREPREF"></A></P><P>To search <AHREF="filearchiv.html#BZIPREF">bzipped</A> files, use <BCLASS="COMMAND">bzgrep</B>.</P></TD></TR></TABLE></DIV></DD><DT><ANAME="LOOKREF"></A><BCLASS="COMMAND">look</B></DT><DD><P>The command <BCLASS="COMMAND">look</B> works like <BCLASS="COMMAND">grep</B>, but does a lookup on a <SPANCLASS="QUOTE">"dictionary,"</SPAN> a sorted word list. By default, <BCLASS="COMMAND">look</B> searches for a match in <TTCLASS="FILENAME">/usr/dict/words</TT>, but a different dictionary file may be specified.</P><DIVCLASS="EXAMPLE"><HR><ANAME="LOOKUP"></A><P><B>Example 15-19. Checking words in a list for validity</B></P><TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING"> 1 #!/bin/bash 2 # lookup: Does a dictionary lookup on each word in a data file. 3 4 file=words.data # Data file from which to read words to test. 5 6 echo 7 8 while [ "$word" != end ] # Last word in data file. 9 do # ^^^ 10 read word # From data file, because of redirection at end of loop. 11 look $word > /dev/null # Don't want to display lines in dictionary file. 12 lookup=$? # Exit status of 'look' command. 13 14 if [ "$lookup" -eq 0 ] 15 then 16 echo "\"$word\" is valid." 17 else 18 echo "\"$word\" is invalid." 19 fi 20 21 done <"$file" # Redirects stdin to $file, so "reads" come from there. 22 23 echo 24 25 exit 0 26 27 # ---------------------------------------------------------------- 28 # Code below line will not execute because of "exit" command above. 29 30 31 # Stephane Chazelas proposes the following, more concise alternative: 32 33 while read word && [[ $word != end ]] 34 do if look "$word" > /dev/null 35 then echo "\"$word\" is valid." 36 else echo "\"$word\" is invalid." 37 fi 38 done <"$file" 39 40 exit 0</PRE></TD></TR></TABLE><HR></DIV></DD><DT><BCLASS="COMMAND">sed</B>, <BCLASS="COMMAND">awk</B></DT><DD><P>Scripting languages especially suited for parsing text files and command output. May be embedded singly or in combination in pipes and shell scripts.</P></DD><DT><BCLASS="COMMAND"><AHREF="sedawk.html#SEDREF">sed</A></B></DT><DD><P>Non-interactive <SPANCLASS="QUOTE">"stream editor"</SPAN>, permits using many <BCLASS="COMMAND">ex</B> commands in <AHREF="timedate.html#BATCHPROCREF">batch</A> mode. It finds many uses in shell scripts.</P></DD><DT><BCLASS="COMMAND"><AHREF="awk.html#AWKREF">awk</A></B></DT><DD><P>Programmable file extractor and formatter, good for manipulating and/or extracting fields (columns) in structured text files. Its syntax is similar to C.</P></DD><DT><ANAME="WCREF"></A><BCLASS="COMMAND">wc</B></DT><DD><P><ICLASS="FIRSTTERM">wc</I> gives a <SPANCLASS="QUOTE">"word count"</SPAN> on a file or I/O stream: <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="SCREEN"> <TTCLASS="PROMPT">bash $ </TT><TTCLASS="USERINPUT"><B>wc /usr/share/doc/sed-4.1.2/README</B></TT> <TTCLASS="COMPUTEROUTPUT">13 70 447 README</TT> [13 lines 70 words 447 characters]</PRE></TD></TR></TABLE></P><P><TTCLASS="USERINPUT"><B>wc -w</B></TT> gives only the word count.</P><P><TTCLASS="USERINPUT"><B>wc -l</B></TT> gives only the line count.</P><P><TTCLASS="USERINPUT"><B>wc -c</B></TT> gives only the byte count.</P><P><TTCLASS="USERINPUT"><B>wc -m</B></TT> gives only the character count.</P><P><TTCLASS="USERINPUT"><B>wc -L</B></TT> gives only the length of the longest line.</P><P>Using <BCLASS="COMMAND">wc</B> to count how many <TTCLASS="FILENAME">.txt</TT> files are in current working directory: <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING"> 1 $ ls *.txt | wc -l 2 # Will work as long as none of the "*.txt" files 3 #+ have a linefeed embedded in their name. 4 5 # Alternative ways of doing this are: 6 # find . -maxdepth 1 -name \*.txt -print0 | grep -cz . 7 # (shopt -s nullglob; set -- *.txt; echo $#) 8 9 # Thanks, S.C.</PRE></TD></TR></TABLE> </P><P>Using <BCLASS="COMMAND">wc</B> to total up the size of all the files whose names begin with letters in the range d - h <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="SCREEN"> <TTCLASS="PROMPT">bash$ </TT><TTCLASS="USERINPUT"><B>wc [d-h]* | grep total | awk '{print $3}'</B></TT> <TTCLASS="COMPUTEROUTPUT">71832</TT> </PRE></TD></TR></TABLE> </P><P>Using <BCLASS="COMMAND">wc</B> to count the instances of the word <SPANCLASS="QUOTE">"Linux"</SPAN> in the main source file for this book. <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="SCREEN"> <TTCLASS="PROMPT">bash$ </TT><TTCLASS="USERINPUT"><B>grep Linux abs-book.sgml | wc -l</B></TT> <TTCLASS="COMPUTEROUTPUT">50</TT> </PRE></TD></TR></TABLE> </P><P>See also <AHREF="filearchiv.html#EX52">Example 15-38</A> and <AHREF="redircb.html#REDIR4">Example 19-8</A>.</P><P>Certain commands include some of the functionality of <BCLASS="COMMAND">wc</B> as options. <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING"> 1 ... | grep foo | wc -l 2 # This frequently used construct can be more concisely rendered. 3 4 ... | grep -c foo 5 # Just use the "-c" (or "--count") option of grep. 6 7 # Thanks, S.C.</PRE></TD></TR></TABLE></P></DD><DT><ANAME="TRREF"></A><BCLASS="COMMAND">tr</B></DT><DD><P>character translation filter.</P><DIVCLASS="CAUTION"><TABLECLASS="CAUTION"WIDTH="90%"BORDER="0"><TR><TDWIDTH="25"ALIGN="CENTER"VALIGN="TOP"><IMGSRC="common/caution.png"HSPACE="5"ALT="Caution"></TD><TDALIGN="LEFT"VALIGN="TOP"><P><AHREF="special-chars.html#UCREF">Must use quoting and/or brackets</A>, as appropriate. Quotes prevent the shell from reinterpreting the special characters in <BCLASS="COMMAND">tr</B> command sequences. Brackets should be quoted to prevent expansion by the shell. </P></TD></TR></TABLE></DIV><P>Either <TTCLASS="USERINPUT"><B>tr "A-Z" "*" <filename</B></TT> or <TTCLASS="USERINPUT"><B>tr A-Z \* <filename</B></TT> changes all the uppercase letters in <TTCLASS="FILENAME">filename</TT> to asterisks (writes to <TTCLASS="FILENAME">stdout</TT>). On some systems this may not work, but <TTCLASS="USERINPUT"><B>tr A-Z '[**]'</B></TT> will.</P><P><ANAME="TROPTIONS"></A></P><P>The <TTCLASS="OPTION">-d</TT> option deletes a range of characters. <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING"> 1 echo "abcdef" # abcdef 2 echo "abcdef" | tr -d b-d # aef 3 4 5 tr -d 0-9 <filename 6 # Deletes all digits from the file "filename".</PRE></TD></TR></TABLE></P><P>The <TTCLASS="OPTION">--squeeze-repeats</TT> (or <TTCLASS="OPTION">-s</TT>) option deletes all but the first instance of a string of consecutive characters. This option is useful for removing excess <AHREF="special-chars.html#WHITESPACEREF">whitespace</A>. <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="SCREEN"> <TTCLASS="PROMPT">bash$ </TT><TTCLASS="USERINPUT"><B>echo "XXXXX" | tr --squeeze-repeats 'X'</B></TT> <TTCLASS="COMPUTEROUTPUT">X</TT></PRE></TD></TR></TABLE></P><P>The <TTCLASS="OPTION">-c</TT> <SPANCLASS="QUOTE">"complement"</SPAN> option <ICLASS="FIRSTTERM">inverts</I> the character set to match. With this option, <BCLASS="COMMAND">tr</B> acts only upon those characters <SPANCLASS="emphasis"><ICLASS="EMPHASIS">not</I></SPAN> matching the specified set.</P><P> <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="SCREEN"> <TTCLASS="PROMPT">bash$ </TT><TTCLASS="USERINPUT"><B>echo "acfdeb123" | tr -c b-d +</B></TT> <TTCLASS="COMPUTEROUTPUT">+c+d+b++++</TT></PRE></TD></TR></TABLE> </P><P>Note that <BCLASS="COMMAND">tr</B> recognizes <AHREF="regexp.html#POSIXREF">POSIX character classes</A>. <ANAME="AEN10535"HREF="#FTN.AEN10535">[1]</A> </P><P> <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="SCREEN"> <TTCLASS="PROMPT">bash$ </TT><TTCLASS="USERINPUT"><B>echo "abcd2ef1" | tr '[:alpha:]' -</B></TT> <TTCLASS="COMPUTEROUTPUT">----2--1</TT> </PRE></TD></TR></TABLE> </P><DIVCLASS="EXAMPLE"><HR><ANAME="EX49"></A><P><B>Example 15-20. <ICLASS="FIRSTTERM">toupper</I>: Transforms a file to all uppercase.</B></P><TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING"> 1 #!/bin/bash 2 # Changes a file to all uppercase. 3 4 E_BADARGS=65 5 6 if [ -z "$1" ] # Standard check for command line arg. 7 then 8 echo "Usage: `basename $0` filename" 9 exit $E_BADARGS 10 fi 11 12 tr a-z A-Z <"$1" 13 14 # Same effect as above, but using POSIX character set notation: 15 # tr '[:lower:]' '[:upper:]' <"$1" 16 # Thanks, S.C. 17 18 exit 0 19 20 # Exercise: 21 # Rewrite this script to give the option of changing a file 22 #+ to *either* upper or lowercase.</PRE></TD></TR></TABLE><HR></DIV><DIVCLASS="EXAMPLE"><HR><ANAME="LOWERCASE"></A><P><B>Example 15-21. <ICLASS="FIRSTTERM">lowercase</I>: Changes all filenames in working directory to lowercase.</B></P><TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING"> 1 #!/bin/bash 2 # 3 # Changes every filename in working directory to all lowercase. 4 # 5 # Inspired by a script of John Dubois, 6 #+ which was translated into Bash by Chet Ramey, 7 #+ and considerably simplified by the author of the ABS Guide. 8 9 10 for filename in * # Traverse all files in directory. 11 do 12 fname=`basename $filename` 13 n=`echo $fname | tr A-Z a-z` # Change name to lowercase. 14 if [ "$fname" != "$n" ] # Rename only files not already lowercase. 15 then 16 mv $fname $n 17 fi 18 done 19 20 exit $? 21 22 23 # Code below this line will not execute because of "exit". 24 #--------------------------------------------------------# 25 # To run it, delete script above line. 26 27 # The above script will not work on filenames containing blanks or newlines. 28 # Stephane Chazelas therefore suggests the following alternative: 29
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -