📄 textproc.html

📁 Shall高级编程
💻 HTML
📖 第 1 页 / 共 5 页
字号:
CLASS="PROGRAMLISTING">   1&nbsp;grep -c txt *.sgml   # (number of occurrences of "txt" in "*.sgml" files)   2&nbsp;   3&nbsp;   4&nbsp;#   grep -cz .   5&nbsp;#            ^ dot   6&nbsp;# means count (-c) zero-separated (-z) items matching "."   7&nbsp;# that is, non-empty ones (containing at least 1 character).   8&nbsp;#    9&nbsp;printf 'a b\nc  d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz .     # 3  10&nbsp;printf 'a b\nc  d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz '$'   # 5  11&nbsp;printf 'a b\nc  d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz '^'   # 5  12&nbsp;#  13&nbsp;printf 'a b\nc  d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -c '$'    # 9  14&nbsp;# By default, newline chars (\n) separate items to match.   15&nbsp;  16&nbsp;# Note that the -z option is GNU "grep" specific.  17&nbsp;  18&nbsp;  19&nbsp;# Thanks, S.C.</PRE></TD></TR></TABLE>            </P><P>The <TTCLASS="OPTION">--color</TT> (or <TTCLASS="OPTION">--colour</TT>)	      option marks the matching string in color (on the console	      or in an <ICLASS="FIRSTTERM">xterm</I> window). Since	      <ICLASS="FIRSTTERM">grep</I> prints out each entire line	      containing the matching pattern, this lets you see exactly	      <SPANCLASS="emphasis"><ICLASS="EMPHASIS">what</I></SPAN> is being matched. See also	      the <TTCLASS="OPTION">-o</TT> option, which shows only the	      matching portion of the line(s).</P><DIVCLASS="EXAMPLE"><HR><ANAME="FROMSH"></A><P><B>Example 15-16. Printing out the <ICLASS="FIRSTTERM">From</I> lines in	        stored e-mail messages</B></P><TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING">   1&nbsp;#!/bin/bash   2&nbsp;# from.sh   3&nbsp;   4&nbsp;#  Emulates the useful "from" utility in Solaris, BSD, etc.   5&nbsp;#  Echoes the "From" header line in all messages   6&nbsp;#+ in your e-mail directory.   7&nbsp;   8&nbsp;   9&nbsp;MAILDIR=~/mail/*               #  No quoting of variable. Why?  10&nbsp;GREP_OPTS="-H -A 5 --color"    #  Show file, plus extra context lines  11&nbsp;                               #+ and display "From" in color.  12&nbsp;TARGETSTR="^From"              # "From" at beginning of line.  13&nbsp;  14&nbsp;for file in $MAILDIR           #  No quoting of variable.  15&nbsp;do  16&nbsp;  grep $GREP_OPTS "$TARGETSTR" "$file"  17&nbsp;  #    ^^^^^^^^^^              #  Again, do not quote this variable.  18&nbsp;  echo  19&nbsp;done  20&nbsp;  21&nbsp;exit $?  22&nbsp;  23&nbsp;#  Might wish to pipe the output of this script to 'more' or  24&nbsp;#+ redirect it to a file . . .</PRE></TD></TR></TABLE><HR></DIV><P>When invoked with more than one target file given,	      <BCLASS="COMMAND">grep</B> specifies which file contains	      matches.</P><P>	      <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="SCREEN"> <TTCLASS="PROMPT">bash$ </TT><TTCLASS="USERINPUT"><B>grep Linux osinfo.txt misc.txt</B></TT> <TTCLASS="COMPUTEROUTPUT">osinfo.txt:This is a file containing information about Linux. osinfo.txt:The GPL governs the distribution of the Linux operating system. misc.txt:The Linux operating system is steadily gaining in popularity.</TT> 	      </PRE></TD></TR></TABLE>	    </P><DIVCLASS="TIP"><TABLECLASS="TIP"WIDTH="90%"BORDER="0"><TR><TDWIDTH="25"ALIGN="CENTER"VALIGN="TOP"><IMGSRC="common/tip.png"HSPACE="5"ALT="Tip"></TD><TDALIGN="LEFT"VALIGN="TOP"><P>To force <BCLASS="COMMAND">grep</B> to show the filename	      when searching only one target file, simply give	      <TTCLASS="FILENAME">/dev/null</TT> as the second file.</P><P>	      <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="SCREEN"> <TTCLASS="PROMPT">bash$ </TT><TTCLASS="USERINPUT"><B>grep Linux osinfo.txt /dev/null</B></TT> <TTCLASS="COMPUTEROUTPUT">osinfo.txt:This is a file containing information about Linux. osinfo.txt:The GPL governs the distribution of the Linux operating system.</TT> 	      </PRE></TD></TR></TABLE>	    </P></TD></TR></TABLE></DIV><P>If there is a successful match, <BCLASS="COMMAND">grep</B>	      returns an <AHREF="exit-status.html#EXITSTATUSREF">exit status</A>	      of 0, which makes it useful in a condition test in a	      script, especially in combination with the <TTCLASS="OPTION">-q</TT>	      option to suppress output.	        <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING">   1&nbsp;SUCCESS=0                      # if grep lookup succeeds   2&nbsp;word=Linux   3&nbsp;filename=data.file   4&nbsp;   5&nbsp;grep -q "$word" "$filename"    #  The "-q" option   6&nbsp;                               #+ causes nothing to echo to stdout.   7&nbsp;if [ $? -eq $SUCCESS ]   8&nbsp;# if grep -q "$word" "$filename"   can replace lines 5 - 7.   9&nbsp;then  10&nbsp;  echo "$word found in $filename"  11&nbsp;else  12&nbsp;  echo "$word not found in $filename"  13&nbsp;fi</PRE></TD></TR></TABLE>            </P><P><AHREF="debugging.html#ONLINE">Example 29-6</A> demonstrates how to use	      <BCLASS="COMMAND">grep</B> to search for a word pattern in	      a system logfile.</P><DIVCLASS="EXAMPLE"><HR><ANAME="GRP"></A><P><B>Example 15-17. Emulating <ICLASS="FIRSTTERM">grep</I> in a script</B></P><TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING">   1&nbsp;#!/bin/bash   2&nbsp;# grp.sh: Very crude reimplementation of 'grep'.   3&nbsp;   4&nbsp;E_BADARGS=65   5&nbsp;   6&nbsp;if [ -z "$1" ]    # Check for argument to script.   7&nbsp;then   8&nbsp;  echo "Usage: `basename $0` pattern"   9&nbsp;  exit $E_BADARGS  10&nbsp;fi    11&nbsp;  12&nbsp;echo  13&nbsp;  14&nbsp;for file in *     # Traverse all files in $PWD.  15&nbsp;do  16&nbsp;  output=$(sed -n /"$1"/p $file)  # Command substitution.  17&nbsp;  18&nbsp;  if [ ! -z "$output" ]           # What happens if "$output" is not quoted?  19&nbsp;  then  20&nbsp;    echo -n "$file: "  21&nbsp;    echo $output  22&nbsp;  fi              #  sed -ne "/$1/s|^|${file}: |p"  is equivalent to above.  23&nbsp;  24&nbsp;  echo  25&nbsp;done    26&nbsp;  27&nbsp;echo  28&nbsp;  29&nbsp;exit 0  30&nbsp;  31&nbsp;# Exercises:  32&nbsp;# ---------  33&nbsp;# 1) Add newlines to output, if more than one match in any given file.  34&nbsp;# 2) Add features.</PRE></TD></TR></TABLE><HR></DIV><P>How can <BCLASS="COMMAND">grep</B> search for two (or	      more) separate patterns? What if you want	      <BCLASS="COMMAND">grep</B> to display all lines in a file	      or files that contain both <SPANCLASS="QUOTE">"pattern1"</SPAN>	      <SPANCLASS="emphasis"><ICLASS="EMPHASIS">and</I></SPAN> <SPANCLASS="QUOTE">"pattern2"</SPAN>?</P><P>One method is to <AHREF="special-chars.html#PIPEREF">pipe</A> the result of <BCLASS="COMMAND">grep	      pattern1</B> to <BCLASS="COMMAND">grep pattern2</B>.</P><P>For example, given the following file:</P><P>	    <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING">   1&nbsp;# Filename: tstfile   2&nbsp;   3&nbsp;This is a sample file.   4&nbsp;This is an ordinary text file.   5&nbsp;This file does not contain any unusual text.   6&nbsp;This file is not unusual.   7&nbsp;Here is some text.</PRE></TD></TR></TABLE>            </P><P>Now, let's search this file for lines containing	      <SPANCLASS="emphasis"><ICLASS="EMPHASIS">both</I></SPAN> <SPANCLASS="QUOTE">"file"</SPAN> and	      <SPANCLASS="QUOTE">"text"</SPAN> . . . </P><TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="SCREEN"> <TTCLASS="PROMPT">bash$ </TT><TTCLASS="USERINPUT"><B>grep file tstfile</B></TT> <TTCLASS="COMPUTEROUTPUT"># Filename: tstfile This is a sample file. This is an ordinary text file. This file does not contain any unusual text. This file is not unusual.</TT>  <TTCLASS="PROMPT">bash$ </TT><TTCLASS="USERINPUT"><B>grep file tstfile | grep text</B></TT> <TTCLASS="COMPUTEROUTPUT">This is an ordinary text file. This file does not contain any unusual text.</TT></PRE></TD></TR></TABLE><P>--</P><P><ANAME="EGREPREF"></A><BCLASS="COMMAND">egrep</B>	      -- <ICLASS="FIRSTTERM">extended grep</I> -- is the same	      as <BCLASS="COMMAND">grep -E</B>. This uses a somewhat	      different, extended set of <AHREF="regexp.html#REGEXREF">Regular	      Expressions</A>, which can make the search a bit more	      flexible. It also allows the boolean |	      (<ICLASS="FIRSTTERM">or</I>) operator.	      <TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="SCREEN"> <TTCLASS="PROMPT">bash $ </TT><TTCLASS="USERINPUT"><B>egrep 'matches|Matches' file.txt</B></TT> <TTCLASS="COMPUTEROUTPUT">Line 1 matches. Line 3 Matches. Line 4 contains matches, but also Matches</TT>               </PRE></TD></TR></TABLE>	      </P><P><ANAME="FGREPREF"></A><BCLASS="COMMAND">fgrep</B> --	      <ICLASS="FIRSTTERM">fast grep</I> -- is the same as	      <BCLASS="COMMAND">grep -F</B>. It does a literal string search	      (no <AHREF="regexp.html#REGEXREF">Regular Expressions</A>),	      which generally speeds things up a bit.</P><DIVCLASS="NOTE"><TABLECLASS="NOTE"WIDTH="90%"BORDER="0"><TR><TDWIDTH="25"ALIGN="CENTER"VALIGN="TOP"><IMGSRC="common/note.png"HSPACE="5"ALT="Note"></TD><TDALIGN="LEFT"VALIGN="TOP"><P>On some Linux distros, <BCLASS="COMMAND">egrep</B> and	      <BCLASS="COMMAND">fgrep</B> are symbolic links to, or aliases for	      <BCLASS="COMMAND">grep</B>, but invoked with the	      <TTCLASS="OPTION">-E</TT> and <TTCLASS="OPTION">-F</TT> options,	      respectively.</P></TD></TR></TABLE></DIV><DIVCLASS="EXAMPLE"><HR><ANAME="DICTLOOKUP"></A><P><B>Example 15-18. Looking up definitions in <ICLASS="CITETITLE">Webster's 1913 Dictionary</I></B></P><TABLEBORDER="0"BGCOLOR="#E0E0E0"WIDTH="90%"><TR><TD><PRECLASS="PROGRAMLISTING">   1&nbsp;#!/bin/bash   2&nbsp;# dict-lookup.sh   3&nbsp;   4&nbsp;#  This script looks up definitions in the 1913 Webster's Dictionary.   5&nbsp;#  This Public Domain dictionary is available for download   6&nbsp;#+ from various sites, including   7&nbsp;#+ Project Gutenberg (http://www.gutenberg.org/etext/247).   8&nbsp;#   9&nbsp;#  Convert it from DOS to UNIX format (only LF at end of line)  10&nbsp;#+ before using it with this script.  11&nbsp;#  Store the file in plain, uncompressed ASCII.  12&nbsp;#  Set DEFAULT_DICTFILE variable below to path/filename.  13&nbsp;  14&nbsp;  15&nbsp;E_BADARGS=65  16&nbsp;MAXCONTEXTLINES=50                        # Maximum number of lines to show.  17&nbsp;DEFAULT_DICTFILE="/usr/share/dict/webster1913-dict.txt"  18&nbsp;                                          # Default dictionary file pathname.  19&nbsp;                                          # Change this as necessary.  20&nbsp;#  Note:  21&nbsp;#  ----  22&nbsp;#  This particular edition of the 1913 Webster's  23&nbsp;#+ begins each entry with an uppercase letter  24&nbsp;#+ (lowercase for the remaining characters).  25&nbsp;#  Only the *very first line* of an entry begins this way,  26&nbsp;#+ and that's why the search algorithm below works.  27&nbsp;  28&nbsp;  29&nbsp;  30&nbsp;if [[ -z $(echo "$1" | sed -n '/^[A-Z]/p') ]]  31&nbsp;#  Must at least specify word to look up, and  32&nbsp;#+ it must start with an uppercase letter.  33&nbsp;then  34&nbsp;  echo "Usage: `basename $0` Word-to-define [dictionary-file]"  35&nbsp;  echo  36&nbsp;  echo "Note: Word to look up must start with capital letter,"  37&nbsp;  echo "with the rest of the word in lowercase."  38&nbsp;  echo "--------------------------------------------"  39&nbsp;  echo "Examples: Abandon, Dictionary, Marking, etc."  40&nbsp;  exit $E_BADARGS  41&nbsp;fi  42&nbsp;  43&nbsp;  44&nbsp;if [ -z "$2" ]                            #  May specify different dictionary  45&nbsp;                                          #+ as an argument to this script.  46&nbsp;then  47&nbsp;  dictfile=$DEFAULT_DICTFILE  48&nbsp;else  49&nbsp;  dictfile="$2"  50&nbsp;fi  51&nbsp;  52&nbsp;# ---------------------------------------------------------  53&nbsp;Definition=$(fgrep -A $MAXCONTEXTLINES "$1 \\" "$dictfile")  54&nbsp;#                  Definitions in form "Word \..."  55&nbsp;#  56&nbsp;#  And, yes, "fgrep" is fast enough  57&nbsp;#+ to search even a very large text file.  58&nbsp;  59&nbsp;  60&nbsp;# Now, snip out just the definition block.  61&nbsp;  62&nbsp;echo "$Definition" |  63&nbsp;sed -n '1,/^[A-Z]/p' |  64&nbsp;#  Print from first line of output  65&nbsp;#+ to the first line of the next entry.  66&nbsp;sed '$d' | sed '$d'  67&nbsp;#  Delete last two lines of output  68&nbsp;#+ (blank line and first line of next entry).  69&nbsp;# ---------------------------------------------------------  70&nbsp;  71&nbsp;exit 0  72&nbsp;  73&nbsp;# Exercises:  74&nbsp;# ---------  75&nbsp;# 1)  Modify the script to accept any type of alphabetic input  76&nbsp;#   + (uppercase, lowercase, mixed case), and convert it  77&nbsp;#   + to an acceptable format for processing.  78&nbsp;#  79&nbsp;# 2)  Convert the script to a GUI application,  80&nbsp;#   + using something like 'gdialog' or 'zenity' . . .  81&nbsp;#     The script will then no longer take its argument(s)  82&nbsp;#   + from the command line.  83&nbsp;#  84&nbsp;# 3)  Modify the script to parse one of the other available  85&nbsp;#   + Public Domain Dictionaries, such as the U.S. Census Bureau Gazetteer.</PRE></TD></TR></TABLE><HR></DIV><P><ANAME="AGREPREF"></A></P><P><BCLASS="COMMAND">agrep</B> (<ICLASS="FIRSTTERM">approximate	      grep</I>) extends the capabilities of	      <BCLASS="COMMAND">grep</B> to approximate matching. The search	      string may differ by a specified number of characters	      from the resulting matches. This utility is not part of	      the core Linux distribution.</P><P><ANAME="ZEGREPREF"></A></P><DIVCLASS="TIP"><TABLECLASS="TIP"WIDTH="90%"BORDER="0"><TR><TDWIDTH="25"ALIGN="CENTER"VALIGN="TOP"><IMGSRC="common/tip.png"HSPACE="5"ALT="Tip"></TD><TDALIGN="LEFT"VALIGN="TOP"><P>To search compressed files, use	      <BCLASS="COMMAND">zgrep</B>, <BCLASS="COMMAND">zegrep</B>, or	      <BCLASS="COMMAND">zfgrep</B>. These also work on non-compressed	      files, though slower than plain <BCLASS="COMMAND">grep</B>,	      <B
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -