⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch26_04.htm

📁 the unix power tools
💻 HTM
📖 第 1 页 / 共 3 页
字号:
><PCLASS="para">was the first word on a line,</P></LI><LICLASS="listitem"><PCLASS="para">the second letter was a lowercase letter,</P></LI><LICLASS="listitem"><PCLASS="para">was three letters long (followed by a space character (<IMGSRC="../chars/squ.gif"ALT=" ">)), and</P></LI><LICLASS="listitem"><PCLASS="para">the third letter was a lowercase vowel,</P></LI></UL><PCLASS="para">the regular expression would be: <CODECLASS="literal">^T[a-z][aeiou]</CODE><IMGSRC="../chars/squ.gif"ALT=" ">.</P><PCLASS="para">[To be specific:A range is a contiguous series of characters, from low to high, in the<SPANCLASS="link">ASCII chart (<ACLASS="linkend"HREF="ch51_03.htm"TITLE="ASCII Characters: Listing and Getting Values ">51.3</A>)</SPAN>.For example, <CODECLASS="literal">[z-a]</CODE> is <EMCLASS="emphasis">not</EM> a range because it's backwards.The range <CODECLASS="literal">[A-z]</CODE> does match both uppercase and lowercase letters,but it also matches the six characters that fall between uppercaseand lowercase letters in the ASCII chart:<CODECLASS="literal">[</CODE>, <CODECLASS="literal">\</CODE>, <CODECLASS="literal">]</CODE>, <CODECLASS="literal">^</CODE>, <CODECLASS="literal">_</CODE>, and <CODECLASS="literal">`</CODE>.<EMCLASS="emphasis">-JP</EM>&nbsp;]</P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="UPT-ART-427-SECT-1.5">26.4.5 Exceptions in a Character Set </A></H3><PCLASS="para"><ACLASS="indexterm"NAME="AUTOID-28629"></A><ACLASS="indexterm"NAME="AUTOID-28632"></A>You can easily search for all characters except those in squarebrackets by putting acaret (<CODECLASS="literal">^</CODE>)as the first character after the left square bracket (<CODECLASS="literal">[</CODE>).To match all characters except lowercase vowels use: <CODECLASS="literal">[^aeiou]</CODE>.</P><PCLASS="para">Like the anchors in places that can't be considered an anchor, theright square bracket (<CODECLASS="literal">]</CODE>)anddash (<CODECLASS="literal">-</CODE>)do not have a special meaning if they directly follow a&nbsp;<CODECLASS="literal">[</CODE>.<ACLASS="xref"HREF="ch26_04.htm#UPT-ART-427-TAB-1"TITLE="Regular Expression Character Set Examples">Table 26.2</A>has some examples.&#13;</P><TABLECLASS="table"><CAPTIONCLASS="table"><ACLASS="title"NAME="UPT-ART-427-TAB-1">Table 26.2: Regular Expression Character Set Examples</A></CAPTION><THEADCLASS="thead"><TRCLASS="row"VALIGN="TOP"><THCLASS="entry"ALIGN="LEFT"ROWSPAN="1"COLSPAN="1">Regular Expression</TH><THCLASS="entry"ALIGN="LEFT"ROWSPAN="1"COLSPAN="1">Matches</TH></TR></THEAD><TBODYCLASS="tbody"><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">[0-9]</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any digit</TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">[^0-9]</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any character other than a digit</TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">[-0-9]</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any digit or a <CODECLASS="literal">-</CODE></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">[0-9-]</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any digit or a <CODECLASS="literal">-</CODE></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">[^-0-9]</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any character except a digit or a <CODECLASS="literal">-</CODE></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">[]0-9]</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any digit or a <CODECLASS="literal">]</CODE></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">[0-9]]</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any digit followed by a <CODECLASS="literal">]</CODE></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">[0-99-z]</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">Any digit or any character <SPANCLASS="link">between 9 and z (<ACLASS="linkend"HREF="ch51_03.htm"TITLE="ASCII Characters: Listing and Getting Values ">51.3</A>)</SPAN></P></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">[]0-9-]</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any digit, a <CODECLASS="literal">-</CODE>, or a <CODECLASS="literal">]</CODE></TD></TR></TBODY></TABLE></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="UPT-ART-427-SECT-1.6">26.4.6 Repeating Character Sets with <CODECLASS="literal">*</CODE> </A></H3><PCLASS="para"><ACLASS="indexterm"NAME="AUTOID-28691"></A>The third part of a regular expression is the modifier.It is used to specify how many times you expect to see the previouscharacter set. The special character <CODECLASS="literal">*</CODE>&nbsp;(asterisk)matches<EMCLASS="emphasis">zero or more</EM>copies.That is, the regular expression<CODECLASS="literal">0*</CODE>matches zero or more zeros,while the expression<CODECLASS="literal">[0-9]*</CODE>matches zero or more digits.</P><PCLASS="para">This explains why the pattern<CODECLASS="literal">^#*</CODE>is useless, as it matches any number of <CODECLASS="literal">#</CODE>'sat the beginning of the line, including <EMCLASS="emphasis">zero</EM>.Therefore, this will match every line, because every line starts withzero or more <CODECLASS="literal">#</CODE>'s.</P><PCLASS="para">At first glance, it might seem that starting the count at zero isstupid.Not so.Looking for an unknown number of characters is very important.Suppose you wanted to look for a digit at the beginning of a line,and there may or may not be spaces before the digit.Just use <CODECLASS="literal">^</CODE><IMGSRC="../chars/squ.gif"ALT=" "><CODECLASS="literal">*</CODE>to match zero or more spaces at the beginning of the line.If you need to match one or more, just repeat the character set.That is, <CODECLASS="literal">[0-9]*</CODE>matches zero or more digits and<CODECLASS="literal">[0-9][0-9]*</CODE>matches one or more digits.</P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="UPT-ART-427-SECT-1.7">26.4.7 Matching a Specific Number of Sets with \&nbsp;{ and \&nbsp;} </A></H3><PCLASS="para"><ACLASS="indexterm"NAME="AUTOID-28711"></A><ACLASS="indexterm"NAME="AUTOID-28714"></A>You cannot specify a maximum number of setswith the<CODECLASS="literal">*</CODE>modifier.However, <SPANCLASS="link">some programs (<ACLASS="linkend"HREF="ch26_09.htm"TITLE="Valid Metacharacters for Different UNIX Programs ">26.9</A>)</SPAN>recognize a special pattern you can use to specify theminimum and maximum number of repeats. This is done by putting those two numbers between <CODECLASS="literal">\{</CODE>and<CODECLASS="literal">\}</CODE>.</P><PCLASS="para">Having convinced you that <CODECLASS="literal">\{</CODE>isn't a plot to confuse you, an example is in order. The regularexpression to match four, five, six, seven, or eight lowercase letters is:<CODECLASS="literal">[a-z]\{4,8\}</CODE>.Any numbers between 0 and 255 can be used.The second number may be omitted, which removes the upper limit.If the comma and the second number are omitted, the pattern must beduplicated the exact number of times specified by the first number.</P><BLOCKQUOTECLASS="caution"><PCLASS="para"><STRONG>CAUTION:</STRONG> The backslashes deserve a special discussion.Normally a backslash <EMCLASS="emphasis">turns off</EM>the special meaning for a character.For example, a literal period is matched by <CODECLASS="literal">\.</CODE>and a literal asterisk is matched by<CODECLASS="literal">\*</CODE>.However, if a backslash is placed before a<CODECLASS="literal">&lt;</CODE>,<CODECLASS="literal">&gt;</CODE>,<CODECLASS="literal">{</CODE>,<CODECLASS="literal">}</CODE>,<CODECLASS="literal">(</CODE>, or<CODECLASS="literal">)</CODE>or before a digit, the backslash<EMCLASS="emphasis">turns on</EM>a special meaning.This was done because these special functions were added late in thelife of regular expressions. Changing the meaning of <CODECLASS="literal">{</CODE>, <CODECLASS="literal">}</CODE>,<CODECLASS="literal">(</CODE>, <CODECLASS="literal">)</CODE>,<CODECLASS="literal">&lt;</CODE>, and <CODECLASS="literal">&gt;</CODE>would have broken old expressions. (This is a horrible crime punishableby a year of hard labor writing COBOL programs.)Instead, adding a backslash added functionality without breaking oldprograms. Rather than complain about the change, view it as evolution.&#13;</P></BLOCKQUOTE><PCLASS="para">You must remember that modifiers like<CODECLASS="literal">*</CODE>and<CODECLASS="literal">\{1,5\}</CODE>only act as modifiers if they follow a character set.If they were at the beginning of a pattern, they would not be modifiers.<ACLASS="xref"HREF="ch26_04.htm#UPT-ART-427-TAB-2"TITLE="Regular Expression Pattern Repetition Examples">Table 26.3</A>is a list of examples, and the exceptions.&#13;</P><TABLECLASS="table"><CAPTIONCLASS="table"><ACLASS="title"NAME="UPT-ART-427-TAB-2">Table 26.3: Regular Expression Pattern Repetition Examples</A></CAPTION><THEADCLASS="thead"><TRCLASS="row"VALIGN="TOP"><THCLASS="entry"ALIGN="LEFT"ROWSPAN="1"COLSPAN="1">Regular Expression</TH><THCLASS="entry"ALIGN="LEFT"ROWSPAN="1"COLSPAN="1">Matches</TH></TR></THEAD><TBODYCLASS="tbody"><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">*</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any line with a <CODECLASS="literal">*</CODE></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">\*</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any line with a <CODECLASS="literal">*</CODE></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">\\</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any line with a <CODECLASS="literal">\</CODE></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">^*</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any line starting with a <CODECLASS="literal">*</CODE></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">^A*</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any line</TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">^A\*</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any line starting with an A<CODECLASS="literal">*</CODE></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">^AA*</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any line starting with one A</TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">^AA*B</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">Any line starting with one or more A's followedby a B</P></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">^A\{4,8\}B</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">Any line starting with four, five, six, seven, or eight A's followed by a B</P></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">^A\{4,\}B</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">Any line starting with four or more A's followedby a B</P></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">^A\{4\}B</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any line starting with an AAAAB</TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">\{4,8\}</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any line with a {4,8}</TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">A{4,8}</TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1">Any line with an A{4,8}</TD></TR></TBODY></TABLE></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="UPT-ART-427-SECT-1.8">26.4.8 Matching Words with \&nbsp;&lt; and \&nbsp;&gt; </A></H3><PCLASS="para"><ACLASS="indexterm"NAME="AUTOID-28804"></A

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -