📄 ch04_06.htm
字号:
</td><td><p>Matches at the end of the string (or line, if <tt class="literal">/m</tt>is used)</p></td></tr><tr><td><p><tt class="literal">\b</tt></p></td><td><p>Matches at word boundary (between <tt class="literal">\w</tt> and<tt class="literal">\W</tt>)</p></td></tr><tr><td><p><tt class="literal">\B</tt></p></td><td><p>Matches except at word boundary</p></td></tr><tr><td><p><tt class="literal">\A</tt></p></td><td><p>Matches at the beginning of the string</p></td></tr><tr><td><p><tt class="literal">\Z</tt></p></td><td><p>Matches at the end of the string or before a newline</p></td></tr><tr><td><p><tt class="literal">\z</tt></p></td><td><p>Matches only at the end of the string</p></td></tr><tr><td><p><tt class="literal">\G</tt></p></td><td><p>Matches where previous <tt class="literal">m//g</tt> left off</p></td></tr><tr><td><p><tt class="literal">\c</tt></p></td><td><p>Suppresses resetting of search position when used with \g. Without\c, search pattern is reset to the beginning of the string.</p></td></tr></table><p><p>The <tt class="literal">$</tt> and <tt class="literal">\Z</tt> assertions canmatch not only at the end of the string, but also one characterearlier than that, if the last character of the string is a newline.</p></div><a name="perlnut2-CHP-4-SECT-6.6" /><div class="sect2"><h3 class="sect2">4.6.6. Quantifiers</h3><p><a name="INDEX-682" /><a name="INDEX-683" /><a name="INDEX-684" /><a name="INDEX-685" />Quantifiersare used to specify the number of instances of the previous elementthat can match. For instance, you could say "matchany number of a's, including none"(<tt class="literal">a*</tt>), or "match between 5 and 10instances of the word 'owie'(<tt class="literal">(owie){5,10}</tt>)". </p><p><a name="INDEX-686" />Quantifiers,by nature, are greedy. That is, the way the Perl regular expression"engine" works is that it will lookfor the biggest match possible (the farthest to the right) unless youtell it not to. Say you are searching a string that reads:</p><blockquote><pre class="code">a whatever foo, b whatever foo</pre></blockquote><p>and you want to find <tt class="literal">a</tt> and <tt class="literal">foo</tt>with something in between. You might use:</p><blockquote><pre class="code">/a.*foo/</pre></blockquote><p>A <tt class="literal">.</tt> followed by a <tt class="literal">*</tt> looks forany character, any number of times, until <tt class="literal">foo</tt> isfound. But since Perl will look as far to the right as possible tofind <tt class="literal">foo</tt>, the first instance of<tt class="literal">foo</tt> is swallowed up by the greedy<tt class="literal">.*</tt> expression.</p><p><a name="INDEX-687" /><a name="INDEX-688" /><a name="INDEX-689" />Therefore,a<a name="INDEX-690" />ll the quantifiers have a notationthat allows for minimal matching, so they are nongreedy. Thisnotation uses a question mark immediately following the quantifier toforce Perl to look for the earliest available match (farthest to theleft). The following table lists the regular expression quantifiersand their nongreedy forms:</p><a name="ch04-14-fm2xml" /><table border="1" cellpadding="3"><tr><th><p>Maximal</p></th><th><p>Minimal</p></th><th><p>Allowed range</p></th></tr><tr><td><p><tt class="literal">{n</tt>,<tt class="literal">m}</tt></p></td><td><p><tt class="literal">{n</tt>,<tt class="literal">m}?</tt></p></td><td><p>Must occur at least <tt class="literal">n</tt> times but no more than<tt class="literal">m</tt> times</p></td></tr><tr><td><p><tt class="literal">{n,}</tt></p></td><td><p><tt class="literal">{n,}?</tt></p></td><td><p>Must occur at least <tt class="literal">n</tt> times</p></td></tr><tr><td><p><tt class="literal">{n}</tt></p></td><td><p><tt class="literal">{n}?</tt></p></td><td><p>Must match exactly <tt class="literal">n</tt> times</p></td></tr><tr><td><p><tt class="literal">*</tt></p></td><td><p><tt class="literal">*?</tt></p></td><td><p>0 or more times (same as <tt class="literal">{0,}</tt>)</p></td></tr><tr><td><p><tt class="literal">+</tt></p></td><td><p><tt class="literal">+?</tt></p></td><td><p>1 or more times (same as <tt class="literal">{1,}</tt>)</p></td></tr><tr><td><p><tt class="literal">?</tt></p></td><td><p><tt class="literal">??</tt></p></td><td><p>0 or 1 time (same as <tt class="literal">{0,1}</tt>)</p></td></tr></table><p></div><a name="perlnut2-CHP-4-SECT-6.7" /><div class="sect2"><h3 class="sect2">4.6.7. Pattern Match Variables</h3><p><a name="INDEX-691" /><a name="INDEX-692" />Parentheses not only group elementsin a regular expression, they also remember the patterns they match.Every match from a parenthesized element is saved to a special,read-only variable indicated by a number. You can recall and reuse amatch by using these variables.</p><p><a name="INDEX-693" /><a name="INDEX-694" /><a name="INDEX-695" />Within a pattern, eachparenthesized element saves its match to a numbered variable, inorder starting with <tt class="literal">1</tt>. You can recall thesematches within the expression by using <tt class="literal">\1</tt>,<tt class="literal">\2</tt>, and so on.</p><p>Outside of the matching pattern, the matched variables are recalledwith the usual dollar sign, i.e., <tt class="literal">$1</tt>,<tt class="literal">$2</tt>, etc. The dollar sign notation should be usedin the replacement expression of a substitution and anywhere else youmight want to use the variables in your program. For example, toimplement "i before e, except afterc":</p><blockquote><pre class="code">s/([^c])ei/$1ie/g;</pre></blockquote><p>The backreferencing variables are:</p><dl><dt><b><tt class="literal">$+</tt></b></dt><dd>Returns the last parenthesized pattern match</p></dd><dt><b><tt class="literal">$&</tt></b></dt><dd>Returns the entire matched string</p></dd><dt><b><tt class="literal">$'</tt></b></dt><dd>Returns everything before the matched string</p></dd><dt><b><tt class="literal">$'</tt></b></dt><dd>Returns everything after the matched string</p></dd></dl><p>Backreferencing with these variables will slow down your programnoticeably for all regular expressions.</p></div><a name="perlnut2-CHP-4-SECT-6.8" /><div class="sect2"><h3 class="sect2">4.6.8. Extended Regular Expressions</h3><p><a name="INDEX-696" /><a name="INDEX-697" /><a name="INDEX-698" /><a name="INDEX-699" />Perl defines anextended syntax for regular expressions. The syntax is a pair ofparentheses with a question mark as the first thing within theparentheses. The character after the question mark gives the functionof the extension. The extensions are:</p><dl><dt><b><tt class="literal">(?#</tt><em class="replaceable">text</em><tt class="literal">)</tt></b></dt><dd>A comment. The text is ignored.</p></dd><dt><b><tt class="literal">(?:...)</tt></b></dt><dt><b><tt class="literal">(?imsx-imsx:...)</tt></b></dt><dd>This groups things like <tt class="literal">(...)</tt> butdoesn't make backreferences.</p></dd><dt><b><tt class="literal">(?=...)</tt></b></dt><dd>A zero-width positive lookahead assertion. For example,<tt class="literal">/\w+(?=\t)/</tt> matches a word followed by a tab,without including the tab in <tt class="literal">$&</tt>.</p></dd><dt><b><tt class="literal">(?!...)</tt></b></dt><dd>A zero-width negative lookahead assertion. For example,<tt class="literal">/foo(?!bar)/</tt> matches any occurrence of<tt class="literal">foo</tt> that isn't followed by<tt class="literal">bar</tt>.</p></dd><dt><b><tt class="literal">(?<...)</tt></b></dt><dd>A zero-width positive lookbehind assertion. For example,<tt class="literal">/(?<bad)boy/</tt> matches the word<tt class="literal">boy</tt> that follows <tt class="literal">bad</tt>, withoutincluding <tt class="literal">bad</tt> in <tt class="literal">$&</tt>. Thisworks only for fixed-width lookbehind.</p></dd><dt><b><tt class="literal">(?{</tt><em class="replaceable">code</em><tt class="literal">})</tt></b></dt><dd>An experimental regular expression feature to evaluate any embeddedPerl code. This evaluation always succeeds, and<em class="replaceable"><tt>code</tt></em> is not interpolated.</p></dd><dt><b><tt class="literal">(?<!=...)</tt></b></dt><dd>A zero-width negative lookbehind assertion. For example,<tt class="literal">/(?<!=bad)boy/</tt> matches any occurrence of<tt class="literal">boy</tt> that doesn't follow<tt class="literal">bad</tt>. This works only for fixed-width lookbehind.</p></dd><dt><b><tt class="literal">(?>...)</tt></b></dt><dd>Matches the substring that the standalone pattern would match ifanchored at the given position.</p></dd><dt><b><tt class="literal">(?(</tt><em class="replaceable">condition</em><tt class="literal">)</tt><em class="replaceable">yes-pattern</em>|<em class="replaceable">no-pattern</em><tt class="literal">)</tt></b></dt><dt><b><tt class="literal">(?(</tt><em class="replaceable">condition</em>)<em class="replaceable">yes-pattern</em><tt class="literal">)</tt></b></dt><dd>Matches a pattern determined by a condition.<em class="replaceable"><tt>condition</tt></em> should be either an integer,which is true if the pair of parentheses corresponding to the integerhas matched, or a lookahead, lookbehind, or evaluate, zero-widthassertion. <em class="replaceable"><tt>no-pattern</tt></em> will be used tomatch if the condition was not meant, but it is also optional.</p></dd><dt><b><tt class="literal">(?imsx-imsx)</tt></b></dt><dd>One or more embedded pattern-match modifiers. Modifiers are switchedoff if they follow a <tt class="literal">-</tt> (dash). The modifiers aredefined as follows<a name="INDEX-700" /><a name="INDEX-701" /><a name="INDEX-702" /><a name="INDEX-703" />:</p><a name="ch04-15-fm2xml" /><table border="1" cellpadding="3"><tr><th><p>Modifier</p></th><th><p>Meaning</p></th></tr><tr><td><p><tt class="literal">i</tt></p></td><td><p>Do case-insensitive pattern matching.</p></td></tr><tr><td><p><tt class="literal">m</tt></p></td><td><p>Treat string as multiple lines.</p></td></tr><tr><td><p><tt class="literal">s</tt></p></td><td><p>Treat string as single line.</p></td></tr><tr><td><p><tt class="literal">x</tt></p></td><td><p>Use extended regular expressions.</p></td></tr></table><p></dd></dl></div><hr width="684" align="left" /><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch04_05.htm"><img src="../gifs/txtpreva.gif" alt="Previous" border="0" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"><img src="../gifs/txthome.gif" alt="Home" border="0" /></a></td><td align="right" valign="top" width="228"><a href="ch04_07.htm"><img src="../gifs/txtnexta.gif" alt="Next" border="0" /></a></td></tr><tr><td align="left" valign="top" width="228">4.5. Operators</td><td align="center" valign="top" width="228"><a href="index/index.htm"><img src="../gifs/index.gif" alt="Book Index" border="0" /></a></td><td align="right" valign="top" width="228">4.7. Subroutines</td></tr></table></div><hr width="684" align="left" /><img src="../gifs/navbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links" /><p><p><font size="-1"><a href="copyrght.htm">Copyright © 2002</a> O'Reilly & Associates. All rights reserved.</font></p><map name="library-map"><area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map></body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -