📄 ch05_01.htm
字号:
(like <tt class="literal">\t</tt>) eat up some of the string as they match, and others (like<tt class="literal">\b</tt>) don't. But we usually reserve the term "assertion" for thezero-width assertions. To avoid confusion, we'll call the thing withwidth an <em class="emphasis">atom</em>. (If you're a physicist, you can think ofnonzero-width atoms as massive, in contrast to the zero-widthassertions, which are massless like photons.)</p><p><a name="INDEX-1285"></a><a name="INDEX-1286"></a>You'll also see some metacharacters that aren't assertions; rather,they're structural (just as braces and semicolons define the structureof ordinary Perl code, but don't really do anything). These structuralmetacharacters are in some ways the most important ones because thecrucial first step in learning to read regular expressions is to teachyour eyes to pick out the structural metacharacters. Once you'velearned that, reading regular expressions is a breeze.<a href="#FOOTNOTE-1">[1]</a></p><blockquote class="footnote"><a name="FOOTNOTE-1"></a><p>[1]Admittedly, a stiff breeze at times, but not something that will blowyou away.</p></blockquote><p><a name="INDEX-1287"></a><a name="INDEX-1288"></a><a name="INDEX-1289"></a>One such structural metacharacter is the vertical bar, which indicates<em class="emphasis">alternation</em>:<blockquote><pre class="programlisting">/Frodo|Pippin|Merry|Sam/</pre></blockquote><a name="INDEX-1290"></a><a name="INDEX-1291"></a></p><p>That means that any of those strings can trigger a match; this iscovered in <a href="ch05_08.htm#ch05-sect-alt">Section 5.8, "Alternation"</a> later in the chapter. And in <a href="ch05_07.htm#ch05-sect-candc">Section 5.7, "Capturing and Clustering"</a> after that, we'll show you how to use parentheses aroundportions of your pattern to do <em class="emphasis">grouping</em>:<blockquote><pre class="programlisting">/(Frodo|Drogo|Bilbo) Baggins/</pre></blockquote>or even:<blockquote><pre class="programlisting">/(Frod|Drog|Bilb)o Baggins/</pre></blockquote></p><p><a name="INDEX-1292"></a><a name="INDEX-1293"></a>Another thing you'll see are what we call <em class="emphasis">quantifiers</em>, which say howmany of the previous thing should match in a row.Quantifiers look like this:<blockquote><pre class="programlisting">* + ? *? {3} {2,5}</pre></blockquote><a name="INDEX-1294"></a><a name="INDEX-1295"></a>You'll never see them in isolation like that, though. Quantifiersonly make sense when attached to atoms--that is, to assertions thathave width.<a href="#FOOTNOTE-2">[2]</a> Quantifiers attach tothe previous atom only, which in human terms means they normallyquantify only one character. If you want to match three copies of"<tt class="literal">bar</tt>" in a row, you need to group the individualcharacters of "<tt class="literal">bar</tt>" into a single "molecule" withparentheses, like this:<blockquote><pre class="programlisting">/(bar){3}/</pre></blockquote></p><blockquote class="footnote"><a name="FOOTNOTE-2"></a><p>[2] Quantifiers are a bit like the statementmodifiers in<a href="ch04_01.htm">Chapter 4, "Statements and Declarations"</a>, which can onlyattach to a single statement. Attaching a quantifier to a zero-widthassertion would be like trying to attach a <tt class="literal">while</tt>modifier to a declaration--either of which makes about as much senseas asking your local apothecary for a pound of photons. Apothecariesonly deal in atoms and such.</p></blockquote><p>That will match "<tt class="literal">barbarbar</tt>". If you'd said <tt class="literal">/bar{3}/</tt>, thatwould match "<tt class="literal">barrr</tt>"--which might qualify you as Scottish butdisqualify you as barbarbaric. (Then again, maybe not. Some of ourfavorite metacharacters are Scottish.) For more on quantifiers,see "Quantifiers" later.</p><p>Now that you've seen a few of the beasties that inhabit regularexpressions, you're probably anxious to start taming them. However,before we discuss regular expressions in earnest, we need to backtracka little and talk about the pattern-matching operators that make use ofregular expressions. (And if you happen to spot a few more regexbeasties along the way, just leave a decent tip for the tour guide.)</p><a name="INDEX-1296"></a><a name="INDEX-1297"></a><a name="INDEX-1780"></a><a name="INDEX-1781"></a><!-- BOTTOM NAV BAR --><hr width="515" align="left"><div class="navbar"><table width="515" border="0"><tr><td align="left" valign="top" width="172"><a href="ch04_09.htm"><img src="../gifs/txtpreva.gif" alt="Previous" border="0"></a></td><td align="center" valign="top" width="171"><a href="index.htm"><img src="../gifs/txthome.gif" alt="Home" border="0"></a></td><td align="right" valign="top" width="172"><a href="ch05_02.htm"><img src="../gifs/txtnexta.gif" alt="Next" border="0"></a></td></tr><tr><td align="left" valign="top" width="172">4.9. Pragmas</td><td align="center" valign="top" width="171"><a href="index/index.htm"><img src="../gifs/index.gif" alt="Book Index" border="0"></a></td><td align="right" valign="top" width="172">5.2. Pattern-Matching Operators</td></tr></table></div><hr width="515" align="left"><!-- LIBRARY NAV BAR --><img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p><font size="-1"><a href="copyrght.htm">Copyright © 2001</a> O'Reilly & Associates. All rights reserved.</font></p><map name="library-map"> <area shape="rect" coords="2,-1,79,99" href="../index.htm"><area shape="rect" coords="84,1,157,108" href="../perlnut/index.htm"><area shape="rect" coords="162,2,248,125" href="../prog/index.htm"><area shape="rect" coords="253,2,326,130" href="../advprog/index.htm"><area shape="rect" coords="332,1,407,112" href="../cookbook/index.htm"><area shape="rect" coords="414,2,523,103" href="../sysadmin/index.htm"></map><!-- END OF BODY --></body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -