📄 ch05_02.htm
字号:
<tr><td><tt class="literal">/x</tt></td><td>Ignore (most) whitespace and permit comments in pattern.<a name="INDEX-1376"></a></td></tr><tr><td><tt class="literal">/o</tt></td><td>Compile pattern once only.<a name="INDEX-1377"></a></td></tr><tr><td><tt class="literal">/g</tt><a name="INDEX-1378"></a></td><td>Globally find all matches.</td></tr><tr><td><tt class="literal">/cg</tt><a name="INDEX-1379"></a></td><td>Allow continued search after failed <tt class="literal">/g</tt> match.</td></tr></table><p><a name="INDEX-1380"></a><a name="INDEX-1381"></a>The first five modifiers apply to the regex and were describedearlier. The last two change the behavior of the match operationitself. The <tt class="literal">/g</tt> modifier specifies global matching--that is,matching as many times as possible within the string. How it behavesdepends on context. In list context, <tt class="literal">m//g</tt> returns a list of allmatches found. Here we find all the places someone mentioned"<tt class="literal">perl</tt>", "<tt class="literal">Perl</tt>", "<tt class="literal">PERL</tt>", and so on:<blockquote><pre class="programlisting">if (@perls = $paragraph =~ /perl/gi) { printf "Perl mentioned %d times.\n", scalar @perls;}</pre></blockquote>If there are no capturing parentheses within the <tt class="literal">/g</tt>pattern, then the complete matches are returned. If there arecapturing parentheses, then only the strings captured are returned.Imagine a string like:<blockquote><pre class="programlisting">$string = "password=xyzzy verbose=9 score=0";</pre></blockquote>Also imagine you want to use that to initialize a hash like this:<blockquote><pre class="programlisting">%hash = (password => "xyzzy", verbose => 9, score => 0);</pre></blockquote><a name="INDEX-1382"></a>Except, of course, you don't have a list, you have a string. To getthe corresponding list, you can use the <tt class="literal">m//g</tt>operator in list context to capture all of the key/value pairs fromthe string:<blockquote><pre class="programlisting">%hash = $string =~ /(\w+)=(\w+)/g;</pre></blockquote>The <tt class="literal">(\w+)</tt> sequence captures an alphanumeric word. See the section<a href="ch05_07.htm#ch05-sect-candc">Section 5.7, "Capturing and Clustering"</a>.<a name="INDEX-1383"></a><a name="INDEX-1384"></a></p><p><a name="INDEX-1385"></a><a name="INDEX-1386"></a><a name="INDEX-1387"></a><a name="INDEX-1388"></a>Used in scalar context, the <tt class="literal">/g</tt> modifier indicates a<em class="emphasis">progressive match</em>, which makes Perl start thenext match on the same variable at a position just past where the lastone stopped. The <tt class="literal">\G</tt> assertion represents thatposition in the string; see <a href="ch05_06.htm#ch05-sect-posit">Section 5.6, "Positions"</a> later in this chapter for adescription of <tt class="literal">\G</tt>. If you use the<tt class="literal">/c</tt> (for "continue") modifier in addition to<tt class="literal">/g</tt>, then when the <tt class="literal">/g</tt> runs out,the failed match doesn't reset the position pointer.</p><p>If a <tt class="literal">?</tt> is the delimiter, as in<tt class="literal">?</tt><em class="replaceable">PATTERN</em><tt class="literal">?</tt>,this works just like a normal<tt class="literal">/</tt><em class="replaceable">PATTERN</em><tt class="literal">/</tt>search, except that it matches only once between calls to the<tt class="literal">reset</tt> operator. This can be a convenientoptimization when you want to match only the first occurrence of thepattern during the run of the program, not all occurrences. Theoperator runs the search every time you call it, up until it finallymatches something, after which it turns itself off, returning falseuntil you explicitly turn it back on with <tt class="literal">reset</tt>.Perl keeps track of the match state for you.</p><p><a name="INDEX-1389"></a>The <tt class="literal">??</tt> operator is most useful when an ordinarypattern match would find the last rather than the first occurrence:<blockquote><pre class="programlisting">open DICT, "/usr/dict/words" or die "Can't open words: $!\n";while (<DICT>) { $first = $1 if ?(^neur.*)?; $last = $1 if /(^neur.*)/;}print $first,"\n"; # prints "neurad"print $last,"\n"; # prints "neurypnology"</pre></blockquote>The <tt class="literal">reset</tt> operator will reset only those instancesof <tt class="literal">??</tt> compiled in the same package as the call to<tt class="literal">reset</tt>. Saying <tt class="literal">m??</tt> is equivalentto saying <tt class="literal">??</tt>.<a name="INDEX-1390"></a></p><a name="INDEX-1391"></a><h3 class="sect2">5.2.3. The s/// Operator (Substitution)</h3><p><blockquote><pre class="programlisting"><em class="replaceable">LVALUE</em> =~ s/<em class="replaceable">PATTERN</em>/<em class="replaceable">REPLACEMENT</em>/egimosxs/<em class="replaceable">PATTERN</em>/<em class="replaceable">REPLACEMENT</em>/egimosx</pre></blockquote><a name="INDEX-1392"></a><a name="INDEX-1393"></a><a name="INDEX-1394"></a><a name="INDEX-1395"></a><a name="INDEX-1396"></a>This operator searches a string for <em class="replaceable">PATTERN</em> and, if found, replacesthe matched substring with the <em class="replaceable">REPLACEMENT</em> text. (Modifiers are described later in this section.)<blockquote><pre class="programlisting">$lotr = $hobbit; # Just copy The Hobbit$lotr =~ s/Bilbo/Frodo/g; # and write a sequel the easy way.</pre></blockquote>The return value of an <tt class="literal">s///</tt> operation (in scalar and listcontexts alike) is the number of times it succeeded (which can be morethan once if used with the <tt class="literal">/g</tt> modifier, as described earlier). On failure, since itsubstituted zero times, it returns false (<tt class="literal">""</tt>), which is numericallyequivalent to <tt class="literal">0</tt>.<blockquote><pre class="programlisting">if ($lotr =~ s/Bilbo/Frodo/) { print "Successfully wrote sequel." }$change_count = $lotr =~ s/Bilbo/Frodo/g;</pre></blockquote><a name="INDEX-1397"></a><a name="INDEX-1398"></a><a name="INDEX-1399"></a><a name="INDEX-1400"></a><a name="INDEX-1401"></a><a name="INDEX-1402"></a><a name="INDEX-1403"></a>The replacement portion is treated as a double-quoted string. You mayuse any of the dynamically scoped pattern variables described earlier(<tt class="literal">$`</tt>, <tt class="literal">$&</tt>, <tt class="literal">$'</tt>, <tt class="literal">$1</tt>, <tt class="literal">$2</tt>, and so on) in the replacementstring, as well as any other double-quote gizmos you care to employ.For instance, here's an example that finds all the strings"<tt class="literal">revision</tt>", "<tt class="literal">version</tt>", or "<tt class="literal">release</tt>", and replaces each withits capitalized equivalent, using the <tt class="literal">\u</tt> escape in the replacementportion:<blockquote><pre class="programlisting">s/revision|version|release/\u$&/g; # Use | to mean "or" in a pattern</pre></blockquote><a name="INDEX-1404"></a><a name="INDEX-1405"></a>All scalar variables expand in double-quote context, not just thesestrange ones. Suppose you had a <tt class="literal">%Names</tt> hash that mapped revisionnumbers to internal project names; for example, <tt class="literal">$Names{"3.0"}</tt> mightbe code-named "<tt class="literal">Isengard</tt>". You could use <tt class="literal">s///</tt> to find versionnumbers and replace them with their corresponding project names:<blockquote><pre class="programlisting">s/version ([0-9.]+)/the $Names{$1} release/g;</pre></blockquote><a name="INDEX-1406"></a>In the replacement string, <tt class="literal">$1</tt> returns what thefirst (and only) pair of parentheses captured. (You could use also<tt class="literal">\1</tt> as you would in the pattern, but that usage isdeprecated in the replacement. In an ordinary double-quoted string,<tt class="literal">\1</tt> means a Control-A.)</p><p>If <em class="replaceable">PATTERN</em> is a null string, the lastsuccessfully executed regular expression is used instead. Both<em class="replaceable">PATTERN</em> and<em class="replaceable">REPLACEMENT</em> are subject to variableinterpolation, but a <em class="replaceable">PATTERN</em> isinterpolated each time the <tt class="literal">s///</tt> operator isevaluated as a whole, while the <em class="replaceable">REPLACEMENT</em>is interpolated every time the pattern matches. (The<em class="replaceable">PATTERN</em> can match multiple times in oneevaluation if you use the <tt class="literal">/g</tt> modifier.)<a name="INDEX-1407"></a></p><p><a name="INDEX-1408"></a><a name="INDEX-1409"></a>As before, the first five modifiers in<a href="ch05_02.htm#perl3-tab-smods">Table 5-2</a> alter the behavior of the regex;they're the same as in <tt class="literal">m//</tt> and <tt class="literal">qr//</tt>.The last two alter the substitution operator itself.</p><a name="perl3-tab-smods"></a><h4 class="objtitle">Table 5.2. s/// Modifiers</h4><table border="1"><tr><th>Modifier</th><th>Meaning</th></tr><tr><td><tt class="literal">/i</tt><a name="INDEX-1410"></a></td><td>Ignore alphabetic case (when matching).</td></tr><tr><td><tt class="literal">/m</tt><a name="INDEX-1411"></a></td><td>Let <tt class="literal">^</tt> and <tt class="literal">$</tt> match next to embedded <tt class="literal">\n</tt>.</td></tr><tr><td><tt class="literal">/s</tt><a name="INDEX-1412"></a></td><td>Let <tt class="literal">.</tt> match newline and ignore deprecated <tt class="literal">$*</tt>.</td></tr><tr><td><tt class="literal">/x</tt></td><td>Ignore (most) whitespace and permit comments in pattern.<a name="INDEX-1413"></a></td></tr><tr><td><p><tt class="literal">/o</tt><a name="INDEX-1414"></a></p></td><td>Compile pattern once only.</td></tr><tr><td><p><tt class="literal">/g</tt><a name="INDEX-1415"></a></p></td><td>Replace globally, that is, all occurrences.</td></tr><tr><td><tt class="literal">/e</tt></td><td>Evaluate the right side as an expression.<a name="INDEX-1416"></a></td></tr></table><p><a name="INDEX-1417"></a><a name="INDEX-1418"></a>The <tt class="literal">/g</tt> modifier is used with<tt class="literal">s///</tt> to replace every match of<em class="replaceable">PATTERN</em> with the<em class="replaceable">REPLACEMENT</em> value, not just the first onefound. A <tt class="literal">s///g</tt> operator acts as a global searchand replace, making all the changes at once, much like list<tt class="literal">m//g</tt>, except that <tt class="literal">m//g</tt> doesn'tchange anything. (And <tt class="literal">s///g</tt> is not a progressivematch as scalar <tt class="literal">m//g</tt> was.)</p><p><a name="INDEX-1419"></a><a name="INDEX-1420"></a><a name="INDEX-1421"></a>The <tt class="literal">/e</tt> modifier treats the<em class="replaceable">REPLACEMENT</em> as a chunk of Perl code ratherthan as an interpolated string. The result of executing that code isused as the replacement string. For example,<tt class="literal">s/([0-9]+)/sprintf("%#x", $1)/ge</tt> wouldconvert all numbers into hexadecimal, changing, for example,<tt class="literal">2581</tt> into <tt class="literal">0xb23</tt>. Or supposethat, in our earlier example, you weren't sure that you had names forall the versions, so you wanted to leave any others unchanged. With alittle creative <tt class="literal">/x</tt> formatting, you could say:<blockquote><pre class="programlisting">s{ version \s+ ( [0-9.]+ )}{ $Names{$1} ? "the $Names{$1} release" : $&}xge;</pre></blockquote>The righthand side of your <tt class="literal">s///e</tt> (or in this case, thelower side) is syntax-checked and compiled at compile time along withthe rest of your program. Any syntax error is detected duringcompilation, and run-time exceptions are left uncaught. Eachadditional <tt class="literal">/e</tt> after the first one (like<tt class="literal">/ee</tt>, <tt class="literal">/eee</tt>, and so on) isequivalent to calling <tt class="literal">eval</tt><em class="replaceable">STRING</em> on the result of the code, once perextra <tt class="literal">/e</tt>. This evaluates the result of the codeexpression and traps exceptions in the special <tt class="literal">$@</tt>variable. See the section <a href="ch05_10.htm#ch05-sect-pp">Section 5.10.3, "Programmatic Patterns"</a> later in thechapter for more details.</p><h3 class="sect3">5.2.3.1. Modifying strings en passant</h3><p><a name="INDEX-1422"></a>Sometimes you want a new, modified string without clobbering theold one upon which the new one was based. Instead of writing:<blockquote><pre class="programlisting">$lotr = $hobbit;$lotr =~ s/Bilbo/Frodo/g;</pre></blockquote>you can combine these into one statement. Due to precedence,parentheses are required around the assignment, as they are withmost combinations applying <tt class="literal">=~</tt> to an expression.<blockquote><pre class="programlisting">($lotr = $hobbit) =~ s/Bilbo/Frodo/g;</pre></blockquote>Without the parentheses around the assignment, you'd only change<tt class="literal">$hobbit</tt> and get the number of replacements stored into <tt class="literal">$lotr</tt>,
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -