📄 ch09_07.htm
字号:
<html><head><title>The split Operator (Learning Perl, 3rd Edition)</title><link rel="stylesheet" type="text/css" href="../style/style1.css" /><meta name="DC.Creator" content="Randal L. Schwartz and Tom Phoenix" /><meta name="DC.Format" content="text/xml" scheme="MIME" /><meta name="DC.Language" content="en-US" /><meta name="DC.Publisher" content="O'Reilly & Associates, Inc." /><meta name="DC.Source" scheme="ISBN" content="0596001320L" /><meta name="DC.Subject.Keyword" content="stuff" /><meta name="DC.Title" content="Learning Perl, 3rd Edition" /><meta name="DC.Type" content="Text.Monograph" /></head><body bgcolor="#ffffff"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Learning Perl, 3rd Edition" /><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch09_06.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"></a></td><td align="right" valign="top" width="228"><a href="ch09_08.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr></table></div><h2 class="sect1">9.7. The split Operator</h2><p>Another operator that uses regular expressions is<tt class="literal">split</tt><a name="INDEX-631" />, which breaks up a string according toa separator. This is useful for tab-separated data, orcolon-separated, whitespace-separated, or<em class="emphasis">anything</em>-separated data, really.<a href="#FOOTNOTE-211">[211]</a> So long as you can specify theseparator with a regular expression (and generally, it's asimple regular expression), you can use <tt class="literal">split</tt>. Itlooks like this:</p><blockquote class="footnote"><a name="FOOTNOTE-211" /><p>[211]Except "comma-separated values," normally calledCSV files. Those are a pain to do with <tt class="literal">split</tt>;you're better off getting the <tt class="literal">Text::CSV</tt>module from CPAN.</p> </blockquote><blockquote><pre class="code">@fields = split /separator/, $string;</pre></blockquote><p>The <tt class="literal">split</tt> operator<a href="#FOOTNOTE-212">[212]</a>drags the pattern through a string and returns a list of fields(substrings) that were separated by the separators. Whenever thepattern matches, that's the end of one field and the start ofthe next. So, anything that matches the pattern will never show up inthe returned fields. Here's a typical <tt class="literal">split</tt>pattern, splitting on colons:</p><blockquote class="footnote"> <a name="FOOTNOTE-212" /><p>[212]It's anoperator, even though it acts a lot like a function, and everyonegenerally calls it a function. But the technical details of thedifference are beyond the scope of this book.</p> </blockquote><blockquote><pre class="code">@fields = split /:/, "abc:def:g:h"; # gives ("abc", "def", "g", "h")</pre></blockquote><p>You could even have an empty field, if there were two delimiterstogether:</p><blockquote><pre class="code">@fields = split /:/, "abc:def::g:h"; # gives ("abc", "def", "", "g", "h")</pre></blockquote><p>Here's a rule that seems odd at first, but it rarely causesproblems: Leading empty fields are always returned, but trailingempty fields are discarded:<a href="#FOOTNOTE-213">[213]</a></p><blockquote class="footnote"> <a name="FOOTNOTE-213" /><p>[213]This is merely thedefault. It's this way for efficiency. If you worry aboutlosing trailing empty fields, use <tt class="literal">-1</tt> as a thirdargument to <tt class="literal">split</tt> and they'll be kept; seethe <em class="emphasis">perlfunc</em> manpage.</p> </blockquote><blockquote><pre class="code">@fields = split /:/, ":::a:b:c:::"; # gives ("", "", "", "a", "b", "c")</pre></blockquote><p>It's also common to <tt class="literal">split</tt> on<a name="INDEX-632" />whitespace, using <tt class="literal">/\s+/</tt>as the pattern. Under that pattern, all whitespace runs areequivalent to a single space:</p><blockquote><pre class="code">my $some_input = "This is a \t test.\n";my @args = split /\s+/, $some_input; # ("This", "is", "a", "test.")</pre></blockquote><p>The default for <tt class="literal">split</tt> is to break up<tt class="literal">$_</tt> on whitespace:</p><blockquote><pre class="code">my @fields = split; # like split /\s+/, $_;</pre></blockquote><p>This is almost the same as using <tt class="literal">/\s+/</tt> as thepattern, except that a leading empty field is suppressed -- so, ifthe line starts with whitespace, you won't see an empty fieldat the start of the list. (If you'd like to get the samebehavior when splitting another string on whitespace, just use asingle space in place of the pattern: <tt class="literal">split</tt><tt class="literal">' ', $other_string</tt>. Using a space instead of thepattern is a special kind of <tt class="literal">split</tt>.)</p><p>Generally, the <a name="INDEX-633" />patterns used for<tt class="literal">split</tt> are as simple as the ones you see here. Butif the pattern becomes more complex, be sure to avoid using memoryparentheses in the pattern; see the <tt class="literal">perlfunc</tt>manpage for more information.<a href="#FOOTNOTE-214">[214]</a></p><blockquote class="footnote"> <a name="FOOTNOTE-214" /><p>[214]And you might want tocheck out the nonmemory grouping-only parenthesis notation as well,in the <em class="emphasis">perlre</em> manpage.</p> </blockquote><hr width="684" align="left" /><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch09_06.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"><img alt="Home" border="0" src="../gifs/txthome.gif" /></a></td><td align="right" valign="top" width="228"><a href="ch09_08.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr><tr><td align="left" valign="top" width="228">9.6. Substitutions with s///</td><td align="center" valign="top" width="228"><a href="index/index.htm"><img alt="Book Index" border="0" src="../gifs/index.gif" /></a></td><td align="right" valign="top" width="228">9.8. The join Function</td></tr></table></div><hr width="684" align="left" /><img alt="Library Navigation Links" border="0" src="../gifs/navbar.gif" usemap="#library-map" /><p><p><font size="-1"><a href="copyrght.htm">Copyright © 2002</a> O'Reilly & Associates. All rights reserved.</font></p><map name="library-map"><area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map></body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -