📄 appa_06.htm
字号:
<html><head><title>Answers to Chapter 7 Exercises (Learning Perl, 3rd Edition)</title><link rel="stylesheet" type="text/css" href="../style/style1.css" /><meta name="DC.Creator" content="Randal L. Schwartz and Tom Phoenix" /><meta name="DC.Format" content="text/xml" scheme="MIME" /><meta name="DC.Language" content="en-US" /><meta name="DC.Publisher" content="O'Reilly & Associates, Inc." /><meta name="DC.Source" scheme="ISBN" content="0596001320L" /><meta name="DC.Subject.Keyword" content="stuff" /><meta name="DC.Title" content="Learning Perl, 3rd Edition" /><meta name="DC.Type" content="Text.Monograph" /></head><body bgcolor="#ffffff"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Learning Perl, 3rd Edition" /><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="appa_05.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"></a></td><td align="right" valign="top" width="228"><a href="appa_07.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr></table></div><h2 class="sect1">A.6. Answers to Chapter 7 Exercises</h2><ol><li><p>Here's one way to do it:</p><blockquote><pre class="code">/fred/</pre></blockquote><p>Of course, you have to put that into the test program! This is prettysimple. The more important part of this exercise is trying it out onthe sample strings. It doesn't match <tt class="literal">Fred</tt>,showing that regular expressions are case-sensitive. (We'll seehow to change that later.) It does match <tt class="literal">frederick</tt>and <tt class="literal">Alfred</tt>, since both of those strings containthe four-letter string <tt class="literal">fred</tt>.. (Matching wholewords only, so that <tt class="literal">frederick</tt> and<tt class="literal">Alfred</tt> wouldn't match, is another featurewe'll see later.)</p><p>If the test program is working correctly,<a href="#FOOTNOTE-388">[388]</a> it should show those twomatches as something like <tt class="literal">|<fred>erick|</tt> and<tt class="literal">|Al<fred>|</tt>, using the angle brackets to showwhere <tt class="literal">fred</tt> was found inside each string.</p><blockquote class="footnote"> <a name="FOOTNOTE-388" /><p>[388]If the testprogram didn't work correctly, you probably didn'tdownload it as we suggested. And you probably didn't test whatyou typed, as we also suggested. But in that case, you probablydidn't do the exercises either; you're just reading theseanswers in the back of the book, and so the test program (which youdidn't actually run) performed flawlessly. In that case, thisfootnote is pointless.</p> </blockquote></li><li><p>Here's one way to do it:</p><blockquote><pre class="code">/a+b*/</pre></blockquote><p>That matches the letter <tt class="literal">a</tt> one or more times(that's the plus), followed by <tt class="literal">b</tt> zero ormore times (that's the star). Well, that's what theexercise asked for, but you may have come up with somethingdifferent. After all, if you're looking for<em class="emphasis">any</em> number of <tt class="literal">b</tt>'s, youknow you'll always find what you're looking for. So youcould have written <tt class="literal">/a+/</tt> instead, and matched thesame strings.<a href="#FOOTNOTE-389">[389]</a></p><blockquote class="footnote"> <a name="FOOTNOTE-389" /><p>[389]To be sure, you'll matchdifferent parts of the strings. But any string that matches<tt class="literal">/a+b*/</tt> will also match <tt class="literal">/a+/</tt>,and vice versa.</p> </blockquote><p>For that matter, when you want to match one or more<tt class="literal">a</tt>'s, you know that the match will succeedwhen you find even the first one. So, <tt class="literal">/a/</tt> willmatch the same set of strings as the first two. The description"any string containing at least one <tt class="literal">a</tt>followed by any number of <tt class="literal">b</tt>'s" meansthe exact same thing as "any string containing<tt class="literal">a</tt>." Of the sample strings, this matches allof them except <tt class="literal">fred</tt>.</p><p>There are even more ways to make this pattern than we show here.Often, in trying to write a pattern, you will need to decide whichone of many possible patterns best suits your needs.</p></li><li><p>Here's one way to do it:</p><blockquote><pre class="code">/\\*\**/</pre></blockquote><p>That's what the text asked for: a backslash (typed twice, sincewe mean a <em class="emphasis">real</em> backslash<a href="#FOOTNOTE-390">[390]</a>) zero or more times(that's the first star), followed by an asterisk (backslashed,since star is a metacharacter) zero or more times (that's thelast star). Whew!</p><blockquote class="footnote"><a name="FOOTNOTE-390" /><p>[390]Whenever you mean a real backslash in Perl, type two of them. Alone backslash may try to do something magical, but two of them willalways mean a real backslash. </p> </blockquote><p>And what about the sample strings? Did it match any of them? You bet:it matches all of them! It's because the backslashes andasterisks aren't required in the pattern; that is, this patterncan match the empty string. Here's a rule you can rely upon:when a pattern <em class="emphasis">may</em> freely match the emptystring, it'll <em class="emphasis">always</em> match, since theempty string can be found in any string. In fact, it'll alwaysmatch in the <em class="emphasis">first</em> place that you look.</p><p>So, this pattern matches all four characters in<tt class="literal">\\**</tt>, as you'd expect. It matches the emptystring at the beginning of <tt class="literal">fred</tt>, which you may nothave expected. In the string <tt class="literal">barney \\\***</tt>, itmatches the empty string at the beginning. You might wish it wouldhunt down the backslashes and stars at the end of that string, but itdoesn't bother. It looks at the beginning, sees zerobackslashes followed by zero asterisks, declares the match a success,and goes home to watch television. And in <tt class="literal">*wilma\</tt>,it matches just the star at the beginning; as you see, this patternnever gets away from the beginning of the string, since it alwaysmatches at the first opportunity.</p><p>Now, if someone asked you for a pattern to match any number ofbackslashes followed by any number of asterisks, you'd betechnically correct to give them this one. But chances are,that's not what they really wanted. Spoken languages likeEnglish may be ambiguous and not say exactly what they mean, butregular expressions always mean exactly what they say they mean.</p><p>In this case, maybe the person who asked for the pattern forgot tosay that he or she always wants to match at least one character, whenthe pattern matches at all. We can do that. If there's at leastone backslash, <tt class="literal">/\\+\**/</tt> will match. (That'sjust like what we had before, but there's a plus in place ofthe first star, meaning one or more backslashes.) If there'snot at least one backslash, then in order to match at least onecharacter, we'll need at least one asterisk, so we want<tt class="literal">/\*+/</tt>. When you put those two possibilitiestogether, you get:</p><blockquote><pre class="code">/\\+\**|\*+/</pre></blockquote><p>Ugly, isn't it? Regular expressions are powerful but notbeautiful. And they've contributed to Perl being maligned as a"write-only language." To be sure that no one criticizesyour code in that way, though, it's kind to put an explanatorycomment near any pattern that's not obvious. On the other hand,when you've been using these for a year, you will have adifferent definition of "obvious" than you have today.</p><p>How does this new pattern work with the sample strings? With<tt class="literal">\\**</tt>, it matches all four characters, just likethe last one. It won't match <tt class="literal">fred</tt>, which isprobably the right behavior given the problem description. For<tt class="literal">barney \\\***</tt>, it matches the six characters atthe end, as you hoped. And for <tt class="literal">*wilma\</tt>, it matchesthe asterisk at the beginning.</p></li><li><p>Here's one way to do it:</p><blockquote><pre class="code">while (<>) { if (/wilma/) { print; }}</pre></blockquote><p>This is a <i class="command">grep</i>-like program. For each line of text(contained in <tt class="literal">$_</tt>), we check to see whether thepattern matches. If it matches, we print it. This program uses<tt class="literal">print</tt>'s default: if you don't tellit to print something else, it prints <tt class="literal">$_</tt>. So wehave written a program that uses <tt class="literal">$_</tt> all the waythrough, but never mentions it anywhere. Perl folks love to use thedefaults and save time typing, so you'll see a lot of programsthat do this.</p><p>And if, for extra credit, you wanted to match a capitalized<tt class="literal">Wilma</tt> as well, <tt class="literal">/wilma|Wilma/</tt>would do the job. Or, more simply, you could have written<tt class="literal">/(w|W)ilma/</tt>. People who have used other regularexpression implementations and already know about character classes,which we'll discuss in the next chapter, could make that lastone even shorter (and more efficient).<a href="#FOOTNOTE-391">[391]</a></p><blockquote class="footnote"> <a name="FOOTNOTE-391" /><p>[391]If you madethe whole pattern case-insensitive, shame on you. We haven'tlearned that yet. Besides, that would match <tt class="literal">WILMA</tt>,which shouldn't match, according to the exercisedescription.</p> </blockquote></li><li><p>Here's one way to do it:</p><blockquote><pre class="code">while (<>) { if (/wilma/) { if (/fred/) { print; } }}</pre></blockquote><p>This tests <tt class="literal">/fred/</tt> only after we find<tt class="literal">/wilma/</tt> matches, but <tt class="literal">fred</tt> couldappear before or after <tt class="literal">wilma</tt> in the line; eachtest is independent of the other.</p><p>If you wanted to avoid the extra nested <tt class="literal">if</tt> test,you might have written something like this:<a href="#FOOTNOTE-392">[392]</a></p><blockquote class="footnote"> <a name="FOOTNOTE-392" /><p>[392]Folks whoknow about the logical-and operator, which we saw in <a href="ch10_01.htm">Chapter 10, "More Control Structures"</a>, could do both tests <tt class="literal">/fred/</tt>and <tt class="literal">/wilma/</tt> in the same <tt class="literal">if</tt>conditional. That's more efficient, more scalable,and an all-around better way than the ones given here. But wehaven't seen logical-and yet.</p> </blockquote><blockquote><pre class="code">while (<>) { if (/wilma.*fred|fred.*wilma/) { print; }}</pre></blockquote><p>This works because we'll either have <tt class="literal">wilma</tt>before <tt class="literal">fred</tt>, or <tt class="literal">fred</tt> before<tt class="literal">wilma</tt>. If we had written just<tt class="literal">/wilma.*fred/</tt>, that wouldn't have matched aline like <tt class="literal">fred and wilma flintstone</tt>, even thoughthat line mentions both of them.</p><p>We made this an extra-credit exercise because many folks have amental block here. We showed you an "or" operation (withthe vertical bar, "<tt class="literal">|</tt>"), but we nevershowed you an "and" operation. That's because thereisn't one in regular expressions.<a href="#FOOTNOTE-393">[393]</a> If you want to know whether one pattern and another areboth successful, just test both of them.</p><blockquote class="footnote"> <a name="FOOTNOTE-393" /><p>[393]But there aresome tricky and advanced ways of doing what some folks would call an"and" operation. These are generally less efficient thanusing Perl's logical-and, though, depending upon whatoptimizations Perl and its regular expression engine can make.</p></blockquote></li></ol><hr width="684" align="left" /><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="appa_05.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"><img alt="Home" border="0" src="../gifs/txthome.gif" /></a></td><td align="right" valign="top" width="228"><a href="appa_07.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr><tr><td align="left" valign="top" width="228">A.5. Answers to Chapter 6 Exercises</td><td align="center" valign="top" width="228"><a href="index/index.htm"><img alt="Book Index" border="0" src="../gifs/index.gif" /></a></td><td align="right" valign="top" width="228">A.7. Answers to Chapter 8 Exercises</td></tr></table></div><hr width="684" align="left" /><img alt="Library Navigation Links" border="0" src="../gifs/navbar.gif" usemap="#library-map" /><p><p><font size="-1"><a href="copyrght.htm">Copyright © 2002</a> O'Reilly & Associates. All rights reserved.</font></p><map name="library-map"><area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map></body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -