ch24_04.htm

来自「编程珍珠,里面很多好用的代码,大家可以参考学习呵呵,」· HTM 代码 · 共 671 行 · 第 1/2 页
HTM
671 行
<blockquote><pre class="programlisting">sub bark {    my DOG $spot = shift;    my %parm = @_;    my $quality  = $parm{QUALITY}  || "yapping";    my $quantity = $parm{QUANTITY} || "nonstop";     ...}$fido-&gt;bark( QUANTITY =&gt; "once",              QUALITY =&gt; "woof" );</pre></blockquote>Named parameters are often an affordable luxury.  And with Perl, youget them for free, if you don't count the cost of the hash assignment.</p></li><li><p>Repeat Boolean expressions until false.</p></li><li><p>Use minimal matching when appropriate.</p></li><li><p>Use the <tt class="literal">/e</tt> modifier to evaluate a replacement expression:<blockquote><pre class="programlisting">#!/usr/bin/perl -p1 while s/^(.*?)(\t+)/$1 . ' ' x (length($2) * 4 - length($1) % 4)/e;</pre></blockquote>This program fixes any file you receive from someone who mistakenlythinks they can redefine hardware tabs to occupy 4 spaces insteadof 8.  It makes use of several important idioms.  First, the <tt class="literal">1 while</tt> idiomis handy when all the work you want to do in the loop is actually doneby the conditional.  (Perl is smart enough not to warn you that you'reusing <tt class="literal">1</tt> in a void context.) We have to repeat this substitution becauseeach time we substitute some number of spaces in for tabs, we have torecalculate the column position of the next tab from the beginning.</p><p>The <tt class="literal">(.*?)</tt> matches the smallest string it can up until the first tab,using the minimal matching modifier (the question mark).  In this case,we could have used an ordinary greedy <tt class="literal">*</tt> like this: <tt class="literal">([^\t]*)</tt>.  Butthat only works because a tab is a single character, so we can use anegated character class to avoid running past the first tab.  In general,the minimal matcher is much more elegant, and doesn't break if the nextthing that must match happens to be longer than one character.</p><p>The <tt class="literal">/e</tt> modifier does a substitution using an expression rather thana mere string.  This lets us do the calculations we need right whenwe need them.</p></li><li><p>Use creative formatting and comments on complex substitutions:<blockquote><pre class="programlisting">#!/usr/bin/perl -p1 while s{    ^               # anchor to beginning    (               # start first subgroup        .*?         # match minimal number of characters    )               # end first subgroup    (               # start second subgroup        \t+         # match one or more tabs    )               # end second subgroup}{    my $spacelen = length($2) * 4;  # account for full tabs    $spacelen -= length($1) % 4;    # account for the uneven tab    $1 . ' ' x $spacelen;           # make correct number of spaces}ex;</pre></blockquote>This is probably overkill, but some people find it more impressivethan the previous one-liner.  Go figure.</p></li><li><p>Go ahead and use <tt class="literal">$`</tt> if you feel like it:<blockquote><pre class="programlisting">1 while s/(\t+)/' ' x (length($1) * 4 - length($`) % 4)/e;</pre></blockquote>Here's the shorter version, which uses <tt class="literal">$`</tt>, which isknown to impact performance.  Except that we're only using the lengthof it, so it doesn't really count as bad.</p></li><li><p>Use the offsets directly from the <tt class="literal">@-</tt>(<tt class="literal">@LAST_MATCH_START</tt>) and <tt class="literal">@+</tt>(<tt class="literal">@LAST_MATCH_END</tt>) arrays:<blockquote><pre class="programlisting">1 while s/\t+/' ' x (($+[0] - $-[0]) * 4 - $-[0] % 4)/e;</pre></blockquote>This one's even shorter.  (If you don't see any arrays there, try looking for array elements instead.) See <tt class="literal">@-</tt> and <tt class="literal">@+</tt> in <a href="ch28_01.htm">Chapter 28, "Special Names"</a>.</p></li><li><p>Use <tt class="literal">eval</tt> with a constant return value:<blockquote><pre class="programlisting">sub is_valid_pattern {    my $pat = shift;    return eval { "" =~ /$pat/; 1 } || 0;}</pre></blockquote>You don't have to use the <tt class="literal">eval {}</tt> operator to return a real value. Here we always return <tt class="literal">1</tt> if it gets to the end.  However, if the patterncontained in <tt class="literal">$pat</tt> blows up, the <tt class="literal">eval</tt> catches it and returns <tt class="literal">undef</tt>to the Boolean conditional of the <tt class="literal">||</tt> operator, which turns it intoa defined <tt class="literal">0</tt> (just to be polite, since <tt class="literal">undef</tt> is also false but mightlead someone to believe that the <tt class="literal">is_valid_pattern</tt> subroutine ismisbehaving, and we wouldn't want that, now would we?).</p></li><li><p>Use modules to do all the dirty work.</p></li><li><p>Use object factories.</p></li><li><p>Use callbacks.</p></li><li><p>Use stacks to keep track of context.</p></li><li><p>Use negative subscripts to access the end of an array or string:<blockquote><pre class="programlisting">use XML::Parser;$p = new XML::Parser Style =&gt; 'subs';setHandlers $p Char =&gt; sub { $out[-1] .= $_[1] };push @out, "";sub literal {    $out[-1] .= "C&lt;";    push @out, "";}sub literal_ {    my $text = pop @out;    $out[-1] .= $text . "&gt;";}...</pre></blockquote>This is a snippet from the 250-line program we used to translate theXML version of the old Camel book back into pod format so we could editit for this edition with a Real Text Editor.</p><p>The first thing you'll notice is that we rely on the <tt class="literal">XML::Parser</tt>module (from CPAN) to parse our XML correctly, so we don't have tofigure out how.  That cuts a few thousand lines out of our programright there (presuming we're reimplementing in Perl everything<tt class="literal">XML::Parser</tt> does for us,<a href="#FOOTNOTE-2">[2]</a>including translation from almost any character set into UTF-8).</p><blockquote class="footnote"><a name="FOOTNOTE-2"></a><p>[2]Actually, <tt class="literal">XML::Parser</tt> is just afancy wrapper around James Clark's <em class="emphasis">expat</em> XML parser.</p></blockquote><p><tt class="literal">XML::Parser</tt> uses a high-level idiom called an <em class="emphasis">object factory</em>.  Inthis case, it's a parser factory.  When we create an <tt class="literal">XML::Parser</tt>object, we tell it which style of parser interface we want, and itcreates one for us.  This is an excellent way to build a testbedapplication when you're not sure which kind of interface will turn outto be the best in the long run.  The <tt class="literal">subs</tt> style is just one of<tt class="literal">XML::Parser</tt>'s interfaces.  In fact, it's one of the oldestinterfaces, and probably not even the most popular one these days.</p><p>The <tt class="literal">setHandlers</tt> line shows a method call on the parser, not in arrownotation, but in "indirect object" notation, which lets you omit theparens on the arguments, among other things.  The line also uses thenamed parameter idiom we saw earlier.</p><p>The line also shows another powerful concept, the notion of acallback.  Instead of us calling the parser to get the next item, wetell it to call us.  For named XML tags like <tt class="literal">&lt;literal&gt;</tt>, thisinterface style will automatically call a subroutine of that name (or the name with an underline on the end for the corresponding end tag).  But thedata between tags doesn't have a name, so we set up a <tt class="literal">Char</tt> callbackwith the <tt class="literal">setHandlers</tt> method.</p><p>Next we initialize the <tt class="literal">@out</tt> array, which is a stack of outputs.  Weput a null string into it to represent that we haven't collected anytext at the current tag embedding level (0 initially).</p><p>Now is when that callback comes back in.  Whenever we see text, itautomatically gets appended to the final element of the array, via the<tt class="literal">$out[-1]</tt> idiom in the callback.  At the outer tag level, <tt class="literal">$out[-1]</tt>is the same as <tt class="literal">$out[0]</tt>, so <tt class="literal">$out[0]</tt> ends up with our wholeoutput.  (Eventually.  But first we have to deal with tags.)</p><p>Suppose we see a <tt class="literal">&lt;literal&gt;</tt> tag.  Then the <tt class="literal">literal</tt> subroutinegets called, appends some text to the current output, then pushes a newcontext onto the <tt class="literal">@out</tt> stack.  Now any text up until the closing taggets appended to that new end of the stack.  When we hit the closingtag, we pop the <tt class="literal">$text</tt> we've collected back off the <tt class="literal">@out</tt> stack,and append the rest of the transmogrified data to the new (that is, theold) end of stack, the result of which is to translate the XML string, <tt class="literal">&lt;literal&gt;</tt><em class="replaceable">text</em><tt class="literal">&lt;/literal&gt;</tt>, into the corresponding pod string, <tt class="literal">C&lt;</tt><em class="replaceable">text</em><tt class="literal">&gt;</tt>.</p><p>The subroutines for the other tags are just the same, only different.</p></li><li><p>Use <tt class="literal">my</tt> without assignment to create an empty array or hash.</p></li><li><p>Split the default string on whitespace.</p></li><li><p>Assign to lists of variables to collect however many you want.</p></li><li><p>Use autovivification of undefined references to create them.</p></li><li><p>Autoincrement undefined array and hash elements to create them.</p></li><li><p>Use autoincrement of a <tt class="literal">%seen</tt> array to determine uniqueness.</p></li><li><p>Assign to a handy <tt class="literal">my</tt> temporary in the conditional.</p></li><li><p>Use the autoquoting behavior of braces.</p></li><li><p>Use an alternate quoting mechanism to interpolate double quotes.</p></li><li><p>Use the <tt class="literal">?:</tt> operator to switch between two arguments to a <tt class="literal">printf</tt>.</p></li><li><p>Line up <tt class="literal">printf</tt> args with their <tt class="literal">%</tt> field:<blockquote><pre class="programlisting">my %seen;while (&lt;&gt;) {    my ($a, $b, $c, $d) = split;    print unless $seen{$a}{$b}{$c}{$d}++;}if (my $tmp = $seen{fee}{fie}{foe}{foo}) {    printf qq(Saw "fee fie foe foo" [sic] %d time%s.\n"),                                          $tmp,  $tmp == 1 ? "" : "s";}</pre></blockquote>These nine lines are just chock full of idioms.  The first line makesan empty hash because we don't assign anything to it.  We iterate overinput lines setting "it", that is, <tt class="literal">$_</tt>, implicitly,then using an argumentless <tt class="literal">split</tt> which splits "it"on whitespace.  Then we pick off the four first words with a listassignment, throwing any subsequent words away.  Then we remember thefirst four words in a four-dimensional hash, which automaticallycreates (if necessary) the first three reference elements and finalcount element for the autoincrement to increment.  (Under <tt class="literal">usewarnings</tt>, the autoincrement will never warn that you'reusing undefined values, because autoincrement is an accepted way todefine undefined values.) We then print out the line if we've neverseen a line starting with these four words before, because theautoincrement is a postincrement, which, in addition to incrementingthe hash value, will return the old true value if there was one.</p><p>After the loop, we test <tt class="literal">%seen</tt> again to see if aparticular combination of four words was seen.  We make use of thefact that we can put a literal identifier into braces and it will beautoquoted.  Otherwise, we'd have to say<tt class="literal">$seen{"fee"}{"fie"}{"foe"}{"foo"}</tt>, which is a drageven when you're not running from a giant.</p><p>We assign the result of <tt class="literal">$seen{fee}{fie}{foe}{foo}</tt>to a temporary variable even before testing it in the Boolean contextprovided by the <tt class="literal">if</tt>. Because assignment returns itsleft value, we can still test the value to see if it was true.  The<tt class="literal">my</tt> tells your eye that it's a new variable, andwe're not testing for equality but doing an assignment.  It would alsowork fine without the <tt class="literal">my</tt>, and an expert Perlprogrammer would still immediately notice that we used one<tt class="literal">=</tt> instead of two <tt class="literal">==</tt>.  (Asemiskilled Perl programmer might be fooled, however.  Pascalprogrammers of any skill level will foam at the mouth.)</p><p>Moving on to the <tt class="literal">printf</tt> statement, you can see the<tt class="literal">qq()</tt> form of double quotes we used so that we couldinterpolate ordinary double quotes as well as a newline.  We could'vedirectly interpolated <tt class="literal">$tmp</tt> there as well, sinceit's effectively a double-quoted string, but we chose to do furtherinterpolation via <tt class="literal">printf</tt>.  Our temporary<tt class="literal">$tmp</tt> variable is now quite handy, particularlysince we don't just want to interpolate it, but also test it in theconditional of a <tt class="literal">?:</tt> operator to see whether weshould pluralize the word "time".  Finally, note that we lined up thetwo fields with their corresponding <tt class="literal">%</tt> markers inthe <tt class="literal">printf</tt> format.  If an argument is too long tofit, you can always go to the next line for the next argument, thoughwe didn't have to in this case.</p></li></ul><p>Whew! Had enough?  There are many more idioms we could discuss, butthis book is already sufficiently heavy.  But we'd like totalk about one more idiomatic use of Perl, the writing of programgenerators.</p><a name="INDEX-4281"></a><a name="INDEX-4282"></a><a name="INDEX-4283"></a><!-- BOTTOM NAV BAR --><hr width="515" align="left"><div class="navbar"><table width="515" border="0"><tr><td align="left" valign="top" width="172"><a href="ch24_03.htm"><img src="../gifs/txtpreva.gif" alt="Previous" border="0"></a></td><td align="center" valign="top" width="171"><a href="index.htm"><img src="../gifs/txthome.gif" alt="Home" border="0"></a></td><td align="right" valign="top" width="172"><a href="ch24_05.htm"><img src="../gifs/txtnexta.gif" alt="Next" border="0"></a></td></tr><tr><td align="left" valign="top" width="172">24.3. Programming with Style</td><td align="center" valign="top" width="171"><a href="index/index.htm"><img src="../gifs/index.gif" alt="Book Index" border="0"></a></td><td align="right" valign="top" width="172">24.5. Program Generation</td></tr></table></div><hr width="515" align="left"><!-- LIBRARY NAV BAR --><img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p><font size="-1"><a href="copyrght.htm">Copyright &copy; 2001</a> O'Reilly &amp; Associates. All rights reserved.</font></p><map name="library-map"> <area shape="rect" coords="2,-1,79,99" href="../index.htm"><area shape="rect" coords="84,1,157,108" href="../perlnut/index.htm"><area shape="rect" coords="162,2,248,125" href="../prog/index.htm"><area shape="rect" coords="253,2,326,130" href="../advprog/index.htm"><area shape="rect" coords="332,1,407,112" href="../cookbook/index.htm"><area shape="rect" coords="414,2,523,103" href="../sysadmin/index.htm"></map><!-- END OF BODY --></body></html>
ch24_04.htm - 源码说明

本页面展示了「编程珍珠,里面很多好用的代码,大家可以参考学习呵呵,」中的 ch24_04.htm 源码文件，采用 HTM 编程语言编写，共 671 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与编程相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?