ch15_04.htm

来自「by Randal L. Schwartz and Tom Phoenix I」· HTM 代码 · 共 442 行 · 第 1/2 页

HTM
442
字号
<html><head><title>Advanced Sorting (Learning Perl, 3rd Edition)</title><link rel="stylesheet" type="text/css" href="../style/style1.css" /><meta name="DC.Creator" content="Randal L. Schwartz and Tom Phoenix" /><meta name="DC.Format" content="text/xml" scheme="MIME" /><meta name="DC.Language" content="en-US" /><meta name="DC.Publisher" content="O'Reilly &amp; Associates, Inc." /><meta name="DC.Source" scheme="ISBN" content="0596001320L" /><meta name="DC.Subject.Keyword" content="stuff" /><meta name="DC.Title" content="Learning Perl, 3rd Edition" /><meta name="DC.Type" content="Text.Monograph" /></head><body bgcolor="#ffffff"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Learning Perl, 3rd Edition" /><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch15_03.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"></a></td><td align="right" valign="top" width="228"><a href="ch15_05.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr></table></div><h2 class="sect1">15.4. Advanced Sorting</h2><p><a name="INDEX-1018" />Earlier,in <a href="ch03_01.htm">Chapter 3, "Lists and Arrays "</a>, we showed that you could sort a listin ascending ASCIIbetical order by using the builtin<tt class="literal">sort</tt> operator. What if you want a numeric sort?Or a case-insensitive sort? Or maybe you want to sort items accordingto information stored in a hash. Well, Perl lets you sort a list inwhatever order you'd need; we'll see all of thoseexamples by the end of the chapter.</p><p><a name="INDEX-1019" />You'll tell Perl what orderyou want by making a <em class="firstterm">sort-definitionsubroutine</em>, or <em class="firstterm">sort subroutine</em> forshort. Now, when you first hear the term "sortsubroutine," if you've been through any computer sciencecourses, visions of bubble sort and shell sort and quick sort racethrough your head, and you say, "No, never again!"Don't worry; it's not that bad. In fact, it'spretty simple. Perl already knows how to sort a list of items; itmerely doesn't know which order you want. So thesort-definition subroutine simply tells it the order.</p><p>Why is this necessary? Well, if you think about it, sorting isputting a bunch of things in order by comparing them all. Since youcan't compare them all at once, you need to compare two at atime, eventually using what you find out about each pair'sorder to put the whole kit'n'caboodle in line. Perlalready understands all of those steps <em class="emphasis">except</em>for the part about how you'd like to compare the items, sothat's all you have to write.</p><p>This means that the sort subroutine doesn't need to sort manyitems after all. It merely has to be able to compare two items. If itcan put two items in the proper order, Perl will be able to tell (byrepeatedly consulting the sort subroutine) what order you want foryour data.</p><p>The sort subroutine is defined like an ordinary subroutine (well,almost). This routine will be called repeatedly, each time checkingon a pair of elements from the list to be sorted.</p><p>Now, if you were writing a subroutine that's expecting to gettwo parameters that need sorting, you might write something like thisto start:</p><blockquote><pre class="code">sub any_sort_sub {  # It doesn't really work this way  my($a, $b) = @_;  # Get and name the two parameters  # start comparing $a and $b here  ...}</pre></blockquote><p>But the sort subroutine will be called again and again, oftenhundreds or thousands of times. Declaring the variables<tt class="literal">$a</tt> and <tt class="literal">$b</tt> and assigning themvalues at the top of the subroutine will take just a little time, butmultiply that by the thousands of times that the routine will becalled, and you can see that it contributes significantly to theoverall execution speed.</p><p>We don't do it like that. (In fact, if you did it that way, itwouldn't work.) Instead, it is as if Perl has done this for us,before our subroutine's code has even started.<a href="#FOOTNOTE-340">[340]</a>You'll really write a sort subroutine without that first line;both <tt class="literal">$a</tt> and <tt class="literal">$b</tt> have beenassigned for you. When the sort subroutine starts running,<tt class="literal">$a</tt> and <tt class="literal">$b</tt> are two elements fromthe original list.</p><blockquote class="footnote"><a name="FOOTNOTE-340" /><p>[340]To be honest, it's closer to being as if Perl has used<tt class="literal">local($a, $b) </tt>in a private block around the<tt class="literal">sort </tt>invocation, because these variables arereally globals rather than lexical variables. (Unless you dosomething unusual, though, you can't tell the difference insidethe sort subroutine; you can pretend these are <tt class="literal">my</tt>variables. <tt class="literal">use strict</tt> makes a specialexception for these two globals, so you don't need to declarethem in any way.) This means that if your program already has its own<tt class="literal">$a</tt> or <tt class="literal">$b</tt>, you won't beable to access those while Perl is sorting the list. Also, be sure tonote that the two items to be sorted are <em class="emphasis">not</em>passed in via <tt class="literal">@_</tt> (unless you use a subroutineprototype, which we won't cover in this book, but see thedocumentation for the full details). Inside the sort subroutine, justuse <tt class="literal">$a</tt> and <tt class="literal">$b</tt>, and try not toworry too much about where they came from. And as if thatwasn't enough, if there's a lexical <tt class="literal">$a</tt>or <tt class="literal">$b</tt> somewhere in scope, the subroutinedefinition doesn't work either. Whew!</p> </blockquote><p>The subroutine returns a coded value describing how the elementscompare (like C's <tt class="literal">qsort(3)</tt> does, butit's Perl's own internal sort implementation). If<tt class="literal">$a</tt> should appear before <tt class="literal">$b</tt> inthe final list, the sort subroutine returns <tt class="literal">-1</tt> tosay so. If <tt class="literal">$b</tt> should appear before<tt class="literal">$a</tt>, it returns <tt class="literal">1</tt>.</p><p>If the order of <tt class="literal">$a</tt> and <tt class="literal">$b</tt>doesn't matter, the subroutine returns <tt class="literal">0</tt>.Why would it not matter? Perhaps you're doing acase-insensitive sort and the two strings are <tt class="literal">fred</tt>and <tt class="literal">Fred</tt>. Or perhaps you're doing a numericsort, and the two numbers are equal.</p><p>We could now write a numeric sort subroutine like this:</p><blockquote><pre class="code">sub by_number {  # a sort subroutine, expect $a and $b  if ($a &lt; $b) { -1 } elsif ($a &gt; $b) { 1 } else { 0 }}</pre></blockquote><p>To use the sort subroutine, just put its name (without an ampersand)between the keyword <tt class="literal">sort</tt> and the list to besorted. This example puts a numerically sorted list of numbers into<tt class="literal">@result</tt>:</p><blockquote><pre class="code">my @result = sort by_number @some_numbers;</pre></blockquote><p>We called the subroutine <tt class="literal">by_number</tt> because thatdescribes how it's sorting. But more importantly, you can readthe line of code that uses it with sort as saying "sort bynumber," as you would in English. Many sort-subroutine namesbegin with <tt class="literal">by_</tt> to describe how they sort. Or wecould have called this one <tt class="literal">numerically</tt>, for asimilar reason, but that's more typing and more chance to messsomething up.</p><p>Notice that we don't have to do anything in the sort subroutineto declare <tt class="literal">$a</tt> and <tt class="literal">$b</tt>, and toset their values -- and if we did, the subroutine wouldn'twork right. We just let Perl set up <tt class="literal">$a</tt> and<tt class="literal">$b</tt> for us, and so all we need to write is thecomparison.</p><p>In fact, we can make it even simpler (and more efficient). Since thiskind of three-way comparison is frequent, Perl has a convenientshortcut to use to write it. In this case, we use the<a name="INDEX-1020" /><a name="INDEX-1021" />spaceship operator(<tt class="literal">&lt;=&gt;</tt>).<a href="#FOOTNOTE-341">[341]</a> This operator compares two numbers andreturns <tt class="literal">-1</tt>, <tt class="literal">0</tt>, or<tt class="literal">1</tt> as needed to sort them numerically. So we couldhave written that sort subroutine better, like this:</p><blockquote class="footnote"> <a name="FOOTNOTE-341" /><p>[341]We call it thatbecause it looks like one of the Tie-fighters from <em class="emphasis">StarWars</em><a name="INDEX-1022" />. Well, it looks like that to us,anyway.</p> </blockquote><blockquote><pre class="code">sub by_number { $a &lt;=&gt; $b }</pre></blockquote><p>Since the spaceship compares<a name="INDEX-1023" />numbers, you may have guessedthat there's a corresponding <a name="INDEX-1024" /><a name="INDEX-1025" />three-way string-comparison operator:<tt class="literal">cmp</tt>. These two are easy to remember and keepstraight. The spaceship has a family resemblance to the numeric<a name="INDEX-1026" /><a name="INDEX-1027" />comparisonoperators like <tt class="literal">&gt;=</tt>, but it's threecharacters long instead of two because it has three possible returnvalues instead of two. And <tt class="literal">cmp</tt> has a familyresemblance to the string comparison operators like<tt class="literal">ge</tt>, but it's three characters long insteadof two because it <em class="emphasis">also</em> has three possible returnvalues instead of two.<a href="#FOOTNOTE-342">[342]</a></p><blockquote class="footnote"> <a name="FOOTNOTE-342" /><p>[342]This is no accident. Larrydoes things like this on purpose, to make Perl easier to learn andremember. Remember, he's a linguist at heart, so he'sstudied how people think of languages.</p> </blockquote><p>Of course, <tt class="literal">cmp</tt> by itself provides the same orderas the default sort. You'd never need to write this subroutine,which yields merely the default sort order:<a href="#FOOTNOTE-343">[343]</a></p><blockquote class="footnote"><a name="FOOTNOTE-343" /><p>[343]You'd never need to write this unless, of course, youwere writing an introductory Perl book and needed it for anexample.</p> </blockquote><blockquote><pre class="code">sub ASCIIbetically { $a cmp $b }my @strings = sort ASCIIbetically @any_strings;</pre></blockquote><p>But you can use <tt class="literal">cmp</tt> to build a more complex sortorder, like a case-insensitive sort:</p><blockquote><pre class="code">sub case_insensitive { "\L$a" cmp "\L$b" }</pre></blockquote><p>In this case, we're comparing the string from<tt class="literal">$a</tt> (forced to lowercase) against the string from<tt class="literal">$b</tt> (forced to lowercase), giving acase-insensitive sort order.</p><p>Note that we're not modifying the elements themselves;we're merely using their values. That's actuallyimportant: for efficiency reasons, <tt class="literal">$a</tt> and

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?