⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch10_18.htm

📁 By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998
💻 HTM
字号:
<HTML><HEAD><TITLE>Recipe 10.17. Program: Sorting Your Mail (Perl Cookbook)</TITLE><METANAME="DC.title"CONTENT="Perl Cookbook"><METANAME="DC.creator"CONTENT="Tom Christiansen &amp; Nathan Torkington"><METANAME="DC.publisher"CONTENT="O'Reilly &amp; Associates, Inc."><METANAME="DC.date"CONTENT="1999-07-02T01:40:10Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-243-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch10_01.htm"TITLE="10. Subroutines"><LINKREL="prev"HREF="ch10_17.htm"TITLE="10.16. Nesting Subroutines"><LINKREL="next"HREF="ch11_01.htm"TITLE="11. References and Records"></HEAD><BODYBGCOLOR="#FFFFFF"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl Cookbook"><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><p><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch10_17.htm"TITLE="10.16. Nesting Subroutines"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 10.16. Nesting Subroutines"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1"><ACLASS="chapter"REL="up"HREF="ch10_01.htm"TITLE="10. Subroutines"></A></FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="chapter"HREF="ch11_01.htm"TITLE="11. References and Records"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 11. References and Records"BORDER="0"></A></TD></TR></TABLE></DIV><DIVCLASS="sect1"><H2CLASS="sect1"><ACLASS="title"NAME="ch10-chap10_program_0">10.17. Program: Sorting Your Mail</A></H2><PCLASS="para"><ACLASS="indexterm"NAME="ch10-idx-1000006301-0"></A><ACLASS="indexterm"NAME="ch10-idx-1000006301-1"></A><ACLASS="indexterm"NAME="ch10-idx-1000006301-2"></A>The program in <ACLASS="xref"HREF="ch10_18.htm#ch10-24677"TITLE="bysub1">Example 10.1</A> sorts a mailbox by subject by reading input a paragraph at a time, looking for one with a <CODECLASS="literal">&quot;From&quot;</CODE> at the start of a line. When it finds one, it searches for the subject, strips it of any <CODECLASS="literal">&quot;Re:</CODE> <CODECLASS="literal">&quot;</CODE> marks, and stores its lowercased version in the <CODECLASS="literal">@sub</CODE> array. Meanwhile, the messages themselves are stored in a corresponding <CODECLASS="literal">@msgs</CODE> array. The <CODECLASS="literal">$msgno</CODE> variable keeps track of the message number.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch10-24677">Example 10.1: bysub1</A></H4><PRECLASS="programlisting">#!/usr/bin/perl # <ACLASS="indexterm"NAME="ch10-idx-1000006317-0"></A>bysub1 - simple sort by subjectmy(@msgs, @sub);my $msgno = -1;$/ = '';                    # paragraph readswhile (&lt;&gt;) {    if (/^From/m) {        /^Subject:\s*(?:Re:\s*)*(.*)/mi;        $sub[++$msgno] = lc($1) || '';    }    $msgs[$msgno] .= $_;} for my $i (sort { $sub[$a] cmp $sub[$b] || $a &lt;=&gt; $b } (0 .. $#msgs)) {    print $msgs[$i];}</PRE></DIV><PCLASS="para">That <CODECLASS="literal">sort</CODE> is only sorting array indices. If the subjects are the same, <CODECLASS="literal">cmp</CODE> returns 0, so the second part of the <CODECLASS="literal">||</CODE> is taken, which compares the message numbers in the order they originally appeared.</P><PCLASS="para">If <CODECLASS="literal">sort</CODE> were fed a list like <CODECLASS="literal">(0,1,2,3)</CODE>, that list would get sorted into a different permutation, perhaps <CODECLASS="literal">(2,1,3,0)</CODE>. We iterate across them with a <CODECLASS="literal">for</CODE> loop to print out each message.</P><PCLASS="para"><ACLASS="xref"HREF="ch10_18.htm#ch10-30783"TITLE="bysub2">Example 10.2</A> shows how an <ICLASS="filename">awk</I> programmer might code this program, using the <BCLASS="emphasis.bold">-00</B> switch to read paragraphs instead of lines.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch10-30783">Example 10.2: bysub2</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -n00# <ACLASS="indexterm"NAME="ch10-idx-1000004769-0"></A>bysub2 - awkish sort-by-subjectBEGIN { $msgno = -1 }$sub[++$msgno] = (/^Subject:\s*(?:Re:\s*)*(.*)/mi)[0] if /^From/m;$msg[$msgno] .= $_;END { print @msg[ sort { $sub[$a] cmp $sub[$b] || $a &lt;=&gt; $b } (0 .. $#msg) ] }</PRE></DIV><PCLASS="para"><ACLASS="indexterm"NAME="ch10-idx-1000004764-0"></A>Perl has kept parallel arrays since its early days. Keeping each message in a hash is a more elegant solution. We'll sort on each field in the hash, by making an anonymous hash as described in <ACLASS="xref"HREF="ch11_01.htm"TITLE="References and Records">Chapter 11</A>.</P><PCLASS="para"><ACLASS="xref"HREF="ch10_18.htm#ch10-11145"TITLE="bysub3">Example 10.3</A> is a program similar in spirit to <ACLASS="xref"HREF="ch10_18.htm#ch10-24677"TITLE="bysub1">Example 10.1</A> and <ACLASS="xref"HREF="ch10_18.htm#ch10-30783"TITLE="bysub2">Example 10.2</A>.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch10-11145">Example 10.3: bysub3</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -00# bysub3<ACLASS="indexterm"NAME="ch10-idx-1000004771-0"></A> - sort by subject using hash recordsuse strict;my @msgs = ();while (&lt;&gt;) {    push @msgs, {        SUBJECT =&gt; /^Subject:\s*(?:Re:\s*)*(.*)/mi,        NUMBER  =&gt; scalar @msgs,   # which msgno this is        TEXT    =&gt; '',    } if /^From/m;    $msgs[-1]{TEXT} .= $_;} for my $msg (sort {                             $a-&gt;{SUBJECT} cmp $b-&gt;{SUBJECT}                                        ||                         $a-&gt;{NUMBER}  &lt;=&gt; $b-&gt;{NUMBER}                   } @msgs         ){    print $msg-&gt;{TEXT};} </PRE></DIV><PCLASS="para"><ACLASS="indexterm"NAME="ch10-idx-1000004763-0"></A>Once we have real hashes, adding further sorting criteria is simple. A common way to sort a folder is subject major, date minor order. The hard part is figuring out how to parse and compare dates. Date::Manip does this, returning a string we can compare; however, the <EMCLASS="emphasis">datesort</EM> program in <ACLASS="xref"HREF="ch10_18.htm#ch10-26587"TITLE="datesort (continued)">Example 10.4</A>, which uses Date::Manip, runs more than 10 times slower than the previous one. Parsing dates in unpredictable formats is extremely slow.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch10-26587">Example 10.4: datesort (continued)</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -00# <ACLASS="indexterm"NAME="ch10-idx-1000004780-0"></A>datesort - sort mbox by subject then dateuse strict;use Date::Manip;my @msgs = ();while (&lt;&gt;) {    next unless /^From/m;    my $date = '';    if (/^Date:\s*(.*)/m) {        ($date = $1) =~ s/\s+\(.*//;  # library hates (MST)        $date = ParseDate($date);    }     push @msgs, {        SUBJECT =&gt; /^Subject:\s*(?:Re:\s*)*(.*)/mi,        DATE    =&gt; $date,        NUMBER  =&gt; scalar @msgs,        TEXT    =&gt; '',    }; } continue {    $msgs[-1]{TEXT} .= $_;}for my $msg (sort {                             $a-&gt;{SUBJECT} cmp $b-&gt;{SUBJECT}                                        ||                         $a-&gt;{DATE}    cmp $b-&gt;{DATE}                                        ||                         $a-&gt;{NUMBER}  &lt;=&gt; $b-&gt;{NUMBER}                   } @msgs         ){    print $msg-&gt;{TEXT};}</PRE></DIV><PCLASS="para"><ACLASS="xref"HREF="ch10_18.htm#ch10-26587"TITLE="datesort (continued)">Example 10.4</A> is written to draw attention to the <CODECLASS="literal">continue</CODE> block. When a loop's end is reached, either because it fell through to that point or got there from a <CODECLASS="literal">next</CODE>, the whole <CODECLASS="literal">continue</CODE> block is executed. It corresponds to the third portion of a three-part <CODECLASS="literal">for</CODE> loop, except that the <CODECLASS="literal">continue</CODE> block isn't restricted to an expression. It's a full block, with separate <ACLASS="indexterm"NAME="ch10-idx-1000004759-0"></A><ACLASS="indexterm"NAME="ch10-idx-1000004759-1"></A><ACLASS="indexterm"NAME="ch10-idx-1000004759-2"></A>statements.</P><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch10-pgfId-1000005909">See Also</A></H3><PCLASS="para">The <ACLASS="olink"HREF="../prog/ch03_153.htm"><CODECLASS="literal">sort</CODE></A> function in <ACLASS="olink"HREF="../prog/ch03_01.htm">Chapter 3</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A> and in <ICLASS="filename">perlfunc </I>(1); the discussion of the <ACLASS="olink"HREF="../prog/ch02_09.htm#PERL2-CH-2-SECT-9.3"><CODECLASS="literal">$/</CODE></A> variable in <ACLASS="olink"HREF="../prog/ch02_01.htm">Chapter 2</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A>, <ICLASS="filename">perlvar </I>(1), and the Introduction to <ACLASS="xref"HREF="ch08_01.htm"TITLE="File Contents">Chapter 8, <CITECLASS="chapter">File Contents</CITE></A>; <ACLASS="xref"HREF="ch03_08.htm"TITLE="Parsing Dates and Times from Strings">Recipe 3.7</A>; <ACLASS="xref"HREF="ch04_16.htm"TITLE="Sorting a List by Computable Field">Recipe 4.15</A>; <ACLASS="xref"HREF="ch05_10.htm"TITLE="Sorting a Hash">Recipe 5.9</A>; <ACLASS="xref"HREF="ch11_10.htm"TITLE="Constructing Records">Recipe 11.9</A> <ACLASS="indexterm"NAME="ch10-idx-1000004629-0"></A></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch10_17.htm"TITLE="10.16. Nesting Subroutines"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 10.16. Nesting Subroutines"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="chapter"HREF="ch11_01.htm"TITLE="11. References and Records"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 11. References and Records"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">10.16. Nesting Subroutines</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">11. References and Records</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -