ch04_07.htm

来自「By Tom Christiansen and Nathan Torkingto」· HTM 代码 · 共 497 行
HTM
497 行
<HTML><HEAD><TITLE>Recipe 4.6. Extracting Unique Elements from a List (Perl Cookbook)</TITLE><METANAME="DC.title"CONTENT="Perl Cookbook"><METANAME="DC.creator"CONTENT="Tom Christiansen &amp; Nathan Torkington"><METANAME="DC.publisher"CONTENT="O'Reilly &amp; Associates, Inc."><METANAME="DC.date"CONTENT="1999-07-02T01:31:26Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-243-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch04_01.htm"TITLE="4. Arrays"><LINKREL="prev"HREF="ch04_06.htm"TITLE="4.5. Iterating Over an Array by Reference"><LINKREL="next"HREF="ch04_08.htm"TITLE="4.7. Finding Elements in One Array but Not Another"></HEAD><BODYBGCOLOR="#FFFFFF"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl Cookbook"><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><p><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch04_06.htm"TITLE="4.5. Iterating Over an Array by Reference"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 4.5. Iterating Over an Array by Reference"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1"><ACLASS="chapter"REL="up"HREF="ch04_01.htm"TITLE="4. Arrays"></A></FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch04_08.htm"TITLE="4.7. Finding Elements in One Array but Not Another"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 4.7. Finding Elements in One Array but Not Another"BORDER="0"></A></TD></TR></TABLE></DIV><DIVCLASS="sect1"><H2CLASS="sect1"><ACLASS="title"NAME="ch04-17421">4.6. Extracting Unique Elements from a List</A></H2><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch04-pgfId-663">Problem <ACLASS="indexterm"NAME="ch04-idx-1000006652-0"></A><ACLASS="indexterm"NAME="ch04-idx-1000006652-1"></A><ACLASS="indexterm"NAME="ch04-idx-1000006652-2"></A><ACLASS="indexterm"NAME="ch04-idx-1000006652-3"></A></A></H3><PCLASS="para">You want to eliminate duplicate values from a list, such as when you build the list from a file or from the output of another command. This recipe is equally applicable to removing duplicates as they occur in input and to removing duplicates from an array you've already populated.</P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch04-pgfId-669">Solution</A></H3><PCLASS="para">Use a hash to record which items have been seen, then <CODECLASS="literal">keys</CODE> to extract them. You can use Perl's idea of truth to shorten and speed up your code.</P><DIVCLASS="sect3"><H4CLASS="sect3"><ACLASS="title"NAME="ch04-16198">Straightforward</A></H4><PRECLASS="programlisting">%seen = ();@uniq = ();foreach $item (@list) {    unless ($seen{$item}) {        # if we get here, we have not seen it before        $seen{$item} = 1;        push(@uniq, $item);    }}</PRE></DIV><DIVCLASS="sect3"><H4CLASS="sect3"><ACLASS="title"NAME="ch04-19255">Faster</A></H4><PRECLASS="programlisting">%seen = ();foreach $item (@list) {    push(@uniq, $item) unless $seen{$item}++;}</PRE></DIV><DIVCLASS="sect3"><H4CLASS="sect3"><ACLASS="title"NAME="ch04-27116">Similar but with user function</A></H4><PRECLASS="programlisting">%seen = ();foreach $item (@list) {    some_func($item) unless $seen{$item}++;}</PRE></DIV><DIVCLASS="sect3"><H4CLASS="sect3"><ACLASS="title"NAME="ch04-38870">Faster but different</A></H4><PRECLASS="programlisting">%seen = ();foreach $item (@list) {    $seen{$item}++;}@uniq = keys %seen;</PRE></DIV><DIVCLASS="sect3"><H4CLASS="sect3"><ACLASS="title"NAME="ch04-17367">Faster and even more different</A></H4><PRECLASS="programlisting">%seen = ();@uniqu = grep { ! $seen{$_} ++ } @list;</PRE></DIV></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch04-pgfId-1000005726">Discussion</A></H3><PCLASS="para"><ACLASS="indexterm"NAME="ch04-idx-1000006658-0"></A>The question at the heart of the matter is "Have I seen this element before?" Hashes are ideally suited to such lookups. The first technique (<ACLASS="xref"HREF="ch04_07.htm#ch04-16198"TITLE="Straightforward">"Straightforward</A>") builds up the array of unique values as we go along, using a hash to record whether something is already in the array.</P><PCLASS="para">The second technique (<ACLASS="xref"HREF="ch04_07.htm#ch04-19255"TITLE="Faster">"Faster</A>") is the most natural way to write this sort of thing in Perl. It creates a new entry in the hash every time it sees an element that hasn't been seen before, using the <CODECLASS="literal">++</CODE> operator. This has the side effect of making the hash record the number of times the element was seen. This time we only use the hash for its property of working like a set.</P><PCLASS="para">The third example (<ACLASS="xref"HREF="ch04_07.htm#ch04-27116"TITLE="Similar but with user function">"Similar but with user function</A>") is similar to the second but rather than storing the item away, we call some user-defined function with that item as its argument. If that's all we're doing, keeping a spare array of those unique values is unnecessary.</P><PCLASS="para">The next mechanism (<ACLASS="xref"HREF="ch04_07.htm#ch04-38870"TITLE="Faster but different">"Faster but different</A>") waits until it's done processing the list to extract the unique keys from the <CODECLASS="literal">%seen</CODE> hash. This may be convenient, but the original order has been lost.</P><PCLASS="para">The final approach, (<ACLASS="xref"HREF="ch04_07.htm#ch04-17367"TITLE="Faster and even more different">"Faster and even more different</A>") merges the construction of the <CODECLASS="literal">%seen</CODE> hash with the extraction of unique elements. This preserves the original order of elements.</P><PCLASS="para">Using a hash to record the values has two side effects: processing long lists can take a lot of memory and the list returned by <CODECLASS="literal">keys</CODE> is not in alphabetical, numeric, or insertion order.</P><PCLASS="para">Here's an example of processing input as it is read. We use <CODECLASS="literal">`who`</CODE> to gather information on the current user list, and then we extract the username from each line before updating the hash:</P><PRECLASS="programlisting"># generate a list of users logged in, removing duplicates%ucnt = ();for (`who`) {    s/\s.*\n//;   # kill from first space till end-of-line, yielding username    $ucnt{$_}++;  # record the presence of this user}# extract and print unique keys@users = sort keys %ucnt;print &quot;users logged in: @users\n&quot;;</PRE></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch04-pgfId-711">See Also</A></H3><PCLASS="para">The "Foreach Loops" section of <ICLASS="filename">perlsyn </I>(1) and <ACLASS="olink"HREF="../prog/ch02_01.htm">Chapter 2</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A>; the <CODECLASS="literal">keys</CODE> function in <ICLASS="filename">perlfunc </I>(1) and <ACLASS="olink"HREF="../prog/ch03_01.htm">Chapter 3</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A>; the <ACLASS="olink"HREF="../prog/ch02_03.htm#PERL2-CH-2-SECT-3.5">"Hashes (Associative Arrays)"</A> section of <ACLASS="olink"HREF="../prog/ch02_01.htm">Chapter 2</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A>; <ACLASS="xref"HREF="ch05_01.htm"TITLE="Hashes">Chapter 5</A>; we use hashes in a similar fashion in <ACLASS="xref"HREF="ch04_08.htm"TITLE="Finding Elements in One Array but Not Another">Recipe 4.7</A> and <ACLASS="xref"HREF="ch04_09.htm"TITLE="Computing Union, Intersection, or Difference of Unique Lists">Recipe 4.8</A> <ACLASS="indexterm"NAME="ch04-idx-1000006654-0"></A><ACLASS="indexterm"NAME="ch04-idx-1000006654-1"></A><ACLASS="indexterm"NAME="ch04-idx-1000006654-2"></A><ACLASS="indexterm"NAME="ch04-idx-1000006654-3"></A></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch04_06.htm"TITLE="4.5. Iterating Over an Array by Reference"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 4.5. Iterating Over an Array by Reference"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch04_08.htm"TITLE="4.7. Finding Elements in One Array but Not Another"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 4.7. Finding Elements in One Array but Not Another"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">4.5. Iterating Over an Array by Reference</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">4.7. Finding Elements in One Array but Not Another</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>
ch04_07.htm - 源码说明

本页面展示了「By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998」中的 ch04_07.htm 源码文件，采用 HTM 编程语言编写，共 497 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与Christiansen相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?