⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch01_18.htm

📁 By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998
💻 HTM
字号:
<HTML><HEAD><TITLE>Recipe 1.17. Program: fixstyle (Perl Cookbook)</TITLE><METANAME="DC.title"CONTENT="Perl Cookbook"><METANAME="DC.creator"CONTENT="Tom Christiansen &amp; Nathan Torkington"><METANAME="DC.publisher"CONTENT="O'Reilly &amp; Associates, Inc."><METANAME="DC.date"CONTENT="1999-07-02T01:29:23Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-243-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch01_01.htm"TITLE="1. Strings"><LINKREL="prev"HREF="ch01_17.htm"TITLE="1.16. Soundex Matching"><LINKREL="next"HREF="ch01_19.htm"TITLE="1.18. Program: psgrep"></HEAD><BODYBGCOLOR="#FFFFFF"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl Cookbook"><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><p><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch01_17.htm"TITLE="1.16. Soundex Matching"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 1.16. Soundex Matching"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1"><ACLASS="chapter"REL="up"HREF="ch01_01.htm"TITLE="1. Strings"></A></FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch01_19.htm"TITLE="1.18. Program: psgrep"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 1.18. Program: psgrep"BORDER="0"></A></TD></TR></TABLE></DIV><DIVCLASS="sect1"><H2CLASS="sect1"><ACLASS="title"NAME="ch01-97488">1.17. Program: fixstyle</A></H2><PCLASS="para"><ACLASS="indexterm"NAME="ch01-idx-1000011382-0"></A><ACLASS="indexterm"NAME="ch01-idx-1000011382-1"></A><ACLASS="indexterm"NAME="ch01-idx-1000011382-2"></A>Imagine you have a table with both old and new strings, such as the following.</P><TABLECLASS="informaltable"BORDER="1"CELLPADDING="3"><THEADCLASS="thead"><TRCLASS="row"VALIGN="TOP"><THCLASS="entry"ALIGN="LEFT"ROWSPAN="1"COLSPAN="1"><PCLASS="para">Old Words</P></TH><THCLASS="entry"ALIGN="LEFT"ROWSPAN="1"COLSPAN="1"><PCLASS="para">New Words</P></TH></TR></THEAD><TBODYCLASS="tbody"><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">bonnet</P></TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">hood</P></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">rubber</P></TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">eraser</P></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">lorry</P></TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">truck</P></TD></TR><TRCLASS="row"VALIGN="TOP"><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">trousers</P></TD><TDCLASS="entry"ROWSPAN="1"COLSPAN="1"><PCLASS="para">pants</P></TD></TR></TBODY></TABLE><PCLASS="para">The program in <ACLASS="xref"HREF="ch01_18.htm#ch01-38470"TITLE="fixstyle">Example 1.4</A> is a filter that changes all occurrences of each element in the first set to the corresponding element in the second set.</P><PCLASS="para">When called without filename arguments, the program is a simple filter. If filenames are supplied on the command line, an in-place edit writes the changes to the files, with the original versions safely saved in a file with a &quot;<CODECLASS="literal">.orig</CODE>&quot; extension. See <ACLASS="xref"HREF="ch07_10.htm"TITLE="Modifying a File in Place with -i Switch">Recipe 7.9</A> for a description. A <BCLASS="emphasis.bold">-v</B> command-line option writes notification of each change to standard error.</P><PCLASS="para">The table of original strings and their replacements is stored below <CODECLASS="literal">__END__</CODE> in the main program as described in <ACLASS="xref"HREF="ch07_07.htm"TITLE="Storing Files Inside Your Program Text">Recipe 7.6</A>. Each pair of strings is converted into carefully escaped substitutions and accumulated into the <CODECLASS="literal">$code</CODE> variable like the <EMCLASS="emphasis">popgrep2</EM> program in <ACLASS="xref"HREF="ch06_11.htm"TITLE="Speeding Up Interpolated Matches">Recipe 6.10</A>.</P><PCLASS="para">A <CODECLASS="literal">-t</CODE> check to test for an interactive run check tells whether we're expecting to read from the keyboard if no arguments are supplied. That way if the user forgets to give an argument, they aren't wondering why the program appears to be hung.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch01-38470">Example 1.4: fixstyle</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -w# fixstyle - switch first set of &lt;DATA&gt; strings to second set#   usage: $0 [-v] [files ...]use strict;my $verbose = (@ARGV &amp;&amp; $ARGV[0] eq '-v' &amp;&amp; shift);if (@ARGV) {    $^I = &quot;.orig&quot;;          # preserve old files} else {    warn &quot;$0: Reading from stdin\n&quot; if -t STDIN;}my $code = &quot;while (&lt;&gt;) {\n&quot;;# read in config, build up code to evalwhile (&lt;DATA&gt;) {    chomp;    my ($in, $out) = split /\s*=&gt;\s*/;    next unless $in &amp;&amp; $out;    $code .= &quot;s{\\Q$in\\E}{$out}g&quot;;    $code .= &quot;&amp;&amp; printf STDERR qq($in =&gt; $out at \$ARGV line \$.\\n)&quot;                                                         if $verbose;    $code .= &quot;;\n&quot;;}$code .= &quot;print;\n}\n&quot;;eval &quot;{ $code } 1&quot; || die;__END__analysed        =&gt; analyzedbuilt-in        =&gt; builtinchastized       =&gt; chastisedcommandline     =&gt; command-linede-allocate     =&gt; deallocatedropin          =&gt; drop-inhardcode        =&gt; hard-codemeta-data       =&gt; metadatamulticharacter  =&gt; multi-charactermultiway        =&gt; multi-waynon-empty       =&gt; nonemptynon-profit      =&gt; nonprofitnon-trappable   =&gt; nontrappablepre-define      =&gt; predefinepreextend       =&gt; pre-extendre-compiling    =&gt; recompilingreenter         =&gt; re-enterturnkey         =&gt; turn-key</PRE></DIV><PCLASS="para">One caution: This program is fast, but it doesn't scale if you need to make hundreds of changes. The larger the <CODECLASS="literal">DATA</CODE> section, the longer it takes. A few dozen changes won't slow it down, and in fact, the version given in the solution above is faster for that case. But if you run the program on hundreds of changes, it will bog down.</P><PCLASS="para"><ACLASS="xref"HREF="ch01_18.htm#ch01-36847"TITLE="fixstyle2">Example 1.5</A> is a version that's slower for few changes but faster when there are many changes.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch01-36847">Example 1.5: fixstyle2</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -w# <ACLASS="indexterm"NAME="ch01-idx-1000010675-0"></A>fixstyle2 - like fixstyle but faster for many many matchesuse strict;my $verbose = (@ARGV &amp;&amp; $ARGV[0] eq '-v' &amp;&amp; shift);my %change = ();while (&lt;DATA&gt;) {     chomp;    my ($in, $out) = split /\s*=&gt;\s*/;    next unless $in &amp;&amp; $out;    $change{$in} = $out;}if (@ARGV) {     $^I = &quot;.orig&quot;;} else {     warn &quot;$0: Reading from stdin\n&quot; if -t STDIN;}while (&lt;&gt;) {     my $i = 0;    s/^(\s+)// &amp;&amp; print $1;         # emit leading whitespace    for (split /(\s+)/, $_, -1) {   # preserve trailing whitespace        print( ($i++ &amp; 1) ? $_ : ($change{$_} || $_));    } }__END__analysed        =&gt; analyzedbuilt-in        =&gt; builtinchastized       =&gt; chastisedcommandline     =&gt; command-linede-allocate     =&gt; deallocatedropin          =&gt; drop-inhardcode        =&gt; hard-codemeta-data       =&gt; metadatamulticharacter  =&gt; multi-charactermultiway        =&gt; multi-waynon-empty       =&gt; nonemptynon-profit      =&gt; nonprofitnon-trappable   =&gt; nontrappablepre-define      =&gt; predefinepreextend       =&gt; pre-extendre-compiling    =&gt; recompilingreenter         =&gt; re-enterturnkey         =&gt; turn-key</PRE></DIV><PCLASS="para">This version breaks each line into chunks of whitespace and words, which isn't a fast operation. It then uses those words to look up their replacements in a hash, which is much faster than a substitution. So the first part is slower, the second faster. The difference in speed depends on the number of matches.</P><PCLASS="para">If we didn't care about keeping the amount of whitespace separating each word constant, the second version can run as fast as the first even for a few changes. If you know a lot about your input, you can collapse whitespace into single blanks by plugging in this loop:</P><PRECLASS="programlisting"># very fast, but whitespace collapsewhile (&lt;&gt;) {     for (split) {         print $change{$_} || $_, &quot; &quot;;    }     print &quot;\n&quot;;}</PRE><PCLASS="para">That leaves an extra blank at the end of each line. If that's a problem, you could use the technique from <ACLASS="xref"HREF="ch16_15.htm"TITLE="Sending a Signal">Recipe 16.14</A> to install an output filter. Place the following code in front of the <CODECLASS="literal">while</CODE> loop that's collapsing whitespace:</P><PRECLASS="programlisting">my $pid = open(STDOUT, &quot;|-&quot;);die &quot;cannot fork: $!&quot; unless defined $pid;unless ($pid) {             # child        while (&lt;STDIN&gt;) {        s/ $//;        print;    }     exit;} <ACLASS="indexterm"NAME="ch01-idx-1000010348-0"></A><ACLASS="indexterm"NAME="ch01-idx-1000010348-1"></A><ACLASS="indexterm"NAME="ch01-idx-1000010348-2"></A></PRE></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch01_17.htm"TITLE="1.16. Soundex Matching"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 1.16. Soundex Matching"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch01_19.htm"TITLE="1.18. Program: psgrep"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 1.18. Program: psgrep"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">1.16. Soundex Matching</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">1.18. Program: psgrep</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -