⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch16_07.htm

📁 By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998
💻 HTM
字号:
<HTML><HEAD><TITLE>Recipe 16.6. Preprocessing Input (Perl Cookbook)</TITLE><METANAME="DC.title"CONTENT="Perl Cookbook"><METANAME="DC.creator"CONTENT="Tom Christiansen &amp; Nathan Torkington"><METANAME="DC.publisher"CONTENT="O'Reilly &amp; Associates, Inc."><METANAME="DC.date"CONTENT="1999-07-02T01:43:43Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-243-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch16_01.htm"TITLE="16. Process Management and Communication"><LINKREL="prev"HREF="ch16_06.htm"TITLE="16.5. Filtering Your Own Output"><LINKREL="next"HREF="ch16_08.htm"TITLE="16.7. Reading STDERR from a Program"></HEAD><BODYBGCOLOR="#FFFFFF"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl Cookbook"><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><p><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch16_06.htm"TITLE="16.5. Filtering Your Own Output"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 16.5. Filtering Your Own Output"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1"><ACLASS="chapter"REL="up"HREF="ch16_01.htm"TITLE="16. Process Management and Communication"></A></FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch16_08.htm"TITLE="16.7. Reading STDERR from a Program"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 16.7. Reading STDERR from a Program"BORDER="0"></A></TD></TR></TABLE></DIV><DIVCLASS="sect1"><H2CLASS="sect1"><ACLASS="title"NAME="ch16-56421">16.6. Preprocessing Input</A></H2><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch16-pgfId-1210">Problem<ACLASS="indexterm"NAME="ch16-idx-1000006281-0"></A><ACLASS="indexterm"NAME="ch16-idx-1000006281-1"></A><ACLASS="indexterm"NAME="ch16-idx-1000006281-2"></A><ACLASS="indexterm"NAME="ch16-idx-1000006281-3"></A><ACLASS="indexterm"NAME="ch16-idx-1000006281-4"></A></A></H3><PCLASS="para">You'd like your programs to work on files with funny formats, such as compressed files or remote web documents specified with a URL, but your program only knows how to access regular text in local files.</P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch16-pgfId-1216">Solution</A></H3><PCLASS="para">Take advantage of Perl's easy pipe handling by changing your input files' names to pipes before opening them.</P><PCLASS="para">To autoprocess gzipped or compressed files by decompressing them with <EMCLASS="emphasis">gzip</EM>, use:</P><PRECLASS="programlisting">@ARGV = map { /\.(gz|Z)$/ ? &quot;gzip -dc $_ |&quot; : $_  } @ARGV;while (&lt;&gt;) {    # .......} </PRE><PCLASS="para">To fetch URLs before processing them, use the <EMCLASS="emphasis">GET</EM> program from LWP (see <ACLASS="xref"HREF="ch20_01.htm"TITLE="Web Automation">Chapter 20, <CITECLASS="chapter">Web Automation</CITE></A>):</P><PRECLASS="programlisting">@ARGV = map { m#^\w+://# ? &quot;GET $_ |&quot; : $_ } @ARGV;while (&lt;&gt;) {    # .......} </PRE><PCLASS="para">You might prefer to fetch just the text, of course, not the HTML. That just means using a different command, perhaps <EMCLASS="emphasis">lynx -dump</EM>.</P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch16-pgfId-1244">Discussion</A></H3><PCLASS="para">As shown in <ACLASS="xref"HREF="ch16_02.htm"TITLE="Gathering Output from a Program">Recipe 16.1</A>, Perl's built-in <CODECLASS="literal">open</CODE> function is magical: you don't have to do anything special to get Perl to open a pipe instead of a file. (That's why it's sometimes called <EMCLASS="emphasis">magic open</EM><ACLASS="indexterm"NAME="ch16-idx-1000006307-0"></A><ACLASS="indexterm"NAME="ch16-idx-1000006307-1"></A><ACLASS="indexterm"NAME="ch16-idx-1000006307-2"></A><ACLASS="indexterm"NAME="ch16-idx-1000006307-3"></A> and, when applied to implicit ARGV processing, <EMCLASS="emphasis">magic ARGV</EM>.) If it looks like a pipe, Perl will open it like a pipe. We take advantage of this by rewriting certain filenames to include a decompression or other preprocessing stage. For example, the file <CODECLASS="literal">&quot;09tails.gz&quot;</CODE> becomes <CODECLASS="literal">&quot;gzcat</CODE> <CODECLASS="literal">-dc</CODE> <CODECLASS="literal">09tails.gz|&quot;</CODE>.</P><PCLASS="para">This technique has further applications. Suppose you wanted to read <EMCLASS="emphasis">/etc/passwd</EM> if the machine isn't using NIS, and the output of <EMCLASS="emphasis">ypcat passwd</EM> if it is. You'd use the output of the <EMCLASS="emphasis">domainname</EM> program to decide if you're running NIS, and then set the filename to open to be either <CODECLASS="literal">&quot;&lt;</CODE> <CODECLASS="literal">/etc/passwd&quot;</CODE> or <CODECLASS="literal">&quot;ypcat</CODE> <CODECLASS="literal">passwd|&quot;</CODE>:</P><PRECLASS="programlisting">$pwdinfo = `domainname` =~ /^(\(none\))?$/                ? '&lt; /etc/passwd'                : 'ypcat  passwd |';open(PWD, $pwdinfo)                 or die &quot;can't open $pwdinfo: $!&quot;;</PRE><PCLASS="para">The wonderful thing is that even if you didn't think to build such processing into your program, Perl already did it for you. Imagine a snippet of code like this:</P><PRECLASS="programlisting">print &quot;File, please? &quot;;chomp($file = &lt;&gt;);open (FH, $file)                    or die &quot;can't open $file: $!&quot;;</PRE><PCLASS="para">The user can enter a regular filename&nbsp;- or something like <CODECLASS="literal">&quot;webget</CODE> <CODECLASS="literal">http://www.perl.com</CODE> <CODECLASS="literal">|&quot;</CODE> instead&nbsp;- and your program would suddenly be reading from the output of some <EMCLASS="emphasis">webget</EM> program. They could even enter -, a lone minus sign, which, when opened for reading, interpolates standard input instead.</P><PCLASS="para">This also comes in handy with the automatic ARGV processing we saw in <ACLASS="xref"HREF="ch07_08.htm"TITLE="Writing a Filter">Recipe 7.7</A>.<ACLASS="indexterm"NAME="ch16-idx-1000006283-0"></A><ACLASS="indexterm"NAME="ch16-idx-1000006283-1"></A><ACLASS="indexterm"NAME="ch16-idx-1000006283-2"></A><ACLASS="indexterm"NAME="ch16-idx-1000006283-3"></A><ACLASS="indexterm"NAME="ch16-idx-1000006283-4"></A></P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch16-pgfId-1000005186">See Also</A></H3><PCLASS="para"><ACLASS="xref"HREF="ch07_08.htm"TITLE="Writing a Filter">Recipe 7.7</A>; <ACLASS="xref"HREF="ch16_05.htm"TITLE="Reading or Writing to Another Program">Recipe 16.4</A></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch16_06.htm"TITLE="16.5. Filtering Your Own Output"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 16.5. Filtering Your Own Output"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch16_08.htm"TITLE="16.7. Reading STDERR from a Program"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 16.7. Reading STDERR from a Program"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">16.5. Filtering Your Own Output</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">16.7. Reading STDERR from a Program</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -