ch01_19.htm

来自「By Tom Christiansen and Nathan Torkingto」· HTM 代码 · 共 854 行 · 第 1/2 页
HTM
854 行
><B><CODECLASS="replaceable"><I>100100     0 10116  9564   0   0   1412   928 setup_frame T   p3  0:00 ssh -C www</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>100100     0 26560 26554   0   0   1076   572 setup_frame T   p2  0:00 less</I></CODE></B></CODE><CODECLASS="userinput"><B><CODECLASS="replaceable"><I>100000   101 19058  9562   0   0   1396   900 setup_frame T   p1  0:02 nvi /tmp/a</I></CODE></B></CODE></PRE></DIV><PCLASS="para">The <EMCLASS="emphasis">psgrep</EM> program integrates many techniques presented throughout this book. Stripping strings of leading and trailing whitespace is found in <ACLASS="xref"HREF="ch01_15.htm"TITLE="Trimming Blanks from the Ends of a String">Recipe 1.14</A>. Converting cut marks into an <CODECLASS="literal">unpack</CODE> format to extract fixed fields is in <ACLASS="xref"HREF="ch01_02.htm"TITLE="Accessing Substrings">Recipe 1.1</A>. Matching strings with regular expressions is the entire topic of <ACLASS="xref"HREF="ch06_01.htm"TITLE="Pattern Matching">Chapter 6</A>.</P><PCLASS="para">The multiline string in the here document passed to <CODECLASS="literal">die</CODE> is discussed in Recipes <ACLASS="xref"HREF="ch01_11.htm"TITLE="Interpolating Functions and Expressions Within Strings">Recipe 1.10</A> and <ACLASS="xref"HREF="ch01_12.htm"TITLE="Indenting Here Documents">Recipe 1.11</A>. The assignment to <CODECLASS="literal">@fields{@fieldnames}</CODE> sets many values at once in the hash named <CODECLASS="literal">%fields</CODE>. Hash slices are discussed in Recipes <ACLASS="xref"HREF="ch04_08.htm"TITLE="Finding Elements in One Array but Not Another">Recipe 4.7</A> and <ACLASS="xref"HREF="ch05_11.htm"TITLE="Merging Hashes">Recipe 5.10</A>.</P><PCLASS="para">The sample program input contained beneath <CODECLASS="literal">__END__</CODE> is described in <ACLASS="xref"HREF="ch07_07.htm"TITLE="Storing Files Inside Your Program Text">Recipe 7.6</A>. During development, we used canned input from the <CODECLASS="literal">DATA</CODE> filehandle for testing purposes. Once the program worked properly, we changed it to read from a piped-in <EMCLASS="emphasis">ps</EM> command but left a remnant of the original filter input to aid in future porting and maintenance. Launching other programs over a pipe is covered in <ACLASS="xref"HREF="ch16_01.htm"TITLE="Process Management and Communication">Chapter 16, <CITECLASS="chapter">Process Management and Communication</CITE></A>, including Recipes <ACLASS="xref"HREF="ch16_11.htm"TITLE="Communicating Between Related Processes">Recipe 16.10</A> and <ACLASS="xref"HREF="ch16_14.htm"TITLE="Listing Available Signals">Recipe 16.13</A>.</P><PCLASS="para">The real power and expressiveness in <EMCLASS="emphasis">psgrep</EM> derive from Perl's use of string arguments not as mere strings but directly as Perl code. This is similar to the technique in <ACLASS="xref"HREF="ch09_10.htm"TITLE="Renaming Files">Recipe 9.9</A>, except that in <EMCLASS="emphasis">psgrep</EM>, the user's arguments are wrapped with a routine called <CODECLASS="literal">is_desirable</CODE>. That way, the cost of compiling strings into Perl code happens only once, before the program whose output we'll process is even begun. For example, asking for UIDs under 10 creates this string to <CODECLASS="literal">eval</CODE>:</P><PRECLASS="programlisting">eval &quot;sub is_desirable { uid &lt; 10 } &quot; . 1;</PRE><PCLASS="para">The mysterious &quot;<CODECLASS="literal">.1</CODE>&quot; at the end is so that if the user code compiles, the whole <CODECLASS="literal">eval</CODE> returns true. That way we don't even have to check <CODECLASS="literal">$@</CODE> for compilation errors as we do in <ACLASS="xref"HREF="ch10_13.htm"TITLE="Handling Exceptions">Recipe 10.12</A>.</P><PCLASS="para">Specifying arbitrary Perl code in a filter to select records is a breathtakingly powerful approach, but it's not entirely original. Perl owes much to the <EMCLASS="emphasis">awk</EM> programming language, which is often used for such filtering. One problem with <EMCLASS="emphasis">awk</EM> is that it can't easily treat input as fixed-size fields instead of fields separated by something. Another is that the fields are not mnemonically named: <EMCLASS="emphasis">awk</EM> uses <CODECLASS="literal">$1</CODE>, <CODECLASS="literal">$2</CODE>, etc. Plus Perl can do much that <EMCLASS="emphasis">awk</EM> cannot.</P><PCLASS="para">The user criteria don't even have to be simple expressions. For example, this call initializes a variable <CODECLASS="literal">$id</CODE> to user <EMCLASS="emphasis">nobody </EM>'s number to use later in its expression:</P><PRECLASS="programlisting">% psgrep 'no strict &quot;vars&quot;;          BEGIN { $id = getpwnam(&quot;nobody&quot;) }          uid == $id '</PRE><PCLASS="para">How can we use unquoted words without even a dollar sign, like <CODECLASS="literal">uid</CODE>, <CODECLASS="literal">command</CODE>, and <CODECLASS="literal">size</CODE>, to represent those respective fields in each input record? We directly manipulate the symbol table by assigning closures to indirect <ACLASS="indexterm"NAME="ch01-idx-1000011522-0"></A>typeglobs, which creates functions with those names. The function names are created using both uppercase and lowercase names, allowing both &quot;<CODECLASS="literal">UID</CODE> <CODECLASS="literal">&lt;</CODE> <CODECLASS="literal">10</CODE>&quot; and &quot;<CODECLASS="literal">uid</CODE> <CODECLASS="literal">&lt;</CODE> <CODECLASS="literal">10</CODE>&quot;. Closures are described in <ACLASS="xref"HREF="ch11_05.htm"TITLE="Taking References to Functions">Recipe 11.4</A>, and assigning them to typeglobs to create function aliases is shown in <ACLASS="xref"HREF="ch10_15.htm"TITLE="Redefining a Function">Recipe 10.14</A>.</P><PCLASS="para">One twist here not seen in those recipes is empty parentheses on the closure. These allowed us to use the function in an expression anywhere we'd use a single term, like a string or a numeric constant. It creates a void prototype so the field-accessing function named <CODECLASS="literal">uid</CODE> accepts no arguments, just like the built-in function <CODECLASS="literal">time</CODE>. If these functions weren't prototyped void, expressions like &quot;<CODECLASS="literal">uid</CODE> <CODECLASS="literal">&lt;</CODE> <CODECLASS="literal">10</CODE>&quot; or &quot;<CODECLASS="literal">size</CODE> <CODECLASS="literal">/</CODE> <CODECLASS="literal">2</CODE> <CODECLASS="literal">&gt;</CODE> <CODECLASS="literal">rss</CODE>&quot; would confuse the parser because it would see the unterminated start of a wildcard glob and of a pattern match, respectively. Prototypes are discussed in <ACLASS="xref"HREF="ch10_12.htm"TITLE="Prototyping Functions">Recipe 10.11</A>.</P><PCLASS="para">The version of <EMCLASS="emphasis">psgrep</EM> demonstrated here expects the output from Red Hat Linux's <EMCLASS="emphasis">ps</EM>. To port to other systems, look at which columns the headers begin at. This approach isn't relevant only to <EMCLASS="emphasis">ps</EM> or only to Unix systems. It's a generic technique for filtering input records using Perl expressions, easily adapted to other record layouts. The input format could be in columns, space separated, comma separated, or the result of a pattern match with capturing parentheses.</P><PCLASS="para">The program could even be modified to handle a user-defined database with a small change to the selection functions. If you had an array of records as described in <ACLASS="xref"HREF="ch11_10.htm"TITLE="Constructing Records">Recipe 11.9</A>, you could let users specify arbitrary selection criteria, such as:</P><PRECLASS="programlisting">sub id()         { $_-&gt;{ID}   }sub title()      { $_-&gt;{TITLE} }sub executive()  { title =~ /(?:vice-)?president/i }# user search criteria go in the grep clause@slowburners = grep { id &lt; 10 &amp;&amp; !executive } @employees;</PRE><PCLASS="para">For reasons of security and performance, this kind of power is seldom found in database engines like those described in <ACLASS="xref"HREF="ch14_01.htm"TITLE="Database Access">Chapter 14, <CITECLASS="chapter">Database Access</CITE></A>. SQL doesn't support this, but given Perl and small bit of ingenuity, it's easy to roll it up on your own. The search engine at <ACLASS="systemitem.url"HREF="http://mox.perl.com/cgi-bin/MxScreen ">http://mox.perl.com/cgi-bin/MxScreen </A>uses such a technique, but instead of output from <EMCLASS="emphasis">ps</EM>, its records are Perl hashes loaded from a database. <ACLASS="indexterm"NAME="ch01-idx-1000010111-0"></A><ACLASS="indexterm"NAME="ch01-idx-1000010111-1"></A><ACLASS="indexterm"NAME="ch01-idx-1000010111-2"></A></P></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch01_18.htm"TITLE="1.17. Program: fixstyle"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 1.17. Program: fixstyle"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="chapter"HREF="ch02_01.htm"TITLE="2. Numbers"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 2. Numbers"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">1.17. Program: fixstyle</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">2. Numbers</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>
ch01_19.htm - 源码说明

本页面展示了「By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998」中的 ch01_19.htm 源码文件，采用 HTM 编程语言编写，共 854 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与Christiansen相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?