⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch06_09.htm

📁 By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998
💻 HTM
字号:
<HTML><HEAD><TITLE>Recipe 6.8. Extracting a Range of Lines (Perl Cookbook)</TITLE><METANAME="DC.title"CONTENT="Perl Cookbook"><METANAME="DC.creator"CONTENT="Tom Christiansen &amp; Nathan Torkington"><METANAME="DC.publisher"CONTENT="O'Reilly &amp; Associates, Inc."><METANAME="DC.date"CONTENT="1999-07-02T01:34:19Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-243-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch06_01.htm"TITLE="6. Pattern Matching"><LINKREL="prev"HREF="ch06_08.htm"TITLE="6.7. Reading Records with a Pattern Separator"><LINKREL="next"HREF="ch06_10.htm"TITLE="6.9. Matching Shell Globs as Regular Expressions"></HEAD><BODYBGCOLOR="#FFFFFF"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl Cookbook"><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><p><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch06_08.htm"TITLE="6.7. Reading Records with a Pattern Separator"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 6.7. Reading Records with a Pattern Separator"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1"><ACLASS="chapter"REL="up"HREF="ch06_01.htm"TITLE="6. Pattern Matching"></A></FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch06_10.htm"TITLE="6.9. Matching Shell Globs as Regular Expressions"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 6.9. Matching Shell Globs as Regular Expressions"BORDER="0"></A></TD></TR></TABLE></DIV><DIVCLASS="sect1"><H2CLASS="sect1"><ACLASS="title"NAME="ch06-chap06_extracting_0">6.8. Extracting a Range of Lines</A></H2><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch06-pgfId-921">Problem<ACLASS="indexterm"NAME="ch06-idx-1000007599-0"></A><ACLASS="indexterm"NAME="ch06-idx-1000007599-1"></A><ACLASS="indexterm"NAME="ch06-idx-1000007599-2"></A></A></H3><PCLASS="para">You want to extract all lines from one starting pattern through an ending pattern or from a starting line number up to an ending line number.</P><PCLASS="para">A common example of this is extracting the first 10 lines of a file (line numbers 1 to 10) or just the body of a mail message (everything past the blank line).</P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch06-pgfId-929">Solution</A></H3><PCLASS="para"><ACLASS="indexterm"NAME="ch06-idx-1000007605-0"></A><ACLASS="indexterm"NAME="ch06-idx-1000007605-1"></A>Use the operators <CODECLASS="literal">..</CODE> or <CODECLASS="literal">...</CODE> with patterns or line numbers. The operator <CODECLASS="literal">...</CODE> doesn't return true if both its tests are true on the same line, but <CODECLASS="literal">..</CODE> does.</P><PRECLASS="programlisting">while (&lt;&gt;) {    if (/BEGIN PATTERN/ .. /END PATTERN/) {        # line falls between BEGIN and END in the        # text, inclusive.    }}while (&lt;&gt;) {    if ($FIRST_LINE_NUM .. $LAST_LINE_NUM) {        # operate only between first and last line, inclusive.    }}</PRE><PCLASS="para">The <CODECLASS="literal">...</CODE> operator doesn't test both conditions at once if the first one is true.</P><PRECLASS="programlisting">while (&lt;&gt;) {    if (/BEGIN PATTERN/ ... /END PATTERN/) {        # line is between BEGIN and END on different lines    }}while (&lt;&gt;) {    if ($FIRST_LINE_NUM ... $LAST_LINE_NUM) {        # operate only between first and last line, but not same    }}</PRE></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch06-pgfId-985">Discussion</A></H3><PCLASS="para">The range operators, <CODECLASS="literal">..</CODE> and <CODECLASS="literal">...</CODE>, are probably the least understood of Perl's myriad operators. They were designed to allow easy extraction of ranges of lines without forcing the programmer to retain explicit state information. When used in a scalar sense, such as in the test of <CODECLASS="literal">if</CODE> and <CODECLASS="literal">while</CODE> statements, these operators return a true or false value that's partially dependent on what they last returned. The expression <CODECLASS="literal">left_operand</CODE> <CODECLASS="literal">..</CODE> <CODECLASS="literal">right_operand</CODE> returns false until <CODECLASS="literal">left_operand</CODE> is true, but once that test has been met, it stops evaluating <CODECLASS="literal">left_operand</CODE> and keeps returning true until <CODECLASS="literal">right_operand</CODE> becomes true, after which it restarts the cycle. To put it another way, the first operand turns on the construct as soon as it returns a true value, whereas the second one turns it off as soon as <EMCLASS="emphasis">it</EM> returns true.</P><PCLASS="para">These conditions are absolutely arbitrary. In fact, you could write <CODECLASS="literal">mytestfunc1()</CODE> <CODECLASS="literal">..</CODE> <CODECLASS="literal">mytestfunc2()</CODE>, although in practice this is seldom done. Instead, the range operators are usually used either with line numbers as operands (the first example), patterns as operands (the second example), or both.</P><PRECLASS="programlisting"># command-line to print lines 15 through 17 inclusive (see below)perl -ne 'print if 15 .. 17' datafile# print out all &lt;XMP&gt; .. &lt;/XMP&gt; displays from HTML docwhile (&lt;&gt;) {    print if m#&lt;XMP&gt;#i .. m#&lt;/XMP&gt;#i;}    # same, but as shell command% perl -ne 'print if m#&lt;XMP&gt;#i .. m#&lt;/XMP&gt;#i' document.html</PRE><PCLASS="para">If either operand is a numeric literal, the range operators implicitly compare against the <CODECLASS="literal">$.</CODE> variable (<CODECLASS="literal">$NR</CODE> or <CODECLASS="literal">$INPUT_LINE_NUMBER</CODE> if you <CODECLASS="literal">use</CODE> <CODECLASS="literal">English</CODE>). Be careful with implicit line number comparisons here. You must specify literal numbers in your code, not variables containing line numbers. That means you can simply say <CODECLASS="literal">3</CODE> <CODECLASS="literal">..</CODE> <CODECLASS="literal">5</CODE> in a conditional, but not <CODECLASS="literal">$n</CODE> <CODECLASS="literal">..</CODE> <CODECLASS="literal">$m</CODE> where <CODECLASS="literal">$n</CODE> and <CODECLASS="literal">$m</CODE> are 3 and 5 respectively. You have to be more explicit and test the <CODECLASS="literal">$.</CODE> variable directly.</P><PRECLASS="programlisting">perl -ne 'BEGIN { $top=3; $bottom=5 }  print if $top .. $bottom' /etc/passwd        # previous command FAILSperl -ne 'BEGIN { $top=3; $bottom=5 } \    print if $. == $top .. $. ==     $bottom' /etc/passwd    # worksperl -ne 'print if 3 .. 5' /etc/passwd   # also works</PRE><PCLASS="para">The difference between <CODECLASS="literal">..</CODE> and <CODECLASS="literal">...</CODE> is their behavior when both operands can be true on the same line. Consider these two cases:</P><PRECLASS="programlisting">print if /begin/ .. /end/;print if /begin/ ... /end/;</PRE><PCLASS="para">Given the line <CODECLASS="literal">&quot;You</CODE> <CODECLASS="literal">may</CODE> <CODECLASS="literal">not</CODE> <CODECLASS="literal">end</CODE> <CODECLASS="literal">ere</CODE> <CODECLASS="literal">you</CODE> <CODECLASS="literal">begin&quot;</CODE>, both the double- and triple-dot versions of the range operator above return true. But the code using <CODECLASS="literal">..</CODE> will not print any further lines. That's because <CODECLASS="literal">..</CODE> tests both conditions on the same line once the first test matches, and the second test tells it that it's reached the end of its region. On the other hand, <CODECLASS="literal">...</CODE> will continue until the <EMCLASS="emphasis">next</EM> line that matches <CODECLASS="literal">/end/</CODE> because it never tries to test both operands on the same time.</P><PCLASS="para">You may mix and match conditions of different sorts, as in:</P><PRECLASS="programlisting">while (&lt;&gt;) {    $in_header =   1  .. /^$/;    $in_body   = /^$/ .. eof();}</PRE><PCLASS="para">The first assignment sets <CODECLASS="literal">$in_header</CODE> to be true from the first input line until after the blank line separating the header, such as from a mail message, a news posting, or even an HTTP header. (Technically speaking, an HTTP header should have both linefeeds and carriage returns as network line terminators, but in practice, servers are liberal in what they accept.) The second assignment sets <CODECLASS="literal">$in_body</CODE> to be true starting as soon as the first blank line is encountered, up through end of file. Because range operators do not retest their initial condition, any further blank lines (such as those between paragraphs) won't be noticed.</P><PCLASS="para">Here's an example. It reads files containing mail messages and prints addresses it finds in headers. Each address is printed only once. The extent of the header is from a line beginning with a <CODECLASS="literal">&quot;From</CODE>:<CODECLASS="literal">&quot;</CODE> up through the first blank line. If we're not within that range, go on to the next line. This isn't an RFC-822 notion of an address, but it's easy to write.</P><PRECLASS="programlisting">%seen = ();while (&lt;&gt;) {    next unless /^From:?\s/i .. /^$/;    while (/([^&lt;&gt;(),;\s]+\@[^&lt;&gt;(),;\s]+)/g) {        print &quot;$1\n&quot; unless $seen{$1}++;    }}</PRE><PCLASS="para">If this all range business seems mighty strange, chalk it up to trying to support the <EMCLASS="emphasis">s2p</EM> and <EMCLASS="emphasis">a2p</EM> translators for converting <EMCLASS="emphasis">sed</EM> and <EMCLASS="emphasis">awk</EM> code into Perl. Both those tools have range operators that must work in Perl.<ACLASS="indexterm"NAME="ch06-idx-1000007601-0"></A><ACLASS="indexterm"NAME="ch06-idx-1000007601-1"></A><ACLASS="indexterm"NAME="ch06-idx-1000007601-2"></A></P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch06-pgfId-1059">See Also</A></H3><PCLASS="para">The <CODECLASS="literal">..</CODE> and <CODECLASS="literal">...</CODE> operators in the "Range Operator" sections of <ICLASS="filename">perlop</I> (1) and <ACLASS="olink"HREF="../prog/ch02_01.htm">Chapter 2</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A>; the entry for <CODECLASS="literal">$NR</CODE> in <ICLASS="filename">perlvar</I> (1) and the <ACLASS="olink"HREF="../prog/ch02_09.htm">"Special Variables"</A> section of <ACLASS="olink"HREF="../prog/ch02_01.htm">Chapter 2</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch06_08.htm"TITLE="6.7. Reading Records with a Pattern Separator"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 6.7. Reading Records with a Pattern Separator"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch06_10.htm"TITLE="6.9. Matching Shell Globs as Regular Expressions"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 6.9. Matching Shell Globs as Regular Expressions"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">6.7. Reading Records with a Pattern Separator</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">6.9. Matching Shell Globs as Regular Expressions</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -