⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch08_09.htm

📁 By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998
💻 HTM
字号:
<HTML><HEAD><TITLE>Recipe 8.8. Reading a Particular Line in a File (Perl Cookbook)</TITLE><METANAME="DC.title"CONTENT="Perl Cookbook"><METANAME="DC.creator"CONTENT="Tom Christiansen &amp; Nathan Torkington"><METANAME="DC.publisher"CONTENT="O'Reilly &amp; Associates, Inc."><METANAME="DC.date"CONTENT="1999-07-02T01:38:44Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-243-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="ch08_01.htm"TITLE="8. File Contents"><LINKREL="prev"HREF="ch08_08.htm"TITLE="8.7. Randomizing All Lines"><LINKREL="next"HREF="ch08_10.htm"TITLE="8.9. Processing Variable-Length Text Fields"></HEAD><BODYBGCOLOR="#FFFFFF"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl Cookbook"><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><p><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch08_08.htm"TITLE="8.7. Randomizing All Lines"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 8.7. Randomizing All Lines"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1"><ACLASS="chapter"REL="up"HREF="ch08_01.htm"TITLE="8. File Contents"></A></FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch08_10.htm"TITLE="8.9. Processing Variable-Length Text Fields"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 8.9. Processing Variable-Length Text Fields"BORDER="0"></A></TD></TR></TABLE></DIV><DIVCLASS="sect1"><H2CLASS="sect1"><ACLASS="title"NAME="ch08-22659">8.8. Reading a Particular Line in a File</A></H2><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch08-pgfId-702">Problem<ACLASS="indexterm"NAME="ch08-idx-1000004663-0"></A></A></H3><PCLASS="para">You want to extract a single line from a file.</P></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch08-pgfId-708">Solution</A></H3><PCLASS="para">The simplest solution is to read the lines until you get to the one you want:</P><PRECLASS="programlisting"># looking for line number $DESIRED_LINE_NUMBER$. = 0;do { $LINE = &lt;HANDLE&gt; } until $. == $DESIRED_LINE_NUMBER || eof;</PRE><PCLASS="para">If you are going to be doing this a lot and the file fits into memory, read the file into an array:</P><PRECLASS="programlisting">@lines = &lt;HANDLE&gt;;$LINE = $lines[$DESIRED_LINE_NUMBER];</PRE><PCLASS="para">If you will be retrieving lines by number often and the file doesn't fit into memory, build a byte-address index to let you <CODECLASS="literal">seek</CODE> directly to the start of the line:</P><PRECLASS="programlisting"># usage: build_index(*DATA_HANDLE, *INDEX_HANDLE)sub build_index {    my $data_file  = shift;    my $index_file = shift;    my $offset     = 0;    while (&lt;$data_file&gt;) {        print $index_file pack(&quot;N&quot;, $offset);        $offset = tell($data_file);    }}# usage: line_with_index(*DATA_HANDLE, *INDEX_HANDLE, $LINE_NUMBER)# returns line or undef if LINE_NUMBER was out of rangesub line_with_index {    my $data_file   = shift;    my $index_file  = shift;    my $line_number = shift;    my $size;               # size of an index entry    my $i_offset;           # offset into the index of the entry    my $entry;              # index entry    my $d_offset;           # offset into the data file    $size = length(pack(&quot;N&quot;, 0));    $i_offset = $size * ($line_number-1);    seek($index_file, $i_offset, 0) or return;    read($index_file, $entry, $size);    $d_offset = unpack(&quot;N&quot;, $entry);    seek($data_file, $d_offset, 0);    return scalar(&lt;$data_file&gt;);}# usage:open(FILE, &quot;&lt; $file&quot;)         or die &quot;Can't open $file for reading: $!\n&quot;;open(INDEX, &quot;+&gt;$file.idx&quot;)        or die &quot;Can't open $file.idx for read/write: $!\n&quot;;build_index(*FILE, *INDEX);$line = line_with_index(*FILE, *INDEX, $seeking);</PRE><PCLASS="para">If you have the DB_File module, its <CODECLASS="literal">DB_RECNO</CODE><ACLASS="indexterm"NAME="ch08-idx-1000004664-0"></A> access method ties an array to a file, one line per array element:</P><PRECLASS="programlisting">use DB_File;use Fcntl;$tie = tie(@lines, $FILE, &quot;DB_File&quot;, O_RDWR, 0666, $DB_RECNO) or die     &quot;Cannot open file $FILE: $!\n&quot;;# extract it$line = $lines[$sought-1];</PRE></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch08-pgfId-818">Discussion</A></H3><PCLASS="para">Each strategy has different features, useful in different circumstances. The linear access approach is easy to write and best for short files. The index method gives quick two-step lookup, but requires that the index be pre-built, so it is best when the file being indexed doesn't change often compared to the number of lookups. The DB_File mechanism has some initial overhead, but subsequent accesses are much faster than with linear access, so use it for long files that are accessed more than once and are accessed out of order.</P><PCLASS="para">It is important to know whether you're counting lines from 0 or 1. The <CODECLASS="literal">$.</CODE> variable is 1 after the first line is read, so count from 1 when using linear access. The index mechanism uses lots of offsets, so count from 0. DB_File treats the file's records as an array indexed from 0, so count lines from 0.</P><PCLASS="para">Here are three different implementations of the same program, <EMCLASS="emphasis">print_line</EM>. The program takes two arguments, a filename, and a line number to extract.</P><PCLASS="para">The version in <ACLASS="xref"HREF="ch08_09.htm#ch08-41197"TITLE="print_line-v1">Example 8.1</A> simply reads lines until it finds the one it's looking for.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch08-41197">Example 8.1: print_line-v1</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -w# <ACLASS="indexterm"NAME="ch08-idx-1000004824-0"></A>print_line-v1 - linear style@ARGV == 2 or die &quot;usage: print_line FILENAME LINE_NUMBER\n&quot;;($filename, $line_number) = @ARGV;open(INFILE, &quot;&lt; $filename&quot;) or die &quot;Can't open $filename for reading: $!\n&quot;;while (&lt;INFILE&gt;) {    $line = $_;    last if $. == $line_number;}if ($. != $line_number) {    die &quot;Didn't find line $line_number in $filename\n&quot;;}print;</PRE></DIV><PCLASS="para">The index version in <ACLASS="xref"HREF="ch08_09.htm#ch08-19472"TITLE="print_line-v2">Example 8.2</A> must build an index. For many lookups, you could build the index once and then use it for all subsequent lookups:</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch08-19472">Example 8.2: print_line-v2</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -w# print_line-v2 - index style# build_index and line_with_index from above@ARGV == 2 or    die &quot;usage: print_line FILENAME LINE_NUMBER&quot;;($filename, $line_number) = @ARGV;open(ORIG, &quot;&lt; $filename&quot;)         or die &quot;Can't open $filename for reading: $!&quot;;# open the index and build it if necessary# there's a race condition here: two copies of this# program can notice there's no index for the file and# try to build one.  This would be easily solved with# locking$indexname = &quot;$filename.index&quot;;sysopen(IDX, $indexname, O_CREAT|O_RDWR)         or die &quot;Can't open $indexname for read/write: $!&quot;;build_index(*ORIG, *IDX) if -z $indexname;  # XXX: race unless lock$line = line_with_index(*ORIG, *IDX, $line_number);die &quot;Didn't find line $line_number in $filename&quot; unless defined $line;print $line;</PRE></DIV><PCLASS="para">The DB_File version in <ACLASS="xref"HREF="ch08_09.htm#ch08-23822"TITLE="print_line-v3">Example 8.3</A> is indistinguishable from magic.</P><DIVCLASS="example"><H4CLASS="example"><ACLASS="title"NAME="ch08-23822">Example 8.3: print_line-v3</A></H4><PRECLASS="programlisting">#!/usr/bin/perl -w# print_line-v3 - DB_File styleuse DB_File;use Fcntl;@ARGV == 2 or    die &quot;usage: print_line FILENAME LINE_NUMBER\n&quot;;($filename, $line_number) = @ARGV;$tie = tie(@lines, &quot;DB_File&quot;, $filename, O_RDWR, 0666, $DB_RECNO)        or die &quot;Cannot open file $filename: $!\n&quot;;unless ($line_number &lt; $tie-&gt;length) {    die &quot;Didn't find line $line_number in $filename\n&quot;}print $lines[$line_number-1];                        # easy, eh?<ACLASS="indexterm"NAME="ch08-idx-1000004666-0"></A></PRE></DIV></DIV><DIVCLASS="sect2"><H3CLASS="sect2"><ACLASS="title"NAME="ch08-pgfId-944">See Also</A></H3><PCLASS="para">The documentation for the standard DB_File module (also in <ACLASS="olink"HREF="../prog/ch07_01.htm">Chapter 7</A> of <CITECLASS="citetitle">Programming Perl </CITE>); the <CODECLASS="literal">tie</CODE> function in <ICLASS="filename">perlfunc </I>(1) and in <ACLASS="olink"HREF="../prog/ch03_01.htm">Chapter 3</A> of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A>; the entry on <CODECLASS="literal">$.</CODE> in <ICLASS="filename">perlvar  </I>(1) and in the "Special Variables" section of Chatper 2 of <ACLASS="citetitle"HREF="../prog/index.htm"TITLE="Programming Perl"><CITECLASS="citetitle">Programming Perl</CITE></A></P></DIV></DIV><DIVCLASS="htmlnav"><P></P><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch08_08.htm"TITLE="8.7. Randomizing All Lines"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 8.7. Randomizing All Lines"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="book"HREF="index.htm"TITLE="Perl Cookbook"><IMGSRC="../gifs/txthome.gif"ALT="Perl Cookbook"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch08_10.htm"TITLE="8.9. Processing Variable-Length Text Fields"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 8.9. Processing Variable-Length Text Fields"BORDER="0"></A></TD></TR><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228">8.7. Randomizing All Lines</TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><ACLASS="index"HREF="index/index.htm"TITLE="Book Index"><IMGSRC="../gifs/index.gif"ALT="Book Index"BORDER="0"></A></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228">8.9. Processing Variable-Length Text Fields</TD></TR></TABLE><HRALIGN="LEFT"WIDTH="684"TITLE="footer"><FONTSIZE="-1"></DIV<!-- LIBRARY NAV BAR --> <img src="../gifs/smnavbar.gif" usemap="#library-map" border="0" alt="Library Navigation Links"><p> <a href="copyrght.htm">Copyright &copy; 2002</a> O'Reilly &amp; Associates. All rights reserved.</font> </p> <map name="library-map"> <area shape="rect" coords="1,0,85,94" href="../index.htm"><area shape="rect" coords="86,1,178,103" href="../lwp/index.htm"><area shape="rect" coords="180,0,265,103" href="../lperl/index.htm"><area shape="rect" coords="267,0,353,105" href="../perlnut/index.htm"><area shape="rect" coords="354,1,446,115" href="../prog/index.htm"><area shape="rect" coords="448,0,526,132" href="../tk/index.htm"><area shape="rect" coords="528,1,615,119" href="../cookbook/index.htm"><area shape="rect" coords="617,0,690,135" href="../pxml/index.htm"></map> </BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -