ch08_01.htm

来自「By Tom Christiansen and Nathan Torkingto」· HTM 代码 · 共 828 行 · 第 1/2 页
HTM
828 行
<HTML><HEAD><METANAME="DC.title"CONTENT="Perl Cookbook"><METANAME="DC.creator"CONTENT="Tom Christiansen &amp; Nathan Torkington"><METANAME="DC.publisher"CONTENT="O'Reilly &amp; Associates, Inc."><METANAME="DC.date"CONTENT="1999-07-02T01:38:04Z"><METANAME="DC.type"CONTENT="Text.Monograph"><METANAME="DC.format"CONTENT="text/html"SCHEME="MIME"><METANAME="DC.source"CONTENT="1-56592-243-3"SCHEME="ISBN"><METANAME="DC.language"CONTENT="en-US"><METANAME="generator"CONTENT="Jade 1.1/O'Reilly DocBook 3.0 to HTML 4.0"><LINKREV="made"HREF="mailto:online-books@oreilly.com"TITLE="Online Books Comments"><LINKREL="up"HREF="index.htm"TITLE="Perl Cookbook"><LINKREL="prev"HREF="ch07_23.htm"TITLE="7.22. Program: lockarea"><LINKREL="next"HREF="ch08_02.htm"TITLE="8.1. Reading Lines with Continuation Characters"></HEAD><BODYBGCOLOR="#FFFFFF"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Perl Cookbook"><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><p><TABLEWIDTH="684"BORDER="0"CELLSPACING="0"CELLPADDING="0"><TR><TDALIGN="LEFT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch07_23.htm"TITLE="7.22. Program: lockarea"><IMGSRC="../gifs/txtpreva.gif"ALT="Previous: 7.22. Program: lockarea"BORDER="0"></A></TD><TDALIGN="CENTER"VALIGN="TOP"WIDTH="228"><B><FONTFACE="ARIEL,HELVETICA,HELV,SANSERIF"SIZE="-1"></FONT></B></TD><TDALIGN="RIGHT"VALIGN="TOP"WIDTH="228"><ACLASS="sect1"HREF="ch08_02.htm"TITLE="8.1. Reading Lines with Continuation Characters"><IMGSRC="../gifs/txtnexta.gif"ALT="Next: 8.1. Reading Lines with Continuation Characters"BORDER="0"></A></TD></TR></TABLE></DIV><DIVCLASS="chapter"><H1CLASS="chapter"><ACLASS="title"NAME="ch08-11143">8. File Contents</A></H1><DIVCLASS="htmltoc"><P><B>Contents:</B><BR><ACLASS="sect1"HREF="#ch08-23799"TITLE="8.0. Introduction">Introduction</A><BR><ACLASS="sect1"HREF="ch08_02.htm"TITLE="8.1. Reading Lines with Continuation Characters">Reading Lines with Continuation Characters</A><BR><ACLASS="sect1"HREF="ch08_03.htm"TITLE="8.2. Counting Lines (or Paragraphs or Records) in a File">Counting Lines (or Paragraphs or Records) in a File</A><BR><ACLASS="sect1"HREF="ch08_04.htm"TITLE="8.3. Processing Every Word in a File">Processing Every Word in a File</A><BR><ACLASS="sect1"HREF="ch08_05.htm"TITLE="8.4. Reading a File Backwards by Line or Paragraph">Reading a File Backwards by Line or Paragraph</A><BR><ACLASS="sect1"HREF="ch08_06.htm"TITLE="8.5. Trailing a Growing File">Trailing a Growing File</A><BR><ACLASS="sect1"HREF="ch08_07.htm"TITLE="8.6. Picking a Random Line from a File">Picking a Random Line from a File</A><BR><ACLASS="sect1"HREF="ch08_08.htm"TITLE="8.7. Randomizing All Lines">Randomizing All Lines</A><BR><ACLASS="sect1"HREF="ch08_09.htm"TITLE="8.8. Reading a Particular Line in a File">Reading a Particular Line in a File</A><BR><ACLASS="sect1"HREF="ch08_10.htm"TITLE="8.9. Processing Variable-Length Text Fields">Processing Variable-Length Text Fields</A><BR><ACLASS="sect1"HREF="ch08_11.htm"TITLE="8.10. Removing the Last Line of a File">Removing the Last Line of a File</A><BR><ACLASS="sect1"HREF="ch08_12.htm"TITLE="8.11. Processing Binary Files">Processing Binary Files</A><BR><ACLASS="sect1"HREF="ch08_13.htm"TITLE="8.12. Using Random-Access I/O">Using Random-Access I/O</A><BR><ACLASS="sect1"HREF="ch08_14.htm"TITLE="8.13. Updating a Random-Access File">Updating a Random-Access File</A><BR><ACLASS="sect1"HREF="ch08_15.htm"TITLE="8.14. Reading a String from a Binary File">Reading a String from a Binary File</A><BR><ACLASS="sect1"HREF="ch08_16.htm"TITLE="8.15. Reading Fixed-Length Records">Reading Fixed-Length Records</A><BR><ACLASS="sect1"HREF="ch08_17.htm"TITLE="8.16. Reading Configuration Files">Reading Configuration Files</A><BR><ACLASS="sect1"HREF="ch08_18.htm"TITLE="8.17. Testing a File for Trustworthiness">Testing a File for Trustworthiness</A><BR><ACLASS="sect1"HREF="ch08_19.htm"TITLE="8.18. Program: tailwtmp">Program: tailwtmp</A><BR><ACLASS="sect1"HREF="ch08_20.htm"TITLE="8.19. Program: tctee">Program: tctee</A><BR><ACLASS="sect1"HREF="ch08_21.htm"TITLE="8.20. Program: laston">Program: laston</A></P><P></P></DIV><DIVCLASS="epigraph"ALIGN="right"><PCLASS="para"ALIGN="right"><I>The most brilliant decision in all of Unix was the choice of a single character for the newline sequence.</I></P><PCLASS="attribution"ALIGN="right">-&nbsp;Mike O'Dell, only half jokingly </P></DIV><DIVCLASS="sect1"><H2CLASS="sect1"><ACLASS="title"NAME="ch08-23799">8.0. Introduction</A></H2><PCLASS="para"><ACLASS="indexterm"NAME="ch08-idx-1000004574-0"></A>Before the Unix Revolution, every kind of data source and destination was inherently different. Getting two programs merely to understand each other required heavy wizardry and the occasional sacrifice of a virgin stack of punch cards to an itinerant mainframe repairman. This computational Tower of Babel made programmers dream of quitting the field to take up a less painful hobby, like autoflagellation.</P><PCLASS="para">These days, such cruel and unusual programming is largely behind us. Modern operating systems work hard to provide the illusion that I/O devices, network connections, process control information, other programs, the system console, and even users' terminals are all abstract streams of bytes called <EMCLASS="emphasis">files</EM>. This lets you easily write programs that don't care where their input came from or where their output goes.</P><PCLASS="para">Because programs read and write via byte streams of simple text, every program can communicate with every other program. It is difficult to overstate the power and elegance of this approach. No longer dependent upon troglodyte gnomes with secret tomes of JCL (or COM) incantations, users can now create custom tools from smaller ones by using simple command-line I/O redirection, pipelines, and backticks.</P><PCLASS="para">Treating files as unstructured byte streams necessarily governs what you can do with them. You can read and write sequential, fixed-size blocks of data at any location in the file, increasing its size if you write past the current end. Perl uses the standard C I/O library to implement reading and writing of variable-length records like lines, paragraphs, and words.</P><PCLASS="para">What can't you do to an unstructured file? Because you can't insert or delete bytes anywhere but at end of file, you can't change the length of, insert, or delete records. An exception is the last record, which you can delete by truncating the file to the end of the previous record. For other modifications, you need to use a temporary file or work with a copy of the file in memory. If you need to do this a lot, a database system may be a better solution than a raw file (see <ACLASS="xref"HREF="ch14_01.htm"TITLE="Database Access">Chapter 14, <CITECLASS="chapter">Database Access</CITE></A>).</P><PCLASS="para">The most common files are text files, and the most common operations on text files are reading and writing lines. <ACLASS="indexterm"NAME="ch08-idx-1000004596-0"></A><ACLASS="indexterm"NAME="ch08-idx-1000004596-1"></A>Use <CODECLASS="literal">&lt;FH&gt;</CODE> (or the internal function implementing it, <CODECLASS="literal">readline</CODE>) to read lines, and use <CODECLASS="literal">print</CODE> to write them. These functions can also be used to read or write any record that has a specific record separator. Lines are simply records that end in <CODECLASS="literal">&quot;\n&quot;</CODE>.</P><PCLASS="para">The <CODECLASS="literal">&lt;FH&gt;</CODE> operator returns <CODECLASS="literal">undef</CODE> on error or when end of the file is reached, so use it in loops like this:</P><PRECLASS="programlisting">while (defined ($line = &lt;DATAFILE&gt;)) {    chomp $line;    $size = length $line;    print &quot;$size\n&quot;;                # output size of line}</PRE><PCLASS="para">Because this is a common operation and that's a lot to type, Perl gives it a shorthand notation. This shorthand reads lines into <CODECLASS="literal">$_</CODE> instead of <CODECLASS="literal">$line</CODE>. Many other string operations use <CODECLASS="literal">$_</CODE> as a default value to operate on, so this is more useful than it may appear at first:</P><PRECLASS="programlisting">while (&lt;DATAFILE&gt;) {    chomp;    print length, &quot;\n&quot;;             # output size of line}</PRE><PCLASS="para">Call <CODECLASS="literal">&lt;FH&gt;</CODE> in scalar context to read the next line. Call it in list context to read all remaining lines:</P><PRECLASS="programlisting">@lines = &lt;DATAFILE&gt;;</PRE><PCLASS="para">Each time <CODECLASS="literal">&lt;FH&gt;</CODE> reads a record from a filehandle, it increments the special variable <CODECLASS="literal">$.</CODE> (the "current input record number"). This variable is only reset when <CODECLASS="literal">close</CODE> is called explicitly, which means that it's not reset when you reopen an already opened filehandle.</P><PCLASS="para">Another special variable is <CODECLASS="literal">$/</CODE><ACLASS="indexterm"NAME="ch08-idx-1000004604-0"></A>, the input record separator. It is set to <CODECLASS="literal">&quot;\n&quot;</CODE>, the default end-of-line marker. You can set it to any string you like, for instance <CODECLASS="literal">&quot;\0&quot;</CODE> to read null-terminated records. Read paragraphs by setting <CODECLASS="literal">$/</CODE> to the empty string, <CODECLASS="literal">&quot;&quot;</CODE>. This is almost like setting <CODECLASS="literal">$/</CODE> to <CODECLASS="literal">&quot;\n\n&quot;</CODE>, in that blank lines function as record separators, but <CODECLASS="literal">&quot;&quot;</CODE> treats two or more consecutive empty lines as a single record separator, whereas <CODECLASS="literal">&quot;\n\n&quot;</CODE> returns empty records when more than two consecutive empty lines are read. Undefine <CODECLASS="literal">$/</CODE> to read the rest of the file as one scalar:</P><PRECLASS="programlisting">undef $/;$whole_file = &lt;FILE&gt;;               # 'slurp' mode</PRE><PCLASS="para">The<ACLASS="indexterm"NAME="ch08-idx-1000004605-0"></A> <BCLASS="emphasis.bold">-0</B> option to Perl lets you set <CODECLASS="literal">$/</CODE> from the command line:</P><PRECLASS="programlisting">% perl -040 -e '$word = &lt;&gt;; print &quot;First word is $word\n&quot;;'</PRE><PCLASS="para">The digits after <BCLASS="emphasis.bold">-0</B> are the octal value of the single character that <CODECLASS="literal">$/</CODE> is to be set to. If you specify an illegal value (e.g., with <BCLASS="emphasis.bold">-0777</B>) Perl will set <CODECLASS="literal">$/</CODE> to <CODECLASS="literal">undef</CODE>. If you specify <BCLASS="emphasis.bold"
ch08_01.htm - 源码说明

本页面展示了「By Tom Christiansen and Nathan Torkington ISBN 1-56592-243-3 First Edition, published August 1998」中的 ch08_01.htm 源码文件，采用 HTM 编程语言编写，共 828 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与Christiansen相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?