📄 ch16_04.htm
字号:
<html><head><title>Variable-length (Text) Databases (Learning Perl, 3rd Edition)</title><link rel="stylesheet" type="text/css" href="../style/style1.css" /><meta name="DC.Creator" content="Randal L. Schwartz and Tom Phoenix" /><meta name="DC.Format" content="text/xml" scheme="MIME" /><meta name="DC.Language" content="en-US" /><meta name="DC.Publisher" content="O'Reilly & Associates, Inc." /><meta name="DC.Source" scheme="ISBN" content="0596001320L" /><meta name="DC.Subject.Keyword" content="stuff" /><meta name="DC.Title" content="Learning Perl, 3rd Edition" /><meta name="DC.Type" content="Text.Monograph" /></head><body bgcolor="#ffffff"><img alt="Book Home" border="0" src="gifs/smbanner.gif" usemap="#banner-map" /><map name="banner-map"><area shape="rect" coords="1,-2,616,66" href="index.htm" alt="Learning Perl, 3rd Edition" /><area shape="rect" coords="629,-11,726,25" href="jobjects/fsearch.htm" alt="Search this book" /></map><div class="navbar"><table width="684" border="0"><tr><td align="left" valign="top" width="228"><a href="ch16_03.htm"><img alt="Previous" border="0" src="../gifs/txtpreva.gif" /></a></td><td align="center" valign="top" width="228"><a href="index.htm"></a></td><td align="right" valign="top" width="228"><a href="ch16_05.htm"><img alt="Next" border="0" src="../gifs/txtnexta.gif" /></a></td></tr></table></div><h2 class="sect1">16.4. Variable-length (Text) Databases</h2><p><a name="INDEX-1067" /> <a name="INDEX-1068" /> <a name="INDEX-1069" />Many simple databases are merely textfiles written in a format that allows a program to read and maintainthem. For example, a configuration file for some program might be atext file, with one configuration parameter being set on each line.Or maybe the file is a mailing list, with one name and address oneach line (probably with the components of the name and addressseparated by tab characters).</p><p>Updating text files is more difficult than it probably seems atfirst. But that's only because we're used to seeing textfiles rendered as pages (or screens) of text. If you could see thefile as it is written in the filesystem, the difficulty is moreapparent. Since we can't show you the file as it'sactually written without opening up a disk drive, here's ourrendition of a piece of a text file<a href="#FOOTNOTE-356">[356]</a>:</p><blockquote class="footnote"> <a name="FOOTNOTE-356" /><p>[356]Of course, thereal file wouldn't have lines at all; it's one longstream of text. And the newline character should really be asingle-character code. But these differences don't hurt this asan example.</p> </blockquote><blockquote><pre class="code">He had bought a large map representing the sea,\n Without the least vestige of land:\nAnd the crew were much pleased when they found it to be\n A map they could all understand.\n\n"What's the good of Mercator's North Poles and Equators,\n Tropics, Zones, and Meridian Lines?"\nSo the Bellman would cry: and the crew would reply\n "They are merely conventional signs!\n\n"Other maps are such shapes, with their islands and capes!\n But we've got our brave Captain to thank:"\n(So the crew would protest) "that he's bought us the best-\n A perfect and absolute blank!"\n\n</pre></blockquote><p>If you had this file open in your text editor, it would be easy tochange a word, add a comma, or fix a misspelling. If your editor ispowerful enough, in fact, you could change the indentation of eachline with a single command. But the text file is a stream of bytes;if you wanted to add even a single comma, the remainder of the textfile (possibly thousands or millions of bytes) would have to moveover to make room. Nearly every tiny change would mean lots of slowcopying operations on the file. So how can we edit the fileefficiently?</p><p>The most common way of programmatically updating a text file is bywriting an entirely new file that looks similar to the old one, butmaking whatever changes we need as we go along. As you'll see,this technique gives nearly the same result as updating the fileitself, but it has some beneficial side effects as well.</p><p>In this example, we've got hundreds of files with a similarformat. One of them is <em class="filename">fred03.dat</em>, andit's full of lines like these:</p><blockquote><pre class="code">Program name: graniteAuthor: Gilbert BatesCompany: RockSoftDepartment: R&DPhone: +1 503 555-0095Date: Tues March 9, 1999Version: 2.1Size: 21kStatus: Final beta</pre></blockquote><p>We need to fix this file so that it has some different information.Here's roughly what this one should look like when we'redone:</p><blockquote><pre class="code">Program name: graniteAuthor: Randal L. SchwartzCompany: RockSoftDepartment: R&DDate: June 12, 2002 6:38 pmVersion: 2.1Size: 21kStatus: Final beta</pre></blockquote><p>In short, we need to make three changes. The name of the<tt class="literal">Author</tt> should be changed; the<tt class="literal">Date</tt> should be updated to today's date, andthe <tt class="literal">Phone</tt> should be removed completely. And wehave to make these changes in hundreds of similar files as well.</p><p>Perl supports a way of in-place editing of<a name="INDEX-1070" />files with a little extra help fromthe <a name="INDEX-1071" />diamond operator("<tt class="literal"><></tt>"). Here's a programto do what we want, although it may not be obvious how it works atfirst. This program's only new feature is the special variable<tt class="literal">$^I</tt>; ignore that for now, and we'll comeback to it:</p><blockquote><pre class="code">#!/usr/bin/perl -wuse strict;chomp(my $date = `date`);@ARGV = glob "fred*.dat" or die "no files found";$^I = ".bak";while (<>) { s/^Author:.*/Author: Randal L. Schwartz/; s/^Phone:.*\n//; s/^Date:.*/Date: $date/; print;}</pre></blockquote><p>Since we need today's date, the program starts by using thesystem <i class="command">date</i><a name="INDEX-1072" /> command. A better way to get the date(in a slightly different format) would almost surely be to usePerl's own<tt class="literal">localtime</tt><a name="INDEX-1073" /> function in a scalar context:</p><blockquote><pre class="code">my $date = localtime;</pre></blockquote><p>To get the list of files for the diamond operator, we read them froma glob. The next line sets <tt class="literal">$^I</tt>, but keep ignoringthat for the moment.</p><p>The main loop reads, updates, and prints one line at a time. (Withwhat you know so far, that means that all of the files' newlymodified contents will be dumped to your terminal, scrollingfuriously past your eyes, without the files being changed at all. Butstick with us.) Note that the second substitution can replace theentire line containing the phone number with an emptystring -- leaving not even a newline -- so when that'sprinted, nothing comes out, and it's as if the<tt class="literal">Phone</tt> never existed. Most input lines won'tmatch any of the three patterns, and those will be unchanged in theoutput.</p><p>So this result is close to what we want, except that we haven'tshown you how the updated information gets back out on to the disk.The answer is in the variable<tt class="literal">$^I</tt><a name="INDEX-1074" />. By default it's<tt class="literal">undef</tt>, and everything is normal. But whenit's set to some string, it makes the diamond operator("<tt class="literal"><></tt>") even more magical thanusual.</p>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -