📄 perlfaq5.pod
字号:
=head1 NAME
perlfaq5 - Files and Formats ($Revision: 1.24 $, $Date: 1998/07/05 15:07:20 $)
=head1 DESCRIPTION
This section deals with I/O and the "f" issues: filehandles, flushing,
formats, and footers.
=head2 How do I flush/unbuffer an output filehandle? Why must I do this?
The C standard I/O library (stdio) normally buffers characters sent to
devices. This is done for efficiency reasons, so that there isn't a
system call for each byte. Any time you use print() or write() in
Perl, you go though this buffering. syswrite() circumvents stdio and
buffering.
In most stdio implementations, the type of output buffering and the size of
the buffer varies according to the type of device. Disk files are block
buffered, often with a buffer size of more than 2k. Pipes and sockets
are often buffered with a buffer size between 1/2 and 2k. Serial devices
(e.g. modems, terminals) are normally line-buffered, and stdio sends
the entire line when it gets the newline.
Perl does not support truly unbuffered output (except insofar as you can
C<syswrite(OUT, $char, 1)>). What it does instead support is "command
buffering", in which a physical write is performed after every output
command. This isn't as hard on your system as unbuffering, but does
get the output where you want it when you want it.
If you expect characters to get to your device when you print them there,
you'll want to autoflush its handle.
Use select() and the C<$|> variable to control autoflushing
(see L<perlvar/$|> and L<perlfunc/select>):
$old_fh = select(OUTPUT_HANDLE);
$| = 1;
select($old_fh);
Or using the traditional idiom:
select((select(OUTPUT_HANDLE), $| = 1)[0]);
Or if don't mind slowly loading several thousand lines of module code
just because you're afraid of the C<$|> variable:
use FileHandle;
open(DEV, "+</dev/tty"); # ceci n'est pas une pipe
DEV->autoflush(1);
or the newer IO::* modules:
use IO::Handle;
open(DEV, ">/dev/printer"); # but is this?
DEV->autoflush(1);
or even this:
use IO::Socket; # this one is kinda a pipe?
$sock = IO::Socket::INET->new(PeerAddr => 'www.perl.com',
PeerPort => 'http(80)',
Proto => 'tcp');
die "$!" unless $sock;
$sock->autoflush();
print $sock "GET / HTTP/1.0" . "\015\012" x 2;
$document = join('', <$sock>);
print "DOC IS: $document\n";
Note the bizarrely hardcoded carriage return and newline in their octal
equivalents. This is the ONLY way (currently) to assure a proper flush
on all platforms, including Macintosh. That the way things work in
network programming: you really should specify the exact bit pattern
on the network line terminator. In practice, C<"\n\n"> often works,
but this is not portable.
See L<perlfaq9> for other examples of fetching URLs over the web.
=head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?
Although humans have an easy time thinking of a text file as being a
sequence of lines that operates much like a stack of playing cards --
or punch cards -- computers usually see the text file as a sequence of
bytes. In general, there's no direct way for Perl to seek to a
particular line of a file, insert text into a file, or remove text
from a file.
(There are exceptions in special circumstances. You can add or remove at
the very end of the file. Another is replacing a sequence of bytes with
another sequence of the same length. Another is using the C<$DB_RECNO>
array bindings as documented in L<DB_File>. Yet another is manipulating
files with all lines the same length.)
The general solution is to create a temporary copy of the text file with
the changes you want, then copy that over the original. This assumes
no locking.
$old = $file;
$new = "$file.tmp.$$";
$bak = "$file.bak";
open(OLD, "< $old") or die "can't open $old: $!";
open(NEW, "> $new") or die "can't open $new: $!";
# Correct typos, preserving case
while (<OLD>) {
s/\b(p)earl\b/${1}erl/i;
(print NEW $_) or die "can't write to $new: $!";
}
close(OLD) or die "can't close $old: $!";
close(NEW) or die "can't close $new: $!";
rename($old, $bak) or die "can't rename $old to $bak: $!";
rename($new, $old) or die "can't rename $new to $old: $!";
Perl can do this sort of thing for you automatically with the C<-i>
command-line switch or the closely-related C<$^I> variable (see
L<perlrun> for more details). Note that
C<-i> may require a suffix on some non-Unix systems; see the
platform-specific documentation that came with your port.
# Renumber a series of tests from the command line
perl -pi -e 's/(^\s+test\s+)\d+/ $1 . ++$count /e' t/op/taint.t
# form a script
local($^I, @ARGV) = ('.bak', glob("*.c"));
while (<>) {
if ($. == 1) {
print "This line should appear at the top of each file\n";
}
s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
print;
close ARGV if eof; # Reset $.
}
If you need to seek to an arbitrary line of a file that changes
infrequently, you could build up an index of byte positions of where
the line ends are in the file. If the file is large, an index of
every tenth or hundredth line end would allow you to seek and read
fairly efficiently. If the file is sorted, try the look.pl library
(part of the standard perl distribution).
In the unique case of deleting lines at the end of a file, you
can use tell() and truncate(). The following code snippet deletes
the last line of a file without making a copy or reading the
whole file into memory:
open (FH, "+< $file");
while ( <FH> ) { $addr = tell(FH) unless eof(FH) }
truncate(FH, $addr);
Error checking is left as an exercise for the reader.
=head2 How do I count the number of lines in a file?
One fairly efficient way is to count newlines in the file. The
following program uses a feature of tr///, as documented in L<perlop>.
If your text file doesn't end with a newline, then it's not really a
proper text file, so this may report one fewer line than you expect.
$lines = 0;
open(FILE, $filename) or die "Can't open `$filename': $!";
while (sysread FILE, $buffer, 4096) {
$lines += ($buffer =~ tr/\n//);
}
close FILE;
This assumes no funny games with newline translations.
=head2 How do I make a temporary file name?
Use the C<new_tmpfile> class method from the IO::File module to get a
filehandle opened for reading and writing. Use this if you don't
need to know the file's name.
use IO::File;
$fh = IO::File->new_tmpfile()
or die "Unable to make new temporary file: $!";
Or you can use the C<tmpnam> function from the POSIX module to get a
filename that you then open yourself. Use this if you do need to know
the file's name.
use Fcntl;
use POSIX qw(tmpnam);
# try new temporary filenames until we get one that didn't already
# exist; the check should be unnecessary, but you can't be too careful
do { $name = tmpnam() }
until sysopen(FH, $name, O_RDWR|O_CREAT|O_EXCL);
# install atexit-style handler so that when we exit or die,
# we automatically delete this temporary file
END { unlink($name) or die "Couldn't unlink $name : $!" }
# now go on to use the file ...
If you're committed to doing this by hand, use the process ID and/or
the current time-value. If you need to have many temporary files in
one process, use a counter:
BEGIN {
use Fcntl;
my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMP} || $ENV{TEMP};
my $base_name = sprintf("%s/%d-%d-0000", $temp_dir, $$, time());
sub temp_file {
local *FH;
my $count = 0;
until (defined(fileno(FH)) || $count++ > 100) {
$base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
sysopen(FH, $base_name, O_WRONLY|O_EXCL|O_CREAT);
}
if (defined(fileno(FH))
return (*FH, $base_name);
} else {
return ();
}
}
}
=head2 How can I manipulate fixed-record-length files?
The most efficient way is using pack() and unpack(). This is faster than
using substr() when take many, many strings. It is slower for just a few.
Here is a sample chunk of code to break up and put back together again
some fixed-format input lines, in this case from the output of a normal,
Berkeley-style ps:
# sample input line:
# 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
$PS_T = 'A6 A4 A7 A5 A*';
open(PS, "ps|");
print scalar <PS>;
while (<PS>) {
($pid, $tt, $stat, $time, $command) = unpack($PS_T, $_);
for $var (qw!pid tt stat time command!) {
print "$var: <$$var>\n";
}
print 'line=', pack($PS_T, $pid, $tt, $stat, $time, $command),
"\n";
}
We've used C<$$var> in a way that forbidden by C<use strict 'refs'>.
That is, we've promoted a string to a scalar variable reference using
symbolic references. This is ok in small programs, but doesn't scale
well. It also only works on global variables, not lexicals.
=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?
The fastest, simplest, and most direct way is to localize the typeglob
of the filehandle in question:
local *TmpHandle;
Typeglobs are fast (especially compared with the alternatives) and
reasonably easy to use, but they also have one subtle drawback. If you
had, for example, a function named TmpHandle(), or a variable named
%TmpHandle, you just hid it from yourself.
sub findme {
local *HostFile;
open(HostFile, "</etc/hosts") or die "no /etc/hosts: $!";
local $_; # <- VERY IMPORTANT
while (<HostFile>) {
print if /\b127\.(0\.0\.)?1\b/;
}
# *HostFile automatically closes/disappears here
}
Here's how to use this in a loop to open and store a bunch of
filehandles. We'll use as values of the hash an ordered
pair to make it easy to sort the hash in insertion order.
@names = qw(motd termcap passwd hosts);
my $i = 0;
foreach $filename (@names) {
local *FH;
open(FH, "/etc/$filename") || die "$filename: $!";
$file{$filename} = [ $i++, *FH ];
}
# Using the filehandles in the array
foreach $name (sort { $file{$a}[0] <=> $file{$b}[0] } keys %file) {
my $fh = $file{$name}[1];
my $line = <$fh>;
print "$name $. $line";
}
For passing filehandles to functions, the easiest way is to
prefer them with a star, as in func(*STDIN). See L<perlfaq7/"Passing
Filehandles"> for details.
If you want to create many, anonymous handles, you should check out the
Symbol, FileHandle, or IO::Handle (etc.) modules. Here's the equivalent
code with Symbol::gensym, which is reasonably light-weight:
foreach $filename (@names) {
use Symbol;
my $fh = gensym();
open($fh, "/etc/$filename") || die "open /etc/$filename: $!";
$file{$filename} = [ $i++, $fh ];
}
Or here using the semi-object-oriented FileHandle, which certainly isn't
light-weight:
use FileHandle;
foreach $filename (@names) {
my $fh = FileHandle->new("/etc/$filename") or die "$filename: $!";
$file{$filename} = [ $i++, $fh ];
}
Please understand that whether the filehandle happens to be a (probably
localized) typeglob or an anonymous handle from one of the modules,
in no way affects the bizarre rules for managing indirect handles.
See the next question.
=head2 How can I use a filehandle indirectly?
An indirect filehandle is using something other than a symbol
in a place that a filehandle is expected. Here are ways
to get those:
$fh = SOME_FH; # bareword is strict-subs hostile
$fh = "SOME_FH"; # strict-refs hostile; same package only
$fh = *SOME_FH; # typeglob
$fh = \*SOME_FH; # ref to typeglob (bless-able)
$fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob
Or to use the C<new> method from the FileHandle or IO modules to
create an anonymous filehandle, store that in a scalar variable,
and use it as though it were a normal filehandle.
use FileHandle;
$fh = FileHandle->new();
use IO::Handle; # 5.004 or higher
$fh = IO::Handle->new();
Then use any of those as you would a normal filehandle. Anywhere that
Perl is expecting a filehandle, an indirect filehandle may be used
instead. An indirect filehandle is just a scalar variable that contains
a filehandle. Functions like C<print>, C<open>, C<seek>, or the functions or
the C<E<lt>FHE<gt>> diamond operator will accept either a read filehandle
or a scalar variable containing one:
($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
print $ofh "Type it: ";
$got = <$ifh>
print $efh "What was that: $got";
Of you're passing a filehandle to a function, you can write
the function in two ways:
sub accept_fh {
my $fh = shift;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -