📄 perlopentut.pod
字号:
=head1 NAMEperlopentut - tutorial on opening things in Perl=head1 DESCRIPTIONPerl has two simple, built-in ways to open files: the shell way forconvenience, and the C way for precision. The shell way also has 2- and3-argument forms, which have different semantics for handling the filename.The choice is yours.=head1 Open E<agrave> la shellPerl's C<open> function was designed to mimic the way command-lineredirection in the shell works. Here are some basic examplesfrom the shell: $ myprogram file1 file2 file3 $ myprogram < inputfile $ myprogram > outputfile $ myprogram >> outputfile $ myprogram | otherprogram $ otherprogram | myprogramAnd here are some more advanced examples: $ otherprogram | myprogram f1 - f2 $ otherprogram 2>&1 | myprogram - $ myprogram <&3 $ myprogram >&4Programmers accustomed to constructs like those above can take comfortin learning that Perl directly supports these familiar constructs usingvirtually the same syntax as the shell.=head2 Simple OpensThe C<open> function takes two arguments: the first is a filehandle,and the second is a single string comprising both what to open and howto open it. C<open> returns true when it works, and when it fails,returns a false value and sets the special variable C<$!> to reflectthe system error. If the filehandle was previously opened, it willbe implicitly closed first.For example: open(INFO, "datafile") || die("can't open datafile: $!"); open(INFO, "< datafile") || die("can't open datafile: $!"); open(RESULTS,"> runstats") || die("can't open runstats: $!"); open(LOG, ">> logfile ") || die("can't open logfile: $!");If you prefer the low-punctuation version, you could write that this way: open INFO, "< datafile" or die "can't open datafile: $!"; open RESULTS,"> runstats" or die "can't open runstats: $!"; open LOG, ">> logfile " or die "can't open logfile: $!";A few things to notice. First, the leading less-than is optional.If omitted, Perl assumes that you want to open the file for reading.Note also that the first example uses the C<||> logical operator, and thesecond uses C<or>, which has lower precedence. Using C<||> in the latterexamples would effectively mean open INFO, ( "< datafile" || die "can't open datafile: $!" );which is definitely not what you want.The other important thing to notice is that, just as in the shell,any whitespace before or after the filename is ignored. This is good,because you wouldn't want these to do different things: open INFO, "<datafile" open INFO, "< datafile" open INFO, "< datafile"Ignoring surrounding whitespace also helps for when you read a filenamein from a different file, and forget to trim it before opening: $filename = <INFO>; # oops, \n still there open(EXTRA, "< $filename") || die "can't open $filename: $!";This is not a bug, but a feature. Because C<open> mimics the shell inits style of using redirection arrows to specify how to open the file, italso does so with respect to extra whitespace around the filename itselfas well. For accessing files with naughty names, see L<"Dispelling the Dweomer">.There is also a 3-argument version of C<open>, which lets you put thespecial redirection characters into their own argument: open( INFO, ">", $datafile ) || die "Can't create $datafile: $!";In this case, the filename to open is the actual string in C<$datafile>,so you don't have to worry about C<$datafile> containing charactersthat might influence the open mode, or whitespace at the beginning ofthe filename that would be absorbed in the 2-argument version. Also,any reduction of unnecessary string interpolation is a good thing.=head2 Indirect FilehandlesC<open>'s first argument can be a reference to a filehandle. As ofperl 5.6.0, if the argument is uninitialized, Perl will automaticallycreate a filehandle and put a reference to it in the first argument,like so: open( my $in, $infile ) or die "Couldn't read $infile: $!"; while ( <$in> ) { # do something with $_ } close $in;Indirect filehandles make namespace management easier. Since filehandlesare global to the current package, two subroutines trying to openC<INFILE> will clash. With two functions opening indirect filehandleslike C<my $infile>, there's no clash and no need to worry about futureconflicts.Another convenient behavior is that an indirect filehandle automaticallycloses when it goes out of scope or when you undefine it: sub firstline { open( my $in, shift ) && return scalar <$in>; # no close() required }=head2 Pipe OpensIn C, when you want to open a file using the standard I/O library,you use the C<fopen> function, but when opening a pipe, you use theC<popen> function. But in the shell, you just use a different redirectioncharacter. That's also the case for Perl. The C<open> call remains the same--just its argument differs. If the leading character is a pipe symbol, C<open> starts up a newcommand and opens a write-only filehandle leading into that command.This lets you write into that handle and have what you write show up onthat command's standard input. For example: open(PRINTER, "| lpr -Plp1") || die "can't run lpr: $!"; print PRINTER "stuff\n"; close(PRINTER) || die "can't close lpr: $!";If the trailing character is a pipe, you start up a new command and open aread-only filehandle leading out of that command. This lets whatever thatcommand writes to its standard output show up on your handle for reading.For example: open(NET, "netstat -i -n |") || die "can't fork netstat: $!"; while (<NET>) { } # do something with input close(NET) || die "can't close netstat: $!";What happens if you try to open a pipe to or from a non-existentcommand? If possible, Perl will detect the failure and set C<$!> asusual. But if the command contains special shell characters, such asC<E<gt>> or C<*>, called 'metacharacters', Perl does not execute thecommand directly. Instead, Perl runs the shell, which then tries torun the command. This means that it's the shell that gets the errorindication. In such a case, the C<open> call will only indicatefailure if Perl can't even run the shell. See L<perlfaq8/"How can Icapture STDERR from an external command?"> to see how to cope withthis. There's also an explanation in L<perlipc>.If you would like to open a bidirectional pipe, the IPC::Open2library will handle this for you. Check out L<perlipc/"Bidirectional Communication with Another Process">=head2 The Minus FileAgain following the lead of the standard shell utilities, Perl'sC<open> function treats a file whose name is a single minus, "-", in aspecial way. If you open minus for reading, it really means to accessthe standard input. If you open minus for writing, it really means toaccess the standard output.If minus can be used as the default input or default output, what happensif you open a pipe into or out of minus? What's the default command itwould run? The same script as you're currently running! This is actuallya stealth C<fork> hidden inside an C<open> call. See L<perlipc/"Safe Pipe Opens"> for details.=head2 Mixing Reads and WritesIt is possible to specify both read and write access. All you do isadd a "+" symbol in front of the redirection. But as in the shell,using a less-than on a file never creates a new file; it only opens anexisting one. On the other hand, using a greater-than always clobbers(truncates to zero length) an existing file, or creates a brand-new oneif there isn't an old one. Adding a "+" for read-write doesn't affectwhether it only works on existing files or always clobbers existing ones. open(WTMP, "+< /usr/adm/wtmp") || die "can't open /usr/adm/wtmp: $!"; open(SCREEN, "+> lkscreen") || die "can't open lkscreen: $!"; open(LOGFILE, "+>> /var/log/applog") || die "can't open /var/log/applog: $!";The first one won't create a new file, and the second one will alwaysclobber an old one. The third one will create a new file if necessaryand not clobber an old one, and it will allow you to read at any pointin the file, but all writes will always go to the end. In short,the first case is substantially more common than the second and thirdcases, which are almost always wrong. (If you know C, the plus inPerl's C<open> is historically derived from the one in C's fopen(3S),which it ultimately calls.)In fact, when it comes to updating a file, unless you're working ona binary file as in the WTMP case above, you probably don't want touse this approach for updating. Instead, Perl's B<-i> flag comes tothe rescue. The following command takes all the C, C++, or yacc sourceor header files and changes all their foo's to bar's, leavingthe old version in the original filename with a ".orig" tackedon the end: $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy]This is a short cut for some renaming games that are reallythe best way to update textfiles. See the second question in L<perlfaq5> for more details.=head2 Filters One of the most common uses for C<open> is one you nevereven notice. When you process the ARGV filehandle usingC<< <ARGV> >>, Perl actually does an implicit open on each file in @ARGV. Thus a program called like this: $ myprogram file1 file2 file3can have all its files opened and processed one at a timeusing a construct no more complex than: while (<>) { # do something with $_ } If @ARGV is empty when the loop first begins, Perl pretends you've openedup minus, that is, the standard input. In fact, $ARGV, the currentlyopen file during C<< <ARGV> >> processing, is even set to "-"in these circumstances.You are welcome to pre-process your @ARGV before starting the loop tomake sure it's to your liking. One reason to do this might be to removecommand options beginning with a minus. While you can always roll thesimple ones by hand, the Getopts modules are good for this: use Getopt::Std; # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o getopts("vDo:"); # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o} getopts("vDo:", \%args); Or the standard Getopt::Long module to permit named arguments: use Getopt::Long; GetOptions( "verbose" => \$verbose, # --verbose "Debug" => \$debug, # --Debug "output=s" => \$output ); # --output=somestring or --output somestringAnother reason for preprocessing arguments is to make an emptyargument list default to all files: @ARGV = glob("*") unless @ARGV;You could even filter out all but plain, text files. This is a bitsilent, of course, and you might prefer to mention them on the way. @ARGV = grep { -f && -T } @ARGV;If you're using the B<-n> or B<-p> command-line options, youshould put changes to @ARGV in a C<BEGIN{}> block.Remember that a normal C<open> has special properties, in that it mightcall fopen(3S) or it might called popen(3S), depending on what itsargument looks like; that's why it's sometimes called "magic open".Here's an example: $pwdinfo = `domainname` =~ /^(\(none\))?$/ ? '< /etc/passwd' : 'ypcat passwd |'; open(PWD, $pwdinfo) or die "can't open $pwdinfo: $!";This sort of thing also comes into play in filter processing. BecauseC<< <ARGV> >> processing employs the normal, shell-style Perl C<open>,it respects all the special things we've already seen: $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfileThat program will read from the file F<f1>, the process F<cmd1>, standardinput (F<tmpfile> in this case), the F<f2> file, the F<cmd2> command,and finally the F<f3> file.Yes, this also means that if you have files named "-" (and so on) inyour directory, they won't be processed as literal files by C<open>.You'll need to pass them as "./-", much as you would for the I<rm> program,or you could use C<sysopen> as described below.One of the more interesting applications is to change files of a certainname into pipes. For example, to autoprocess gzipped or compressedfiles by decompressing them with I<gzip>: @ARGV = map { /^\.(gz|Z)$/ ? "gzip -dc $_ |" : $_ } @ARGV;Or, if you have the I<GET> program installed from LWP,you can fetch URLs before processing them: @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV;It's not for nothing that this is called magic C<< <ARGV> >>.Pretty nifty, eh?
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -