📄 find.pm
字号:
package File::Find;use 5.005_64;require Exporter;require Cwd;=head1 NAMEfind - traverse a file treefinddepth - traverse a directory structure depth-first=head1 SYNOPSIS use File::Find; find(\&wanted, '/foo', '/bar'); sub wanted { ... } use File::Find; finddepth(\&wanted, '/foo', '/bar'); sub wanted { ... } use File::Find; find({ wanted => \&process, follow => 1 }, '.');=head1 DESCRIPTIONThe first argument to find() is either a hash reference describing theoperations to be performed for each file, or a code reference.Here are the possible keys for the hash:=over 3=item C<wanted>The value should be a code reference. This code reference is calledI<the wanted() function> below.=item C<bydepth>Reports the name of a directory only AFTER all its entrieshave been reported. Entry point finddepth() is a shortcut forspecifying C<{ bydepth => 1 }> in the first argument of find().=item C<preprocess>The value should be a code reference. This code reference is used topreprocess a directory; it is called after readdir() but before the loop thatcalls the wanted() function. It is called with a list of strings and isexpected to return a list of strings. The code can be used to sort thestrings alphabetically, numerically, or to filter out directory entries basedon their name alone.=item C<postprocess>The value should be a code reference. It is invoked just before leaving thecurrent directory. It is called in void context with no arguments. The nameof the current directory is in $File::Find::dir. This hook is handy forsummarizing a directory, such as calculating its disk usage.=item C<follow>Causes symbolic links to be followed. Since directory trees with symboliclinks (followed) may contain files more than once and may even havecycles, a hash has to be built up with an entry for each file.This might be expensive both in space and time for a largedirectory tree. See I<follow_fast> and I<follow_skip> below.If either I<follow> or I<follow_fast> is in effect:=over 6=item *It is guaranteed that an I<lstat> has been called before the user'sI<wanted()> function is called. This enables fast file checks involving S< _>.=item *There is a variable C<$File::Find::fullname> which holds the absolutepathname of the file with all symbolic links resolved=back=item C<follow_fast>This is similar to I<follow> except that it may report some files morethan once. It does detect cycles, however. Since only symbolic linkshave to be hashed, this is much cheaper both in space and time. Ifprocessing a file more than once (by the user's I<wanted()> function)is worse than just taking time, the option I<follow> should be used.=item C<follow_skip>C<follow_skip==1>, which is the default, causes all files which areneither directories nor symbolic links to be ignored if they are aboutto be processed a second time. If a directory or a symbolic link are about to be processed a second time, File::Find dies.C<follow_skip==0> causes File::Find to die if any file is about to beprocessed a second time.C<follow_skip==2> causes File::Find to ignore any duplicate files anddirctories but to proceed normally otherwise.=item C<no_chdir>Does not C<chdir()> to each directory as it recurses. The wanted()function will need to be aware of this, of course. In this case,C<$_> will be the same as C<$File::Find::name>.=item C<untaint>If find is used in taint-mode (-T command line switch or if EUID != UIDor if EGID != GID) then internally directory names have to be untaintedbefore they can be cd'ed to. Therefore they are checked against a regularexpression I<untaint_pattern>. Note that all names passed to theuser's I<wanted()> function are still tainted. =item C<untaint_pattern>See above. This should be set using the C<qr> quoting operator.The default is set to C<qr|^([-+@\w./]+)$|>. Note that the parantheses are vital.=item C<untaint_skip>If set, directories (subtrees) which fail the I<untaint_pattern>are skipped. The default is to 'die' in such a case.=backThe wanted() function does whatever verifications you want.C<$File::Find::dir> contains the current directory name, and C<$_> thecurrent filename within that directory. C<$File::Find::name> containsthe complete pathname to the file. You are chdir()'d toC<$File::Find::dir> when the function is called, unless C<no_chdir>was specified. When <follow> or <follow_fast> are in effect, there isalso a C<$File::Find::fullname>. The function may setC<$File::Find::prune> to prune the tree unless C<bydepth> wasspecified. Unless C<follow> or C<follow_fast> is specified, forcompatibility reasons (find.pl, find2perl) there are in addition thefollowing globals available: C<$File::Find::topdir>,C<$File::Find::topdev>, C<$File::Find::topino>,C<$File::Find::topmode> and C<$File::Find::topnlink>.This library is useful for the C<find2perl> tool, which when fed, find2perl / -name .nfs\* -mtime +7 \ -exec rm -f {} \; -o -fstype nfs -pruneproduces something like: sub wanted { /^\.nfs.*\z/s && (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) && int(-M _) > 7 && unlink($_) || ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) && $dev < 0 && ($File::Find::prune = 1); }Set the variable C<$File::Find::dont_use_nlink> if you're using AFS,since AFS cheats.Here's another interesting wanted function. It will find all symlinksthat don't resolve: sub wanted { -l && !-e && print "bogus link: $File::Find::name\n"; }See also the script C<pfind> on CPAN for a nice application of thismodule.=head1 CAVEATBe aware that the option to follow symbolic links can be dangerous.Depending on the structure of the directory tree (including symboliclinks to directories) you might traverse a given (physical) directorymore than once (only if C<follow_fast> is in effect). Furthermore, deleting or changing files in a symbolically linked directorymight cause very unpleasant surprises, since you delete or change filesin an unknown directory.=cut@ISA = qw(Exporter);@EXPORT = qw(find finddepth);use strict;my $Is_VMS;require File::Basename;my %SLnkSeen;my ($wanted_callback, $avoid_nlink, $bydepth, $no_chdir, $follow, $follow_skip, $full_check, $untaint, $untaint_skip, $untaint_pat, $pre_process, $post_process);sub contract_name { my ($cdir,$fn) = @_; return substr($cdir,0,rindex($cdir,'/')) if $fn eq '.'; $cdir = substr($cdir,0,rindex($cdir,'/')+1); $fn =~ s|^\./||; my $abs_name= $cdir . $fn; if (substr($fn,0,3) eq '../') { do 1 while ($abs_name=~ s|/(?>[^/]+)/\.\./|/|); } return $abs_name;}sub PathCombine($$) { my ($Base,$Name) = @_; my $AbsName; if (substr($Name,0,1) eq '/') { $AbsName= $Name; } else { $AbsName= contract_name($Base,$Name); } # (simple) check for recursion my $newlen= length($AbsName); if ($newlen <= length($Base)) { if (($newlen == length($Base) || substr($Base,$newlen,1) eq '/') && $AbsName eq substr($Base,0,$newlen)) { return undef; } } return $AbsName;}sub Follow_SymLink($) { my ($AbsName) = @_; my ($NewName,$DEV, $INO); ($DEV, $INO)= lstat $AbsName; while (-l _) { if ($SLnkSeen{$DEV, $INO}++) { if ($follow_skip < 2) { die "$AbsName is encountered a second time"; } else { return undef; } } $NewName= PathCombine($AbsName, readlink($AbsName)); unless(defined $NewName) { if ($follow_skip < 2) { die "$AbsName is a recursive symbolic link"; } else { return undef; } } else { $AbsName= $NewName; } ($DEV, $INO) = lstat($AbsName); return undef unless defined $DEV; # dangling symbolic link } if ($full_check && $SLnkSeen{$DEV, $INO}++) { if ($follow_skip < 1) { die "$AbsName encountered a second time"; } else { return undef; } } return $AbsName;}our($dir, $name, $fullname, $prune);sub _find_dir_symlnk($$$);sub _find_dir($$$);sub _find_opt { my $wanted = shift; die "invalid top directory" unless defined $_[0]; my $cwd = $wanted->{bydepth} ? Cwd::fastcwd() : Cwd::cwd(); my $cwd_untainted = $cwd; $wanted_callback = $wanted->{wanted}; $bydepth = $wanted->{bydepth}; $pre_process = $wanted->{preprocess}; $post_process = $wanted->{postprocess}; $no_chdir = $wanted->{no_chdir}; $full_check = $wanted->{follow}; $follow = $full_check || $wanted->{follow_fast}; $follow_skip = $wanted->{follow_skip}; $untaint = $wanted->{untaint}; $untaint_pat = $wanted->{untaint_pattern}; $untaint_skip = $wanted->{untaint_skip}; # for compatability reasons (find.pl, find2perl) our ($topdir, $topdev, $topino, $topmode, $topnlink); # a symbolic link to a directory doesn't increase the link count $avoid_nlink = $follow || $File::Find::dont_use_nlink; if ( $untaint ) { $cwd_untainted= $1 if $cwd_untainted =~ m|$untaint_pat|; die "insecure cwd in find(depth)" unless defined($cwd_untainted); } my ($abs_dir, $Is_Dir); Proc_Top_Item: foreach my $TOP (@_) { my $top_item = $TOP; $top_item =~ s|/\z|| unless $top_item eq '/'; $Is_Dir= 0; ($topdev,$topino,$topmode,$topnlink) = stat $top_item; if ($follow) { if (substr($top_item,0,1) eq '/') { $abs_dir = $top_item; } elsif ($top_item eq '.') { $abs_dir = $cwd; } else { # care about any ../ $abs_dir = contract_name("$cwd/",$top_item); } $abs_dir= Follow_SymLink($abs_dir); unless (defined $abs_dir) { warn "$top_item is a dangling symbolic link\n"; next Proc_Top_Item; } if (-d _) { _find_dir_symlnk($wanted, $abs_dir, $top_item); $Is_Dir= 1; } } else { # no follow $topdir = $top_item; unless (defined $topnlink) { warn "Can't stat $top_item: $!\n"; next Proc_Top_Item; } if (-d _) { $top_item =~ s/\.dir\z// if $Is_VMS; _find_dir($wanted, $top_item, $topnlink); $Is_Dir= 1; } else { $abs_dir= $top_item; } } unless ($Is_Dir) { unless (($_,$dir) = File::Basename::fileparse($abs_dir)) { ($dir,$_) = ('./', $top_item); } $abs_dir = $dir; if ($untaint) { my $abs_dir_save = $abs_dir; $abs_dir = $1 if $abs_dir =~ m|$untaint_pat|; unless (defined $abs_dir) { if ($untaint_skip == 0) { die "directory $abs_dir_save is still tainted"; } else { next Proc_Top_Item; } } } unless ($no_chdir or chdir $abs_dir) {
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -