📄 tar.pm
字号:
### the gnu tar specification:### http://www.gnu.org/software/tar/manual/tar.html###### and the pax format spec, which tar derives from:### http://www.opengroup.org/onlinepubs/007904975/utilities/pax.htmlpackage Archive::Tar;require 5.005_03;use strict;use vars qw[$DEBUG $error $VERSION $WARN $FOLLOW_SYMLINK $CHOWN $CHMOD $DO_NOT_USE_PREFIX $HAS_PERLIO $HAS_IO_STRING $INSECURE_EXTRACT_MODE ];$DEBUG = 0;$WARN = 1;$FOLLOW_SYMLINK = 0;$VERSION = "1.38";$CHOWN = 1;$CHMOD = 1;$DO_NOT_USE_PREFIX = 0;$INSECURE_EXTRACT_MODE = 0;BEGIN { use Config; $HAS_PERLIO = $Config::Config{useperlio}; ### try and load IO::String anyway, so you can dynamically ### switch between perlio and IO::String eval { require IO::String; import IO::String; }; $HAS_IO_STRING = $@ ? 0 : 1;}use Cwd;use IO::File;use Carp qw(carp croak);use File::Spec ();use File::Spec::Unix ();use File::Path ();use Archive::Tar::File;use Archive::Tar::Constant;=head1 NAMEArchive::Tar - module for manipulations of tar archives=head1 SYNOPSIS use Archive::Tar; my $tar = Archive::Tar->new; $tar->read('origin.tgz',1); $tar->extract(); $tar->add_files('file/foo.pl', 'docs/README'); $tar->add_data('file/baz.txt', 'This is the contents now'); $tar->rename('oldname', 'new/file/name'); $tar->write('files.tar');=head1 DESCRIPTIONArchive::Tar provides an object oriented mechanism for handling tarfiles. It provides class methods for quick and easy files handlingwhile also allowing for the creation of tar file objects for custommanipulation. If you have the IO::Zlib module installed,Archive::Tar will also support compressed or gzipped tar files.An object of class Archive::Tar represents a .tar(.gz) archive fullof files and things.=head1 Object Methods=head2 Archive::Tar->new( [$file, $compressed] )Returns a new Tar object. If given any arguments, C<new()> calls theC<read()> method automatically, passing on the arguments provided tothe C<read()> method.If C<new()> is invoked with arguments and the C<read()> method failsfor any reason, C<new()> returns undef.=cutmy $tmpl = { _data => [ ], _file => 'Unknown',};### install get/set accessors for this object.for my $key ( keys %$tmpl ) { no strict 'refs'; *{__PACKAGE__."::$key"} = sub { my $self = shift; $self->{$key} = $_[0] if @_; return $self->{$key}; }}sub new { my $class = shift; $class = ref $class if ref $class; ### copying $tmpl here since a shallow copy makes it use the ### same aref, causing for files to remain in memory always. my $obj = bless { _data => [ ], _file => 'Unknown' }, $class; if (@_) { unless ( $obj->read( @_ ) ) { $obj->_error(qq[No data could be read from file]); return; } } return $obj;}=head2 $tar->read ( $filename|$handle, $compressed, {opt => 'val'} )Read the given tar file into memory.The first argument can either be the name of a file or a reference toan already open filehandle (or an IO::Zlib object if it's compressed)The second argument indicates whether the file referenced by the firstargument is compressed.The C<read> will I<replace> any previous content in C<$tar>!The second argument may be considered optional if IO::Zlib isinstalled, since it will transparently Do The Right Thing.Archive::Tar will warn if you try to pass a compressed file ifIO::Zlib is not available and simply return.Note that you can currently B<not> pass a C<gzip> compressedfilehandle, which is not opened with C<IO::Zlib>, nor a stringcontaining the full archive information (either compressed oruncompressed). These are worth while features, but not currentlyimplemented. See the C<TODO> section.The third argument can be a hash reference with options. Note thatall options are case-sensitive.=over 4=item limitDo not read more than C<limit> files. This is useful if you havevery big archives, and are only interested in the first few files.=item extractIf set to true, immediately extract entries when reading them. Thisgives you the same memory break as the C<extract_archive> function.Note however that entries will not be read into memory, but writtenstraight to disk.=backAll files are stored internally as C<Archive::Tar::File> objects.Please consult the L<Archive::Tar::File> documentation for details.Returns the number of files read in scalar context, and a list ofC<Archive::Tar::File> objects in list context.=cutsub read { my $self = shift; my $file = shift; my $gzip = shift || 0; my $opts = shift || {}; unless( defined $file ) { $self->_error( qq[No file to read from!] ); return; } else { $self->_file( $file ); } my $handle = $self->_get_handle($file, $gzip, READ_ONLY->( ZLIB ) ) or return; my $data = $self->_read_tar( $handle, $opts ) or return; $self->_data( $data ); return wantarray ? @$data : scalar @$data;}sub _get_handle { my $self = shift; my $file = shift; return unless defined $file; return $file if ref $file; my $gzip = shift || 0; my $mode = shift || READ_ONLY->( ZLIB ); # default to read only my $fh; my $bin; ### only default to ZLIB if we're not trying to /write/ to a handle ### if( ZLIB and $gzip || MODE_READ->( $mode ) ) { ### IO::Zlib will Do The Right Thing, even when passed ### a plain file ### $fh = new IO::Zlib; } else { if( $gzip ) { $self->_error(qq[Compression not available - Install IO::Zlib!]); return; } else { $fh = new IO::File; $bin++; } } unless( $fh->open( $file, $mode ) ) { $self->_error( qq[Could not create filehandle for '$file': $!!] ); return; } binmode $fh if $bin; return $fh;}sub _read_tar { my $self = shift; my $handle = shift or return; my $opts = shift || {}; my $count = $opts->{limit} || 0; my $extract = $opts->{extract} || 0; ### set a cap on the amount of files to extract ### my $limit = 0; $limit = 1 if $count > 0; my $tarfile = [ ]; my $chunk; my $read = 0; my $real_name; # to set the name of a file when # we're encountering @longlink my $data; LOOP: while( $handle->read( $chunk, HEAD ) ) { ### IO::Zlib doesn't support this yet my $offset = eval { tell $handle } || 'unknown'; unless( $read++ ) { my $gzip = GZIP_MAGIC_NUM; if( $chunk =~ /$gzip/ ) { $self->_error( qq[Cannot read compressed format in tar-mode] ); return; } } ### if we can't read in all bytes... ### last if length $chunk != HEAD; ### Apparently this should really be two blocks of 512 zeroes, ### but GNU tar sometimes gets it wrong. See comment in the ### source code (tar.c) to GNU cpio. next if $chunk eq TAR_END; ### according to the posix spec, the last 12 bytes of the header are ### null bytes, to pad it to a 512 byte block. That means if these ### bytes are NOT null bytes, it's a corrrupt header. See: ### www.koders.com/c/fidCE473AD3D9F835D690259D60AD5654591D91D5BA.aspx ### line 111 { my $nulls = join '', "\0" x 12; unless( $nulls eq substr( $chunk, 500, 12 ) ) { $self->_error( qq[Invalid header block at offset $offset] ); next LOOP; } } ### pass the realname, so we can set it 'proper' right away ### some of the heuristics are done on the name, so important ### to set it ASAP my $entry; { my %extra_args = (); $extra_args{'name'} = $$real_name if defined $real_name; unless( $entry = Archive::Tar::File->new( chunk => $chunk, %extra_args ) ) { $self->_error( qq[Couldn't read chunk at offset $offset] ); next LOOP; } } ### ignore labels: ### http://www.gnu.org/manual/tar/html_node/tar_139.html next if $entry->is_label; if( length $entry->type and ($entry->is_file || $entry->is_longlink) ) { if ( $entry->is_file && !$entry->validate ) { ### sometimes the chunk is rather fux0r3d and a whole 512 ### bytes ends up in the ->name area. ### clean it up, if need be my $name = $entry->name; $name = substr($name, 0, 100) if length $name > 100; $name =~ s/\n/ /g; $self->_error( $name . qq[: checksum error] ); next LOOP; } my $block = BLOCK_SIZE->( $entry->size ); $data = $entry->get_content_by_ref; ### just read everything into memory ### can't do lazy loading since IO::Zlib doesn't support 'seek' ### this is because Compress::Zlib doesn't support it =/ ### this reads in the whole data in one read() call. if( $handle->read( $$data, $block ) < $block ) { $self->_error( qq[Read error on tarfile (missing data) ']. $entry->full_path ."' at offset $offset" ); next LOOP; } ### throw away trailing garbage ### substr ($$data, $entry->size) = "" if defined $$data; ### part II of the @LongLink munging -- need to do /after/ ### the checksum check. if( $entry->is_longlink ) { ### weird thing in tarfiles -- if the file is actually a ### @LongLink, the data part seems to have a trailing ^@ ### (unprintable) char. to display, pipe output through less. ### but that doesn't *always* happen.. so check if the last ### character is a control character, and if so remove it ### at any rate, we better remove that character here, or tests ### like 'eq' and hashlook ups based on names will SO not work ### remove it by calculating the proper size, and then ### tossing out everything that's longer than that size. ### count number of nulls my $nulls = $$data =~ tr/\0/\0/; ### cut data + size by that many bytes $entry->size( $entry->size - $nulls ); substr ($$data, $entry->size) = ""; } } ### clean up of the entries.. posix tar /apparently/ has some ### weird 'feature' that allows for filenames > 255 characters ### they'll put a header in with as name '././@LongLink' and the ### contents will be the name of the /next/ file in the archive ### pretty crappy and kludgy if you ask me ### set the name for the next entry if this is a @LongLink; ### this is one ugly hack =/ but needed for direct extraction if( $entry->is_longlink ) { $real_name = $data; next LOOP; } elsif ( defined $real_name ) { $entry->name( $$real_name ); $entry->prefix(''); undef $real_name; } $self->_extract_file( $entry ) if $extract && !$entry->is_longlink && !$entry->is_unknown && !$entry->is_label; ### Guard against tarfiles with garbage at the end last LOOP if $entry->name eq ''; ### push only the name on the rv if we're extracting ### -- for extract_archive push @$tarfile, ($extract ? $entry->name : $entry); if( $limit ) { $count-- unless $entry->is_longlink || $entry->is_dir; last LOOP unless $count; } } continue { undef $data; } return $tarfile;}=head2 $tar->contains_file( $filename )Check if the archive contains a certain file.It will return true if the file is in the archive, false otherwise.Note however, that this function does an exact match using C<eq>on the full path. So it cannot compensate for case-insensitive file-systems or compare 2 paths to see if they would point to the sameunderlying file.=cutsub contains_file { my $self = shift; my $full = shift; return unless defined $full; ### don't warn if the entry isn't there.. that's what this function ### is for after all. local $WARN = 0; return 1 if $self->_find_entry($full); return;}=head2 $tar->extract( [@filenames] )Write files whose names are equivalent to any of the names inC<@filenames> to disk, creating subdirectories as necessary. Thismight not work too well under VMS.Under MacPerl, the file's modification time will be converted to theMacOS zero of time, and appropriate conversions will be done to thepath. However, the length of each element of the path is notinspected to see whether it's longer than MacOS currently allows (32characters).If C<extract> is called without a list of file names, the entirecontents of the archive are extracted.Returns a list of filenames extracted.=cutsub extract { my $self = shift; my @args = @_; my @files; # use the speed optimization for all extracted files local($self->{cwd}) = cwd() unless $self->{cwd}; ### you requested the extraction of only certian files if( @args ) { for my $file ( @args ) { ### it's already an object? if( UNIVERSAL::isa( $file, 'Archive::Tar::File' ) ) { push @files, $file; next; ### go find it then } else { my $found; for my $entry ( @{$self->_data} ) { next unless $file eq $entry->full_path; ### we found the file you're looking for push @files, $entry; $found++;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -