intern.texi

来自「gnu tar 源码包。 tar 软件是 Unix 系统下的一个打包软件」· TEXI 代码 · 共 334 行 · 第 1/2 页

TEXI
334
字号
@c This is part of the paxutils manual.@c Copyright (C) 2006 Free Software Foundation, Inc.@c This file is distributed under GFDL 1.1 or any later version@c published by the Free Software Foundation.@menu* Standard::           Basic Tar Format* Extensions::         @acronym{GNU} Extensions to the Archive Format* Sparse Formats::     Storing Sparse Files* Snapshot Files::* Dumpdir::@end menu@node Standard@unnumberedsec Basic Tar Format@UNREVISEDWhile an archive may contain many files, the archive itself is asingle ordinary file.  Like any other file, an archive file can bewritten to a storage device such as a tape or disk, sent through apipe or over a network, saved on the active file system, or evenstored in another archive.  An archive file is not easy to read ormanipulate without using the @command{tar} utility or Tar mode in@acronym{GNU} Emacs.Physically, an archive consists of a series of file entries terminatedby an end-of-archive entry, which consists of two 512 blocks of zerobytes.  A fileentry usually describes one of the files in the archive (an@dfn{archive member}), and consists of a file header and the contentsof the file.  File headers contain file names and statistics, checksuminformation which @command{tar} uses to detect file corruption, andinformation about file types.Archives are permitted to have more than one member with the samemember name.  One way this situation can occur is if more than oneversion of a file has been stored in the archive.  For informationabout adding new versions of a file to an archive, see @ref{update}.In addition to entries describing archive members, an archive maycontain entries which @command{tar} itself uses to store information.@xref{label}, for an example of such an archive entry.A @command{tar} archive file contains a series of blocks.  Each blockcontains @code{BLOCKSIZE} bytes.  Although this format may be thoughtof as being on magnetic tape, other media are often used.Each file archived is represented by a header block which describesthe file, followed by zero or more blocks which give the contentsof the file.  At the end of the archive file there are two 512-byte blocksfilled with binary zeros as an end-of-file marker.  A reasonable systemshould write such end-of-file marker at the end of an archive, butmust not assume that such a block exists when reading an archive.  Inparticular @GNUTAR{} always issues a warning if it does not encounter it.The blocks may be @dfn{blocked} for physical I/O operations.Each record of @var{n} blocks (where @var{n} is set by the@option{--blocking-factor=@var{512-size}} (@option{-b @var{512-size}}) option to @command{tar}) is written with a single@w{@samp{write ()}} operation.  On magnetic tapes, the result ofsuch a write is a single record.  When writing an archive,the last record of blocks should be written at the full size, withblocks after the zero block containing all zeros.  When readingan archive, a reasonable system should properly handle an archivewhose last record is shorter than the rest, or which contains garbagerecords after a zero block.The header block is defined in C as follows.  In the @GNUTAR{}distribution, this is part of file @file{src/tar.h}:@smallexample@include header.texi@end smallexampleAll characters in header blocks are represented by using 8-bitcharacters in the local variant of ASCII.  Each field within thestructure is contiguous; that is, there is no padding used withinthe structure.  Each character on the archive medium is storedcontiguously.Bytes representing the contents of files (after the header blockof each file) are not translated in any way and are not constrainedto represent characters in any character set.  The @command{tar} formatdoes not distinguish text files from binary files, and no translationof file contents is performed.The @code{name}, @code{linkname}, @code{magic}, @code{uname}, and@code{gname} are null-terminated character strings.  All other fieldsare zero-filled octal numbers in ASCII.  Each numeric field of width@var{w} contains @var{w} minus 1 digits, and a null.The @code{name} field is the file name of the file, with directory names(if any) preceding the file name, separated by slashes.@FIXME{how big a name before field overflows?}The @code{mode} field provides nine bits specifying file permissionsand three bits to specify the Set @acronym{UID}, Set @acronym{GID}, and Save Text(@dfn{sticky}) modes.  Values for these bits are defined above.When special permissions are required to create a file with a givenmode, and the user restoring files from the archive does not hold suchpermissions, the mode bit(s) specifying those special permissionsare ignored.  Modes which are not supported by the operating systemrestoring files from the archive will be ignored.  Unsupported modesshould be faked up when creating or updating an archive; e.g., thegroup permission could be copied from the @emph{other} permission.The @code{uid} and @code{gid} fields are the numeric user and group@acronym{ID} of the file owners, respectively.  If the operating system doesnot support numeric user or group @acronym{ID}s, these fields shouldbe ignored. The @code{size} field is the size of the file in bytes; linked filesare archived with this field specified as zero. The @code{mtime} field is the data modification time of the file atthe time it was archived.  It is the ASCII representation of the octalvalue of the last time the file's contents were modified, representedas an integer number ofseconds since January 1, 1970, 00:00 Coordinated Universal Time.The @code{chksum} field is the ASCII representation of the octal valueof the simple sum of all bytes in the header block.  Each 8-bitbyte in the header is added to an unsigned integer, initialized tozero, the precision of which shall be no less than seventeen bits.When calculating the checksum, the @code{chksum} field is treated asif it were all blanks.The @code{typeflag} field specifies the type of file archived.  If aparticular implementation does not recognize or permit the specifiedtype, the file will be extracted as if it were a regular file.  As thisaction occurs, @command{tar} issues a warning to the standard error.The @code{atime} and @code{ctime} fields are used in making incrementalbackups; they store, respectively, the particular file's access andstatus change times.The @code{offset} is used by the @option{--multi-volume} (@option{-M}) option, whenmaking a multi-volume archive.  The offset is number of bytes intothe file that we need to restart at to continue the file on the nexttape, i.e., where we store the location that a continued file iscontinued at.The following fields were added to deal with sparse files.  A fileis @dfn{sparse} if it takes in unallocated blocks which end up beingrepresented as zeros, i.e., no useful data.  A test to see if a fileis sparse is to look at the number blocks allocated for it versus thenumber of characters in the file; if there are fewer blocks allocatedfor the file than would normally be allocated for a file of thatsize, then the file is sparse.  This is the method @command{tar} uses todetect a sparse file, and once such a file is detected, it is treateddifferently from non-sparse files.Sparse files are often @code{dbm} files, or other database-type fileswhich have data at some points and emptiness in the greater part ofthe file.  Such files can appear to be very large when an @samp{ls-l} is done on them, when in truth, there may be a very small amountof important data contained in the file.  It is thus undesirableto have @command{tar} think that it must back up this entire file, asgreat quantities of room are wasted on empty blocks, which can leadto running out of room on a tape far earlier than is necessary.Thus, sparse files are dealt with so that these empty blocks arenot written to the tape.  Instead, what is written to the tape is adescription, of sorts, of the sparse file: where the holes are, howbig the holes are, and how much data is found at the end of the hole.This way, the file takes up potentially far less room on the tape,and when the file is extracted later on, it will look exactly the wayit looked beforehand.  The following is a description of the fields

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?