📄 zip.htm
字号:
<pre> the mapping of the external attributes is
host-system dependent (see 'version made by'). for
ms-dos, the low order byte is the ms-dos directory
attribute byte. if input came from standard input, this
field is set to zero.</pre>
<pre> relative offset of local header: (4 bytes)</pre>
<pre> this is the offset from the start of the first disk on
which this file appears, to where the local header should
be found.</pre>
<pre> filename: (variable)</pre>
<pre> the name of the file, with optional relative path.
the path stored should not contain a drive or
device letter, or a leading slash. all slashes
should be forward slashes '/' as opposed to
backwards slashes '\' for compatibility with amiga
and unix file systems etc. if input came from standard
input, there is no filename field.</pre>
<pre> extra field: (variable)</pre>
<pre> this is for future expansion. if additional information
needs to be stored in the future, it should be stored
here. earlier versions of the software can then safely
skip this file, and find the next file or header. this
field will be 0 length in version 1.0.</pre>
<pre> in order to allow different programs and different types
of information to be stored in the 'extra' field in .zip
files, the following structure should be used for all
programs storing data in this field:</pre>
<pre> header1+data1 + header2+data2 . . .</pre>
<pre> each header should consist of:</pre>
<pre> header id - 2 bytes
data size - 2 bytes</pre>
<pre> note: all fields stored in intel low-byte/high-byte order.</pre>
<pre> the header id field indicates the type of data that is in
the following data block.</pre>
<pre> header id's of 0 thru 31 are reserved for use by pkware.
the remaining id's can be used by third party vendors for
proprietary usage.</pre>
<pre> the current header id mappings defined by pkware are:</pre>
<pre> 0x0007 av info
0x0009 os/2
0x000c vax/vms
0x000d reserved for unix</pre>
<pre> several third party mappings commonly used are:</pre>
<pre> 0x4b46 fwkcs md5 (see below)
0x07c8 macintosh
0x4341 acorn/sparkfs
0x4453 windows nt security descriptor (binary acl)
0x4704 vm/cms
0x470f mvs
0x4c41 os/2 access control list (text acl)
0x4d49 info-zip vms (vax or alpha)
0x5455 extended timestamp
0x5855 info-zip unix (original, also os/2, nt, etc)
0x6542 beos/bebox
0x756e asi unix
0x7855 info-zip unix (new)
0xfd4a sms/qdos</pre>
<pre> the data size field indicates the size of the following
data block. programs can use this value to skip to the
next header block, passing over any data blocks that are
not of interest.</pre>
<pre> note: as stated above, the size of the entire .zip file
header, including the filename, comment, and extra
field should not exceed 64k in size.</pre>
<pre> in case two different programs should appropriate the same
header id value, it is strongly recommended that each
program place a unique signature of at least two bytes in
size (and preferably 4 bytes or bigger) at the start of
each data area. every program should verify that its
unique signature is present, in addition to the header id
value being correct, before assuming that it is a block of
known type.</pre>
<pre> -os/2 extra field:</pre>
<pre> the following is the layout of the os/2 attributes "extra" block.
(last revision 09/05/95)</pre>
<pre> note: all fields stored in intel low-byte/high-byte order.
</pre>
<pre> value size description
----- ---- -----------
(os/2) 0x0009 short tag for this "extra" block type
tsize short size for the following data block
bsize long uncompressed block size
ctype short compression type
eacrc long crc value for uncompress block
(var) variable compressed block
</pre>
<pre> the os/2 extended attribute structure (fea2list) is compressed and then stored
in it's entirety within this structure. there will only ever be one "block" of data
in varfields[].</pre>
<pre> -vax/vms extra field:</pre>
<pre> the following is the layout of the vax/vms attributes "extra"
block. (last revision 12/17/91)</pre>
<pre> note: all fields stored in intel low-byte/high-byte order.</pre>
<pre> value size description
----- ---- -----------
(vms) 0x000c short tag for this "extra" block type
tsize short size of the total "extra" block
crc long 32-bit crc for remainder of the block
tag1 short vms attribute tag value #1
size1 short size of attribute #1, in bytes
(var.) size1 attribute #1 data
.
.
.
tagn short vms attribute tage value #n
sizen short size of attribute #n, in bytes
(var.) sizen attribute #n data</pre>
<pre> rules:</pre>
<pre> 1. there will be one or more of attributes present, which will
each be preceded by the above tagx & sizex values. these
values are identical to the atr$c_xxxx and atr$s_xxxx constants
which are defined in atr.h under vms c. neither of these values
will ever be zero.</pre>
<pre> 2. no word alignment or padding is performed.</pre>
<pre> 3. a well-behaved pkzip/vms program should never produce more than
one sub-block with the same tagx value. also, there will never
be more than one "extra" block of type 0x000c in a particular
directory record.</pre>
<pre> - fwkcs md5 extra field:</pre>
<pre> the fwkcs contents_signature system, used in
automatically identifying files independent of filename,
optionally adds and uses an extra field to support the
rapid creation of an enhanced contents_signature:</pre>
<pre> header id = 0x4b46
data size = 0x0013
preface = 'm','d','5'
followed by 16 bytes containing the uncompressed
file's 128_bit md5 hash(1), low byte first.</pre>
<pre> when fwkcs revises a zipfile central directory to add
this extra field for a file, it also replaces the
central directory entry for that file's uncompressed
filelength with a measured value.</pre>
<pre> fwkcs provides an option to strip this extra field, if
present, from a zipfile central directory. in adding
this extra field, fwkcs preserves zipfile authenticity
verification; if stripping this extra field, fwkcs
preserves all versions of av through pkzip version 2.04g.</pre>
<pre> fwkcs, and fwkcs contents_signature system, are
trademarks of frederick w. kantor.</pre>
<pre> (1) r. rivest, rfc1321.txt, mit laboratory for computer
science and rsa data security, inc., april 1992.
ll.76-77: "the md5 algorithm is being placed in the
public domain for review and possible adoption as a
standard."</pre>
<pre> file comment: (variable)</pre>
<pre> the comment for this file.</pre>
<pre> number of this disk: (2 bytes)</pre>
<pre> the number of this disk, which contains central
directory end record.</pre>
<pre> number of the disk with the start of the central directory: (2 bytes)</pre>
<pre> the number of the disk on which the central
directory starts.</pre>
<pre> total number of entries in the central dir on this disk: (2 bytes)</pre>
<pre> the number of central directory entries on this disk.</pre>
<pre> total number of entries in the central dir: (2 bytes)</pre>
<pre> the total number of files in the zipfile.
</pre>
<pre> size of the central directory: (4 bytes)</pre>
<pre> the size (in bytes) of the entire central directory.</pre>
<pre> offset of start of central directory with respect to
the starting disk number: (4 bytes)</pre>
<pre> offset of the start of the central directory on the
disk on which the central directory starts.</pre>
<pre> zipfile comment length: (2 bytes)</pre>
<pre> the length of the comment for this zipfile.</pre>
<pre> zipfile comment: (variable)</pre>
<pre> the comment for this zipfile.
</pre>
<pre> d. general notes:</pre>
<pre> 1) all fields unless otherwise noted are unsigned and stored
in intel low-byte:high-byte, low-word:high-word order.</pre>
<pre> 2) string fields are not null terminated, since the
length is given explicitly.</pre>
<pre> 3) local headers should not span disk boundaries. also, even
though the central directory can span disk boundaries, no
single record in the central directory should be split
across disks.</pre>
<pre> 4) the entries in the central directory may not necessarily
be in the same order that files appear in the zipfile.</pre>
<pre>unshrinking - method 1
----------------------</pre>
<pre>shrinking is a dynamic ziv-lempel-welch compression algorithm
with partial clearing. the initial code size is 9 bits, and
the maximum code size is 13 bits. shrinking differs from
conventional dynamic ziv-lempel-welch implementations in several
respects:</pre>
<pre>1) the code size is controlled by the compressor, and is not
automatically increased when codes larger than the current
code size are created (but not necessarily used). when
the decompressor encounters the code sequence 256
(decimal) followed by 1, it should increase the code size
read from the input stream to the next bit size. no
blocking of the codes is performed, so the next code at
the increased size should be read from the input stream
immediately after where the previous code at the smaller
bit size was read. again, the decompressor should not
increase the code size used until the sequence 256,1 is
encountered.</pre>
<pre>2) when the table becomes full, total clearing is not
performed. rather, when the compressor emits the code
sequence 256,2 (decimal), the decompressor should clear
all leaf nodes from the ziv-lempel tree, and continue to
use the current code size. the nodes that are cleared
from the ziv-lempel tree are then re-used, with the lowest
code value re-used first, and the highest code value
re-used last. the compressor can emit the sequence 256,2
at any time.
</pre>
<pre>expanding - methods 2-5
-----------------------</pre>
<pre>the reducing algorithm is actually a combination of two
distinct algorithms. the first algorithm compresses repeated
byte sequences, and the second algorithm takes the compressed
stream from the first algorithm and applies a probabilistic
compression method.</pre>
<pre>the probabilistic compression stores an array of 'follower
sets' s(j), for j=0 to 255, corresponding to each possible
ascii character. each set contains between 0 and 32
characters, to be denoted as s(j)[0],...,s(j)[m], where m<32.
the sets are stored at the beginning of the data area for a
reduced file, in reverse order, with s(255) first, and s(0)
last.</pre>
<pre>the sets are encoded as { n(j), s(j)[0],...,s(j)[n(j)-1] },
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -