📄 zip.htm

📁 常见的一些文件如WORD、JPG等的文件格式
💻 HTM
📖 第 1 页 / 共 4 页
字号:
    <pre>          the mapping of the external attributes is
          host-system dependent (see 'version made by').  for
          ms-dos, the low order byte is the ms-dos directory
          attribute byte.  if input came from standard input, this
          field is set to zero.</pre>
    <pre>      relative offset of local header: (4 bytes)</pre>
    <pre>          this is the offset from the start of the first disk on
          which this file appears, to where the local header should
          be found.</pre>
    <pre>      filename: (variable)</pre>
    <pre>          the name of the file, with optional relative path.
          the path stored should not contain a drive or
          device letter, or a leading slash.  all slashes
          should be forward slashes '/' as opposed to
          backwards slashes '\' for compatibility with amiga
          and unix file systems etc.  if input came from standard
          input, there is no filename field.</pre>
    <pre>      extra field: (variable)</pre>
    <pre>          this is for future expansion.  if additional information
          needs to be stored in the future, it should be stored
          here.  earlier versions of the software can then safely
          skip this file, and find the next file or header.  this
          field will be 0 length in version 1.0.</pre>
    <pre>          in order to allow different programs and different types
          of information to be stored in the 'extra' field in .zip
          files, the following structure should be used for all
          programs storing data in this field:</pre>
    <pre>          header1+data1 + header2+data2 . . .</pre>
    <pre>          each header should consist of:</pre>
    <pre>            header id - 2 bytes
            data size - 2 bytes</pre>
    <pre>          note: all fields stored in intel low-byte/high-byte order.</pre>
    <pre>          the header id field indicates the type of data that is in
          the following data block.</pre>
    <pre>          header id's of 0 thru 31 are reserved for use by pkware.
          the remaining id's can be used by third party vendors for
          proprietary usage.</pre>
    <pre>          the current header id mappings defined by pkware are:</pre>
    <pre>          0x0007        av info
          0x0009        os/2
          0x000c        vax/vms
          0x000d        reserved for unix</pre>
    <pre>          several third party mappings commonly used are:</pre>
    <pre>          0x4b46        fwkcs md5 (see below)
          0x07c8        macintosh
          0x4341        acorn/sparkfs 
          0x4453        windows nt security descriptor (binary acl)
          0x4704        vm/cms
          0x470f        mvs
          0x4c41        os/2 access control list (text acl)
          0x4d49        info-zip vms (vax or alpha)
          0x5455        extended timestamp
          0x5855        info-zip unix (original, also os/2, nt, etc)
          0x6542        beos/bebox
          0x756e        asi unix
          0x7855        info-zip unix (new)
          0xfd4a        sms/qdos</pre>
    <pre>          the data size field indicates the size of the following
          data block. programs can use this value to skip to the
          next header block, passing over any data blocks that are
          not of interest.</pre>
    <pre>          note: as stated above, the size of the entire .zip file
                header, including the filename, comment, and extra
                field should not exceed 64k in size.</pre>
    <pre>          in case two different programs should appropriate the same
          header id value, it is strongly recommended that each
          program place a unique signature of at least two bytes in
          size (and preferably 4 bytes or bigger) at the start of
          each data area.  every program should verify that its
          unique signature is present, in addition to the header id
          value being correct, before assuming that it is a block of
          known type.</pre>
    <pre>         -os/2 extra field:</pre>
    <pre>          the following is the layout of the os/2 attributes &quot;extra&quot; block.
          (last revision  09/05/95)</pre>
    <pre>          note: all fields stored in intel low-byte/high-byte order.
</pre>
    <pre>          value         size            description
          -----         ----            -----------
  (os/2)  0x0009        short           tag for this &quot;extra&quot; block type
          tsize         short           size for the following data block
          bsize         long            uncompressed block size
          ctype         short           compression type
          eacrc         long            crc value for uncompress block
          (var)         variable        compressed block
</pre>
    <pre>        the os/2 extended attribute structure (fea2list) is compressed and then stored
        in it's entirety within this structure.  there will only ever be one &quot;block&quot; of data
        in varfields[].</pre>
    <pre>         -vax/vms extra field:</pre>
    <pre>          the following is the layout of the vax/vms attributes &quot;extra&quot;
          block.  (last revision 12/17/91)</pre>
    <pre>          note: all fields stored in intel low-byte/high-byte order.</pre>
    <pre>          value         size            description
          -----         ----            -----------
  (vms)   0x000c        short           tag for this &quot;extra&quot; block type
          tsize         short           size of the total &quot;extra&quot; block
          crc           long            32-bit crc for remainder of the block
          tag1          short           vms attribute tag value #1
          size1         short           size of attribute #1, in bytes
          (var.)        size1           attribute #1 data
          .
          .
          .
          tagn          short           vms attribute tage value #n
          sizen         short           size of attribute #n, in bytes
          (var.)        sizen           attribute #n data</pre>
    <pre>          rules:</pre>
    <pre>          1. there will be one or more of attributes present, which will
             each be preceded by the above tagx &amp; sizex values.  these
             values are identical to the atr$c_xxxx and atr$s_xxxx constants
             which are defined in atr.h under vms c.  neither of these values
             will ever be zero.</pre>
    <pre>          2. no word alignment or padding is performed.</pre>
    <pre>          3. a well-behaved pkzip/vms program should never produce more than
             one sub-block with the same tagx value.  also, there will never
             be more than one &quot;extra&quot; block of type 0x000c in a particular
             directory record.</pre>
    <pre>          - fwkcs md5 extra field:</pre>
    <pre>          the fwkcs contents_signature system, used in
          automatically identifying files independent of filename,
          optionally adds and uses an extra field to support the
          rapid creation of an enhanced contents_signature:</pre>
    <pre>              header id = 0x4b46
              data size = 0x0013
              preface   = 'm','d','5'
              followed by 16 bytes containing the uncompressed
                  file's 128_bit md5 hash(1), low byte first.</pre>
    <pre>          when fwkcs revises a zipfile central directory to add
          this extra field for a file, it also replaces the
          central directory entry for that file's uncompressed
          filelength with a measured value.</pre>
    <pre>          fwkcs provides an option to strip this extra field, if
          present, from a zipfile central directory. in adding
          this extra field, fwkcs preserves zipfile authenticity
          verification; if stripping this extra field, fwkcs
          preserves all versions of av through pkzip version 2.04g.</pre>
    <pre>          fwkcs, and fwkcs contents_signature system, are
          trademarks of frederick w. kantor.</pre>
    <pre>          (1) r. rivest, rfc1321.txt, mit laboratory for computer
              science and rsa data security, inc., april 1992.
              ll.76-77: &quot;the md5 algorithm is being placed in the
              public domain for review and possible adoption as a
              standard.&quot;</pre>
    <pre>      file comment: (variable)</pre>
    <pre>          the comment for this file.</pre>
    <pre>      number of this disk: (2 bytes)</pre>
    <pre>          the number of this disk, which contains central
          directory end record.</pre>
    <pre>      number of the disk with the start of the central directory: (2 bytes)</pre>
    <pre>          the number of the disk on which the central
          directory starts.</pre>
    <pre>      total number of entries in the central dir on this disk: (2 bytes)</pre>
    <pre>          the number of central directory entries on this disk.</pre>
    <pre>      total number of entries in the central dir: (2 bytes)</pre>
    <pre>          the total number of files in the zipfile.
</pre>
    <pre>      size of the central directory: (4 bytes)</pre>
    <pre>          the size (in bytes) of the entire central directory.</pre>
    <pre>      offset of start of central directory with respect to
      the starting disk number:  (4 bytes)</pre>
    <pre>          offset of the start of the central directory on the
          disk on which the central directory starts.</pre>
    <pre>      zipfile comment length: (2 bytes)</pre>
    <pre>          the length of the comment for this zipfile.</pre>
    <pre>      zipfile comment: (variable)</pre>
    <pre>          the comment for this zipfile.
</pre>
    <pre>  d.  general notes:</pre>
    <pre>      1)  all fields unless otherwise noted are unsigned and stored
          in intel low-byte:high-byte, low-word:high-word order.</pre>
    <pre>      2)  string fields are not null terminated, since the
          length is given explicitly.</pre>
    <pre>      3)  local headers should not span disk boundaries.  also, even
          though the central directory can span disk boundaries, no
          single record in the central directory should be split
          across disks.</pre>
    <pre>      4)  the entries in the central directory may not necessarily
          be in the same order that files appear in the zipfile.</pre>
    <pre>unshrinking - method 1
----------------------</pre>
    <pre>shrinking is a dynamic ziv-lempel-welch compression algorithm
with partial clearing.  the initial code size is 9 bits, and
the maximum code size is 13 bits.  shrinking differs from
conventional dynamic ziv-lempel-welch implementations in several
respects:</pre>
    <pre>1)  the code size is controlled by the compressor, and is not
    automatically increased when codes larger than the current
    code size are created (but not necessarily used).  when
    the decompressor encounters the code sequence 256
    (decimal) followed by 1, it should increase the code size
    read from the input stream to the next bit size.  no
    blocking of the codes is performed, so the next code at
    the increased size should be read from the input stream
    immediately after where the previous code at the smaller
    bit size was read.  again, the decompressor should not
    increase the code size used until the sequence 256,1 is
    encountered.</pre>
    <pre>2)  when the table becomes full, total clearing is not
    performed.  rather, when the compressor emits the code
    sequence 256,2 (decimal), the decompressor should clear
    all leaf nodes from the ziv-lempel tree, and continue to
    use the current code size.  the nodes that are cleared
    from the ziv-lempel tree are then re-used, with the lowest
    code value re-used first, and the highest code value
    re-used last.  the compressor can emit the sequence 256,2
    at any time.

</pre>
    <pre>expanding - methods 2-5
-----------------------</pre>
    <pre>the reducing algorithm is actually a combination of two
distinct algorithms.  the first algorithm compresses repeated
byte sequences, and the second algorithm takes the compressed
stream from the first algorithm and applies a probabilistic
compression method.</pre>
    <pre>the probabilistic compression stores an array of 'follower
sets' s(j), for j=0 to 255, corresponding to each possible
ascii character.  each set contains between 0 and 32
characters, to be denoted as s(j)[0],...,s(j)[m], where m&lt;32.
the sets are stored at the beginning of the data area for a
reduced file, in reverse order, with s(255) first, and s(0)
last.</pre>
    <pre>the sets are encoded as { n(j), s(j)[0],...,s(j)[n(j)-1] },
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -