📄 zip.htm

📁 常见的一些文件如WORD、JPG等的文件格式
💻 HTM
📖 第 1 页 / 共 4 页
字号:
上一页 1 2 34
   bits 1-2: block type
      00 (0) - block is stored - all stored data is byte aligned.
               skip bits until next byte, then next word = block length,
               followed by the ones compliment of the block length word.
               remaining data in block is the stored data.</pre>
    <pre>      01 (1) - use fixed huffman codes for literal and distance codes.
               lit code    bits             dist code   bits
               ---------   ----             ---------   ----
                 0 - 143    8                 0 - 31      5
               144 - 255    9
               256 - 279    7
               280 - 287    8</pre>
    <pre>               literal codes 286-287 and distance codes 30-31 are never
               used but participate in the huffman construction.</pre>
    <pre>      10 (2) - dynamic huffman codes.  (see expanding huffman codes)</pre>
    <pre>      11 (3) - reserved - flag a &quot;error in compressed data&quot; if seen.</pre>
    <pre>expanding huffman codes
-----------------------
if the data block is stored with dynamic huffman codes, the huffman
codes are sent in the following compressed format:</pre>
    <pre>   5 bits: # of literal codes sent - 256 (256 - 286)
           all other codes are never sent.
   5 bits: # of dist codes - 1           (1 - 32)
   4 bits: # of bit length codes - 3     (3 - 19)</pre>
    <pre>the huffman codes are sent as bit lengths and the codes are built as
described in the implode algorithm.  the bit lengths themselves are
compressed with huffman codes.  there are 19 bit length codes:</pre>
    <pre>   0 - 15: represent bit lengths of 0 - 15
       16: copy the previous bit length 3 - 6 times.
           the next 2 bits indicate repeat length (0 = 3, ... ,3 = 6)
              example:  codes 8, 16 (+2 bits 11), 16 (+2 bits 10) will
                        expand to 12 bit lengths of 8 (1 + 6 + 5)
       17: repeat a bit length of 0 for 3 - 10 times. (3 bits of length)
       18: repeat a bit length of 0 for 11 - 138 times (7 bits of length)</pre>
    <pre>the lengths of the bit length codes are sent packed 3 bits per value
(0 - 7) in the following order:</pre>
    <pre>   16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15</pre>
    <pre>the huffman codes should be built as described in the implode algorithm
except codes are assigned starting at the shortest bit length, i.e. the
shortest code should be all 0's rather than all 1's.  also, codes with
a bit length of zero do not participate in the tree construction.  the
codes are then used to decode the bit lengths for the literal and distance
tables.</pre>
    <pre>the bit lengths for the literal tables are sent first with the number
of entries sent described by the 5 bits sent earlier.  there are up
to 286 literal characters; the first 256 represent the respective 8
bit character, code 256 represents the end-of-block code, the remaining
29 codes represent copy lengths of 3 thru 258.  there are up to 30
distance codes representing distances from 1 thru 32k as described
below.</pre>
    <pre>                             length codes
                             ------------
      extra             extra              extra              extra
 code bits length  code bits lengths  code bits lengths  code bits length(s)
 ---- ---- ------  ---- ---- -------  ---- ---- -------  ---- ---- ---------
  257   0     3     265   1   11,12    273   3   35-42    281   5  131-162
  258   0     4     266   1   13,14    274   3   43-50    282   5  163-194
  259   0     5     267   1   15,16    275   3   51-58    283   5  195-226
  260   0     6     268   1   17,18    276   3   59-66    284   5  227-257
  261   0     7     269   2   19-22    277   4   67-82    285   0    258
  262   0     8     270   2   23-26    278   4   83-98
  263   0     9     271   2   27-30    279   4   99-114
  264   0    10     272   2   31-34    280   4  115-130</pre>
    <pre>                            distance codes
                            --------------
      extra           extra             extra               extra
 code bits dist  code bits  dist   code bits distance  code bits distance
 ---- ---- ----  ---- ---- ------  ---- ---- --------  ---- ---- --------
   0   0    1      8   3   17-24    16    7  257-384    24   11  4097-6144
   1   0    2      9   3   25-32    17    7  385-512    25   11  6145-8192
   2   0    3     10   4   33-48    18    8  513-768    26   12  8193-12288
   3   0    4     11   4   49-64    19    8  769-1024   27   12 12289-16384
   4   1   5,6    12   5   65-96    20    9 1025-1536   28   13 16385-24576
   5   1   7,8    13   5   97-128   21    9 1537-2048   29   13 24577-32768
   6   2   9-12   14   6  129-192   22   10 2049-3072
   7   2  13-16   15   6  193-256   23   10 3073-4096</pre>
    <pre>the compressed data stream begins immediately after the
compressed header data.  the compressed data stream can be
interpreted as follows:</pre>
    <pre>do
   read header from input stream.</pre>
    <pre>   if stored block
      skip bits until byte aligned
      read count and 1's compliment of count
      copy count bytes data block
   otherwise
      loop until end of block code sent
         decode literal character from input stream
         if literal &lt; 256
            copy character to the output stream
         otherwise
            if literal = end of block
               break from loop
            otherwise
               decode distance from input stream</pre>
    <pre>               move backwards distance bytes in the output stream, and
               copy length characters from this position to the output
               stream.
      end loop
while not last block</pre>
    <pre>if data descriptor exists
   skip bits until byte aligned
   read crc and sizes
endif</pre>
    <pre>decryption
----------</pre>
    <pre>the encryption used in pkzip was generously supplied by roger
schlafly.  pkware is grateful to mr. schlafly for his expert
help and advice in the field of data encryption.</pre>
    <pre>pkzip encrypts the compressed data stream.  encrypted files must
be decrypted before they can be extracted.</pre>
    <pre>each encrypted file has an extra 12 bytes stored at the start of
the data area defining the encryption header for that file.  the
encryption header is originally set to random values, and then
itself encrypted, using three, 32-bit keys.  the key values are
initialized using the supplied encryption password.  after each byte
is encrypted, the keys are then updated using pseudo-random number
generation techniques in combination with the same crc-32 algorithm
used in pkzip and described elsewhere in this document.</pre>
    <pre>the following is the basic steps required to decrypt a file:</pre>
    <pre>1) initialize the three 32-bit keys with the password.
2) read and decrypt the 12-byte encryption header, further
   initializing the encryption keys.
3) read and decrypt the compressed data stream using the
   encryption keys.
</pre>
    <pre>step 1 - initializing the encryption keys
-----------------------------------------</pre>
    <pre>key(0) &lt;- 305419896
key(1) &lt;- 591751049
key(2) &lt;- 878082192</pre>
    <pre>loop for i &lt;- 0 to length(password)-1
    update_keys(password(i))
end loop
</pre>
    <pre>where update_keys() is defined as:
</pre>
    <pre>update_keys(char):
  key(0) &lt;- crc32(key(0),char)
  key(1) &lt;- key(1) + (key(0) &amp; 000000ffh)
  key(1) &lt;- key(1) * 134775813 + 1
  key(2) &lt;- crc32(key(2),key(1) &gt;&gt; 24)
end update_keys
</pre>
    <pre>where crc32(old_crc,char) is a routine that given a crc value and a
character, returns an updated crc value after applying the crc-32
algorithm described elsewhere in this document.
</pre>
    <pre>step 2 - decrypting the encryption header
-----------------------------------------</pre>
    <pre>the purpose of this step is to further initialize the encryption
keys, based on random data, to render a plaintext attack on the
data ineffective.
</pre>
    <pre>read the 12-byte encryption header into buffer, in locations
buffer(0) thru buffer(11).</pre>
    <pre>loop for i &lt;- 0 to 11
    c &lt;- buffer(i) ^ decrypt_byte()
    update_keys(c)
    buffer(i) &lt;- c
end loop
</pre>
    <pre>where decrypt_byte() is defined as:
</pre>
    <pre>unsigned char decrypt_byte()
    local unsigned short temp
    temp &lt;- key(2) | 2
    decrypt_byte &lt;- (temp * (temp ^ 1)) &gt;&gt; 8
end decrypt_byte
</pre>
    <pre>after the header is decrypted,  the last 1 or 2 bytes in buffer
should be the high-order word/byte of the crc for the file being
decrypted, stored in intel low-byte/high-byte order.  versions of
pkzip prior to 2.0 used a 2 byte crc check; a 1 byte crc check is
used on versions after 2.0.  this can be used to test if the password
supplied is correct or not.
</pre>
    <pre>step 3 - decrypting the compressed data stream
----------------------------------------------</pre>
    <pre>the compressed data stream can be decrypted as follows:
</pre>
    <pre>loop until done
    read a character into c
    temp &lt;- c ^ decrypt_byte()
    update_keys(temp)
    output temp
end loop
</pre>
    <pre>in addition to the above mentioned contributors to pkzip and pkunzip,
i would like to extend special thanks to robert mahoney for suggesting
the extension .zip for this software.
</pre>
    <pre>references:</pre>
    <pre>    fiala, edward r., and greene, daniel h., &quot;data compression with
       finite windows&quot;,  communications of the acm, volume 32, number 4,
       april 1989, pages 490-505.</pre>
    <pre>    held, gilbert, &quot;data compression, techniques and applications,
                    hardware and software considerations&quot;,
       john wiley &amp; sons, 1987.</pre>
    <pre>    huffman, d.a., &quot;a method for the construction of minimum-redundancy
       codes&quot;, proceedings of the ire, volume 40, number 9, september 1952,
       pages 1098-1101.</pre>
    <pre>    nelson, mark, &quot;lzw data compression&quot;, dr. dobbs journal, volume 14,
       number 10, october 1989, pages 29-37.</pre>
    <pre>    nelson, mark, &quot;the data compression book&quot;,  m&amp;t books, 1991.</pre>
    <pre>    storer, james a., &quot;data compression, methods and theory&quot;,
       computer science press, 1988</pre>
    <pre>    welch, terry, &quot;a technique for high-performance data compression&quot;,
       ieee computer, volume 17, number 6, june 1984, pages 8-19.</pre>
    <pre>    ziv, j. and lempel, a., &quot;a universal algorithm for sequential data
       compression&quot;, communications of the acm, volume 30, number 6,
       june 1987, pages 520-540.</pre>
    <pre>    ziv, j. and lempel, a., &quot;compression of individual sequences via
       variable-rate coding&quot;, ieee transactions on information theory,
       volume 24, number 5, september 1978, pages 530-536.</pre>
    </td>
  </tr>
</table>
</center></div>

<p align="center"><a href="../index.htm">返回</a></p>
</body>
</html>
上一页 1 2 34
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -