⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 zlib_how.html

📁 ZIP压缩算法源代码,可以直接加入C++Project中编译调用
💻 HTML
📖 第 1 页 / 共 2 页
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
  "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>zlib Usage Example</title>
<!--  Copyright (c) 2004 Mark Adler.  -->
</head>
<body bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#00A000">
<h2 align="center"> zlib Usage Example </h2>
We often get questions about how the <tt>deflate()</tt> and <tt>inflate()</tt> functions should be used.
Users wonder when they should provide more input, when they should use more output,
what to do with a <tt>Z_BUF_ERROR</tt>, how to make sure the process terminates properly, and
so on.  So for those who have read <tt>zlib.h</tt> (a few times), and
would like further edification, below is an annotated example in C of simple routines to compress and decompress
from an input file to an output file using <tt>deflate()</tt> and <tt>inflate()</tt> respectively.  The
annotations are interspersed between lines of the code.  So please read between the lines.
We hope this helps explain some of the intricacies of <em>zlib</em>.
<p>
Without further adieu, here is the program <a href="zpipe.c"><tt>zpipe.c</tt></a>:
<pre><b>
/* zpipe.c: example of proper use of zlib's inflate() and deflate()
   Not copyrighted -- provided to the public domain
   Version 1.2  9 November 2004  Mark Adler */

/* Version history:
   1.0  30 Oct 2004  First version
   1.1   8 Nov 2004  Add void casting for unused return values
                     Use switch statement for inflate() return values
   1.2   9 Nov 2004  Add assertions to document zlib guarantees
 */
</b></pre><!-- -->
We now include the header files for the required definitions.  From
<tt>stdio.h</tt> we use <tt>fopen()</tt>, <tt>fread()</tt>, <tt>fwrite()</tt>,
<tt>feof()</tt>, <tt>ferror()</tt>, and <tt>fclose()</tt> for file i/o, and
<tt>fputs()</tt> for error messages.  From <tt>string.h</tt> we use
<tt>strcmp()</tt> for command line argument processing.
From <tt>assert.h</tt> we use the <tt>assert()</tt> macro.
From <tt>zlib.h</tt>
we use the basic compression functions <tt>deflateInit()</tt>,
<tt>deflate()</tt>, and <tt>deflateEnd()</tt>, and the basic decompression
functions <tt>inflateInit()</tt>, <tt>inflate()</tt>, and
<tt>inflateEnd()</tt>.
<pre><b>
#include &lt;stdio.h&gt;
#include &lt;string.h&gt;
#include &lt;assert.h&gt;
#include "zlib.h"
</b></pre><!-- -->
<tt>CHUNK</tt> is simply the buffer size for feeding data to and pulling data
from the <em>zlib</em> routines.  Larger buffer sizes would be more efficient,
especially for <tt>inflate()</tt>.  If the memory is available, buffers sizes
on the order of 128K or 256K bytes should be used.
<pre><b>
#define CHUNK 16384
</b></pre><!-- -->
The <tt>def()</tt> routine compresses data from an input file to an output file.  The output data
will be in the <em>zlib</em> format, which is different from the <em>gzip</em> or <em>zip</em>
formats.  The <em>zlib</em> format has a very small header of only two bytes to identify it as
a <em>zlib</em> stream and to provide decoding information, and a four-byte trailer with a fast
check value to verify the integrity of the uncompressed data after decoding.
<pre><b>
/* Compress from file source to file dest until EOF on source.
   def() returns Z_OK on success, Z_MEM_ERROR if memory could not be
   allocated for processing, Z_STREAM_ERROR if an invalid compression
   level is supplied, Z_VERSION_ERROR if the version of zlib.h and the
   version of the library linked do not match, or Z_ERRNO if there is
   an error reading or writing the files. */
int def(FILE *source, FILE *dest, int level)
{
</b></pre>
Here are the local variables for <tt>def()</tt>.  <tt>ret</tt> will be used for <em>zlib</em>
return codes.  <tt>flush</tt> will keep track of the current flushing state for <tt>deflate()</tt>,
which is either no flushing, or flush to completion after the end of the input file is reached.
<tt>have</tt> is the amount of data returned from <tt>deflate()</tt>.  The <tt>strm</tt> structure
is used to pass information to and from the <em>zlib</em> routines, and to maintain the
<tt>deflate()</tt> state.  <tt>in</tt> and <tt>out</tt> are the input and output buffers for
<tt>deflate()</tt>.
<pre><b>
    int ret, flush;
    unsigned have;
    z_stream strm;
    char in[CHUNK];
    char out[CHUNK];
</b></pre><!-- -->
The first thing we do is to initialize the <em>zlib</em> state for compression using
<tt>deflateInit()</tt>.  This must be done before the first use of <tt>deflate()</tt>.
The <tt>zalloc</tt>, <tt>zfree</tt>, and <tt>opaque</tt> fields in the <tt>strm</tt>
structure must be initialized before calling <tt>deflateInit()</tt>.  Here they are
set to the <em>zlib</em> constant <tt>Z_NULL</tt> to request that <em>zlib</em> use
the default memory allocation routines.  An application may also choose to provide
custom memory allocation routines here.  <tt>deflateInit()</tt> will allocate on the
order of 256K bytes for the internal state.
(See <a href="zlib_tech.html"><em>zlib Technical Details</em></a>.)
<p>
<tt>deflateInit()</tt> is called with a pointer to the structure to be initialized and
the compression level, which is an integer in the range of -1 to 9.  Lower compression
levels result in faster execution, but less compression.  Higher levels result in
greater compression, but slower execution.  The <em>zlib</em> constant Z_DEFAULT_COMPRESSION,
equal to -1,
provides a good compromise between compression and speed and is equivalent to level 6.
Level 0 actually does no compression at all, and in fact expands the data slightly to produce
the <em>zlib</em> format (it is not a byte-for-byte copy of the input).
More advanced applications of <em>zlib</em>
may use <tt>deflateInit2()</tt> here instead.  Such an application may want to reduce how
much memory will be used, at some price in compression.  Or it may need to request a
<em>gzip</em> header and trailer instead of a <em>zlib</em> header and trailer, or raw
encoding with no header or trailer at all.
<p>
We must check the return value of <tt>deflateInit()</tt> against the <em>zlib</em> constant
<tt>Z_OK</tt> to make sure that it was able to
allocate memory for the internal state, and that the provided arguments were valid.
<tt>deflateInit()</tt> will also check that the version of <em>zlib</em> that the <tt>zlib.h</tt>
file came from matches the version of <em>zlib</em> actually linked with the program.  This
is especially important for environments in which <em>zlib</em> is a shared library.
<p>
Note that an application can initialize multiple, independent <em>zlib</em> streams, which can
operate in parallel.  The state information maintained in the structure allows the <em>zlib</em>
routines to be reentrant.
<pre><b>
    /* allocate deflate state */
    strm.zalloc = Z_NULL;
    strm.zfree = Z_NULL;
    strm.opaque = Z_NULL;
    ret = deflateInit(&amp;strm, level);
    if (ret != Z_OK)
        return ret;
</b></pre><!-- -->
With the pleasantries out of the way, now we can get down to business.  The outer <tt>do</tt>-loop
reads all of the input file and exits at the bottom of the loop once end-of-file is reached.
This loop contains the only call of <tt>deflate()</tt>.  So we must make sure that all of the
input data has been processed and that all of the output data has been generated and consumed
before we fall out of the loop at the bottom.
<pre><b>
    /* compress until end of file */
    do {
</b></pre>
We start off by reading data from the input file.  The number of bytes read is put directly
into <tt>avail_in</tt>, and a pointer to those bytes is put into <tt>next_in</tt>.  We also
check to see if end-of-file on the input has been reached.  If we are at the end of file, then <tt>flush</tt> is set to the
<em>zlib</em> constant <tt>Z_FINISH</tt>, which is later passed to <tt>deflate()</tt> to
indicate that this is the last chunk of input data to compress.  We need to use <tt>feof()</tt>
to check for end-of-file as opposed to seeing if fewer than <tt>CHUNK</tt> bytes have been read.  The
reason is that if the input file length is an exact multiple of <tt>CHUNK</tt>, we will miss
the fact that we got to the end-of-file, and not know to tell <tt>deflate()</tt> to finish
up the compressed stream.  If we are not yet at the end of the input, then the <em>zlib</em>
constant <tt>Z_NO_FLUSH</tt> will be passed to <tt>deflate</tt> to indicate that we are still
in the middle of the uncompressed data.
<p>
If there is an error in reading from the input file, the process is aborted with
<tt>deflateEnd()</tt> being called to free the allocated <em>zlib</em> state before returning
the error.  We wouldn't want a memory leak, now would we?  <tt>deflateEnd()</tt> can be called
at any time after the state has been initialized.  Once that's done, <tt>deflateInit()</tt> (or
<tt>deflateInit2()</tt>) would have to be called to start a new compression process.  There is
no point here in checking the <tt>deflateEnd()</tt> return code.  The deallocation can't fail.
<pre><b>
        strm.avail_in = fread(in, 1, CHUNK, source);
        if (ferror(source)) {
            (void)deflateEnd(&amp;strm);
            return Z_ERRNO;
        }
        flush = feof(source) ? Z_FINISH : Z_NO_FLUSH;
        strm.next_in = in;
</b></pre><!-- -->
The inner <tt>do</tt>-loop passes our chunk of input data to <tt>deflate()</tt>, and then
keeps calling <tt>deflate()</tt> until it is done producing output.  Once there is no more
new output, <tt>deflate()</tt> is guaranteed to have consumed all of the input, i.e.,
<tt>avail_in</tt> will be zero.
<pre><b>
        /* run deflate() on input until output buffer not full, finish
           compression if all of source has been read in */
        do {
</b></pre>
Output space is provided to <tt>deflate()</tt> by setting <tt>avail_out</tt> to the number
of available output bytes and <tt>next_out</tt> to a pointer to that space.
<pre><b>
            strm.avail_out = CHUNK;
            strm.next_out = out;
</b></pre>
Now we call the compression engine itself, <tt>deflate()</tt>.  It takes as many of the
<tt>avail_in</tt> bytes at <tt>next_in</tt> as it can process, and writes as many as
<tt>avail_out</tt> bytes to <tt>next_out</tt>.  Those counters and pointers are then
updated past the input data consumed and the output data written.  It is the amount of
output space available that may limit how much input is consumed.
Hence the inner loop to make sure that
all of the input is consumed by providing more output space each time.  Since <tt>avail_in</tt>
and <tt>next_in</tt> are updated by <tt>deflate()</tt>, we don't have to mess with those
between <tt>deflate()</tt> calls until it's all used up.
<p>
The parameters to <tt>deflate()</tt> are a pointer to the <tt>strm</tt> structure containing
the input and output information and the internal compression engine state, and a parameter
indicating whether and how to flush data to the output.  Normally <tt>deflate</tt> will consume
several K bytes of input data before producing any output (except for the header), in order
to accumulate statistics on the data for optimum compression.  It will then put out a burst of
compressed data, and proceed to consume more input before the next burst.  Eventually,
<tt>deflate()</tt>
must be told to terminate the stream, complete the compression with provided input data, and
write out the trailer check value.  <tt>deflate()</tt> will continue to compress normally as long
as the flush parameter is <tt>Z_NO_FLUSH</tt>.  Once the <tt>Z_FINISH</tt> parameter is provided,
<tt>deflate()</tt> will begin to complete the compressed output stream.  However depending on how
much output space is provided, <tt>deflate()</tt> may have to be called several times until it
has provided the complete compressed stream, even after it has consumed all of the input.  The flush
parameter must continue to be <tt>Z_FINISH</tt> for those subsequent calls.
<p>
There are other values of the flush parameter that are used in more advanced applications.  You can
force <tt>deflate()</tt> to produce a burst of output that encodes all of the input data provided
so far, even if it wouldn't have otherwise, for example to control data latency on a link with
compressed data.  You can also ask that <tt>deflate()</tt> do that as well as erase any history up to
that point so that what follows can be decompressed independently, for example for random access
applications.  Both requests will degrade compression by an amount depending on how often such
requests are made.
<p>
<tt>deflate()</tt> has a return value that can indicate errors, yet we do not check it here.  Why
not?  Well, it turns out that <tt>deflate()</tt> can do no wrong here.  Let's go through
<tt>deflate()</tt>'s return values and dispense with them one by one.  The possible values are
<tt>Z_OK</tt>, <tt>Z_STREAM_END</tt>, <tt>Z_STREAM_ERROR</tt>, or <tt>Z_BUF_ERROR</tt>.  <tt>Z_OK</tt>
is, well, ok.  <tt>Z_STREAM_END</tt> is also ok and will be returned for the last call of
<tt>deflate()</tt>.  This is already guaranteed by calling <tt>deflate()</tt> with <tt>Z_FINISH</tt>
until it has no more output.  <tt>Z_STREAM_ERROR</tt> is only possible if the stream is not
initialized properly, but we did initialize it properly.  There is no harm in checking for
<tt>Z_STREAM_ERROR</tt> here, for example to check for the possibility that some
other part of the application inadvertently clobbered the memory containing the <em>zlib</em> state.
<tt>Z_BUF_ERROR</tt> will be explained further below, but
suffice it to say that this is simply an indication that <tt>deflate()</tt> could not consume
more input or produce more output.  <tt>deflate()</tt> can be called again with more output space
or more available input, which it will be in this code.
<pre><b>
            ret = deflate(&amp;strm, flush);    /* no bad return value */
            assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
</b></pre>
Now we compute how much output <tt>deflate()</tt> provided on the last call, which is the
difference between how much space was provided before the call, and how much output space
is still available after the call.  Then that data, if any, is written to the output file.
We can then reuse the output buffer for the next call of <tt>deflate()</tt>.  Again if there
is a file i/o error, we call <tt>deflateEnd()</tt> before returning to avoid a memory leak.
<pre><b>
            have = CHUNK - strm.avail_out;
            if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
                (void)deflateEnd(&amp;strm);
                return Z_ERRNO;
            }
</b></pre>
The inner <tt>do</tt>-loop is repeated until the last <tt>deflate()</tt> call fails to fill the
provided output buffer.  Then we know that <tt>deflate()</tt> has done as much as it can with
the provided input, and that all of that input has been consumed.  We can then fall out of this
loop and reuse the input buffer.
<p>
The way we tell that <tt>deflate()</tt> has no more output is by seeing that it did not fill
the output buffer, leaving <tt>avail_out</tt> greater than zero.  However suppose that
<tt>deflate()</tt> has no more output, but just so happened to exactly fill the output buffer!
<tt>avail_out</tt> is zero, and we can't tell that <tt>deflate()</tt> has done all it can.
As far as we know, <tt>deflate()</tt>
has more output for us.  So we call it again.  But now <tt>deflate()</tt> produces no output
at all, and <tt>avail_out</tt> remains unchanged as <tt>CHUNK</tt>.  That <tt>deflate()</tt> call
wasn't able to do anything, either consume input or produce output, and so it returns
<tt>Z_BUF_ERROR</tt>.  (See, I told you I'd cover this later.)  However this is not a problem at
all.  Now we finally have the desired indication that <tt>deflate()</tt> is really done,
and so we drop out of the inner loop to provide more input to <tt>deflate()</tt>.
<p>
With <tt>flush</tt> set to <tt>Z_FINISH</tt>, this final set of <tt>deflate()</tt> calls will
complete the output stream.  Once that is done, subsequent calls of <tt>deflate()</tt> would return
<tt>Z_STREAM_ERROR</tt> if the flush parameter is not <tt>Z_FINISH</tt>, and do no more processing

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -