📄 zlib_how.html
字号:
until the state is reinitialized.
<p>
Some applications of <em>zlib</em> have two loops that call <tt>deflate()</tt>
instead of the single inner loop we have here. The first loop would call
without flushing and feed all of the data to <tt>deflate()</tt>. The second loop would call
<tt>deflate()</tt> with no more
data and the <tt>Z_FINISH</tt> parameter to complete the process. As you can see from this
example, that can be avoided by simply keeping track of the current flush state.
<pre><b>
} while (strm.avail_out == 0);
assert(strm.avail_in == 0); /* all input will be used */
</b></pre><!-- -->
Now we check to see if we have already processed all of the input file. That information was
saved in the <tt>flush</tt> variable, so we see if that was set to <tt>Z_FINISH</tt>. If so,
then we're done and we fall out of the outer loop. We're guaranteed to get <tt>Z_STREAM_END</tt>
from the last <tt>deflate()</tt> call, since we ran it until the last chunk of input was
consumed and all of the output was generated.
<pre><b>
/* done when last data in file processed */
} while (flush != Z_FINISH);
assert(ret == Z_STREAM_END); /* stream will be complete */
</b></pre><!-- -->
The process is complete, but we still need to deallocate the state to avoid a memory leak
(or rather more like a memory hemorrhage if you didn't do this). Then
finally we can return with a happy return value.
<pre><b>
/* clean up and return */
(void)deflateEnd(&strm);
return Z_OK;
}
</b></pre><!-- -->
Now we do the same thing for decompression in the <tt>inf()</tt> routine. <tt>inf()</tt>
decompresses what is hopefully a valid <em>zlib</em> stream from the input file and writes the
uncompressed data to the output file. Much of the discussion above for <tt>def()</tt>
applies to <tt>inf()</tt> as well, so the discussion here will focus on the differences between
the two.
<pre><b>
/* Decompress from file source to file dest until stream ends or EOF.
inf() returns Z_OK on success, Z_MEM_ERROR if memory could not be
allocated for processing, Z_DATA_ERROR if the deflate data is
invalid or incomplete, Z_VERSION_ERROR if the version of zlib.h and
the version of the library linked do not match, or Z_ERRNO if there
is an error reading or writing the files. */
int inf(FILE *source, FILE *dest)
{
</b></pre>
The local variables have the same functionality as they do for <tt>def()</tt>. The
only difference is that there is no <tt>flush</tt> variable, since <tt>inflate()</tt>
can tell from the <em>zlib</em> stream itself when the stream is complete.
<pre><b>
int ret;
unsigned have;
z_stream strm;
char in[CHUNK];
char out[CHUNK];
</b></pre><!-- -->
The initialization of the state is the same, except that there is no compression level,
of course, and two more elements of the structure are initialized. <tt>avail_in</tt>
and <tt>next_in</tt> must be initialized before calling <tt>inflateInit()</tt>. This
is because the application has the option to provide the start of the zlib stream in
order for <tt>inflateInit()</tt> to have access to information about the compression
method to aid in memory allocation. In the current implementation of <em>zlib</em>
(up through versions 1.2.x), the method-dependent memory allocations are deferred to the first call of
<tt>inflate()</tt> anyway. However those fields must be initialized since later versions
of <em>zlib</em> that provide more compression methods may take advantage of this interface.
In any case, no decompression is performed by <tt>inflateInit()</tt>, so the
<tt>avail_out</tt> and <tt>next_out</tt> fields do not need to be initialized before calling.
<p>
Here <tt>avail_in</tt> is set to zero and <tt>next_in</tt> is set to <tt>Z_NULL</tt> to
indicate that no input data is being provided.
<pre><b>
/* allocate inflate state */
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
strm.avail_in = 0;
strm.next_in = Z_NULL;
ret = inflateInit(&strm);
if (ret != Z_OK)
return ret;
</b></pre><!-- -->
The outer <tt>do</tt>-loop decompresses input until <tt>inflate()</tt> indicates
that it has reached the end of the compressed data and has produced all of the uncompressed
output. This is in contrast to <tt>def()</tt> which processes all of the input file.
If end-of-file is reached before the compressed data self-terminates, then the compressed
data is incomplete and an error is returned.
<pre><b>
/* decompress until deflate stream ends or end of file */
do {
</b></pre>
We read input data and set the <tt>strm</tt> structure accordingly. If we've reached the
end of the input file, then we leave the outer loop and report an error, since the
compressed data is incomplete. Note that we may read more data than is eventually consumed
by <tt>inflate()</tt>, if the input file continues past the <em>zlib</em> stream.
For applications where <em>zlib</em> streams are embedded in other data, this routine would
need to be modified to return the unused data, or at least indicate how much of the input
data was not used, so the application would know where to pick up after the <em>zlib</em> stream.
<pre><b>
strm.avail_in = fread(in, 1, CHUNK, source);
if (ferror(source)) {
(void)inflateEnd(&strm);
return Z_ERRNO;
}
if (strm.avail_in == 0)
break;
strm.next_in = in;
</b></pre><!-- -->
The inner <tt>do</tt>-loop has the same function it did in <tt>def()</tt>, which is to
keep calling <tt>inflate()</tt> until has generated all of the output it can with the
provided input.
<pre><b>
/* run inflate() on input until output buffer not full */
do {
</b></pre>
Just like in <tt>def()</tt>, the same output space is provided for each call of <tt>inflate()</tt>.
<pre><b>
strm.avail_out = CHUNK;
strm.next_out = out;
</b></pre>
Now we run the decompression engine itself. There is no need to adjust the flush parameter, since
the <em>zlib</em> format is self-terminating. The main difference here is that there are
return values that we need to pay attention to. <tt>Z_DATA_ERROR</tt>
indicates that <tt>inflate()</tt> detected an error in the <em>zlib</em> compressed data format,
which means that either the data is not a <em>zlib</em> stream to begin with, or that the data was
corrupted somewhere along the way since it was compressed. The other error to be processed is
<tt>Z_MEM_ERROR</tt>, which can occur since memory allocation is deferred until <tt>inflate()</tt>
needs it, unlike <tt>deflate()</tt>, whose memory is allocated at the start by <tt>deflateInit()</tt>.
<p>
Advanced applications may use
<tt>deflateSetDictionary()</tt> to prime <tt>deflate()</tt> with a set of likely data to improve the
first 32K or so of compression. This is noted in the <em>zlib</em> header, so <tt>inflate()</tt>
requests that that dictionary be provided before it can start to decompress. Without the dictionary,
correct decompression is not possible. For this routine, we have no idea what the dictionary is,
so the <tt>Z_NEED_DICT</tt> indication is converted to a <tt>Z_DATA_ERROR</tt>.
<p>
<tt>inflate()</tt> can also return <tt>Z_STREAM_ERROR</tt>, which should not be possible here,
but could be checked for as noted above for <tt>def()</tt>. <tt>Z_BUF_ERROR</tt> does not need to be
checked for here, for the same reasons noted for <tt>def()</tt>. <tt>Z_STREAM_END</tt> will be
checked for later.
<pre><b>
ret = inflate(&strm, Z_NO_FLUSH);
assert(ret != Z_STREAM_ERROR); /* state not clobbered */
switch (ret) {
case Z_NEED_DICT:
ret = Z_DATA_ERROR; /* and fall through */
case Z_DATA_ERROR:
case Z_MEM_ERROR:
(void)inflateEnd(&strm);
return ret;
}
</b></pre>
The output of <tt>inflate()</tt> is handled identically to that of <tt>deflate()</tt>.
<pre><b>
have = CHUNK - strm.avail_out;
if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
(void)inflateEnd(&strm);
return Z_ERRNO;
}
</b></pre>
The inner <tt>do</tt>-loop ends when <tt>inflate()</tt> has no more output as indicated
by not filling the output buffer, just as for <tt>deflate()</tt>. In this case, we cannot
assert that <tt>strm.avail_in</tt> will be zero, since the deflate stream may end before the file
does.
<pre><b>
} while (strm.avail_out == 0);
</b></pre><!-- -->
The outer <tt>do</tt>-loop ends when <tt>inflate()</tt> reports that it has reached the
end of the input <em>zlib</em> stream, has completed the decompression and integrity
check, and has provided all of the output. This is indicated by the <tt>inflate()</tt>
return value <tt>Z_STREAM_END</tt>. The inner loop is guaranteed to leave <tt>ret</tt>
equal to <tt>Z_STREAM_END</tt> if the last chunk of the input file read contained the end
of the <em>zlib</em> stream. So if the return value is not <tt>Z_STREAM_END</tt>, the
loop continues to read more input.
<pre><b>
/* done when inflate() says it's done */
} while (ret != Z_STREAM_END);
</b></pre><!-- -->
At this point, decompression successfully completed, or we broke out of the loop due to no
more data being available from the input file. If the last <tt>inflate()</tt> return value
is not <tt>Z_STREAM_END</tt>, then the <em>zlib</em> stream was incomplete and a data error
is returned. Otherwise, we return with a happy return value. Of course, <tt>inflateEnd()</tt>
is called first to avoid a memory leak.
<pre><b>
/* clean up and return */
(void)inflateEnd(&strm);
return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
}
</b></pre><!-- -->
That ends the routines that directly use <em>zlib</em>. The following routines make this
a command-line program by running data through the above routines from <tt>stdin</tt> to
<tt>stdout</tt>, and handling any errors reported by <tt>def()</tt> or <tt>inf()</tt>.
<p>
<tt>zerr()</tt> is used to interpret the possible error codes from <tt>def()</tt>
and <tt>inf()</tt>, as detailed in their comments above, and print out an error message.
Note that these are only a subset of the possible return values from <tt>deflate()</tt>
and <tt>inflate()</tt>.
<pre><b>
/* report a zlib or i/o error */
void zerr(int ret)
{
fputs("zpipe: ", stderr);
switch (ret) {
case Z_ERRNO:
if (ferror(stdin))
fputs("error reading stdin\n", stderr);
if (ferror(stdout))
fputs("error writing stdout\n", stderr);
break;
case Z_STREAM_ERROR:
fputs("invalid compression level\n", stderr);
break;
case Z_DATA_ERROR:
fputs("invalid or incomplete deflate data\n", stderr);
break;
case Z_MEM_ERROR:
fputs("out of memory\n", stderr);
break;
case Z_VERSION_ERROR:
fputs("zlib version mismatch!\n", stderr);
}
}
</b></pre><!-- -->
Here is the <tt>main()</tt> routine used to test <tt>def()</tt> and <tt>inf()</tt>. The
<tt>zpipe</tt> command is simply a compression pipe from <tt>stdin</tt> to <tt>stdout</tt>, if
no arguments are given, or it is a decompression pipe if <tt>zpipe -d</tt> is used. If any other
arguments are provided, no compression or decompression is performed. Instead a usage
message is displayed. Examples are <tt>zpipe < foo.txt > foo.txt.z</tt> to compress, and
<tt>zpipe -d < foo.txt.z > foo.txt</tt> to decompress.
<pre><b>
/* compress or decompress from stdin to stdout */
int main(int argc, char **argv)
{
int ret;
/* do compression if no arguments */
if (argc == 1) {
ret = def(stdin, stdout, Z_DEFAULT_COMPRESSION);
if (ret != Z_OK)
zerr(ret);
return ret;
}
/* do decompression if -d specified */
else if (argc == 2 && strcmp(argv[1], "-d") == 0) {
ret = inf(stdin, stdout);
if (ret != Z_OK)
zerr(ret);
return ret;
}
/* otherwise, report usage */
else {
fputs("zpipe usage: zpipe [-d] < source > dest\n", stderr);
return 1;
}
}
</b></pre>
<hr>
<i>Copyright (c) 2004 by Mark Adler<br>Last modified 13 November 2004</i>
</body>
</html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -