📄 manual.texi
字号:
\input texinfo @c -*- Texinfo -*-
@setfilename bzip2.info
@ignore
This file documents bzip2 version 1.0, and associated library
libbzip2, written by Julian Seward (jseward@acm.org).
Copyright (C) 1996-2000 Julian R Seward
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.
Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for verbatim copies.
@end ignore
@ifinfo
@format
START-INFO-DIR-ENTRY
* Bzip2: (bzip2). A program and library for data compression.
END-INFO-DIR-ENTRY
@end format
@end ifinfo
@iftex
@c @finalout
@settitle bzip2 and libbzip2
@titlepage
@title bzip2 and libbzip2
@subtitle a program and library for data compression
@subtitle copyright (C) 1996-2000 Julian Seward
@subtitle version 1.0 of 21 March 2000
@author Julian Seward
@end titlepage
@parindent 0mm
@parskip 2mm
@end iftex
@node Top, Overview, (dir), (dir)
This program, @code{bzip2},
and associated library @code{libbzip2}, are
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
@itemize @bullet
@item
Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
@item
The origin of this software must not be misrepresented; you must
not claim that you wrote the original software. If you use this
software in a product, an acknowledgment in the product
documentation would be appreciated but is not required.
@item
Altered source versions must be plainly marked as such, and must
not be misrepresented as being the original software.
@item
The name of the author may not be used to endorse or promote
products derived from this software without specific prior written
permission.
@end itemize
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Julian Seward, Cambridge, UK.
@code{jseward@@acm.org}
@code{http://sourceware.cygnus.com/bzip2}
@code{http://www.cacheprof.org}
@code{http://www.muraroa.demon.co.uk}
@code{bzip2}/@code{libbzip2} version 1.0 of 21 March 2000.
PATENTS: To the best of my knowledge, @code{bzip2} does not use any patented
algorithms. However, I do not have the resources available to carry out
a full patent search. Therefore I cannot give any guarantee of the
above statement.
@node Overview, Implementation, Top, Top
@chapter Introduction
@code{bzip2} compresses files using the Burrows-Wheeler
block-sorting text compression algorithm, and Huffman coding.
Compression is generally considerably better than that
achieved by more conventional LZ77/LZ78-based compressors,
and approaches the performance of the PPM family of statistical compressors.
@code{bzip2} is built on top of @code{libbzip2}, a flexible library
for handling compressed data in the @code{bzip2} format. This manual
describes both how to use the program and
how to work with the library interface. Most of the
manual is devoted to this library, not the program,
which is good news if your interest is only in the program.
Chapter 2 describes how to use @code{bzip2}; this is the only part
you need to read if you just want to know how to operate the program.
Chapter 3 describes the programming interfaces in detail, and
Chapter 4 records some miscellaneous notes which I thought
ought to be recorded somewhere.
@chapter How to use @code{bzip2}
This chapter contains a copy of the @code{bzip2} man page,
and nothing else.
@quotation
@unnumberedsubsubsec NAME
@itemize
@item @code{bzip2}, @code{bunzip2}
- a block-sorting file compressor, v1.0
@item @code{bzcat}
- decompresses files to stdout
@item @code{bzip2recover}
- recovers data from damaged bzip2 files
@end itemize
@unnumberedsubsubsec SYNOPSIS
@itemize
@item @code{bzip2} [ -cdfkqstvzVL123456789 ] [ filenames ... ]
@item @code{bunzip2} [ -fkvsVL ] [ filenames ... ]
@item @code{bzcat} [ -s ] [ filenames ... ]
@item @code{bzip2recover} filename
@end itemize
@unnumberedsubsubsec DESCRIPTION
@code{bzip2} compresses files using the Burrows-Wheeler block sorting
text compression algorithm, and Huffman coding. Compression is
generally considerably better than that achieved by more conventional
LZ77/LZ78-based compressors, and approaches the performance of the PPM
family of statistical compressors.
The command-line options are deliberately very similar to those of GNU
@code{gzip}, but they are not identical.
@code{bzip2} expects a list of file names to accompany the command-line
flags. Each file is replaced by a compressed version of itself, with
the name @code{original_name.bz2}. Each compressed file has the same
modification date, permissions, and, when possible, ownership as the
corresponding original, so that these properties can be correctly
restored at decompression time. File name handling is naive in the
sense that there is no mechanism for preserving original file names,
permissions, ownerships or dates in filesystems which lack these
concepts, or have serious file name length restrictions, such as MS-DOS.
@code{bzip2} and @code{bunzip2} will by default not overwrite existing
files. If you want this to happen, specify the @code{-f} flag.
If no file names are specified, @code{bzip2} compresses from standard
input to standard output. In this case, @code{bzip2} will decline to
write compressed output to a terminal, as this would be entirely
incomprehensible and therefore pointless.
@code{bunzip2} (or @code{bzip2 -d}) decompresses all
specified files. Files which were not created by @code{bzip2}
will be detected and ignored, and a warning issued.
@code{bzip2} attempts to guess the filename for the decompressed file
from that of the compressed file as follows:
@itemize
@item @code{filename.bz2 } becomes @code{filename}
@item @code{filename.bz } becomes @code{filename}
@item @code{filename.tbz2} becomes @code{filename.tar}
@item @code{filename.tbz } becomes @code{filename.tar}
@item @code{anyothername } becomes @code{anyothername.out}
@end itemize
If the file does not end in one of the recognised endings,
@code{.bz2}, @code{.bz},
@code{.tbz2} or @code{.tbz}, @code{bzip2} complains that it cannot
guess the name of the original file, and uses the original name
with @code{.out} appended.
As with compression, supplying no
filenames causes decompression from standard input to standard output.
@code{bunzip2} will correctly decompress a file which is the
concatenation of two or more compressed files. The result is the
concatenation of the corresponding uncompressed files. Integrity
testing (@code{-t}) of concatenated compressed files is also supported.
You can also compress or decompress files to the standard output by
giving the @code{-c} flag. Multiple files may be compressed and
decompressed like this. The resulting outputs are fed sequentially to
stdout. Compression of multiple files in this manner generates a stream
containing multiple compressed file representations. Such a stream
can be decompressed correctly only by @code{bzip2} version 0.9.0 or
later. Earlier versions of @code{bzip2} will stop after decompressing
the first file in the stream.
@code{bzcat} (or @code{bzip2 -dc}) decompresses all specified files to
the standard output.
@code{bzip2} will read arguments from the environment variables
@code{BZIP2} and @code{BZIP}, in that order, and will process them
before any arguments read from the command line. This gives a
convenient way to supply default arguments.
Compression is always performed, even if the compressed file is slightly
larger than the original. Files of less than about one hundred bytes
tend to get larger, since the compression mechanism has a constant
overhead in the region of 50 bytes. Random data (including the output
of most file compressors) is coded at about 8.05 bits per byte, giving
an expansion of around 0.5%.
As a self-check for your protection, @code{bzip2} uses 32-bit CRCs to
make sure that the decompressed version of a file is identical to the
original. This guards against corruption of the compressed data, and
against undetected bugs in @code{bzip2} (hopefully very unlikely). The
chances of data corruption going undetected is microscopic, about one
chance in four billion for each file processed. Be aware, though, that
the check occurs upon decompression, so it can only tell you that
something is wrong. It can't help you recover the original uncompressed
data. You can use @code{bzip2recover} to try to recover data from
damaged files.
Return values: 0 for a normal exit, 1 for environmental problems (file
not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt
compressed file, 3 for an internal consistency error (eg, bug) which
caused @code{bzip2} to panic.
@unnumberedsubsubsec OPTIONS
@table @code
@item -c --stdout
Compress or decompress to standard output.
@item -d --decompress
Force decompression. @code{bzip2}, @code{bunzip2} and @code{bzcat} are
really the same program, and the decision about what actions to take is
done on the basis of which name is used. This flag overrides that
mechanism, and forces bzip2 to decompress.
@item -z --compress
The complement to @code{-d}: forces compression, regardless of the
invokation name.
@item -t --test
Check integrity of the specified file(s), but don't decompress them.
This really performs a trial decompression and throws away the result.
@item -f --force
Force overwrite of output files. Normally, @code{bzip2} will not overwrite
existing output files. Also forces @code{bzip2} to break hard links
to files, which it otherwise wouldn't do.
@item -k --keep
Keep (don't delete) input files during compression
or decompression.
@item -s --small
Reduce memory usage, for compression, decompression and testing. Files
are decompressed and tested using a modified algorithm which only
requires 2.5 bytes per block byte. This means any file can be
decompressed in 2300k of memory, albeit at about half the normal speed.
During compression, @code{-s} selects a block size of 200k, which limits
memory use to around the same figure, at the expense of your compression
ratio. In short, if your machine is low on memory (8 megabytes or
less), use -s for everything. See MEMORY MANAGEMENT below.
@item -q --quiet
Suppress non-essential warning messages. Messages pertaining to
I/O errors and other critical events will not be suppressed.
@item -v --verbose
Verbose mode -- show the compression ratio for each file processed.
Further @code{-v}'s increase the verbosity level, spewing out lots of
information which is primarily of interest for diagnostic purposes.
@item -L --license -V --version
Display the software version, license terms and conditions.
@item -1 to -9
Set the block size to 100 k, 200 k .. 900 k when compressing. Has no
effect when decompressing. See MEMORY MANAGEMENT below.
@item --
Treats all subsequent arguments as file names, even if they start
with a dash. This is so you can handle files with names beginning
with a dash, for example: @code{bzip2 -- -myfilename}.
@item --repetitive-fast
@item --repetitive-best
These flags are redundant in versions 0.9.5 and above. They provided
some coarse control over the behaviour of the sorting algorithm in
earlier versions, which was sometimes useful. 0.9.5 and above have an
improved algorithm which renders these flags irrelevant.
@end table
@unnumberedsubsubsec MEMORY MANAGEMENT
@code{bzip2} compresses large files in blocks. The block size affects
both the compression ratio achieved, and the amount of memory needed for
compression and decompression. The flags @code{-1} through @code{-9}
specify the block size to be 100,000 bytes through 900,000 bytes (the
default) respectively. At decompression time, the block size used for
compression is read from the header of the compressed file, and
@code{bunzip2} then allocates itself just enough memory to decompress
the file. Since block sizes are stored in compressed files, it follows
that the flags @code{-1} to @code{-9} are irrelevant to and so ignored
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -