📄 shorten.man
字号:
SHORTEN(1) SHORTEN(1)NAME shorten - fast compression for waveform filesSYNOPSIS shorten [-hlu] [-a #bytes] [-b #samples] [-c #channels] [-d #bytes] [-m #blocks] [-n #dB] [-p #order] [-q #bits] [-r #bits] [-t filetype] [-v #version] [waveform-file [shortened-file]] shorten -x [-hl] [ -a #bytes] [-d #bytes] [shortened-file [waveform-file]]DESCRIPTION shorten reduces the size of waveform files (such as audio) using Huffman coding of prediction residuals and optional additional quantisation. In lossless mode the amount of compression obtained depends on the nature of the wave- form. Those composing of low frequencies and low ampli- tudes give the best compression, which may be 2:1 or bet- ter. Lossy compression operates by specifying a minimum acceptable segmental signal to noise ratio or a maximum bit rate. Lossy compression operates by zeroing the lower order bits of the waveform, so retaining waveform shape. If both file names are specified then these are used as the input and output files. The first file name can be replaced by "-" to read from standard input and likewise the second filename can be replaced by "-" to write to standard output. Under UNIX, if only one file name is specified, then that name is used for input and the output file name is generated by adding the suffix ".shn" on com- pression and removing the ".shn" suffix on decompression. In these cases the input file is removed on completion. The use of automatic file name generation is not currently supported under DOS. If no file names are specified, shorten reads from standard input and writes to standard output. Whenever possible, the output file inherits the permissions, owner, group, access and modification times of the input file.OPTIONS -a align bytes Specify the number of bytes to be copied verbatim before compression begins. This option can be used to preserve fixed length ASCII headers on waveform files, and may be necessary if the header length is an odd number of bytes. -b block size Specify the number of samples to be grouped into a block for processing. Within a block the signal elements are expected to have the same spectral characteristics. The default option works well for 22 Dec 1995 1SHORTEN(1) SHORTEN(1) a large range of audio files. -c channels Specify the number of independent interwoven chan- nels. For two signals, a(t) and b(t) the original data format is assumed to be a(0),b(0),a(1),b(1)... -d discard bytes Specify the number of bytes to be discarded before compression or decompression. This may be used to delete header information from a file. Refer to the -a option for storing the header information in the compressed file. -h Give a short message specifying usage options. -l Prints the software license specifying the condi- tions for the distribution and usage of this soft- ware. -m blocks Specify the number of past blocks to be used to estimate the mean and power of the signal. The value of zero disables this prediction and the mean is assumed to lie in the middle of the range of the relevant data type (i.e. at zero for signed quanti- ties). The default value is non-zero for format versions 2.0 and above. -n noise level Specify the minimum acceptable segmental signal to noise ratio in dB. The signal power is taken as the variance of the samples in the current block. The noise power is the quantisation noise incurred by coding the current block assuming that samples are uniformally distributed over the quantisation interval. The bit rate is dynamically changed to maintain the desired signal to noise ratio. The default value represents lossless coding. -p prediction order Specify the maximum order of the linear predictive filter. The default value of zero disables the use of linear prediction and a polynomial interpolation method is used instead. The use of the linear pre- dictive filter generally results in a small improvement in compression ratio at the expense of execution time. This is the only option to use a significant amount of floating point processing during compression. Decompression still uses a minimal number of floating point operations. Decompression time is normally about twice that of the default polynomial interpolation. For version 22 Dec 1995 2SHORTEN(1) SHORTEN(1) 0 and 1, compression time is linear in the speci- fied maximum order as all lower values are searched for the greatest expected compression (the number of bits required to transmit the prediction resid- ual is monotonically decreasing with prediction order, but transmitting each filter coefficient requires about 7 bits). For version 2 and above, the search is started at zero order and terminated when the last two prediction orders give a larger expected bit rate than the minimum found to date. This is a reasonable strategy for many real world signals - you may revert back to the exhaustive algorithm by setting -v1 to check that this works for your signal type. -q quantisation level Specify the number of low order bits in each sample which can be discarded (set to zero). This is use- ful if these bits carry no information, for example when the signal is corrupted by noise. -r bit rate Specify the expected maximum number of bits per sample. The upper bound on the bit rate is achieved by setting the low order bits of the sam- ple to zero, hence maximising the segmental signal to noise ratio. -t file type Gives the type of the sound sample file as one of {ulaw,alaw,s8,u8,s16,u16,s16x,u16x,s16hl,u16hl,s16lh,u16lh}. ulaw is the natural file type of ulaw encoded files (such as the default sun .au files) and alaw is a similar byte-packed scheme. All the other types have initial s or u for signed or unsigned data, followed by 8 or 16 as the number of bits per sam- ple. No further extension means the data is in the natural byte order, a trailing x specifies byte swapped data, hl explicitly states the byte order as high byte followed by low byte and lh the con- verse. The default is s16, meaning signed 16 bit integers in the natural byte order. Specific optimisations are applied to ulaw and alaw files. If lossless compression is specified with ulaw files then a check is made that the whole dynamic range is used (useful for files recorded on a SparcStation with the volume set too high). Lossless coding of both file types uses an internal format with a monotonic mapping to linear. If lossy compression is specified then the data is internally converted to linear. The lossy option "-r4" has been observed to give little degradation. 22 Dec 1995 3SHORTEN(1) SHORTEN(1) -u The ulaw standard (ITU G711) has two codes which both map onto the zero value on a linear scale. The "-u" flag maps the negative zero onto the posi- tive zero and so yields marginally better compres- sion for format version 2 (the gain is significant for older format versions). -v version Specify the binary format version number of com- pressed files. Legal values are currently 1 and 2, higher numbers generally giving better compres- sion. Detection of format version on decode is automatic. -x extract Reconstruct the original file. All other command line options except -a and -d are ignored.METHODOLOGY shorten works by blocking the signal, making a model of each block in order to remove temporal redundancy, then Huffman coding the quantised prediction residual. Blocking The signal is read in a block of about 128 or 256 samples, and converted to integers with expected mean of zero. Sample-wise-interleaved data is converted to separate channels, which are assumed independent. Decorrelation Four functions are computed, corresponding to the signal, difference signal, second and third order differences. The one with the lowest variance is coded. The variance is measured by summing absolute values for speed and to avoid overflow. Compression It is assumed the signal has the Laplacian probability density function of exp(-abs(x)). There is a computation- ally efficient way of mapping this density to Huffman codes, The code is in two parts, a run of zeros, a bound- ing one and a fixed number of bits mantissa. The number of leading zeros gives the offset from zero. Signed num- bers are stored by calling the function for unsigned num- bers with the sign in the lowest bit. Some examples for a 2 bit mantissa: 100 0 101 1 110 2 22 Dec 1995 4SHORTEN(1) SHORTEN(1) 111 3 0100 4 0111 7 00100 8 0000100 16 This Huffman code was first used by Robert Rice, for more details see the technical report CUED/F-INFENG/TR.156 included with the shorten distribution as files tr154.tex and tr154.ps.SEE ALSO compress(1),pack(1).DIAGNOSTICS Exit status is normally 0. A warning is issued if the file is not properly aligned, i.e. a whole number of records could not be read at the end of the file.BUGS Large values of '-c' or '-b' cause MS-DOS to throw a wob- bly. Presumably this is a (lack of) memory management problem. An easy way to test shorten for your system is to use "make test", if this fails, for whatever reason, please report it. No check is made for increasing file size, but valid wave- form files generally achieve some compression. Even com- pressing a file of random bytes (which represents the worst case waveform file) only results in a small increase in the file length (about 6% for 8 bit data and 3% for 16 bit data). There is one condition that is know to be problematic, that is the lossy compression of unsigned data without mean estimation - large file sizes may result if the mean is far from the middle range value. For these files the value of the -m switch should be non-zero, as it is by default in format version 2. There is no provision for different channels containing different data types. Normally, this is not a restric- tion, but it does mean that if lossy coding is selected for the ulaw type, then all channels use lossy coding. It would be possible for all options to be channel spe- cific as in the -r option. I could do this if anyone has a really good need for it. See the file "change.log" for a history of bug fixes. Please mail me immediately at the address below if you do 22 Dec 1995 5SHORTEN(1) SHORTEN(1) find a bug.AVAILABILITY The latest version can be obtained by anonymous FTP from svr-ftp.eng.cam.ac.uk, in directory comp.speech/coding. The sources are available for UNIX machines in files shorten.tar.Z and shorten.tar.gz and for DOS machines as file shorten.zip. All distributions contain a DOS exe- cutable.AUTHOR Copyright (C) 1992-1995 by Tony Robinson and SoftSound Ltd (ajr@softsound.com) Shorten is available for non-commercial use without fee. See the LICENSE file for the formal copying and usage restrictions. 22 Dec 1995 6
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -