📄 lzrw3-a.c

📁 一个基于lZW压缩算法的高效实现
💻 C
📖 第 1 页 / 共 4 页
字号:
12 3 4 下一页
/******************************************************************************//*                                                                            *//*                                   LZRW3-A.C                                *//*                                                                            *//******************************************************************************//*                                                                            *//* Author  : Ross Williams.                                                   *//* Date    : 15-Jul-1991.                                                     *//* Release : 1.                                                               *//*                                                                            *//******************************************************************************//*                                                                            *//* This file contains an implementation of the LZRW3-A data compression       *//* algorithm in the C programming language.                                   *//*                                                                            *//* The LZRW3-A algorithm has the following features:                          *//*                                                                            *//*    1 Requires only 16K of memory (for both compression and decompression). *//*    2 The compressor   runs about two   times faster than Unix compress's.  *//*    3 The decompressor runs about three times faster than Unix compress's.  *//*    4 Yields a few percent better compression than Unix compress for        *//*      most files.                                                           *//*    5 Allows you to dial up extra compression at a speed cost in the        *//*      compressor. The speed of the decompressor is not affected.            *//*    6 Algorithm is deterministic.                                           *//*    7 Algorithm is free of patent problems. The algorithm has not been      *//*      patented (nor will it be) and is of the LZ77 class which is fairly    *//*      clear of patents.                                                     *//*    8 This implementation in C is in the public domain.                     *//*                                                                            *//* (Timing tests for the speed comparison were performed on a Pyramid 9820.)  *//*                                                                            *//* LZRW3-A is LZRW3 with a deepened hash table. This simple change yields     *//* about a 6% (absolute) improvement in compression.                          *//*                                                                            *//* Here are the results of applying this code, compiled under THINK C 4.0     *//* and running on a Mac-SE (8MHz 68000), to the standard calgary corpus.      *//*                                                                            *//*     +----------------------------------------------------------------+     *//*     | DATA COMPRESSION TEST                                          |     *//*     | =====================                                          |     *//*     | Time of run     : Mon 15-Jul-1991 05:29PM                      |     *//*     | Timing accuracy : One part in 100                              |     *//*     | Context length  : 262144 bytes (= 256.0000K)                   |     *//*     | Test suite      : Calgary Corpus Suite                         |     *//*     | Files in suite  : 14                                           |     *//*     | Algorithm       : LZRW3-A                                      |     *//*     | Note: All averages are calculated from the un-rounded values.  |     *//*     +----------------------------------------------------------------+     *//*     | File Name   Length  CxB  ComLen  %Remn  Bits  Com K/s  Dec K/s |     *//*     | ----------  ------  ---  ------  -----  ----  -------  ------- |     *//*     | rpus:Bib.D  111261    1   49044   44.1  3.53     8.47    31.19 |     *//*     | us:Book1.D  768771    3  420464   54.7  4.38     7.27    30.07 |     *//*     | us:Book2.D  610856    3  277955   45.5  3.64     8.51    33.40 |     *//*     | rpus:Geo.D  102400    1   84218   82.2  6.58     4.23    15.04 |     *//*     | pus:News.D  377109    2  192880   51.1  4.09     7.08    25.89 |     *//*     | pus:Obj1.D   21504    1   12651   58.8  4.71     5.23    17.44 |     *//*     | pus:Obj2.D  246814    1  108044   43.8  3.50     8.01    28.11 |     *//*     | s:Paper1.D   53161    1   24526   46.1  3.69     8.11    30.24 |     *//*     | s:Paper2.D   82199    1   39483   48.0  3.84     8.11    32.04 |     *//*     | rpus:Pic.D  513216    2  111622   21.7  1.74    10.64    49.31 |     *//*     | us:Progc.D   39611    1   17923   45.2  3.62     8.06    29.01 |     *//*     | us:Progl.D   71646    1   24362   34.0  2.72    10.74    39.51 |     *//*     | us:Progp.D   49379    1   16805   34.0  2.72    10.64    37.58 |     *//*     | us:Trans.D   93695    1   30296   32.3  2.59    11.02    38.06 |     *//*     +----------------------------------------------------------------+     *//*     | Average     224401    1  100733   45.8  3.67     8.29    31.21 |     *//*     +----------------------------------------------------------------+     *//*                                                                            *//******************************************************************************/                            /* INCLUDE FILES                                  */                            /* =============                                  *///#include "port.h"           /* Defines symbols for the non portable stuff.    */#include "compress.h"       /* Defines single exported function "compress".   */#include "fast_copy.h"      /* Fast memory copy routine.                      *//******************************************************************************//* The following structure is returned by the "compress" function below when  *//* the user asks the function to return identifying information.              *//* The most important field in the record is the working memory field which   *//* tells the calling program how much working memory should be passed to      *//* "compress" when it is called to perform a compression or decompression.    *//* LZRW3-A uses the same amount of memory during compression and              *//* decompression. For more information on this structure see "compress.h".    *//* The alignment fudge below really only needs to be 4 (but I play it safe!). *//* The id looks non-random, but it really was generated by coin tossing!      */#define U(X)            ((ULONG) X)#define SIZE_P_BYTE     (U(sizeof(UBYTE *)))#define ALIGNMENT_FUDGE (U(16))#define MEM_REQ ( U(4096)*(SIZE_P_BYTE) + ALIGNMENT_FUDGE )static struct compress_identity identity ={ U(0x01B90B91),                           /* Algorithm identification number. */ MEM_REQ,                                 /* Working memory (bytes) required. */ "LZRW3-A",                               /* Name of algorithm.               */ "1.0",                                   /* Version number of algorithm.     */ "15-Jul-1990",                           /* Date of algorithm.               */ "Public Domain",                         /* Copyright notice.                */ "Ross N. Williams",                      /* Author of algorithm.             */ "Renaissance Software",                  /* Affiliation of author.           */ "Public Domain"                          /* Vendor of algorithm.             */};void compress_compress  (UBYTE *,UBYTE *,ULONG,UBYTE *,ULONG *);void compress_decompress(UBYTE *,UBYTE *,ULONG,UBYTE *,ULONG *);/******************************************************************************//* This function is the only function exported by this module.                *//* Depending on its first parameter, the function can be requested to         *//* compress a block of memory, decompress a block of memory, or to identify   *//* itself. For more information, see the specification file "compress.h".     */EXPORT void compress(action,wrk_mem,src_adr,src_len,dst_adr,p_dst_len)UWORD     action;      /* Action to be performed.                             */UBYTE   *wrk_mem;      /* Address of working memory we can use.               */UBYTE   *src_adr;      /* Address of input data.                              */ULONG    src_len;      /* Length  of input data.                              */UBYTE   *dst_adr;      /* Address to put output data.                         */ULONG *p_dst_len;      /* Address of longword for length of output data.      */{ switch (action)   {    case COMPRESS_ACTION_IDENTITY:       *p_dst_len=(ULONG) &identity;       break;    case COMPRESS_ACTION_COMPRESS:       compress_compress(wrk_mem,src_adr,src_len,dst_adr,p_dst_len);       break;    case COMPRESS_ACTION_DECOMPRESS:       compress_decompress(wrk_mem,src_adr,src_len,dst_adr,p_dst_len);       break;   }}/******************************************************************************//*                                                                            *//* BRIEF DESCRIPTION OF THE LZRW3-A ALGORITHM                                 *//* ==========================================                                 *//* Note: Before attempting to understand this algorithm, you should first     *//* understand the LZRW3 algorithm from which this algorithm is derived.       *//*                                                                            *//* The LZRW3-A algorithm is identical to the LZRW3 algorithm except that the  *//* hash table has been "deepened". The LZRW3 algorithm has a hash table of    *//* 4096 pointers which point to strings in the buffer. LZRW3-A generalizes    *//* this to 4096/(2^n) partitions each of which contains (2^n) pointers.       *//* In LZRW3-A, the hash function hashes to a partition number.                *//*                                                                            *//* During the processing of each phrase, LZRW3 overwrites the pointer in the  *//* position selected by the hash function. LZRW3-A overwrites one of the      *//* pointers in the partition that was selected by the hash function.          *//*                                                                            *//* When searching for a match, LZRW3-A matches against all (2^n) strings      *//* pointed to by the pointers in the target partition.                        *//*                                                                            *//* Deep hash tables were used in early versions of LZRW1 in late 1989, but    *//* were discarded in an effort to increase speed (which was the primary       *//* requirement for LZRW1). They were revived for use in LZRW3-A in order to   *//* produce an algorithm with compression performance competitive with Unix    *//* compress.                                                                  *//*                                                                            *//* Until 14-Jul-1991, deep hash tables used in prototype LZRW* algorithms     *//* used a queue discipline within each partition. Upon the arrival of a new   *//* pointer, the pointers in the partition would be block copied back one      *//* position (with the oldest pointer being overwritten) and the new pointer   *//* being inserted in the space at the front (the youngest position).          *//* This meant that pointers to the (2^n) most recent phrases corresponding to *//* each hash was kept. The only flaw in this system was the time-consuming    *//* block copy operation which was cheap for shallow tables but expensive for  *//* deep tables.                                                               *//*                                                                            *//* The traditional solution to ring buffer block copy problems is to maintain *//* a cyclic counter which points to the "head" of the queue. However, this    *//* would have required one counter to be stored for each partition and would  *//* have been slightly messy. After some thought (on 14-Jul-1991) a better     *//* solution was found. Instead of maintaining a counter for each partition,   *//* LZRW3-A maintains a single counter for all partitions! This counter is     *//* maintained in both the compressor and decompressor and means that the      *//* algorithm (effectively) overwrites a RANDOM element of the partition to be *//* updated. The result was to increase the speed of the compressor and        *//* decompressor, to make the decompressor's speed independent from whatever   *//* depth was selected, and to impair compression by less than 1% absolute.    *//*                                                                            *//* Setting the depth is a speed/compression tradeoff. The table below gives   *//* the tradeoff observed for a typical 50K text file on a Mac-SE.             *//* Note: %Rem=Percentage Remaining (after compression).                       *//*                                                                            *//*      Depth    %Rem    CmpK/s  DecK/s                                       *//*          1    45.2    14.77   32.24                                        *//*          2    42.6    12.12   31.26                                        *//*          4    40.9    10.28   31.91                                        *//*          8    40.0     7.81   32.36                                        *//*         16    39.5     5.30   32.47                                        *//*         32    39.0     3.23   32.59                                        *//*                                                                            *//* I have chosen a depth of 8 as the "default" depth for LZRW3-A. If you use  *//* a depth different to this (e.g. 4), you should use the name LZRW3-A(4) to  *//* indicate that a different depth is being used. LZRW3-A(8) is an acceptable *//* longhand for LZRW3-A.                                                      *//*                                                                            *//* To change the depth, search for "HERE IT IS" in the rest of this file.     *//*                                                                            *//*                                  +---+                                     *//*                                  |___|4095                                 *//*                                  |===|                                     *//*              +---------------------*_|<---+   /----+---\                   *//*              |                   |___|    +---|Hash    |                   *//*              |    512 partitions |___|        |Function|                   *//*              |    of 8 pointers  |===|        \--------/                   *//*              |    each (or any   |___|0            ^                       */
12 3 4 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -