📄 lzrw1-a.txt

📁 一个基于lZW压缩算法的高效实现
💻 TXT
字号:
RELEASE OF THE LZRW1-A ALGORITHM================================Author : Ross Williams.Date   : 25-Jun-1991.Abstract--------This file announces  the release of my LZRW1-A  algorithm. The LZRW1-Aalgorithm is a direct descendant  of the LZRW1 algorithm, bettering ita little  in both  speed and  compression. The  only reason  for usingLZRW1 now  that LZRW1-A exists  is backwards compatibility  (LZRW1 andLZRW1-A  cannot  decompress  each   other's  compressed  files).  Thisdocument specifies the difference between  LZRW1 and LZRW1-A and givesdetails  of the  LZRW1-A  algorithm's implementation  in  C and  68000assembly language.The LZRW1 Algorithm-------------------In April 1991, I published my LZRW1 algorithm by presenting it at thedata compression conference,   "An Extremely Fast Ziv-Lempel Data Compression Algorithm", IEEE   Data Compression Conference, Snowbird, Utah, 8-11 April 1991.and by advertising it on the internet:   FTP Archive Access:   Machine   : sirius.itd.adelaide.edu.au   [IP=129.127.40.3]   Directory : ~pub/misc   Files     : lzrw1.tex   - My conference paper describing the algorithm.               lzrw1.c     - An implementation in C.               lzrw1.68000 - An implementation in 68000 assembly language.The LZRW1 algorithm  gives not-good compression but  is fast, requiresonly 16K of memory and is patent free.Suggested Improvements to LZRW1-------------------------------After publication of  LZRW1, at least the following  people noted thatthe 4-bit copy length range field was not being used to the full.   Dean Long    dlong@oak.ucsc.edu   Kurt Haenen  ghgaea8@cc1.kuleuven.ac.be(I received  some other  suggested improvements  to the  algorithm butmost of  them were  ideas that I  had tried  and discarded  during thealgorithm's  development  (e.g.  deep hash  tables,  different  codingschemes).The following table  shows that LZRW1 was wasting two  elements of itsfour bit length space. A simple revision changed the range from [3,16]to [3,18].Field Value  :  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15--------------------------------------------------------------LZRW1        :  X  X  3  4  5  6  7  8  9 10 11 12 13 14 15 16Revision     :  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18The  missing values  were  intentionally designed  into the  algorithmbecause  of  misguided  concerns  for time  efficiency.  The  originalversion  of  the  algorithm  was  in 68000  machine  code  and  I  wasoptimizing like  crazy. At some  stage I  decided that I  could save asubtraction instruction in the  compressor and an addition instructionin the decompressor  by shifting the range by only  one (the decrementand jump  instruction on the  68000 stops at  -1, not 0).  This designdecision was  reflected in  the C version  with which  I wanted  to becompatible (so as not to confuse the meaning of the name "LZRW1").Things changed when Dean Long not  only pointed out the deficiency (ofwhich I  was aware), but how  removing it could actually  speed up thecode. When I had a look, I saw that he was right and that I had missedan optimization. I  then decided to create a  new algorithm containingthe extended  length range  and the optimizations.  The result  is theLZRW1-A  algorithm which  I have  implemented in  C and  68000 machinecode.LZRW1-A Implementations-----------------------Implementations of LZRW1-A in C and  in 68000 assembly language can befound in the following archive (files  will appear within a day or twoof 25-Jun-1991):   FTP Archive Access:   Machine   : sirius.itd.adelaide.edu.au   [IP=129.127.40.3]   Directory : ~pub/misc   Files     : lzrw1-a.txt    - Release notes on the file.               lzrw1-a.c      - An implementation in C.               lzrw1-a.68000  - An implementation in 68000 assembly language.               lzrw_headers.h - Useful header files.68000 Assembly Language Version of LZRW1-A------------------------------------------The  68000 code  for  LZRW1-A was  obtained  by  making the  followingchanges to the 68000 code for LZRW1:Compressor:1. The value of GROUPRAW was changed from 256 to 288.2. Two extra COMPARE_BYTE macro calls were inserted in the list of calls.3. Two extra DO_COPY macro calls were inserted in the list of calls.4. A constant of two was subtracted from the second parameter of all calls   to DO_COPY.Decompressor:5. The code:          COPYLOOP:             move.b (A_T1)+,  (A_DST)+   ;Perform the copy operation.             dbra   D_COPLEN, COPYLOOPwas replaced by the more unrolled version:          move.b (A_T1)+,  (A_DST)+   ;Copy first  byte of copy item.          move.b (A_T1)+,  (A_DST)+   ;Copy second byte of copy item.          COPYLOOP:             move.b (A_T1)+,  (A_DST)+   ;Perform the copy operation.             dbra   D_COPLEN, COPYLOOPThis last modification was Dean Long's  idea, and made it efficient touse the shifted range. Thanks  Dean --- your modification improved thedecompression speed by about 10%.6. Some minor cosmetic changes such as updating comments were made.When compared  to the  68000 version  of LZRW1,  the 68000  version ofLZRW1-A gives  about 0.5%  absolute better compression,  compresses atthe same speed, and decompresses 10% faster. The following table givesthe results  of applying the 68000  code to the corpus  WITH A CONTEXTBLOCK  SIZE  OF  16K.  These results  correspond  to  Figure-8  of  myconference paper on LZRW1. I  did not bother with instruction countingthis time.--<Start of Table>--+--------------------+---------------+---------+| File       K  /16K | CmpLen   %Rem |   K/s   |+--------------------+---------------+---------+| bib      109     7 |  67600   60.8 | 43  162 || book1    751    47 | 534086   69.5 | 40  148 || book2    597    38 | 368621   60.3 | 45  157 || geo      100     7 |  87618   85.6 | 31  146 || news     368    24 | 236432   62.7 | 41  164 || obj1      21     2 |  13240   61.6 | 41  173 || obj2     241    16 | 127885   51.8 | 50  171 || paper1    52     4 |  31520   59.3 | 45  159 || paper2    80     6 |  51130   62.2 | 44  153 || pic      501    32 | 126828   24.7 | 89  202 || progc     39     3 |  21932   55.4 | 48  164 || progl     70     5 |  31474   43.9 | 58  173 || progp     48     4 |  21536   43.6 | 59  173 || trans     91     6 |  44158   47.1 | 54  176 |+--------------------+---------------+---------+| Average  219    14 |          56.3 | 49  166 |+--------------------+---------------+---------++--------------------+---------------+---------+| zeros     78     5 |   9508   11.9 |144  219 || noise     78     5 |  80030  100.0 | 23 1178 |+--------------------+---------------+---------+This table  gives the performance  of a hand-optimized  68000 assemblylanguage implementation  of LZRW1-A  running on  an 8MHz,  24-bit bus,Macintosh-SE computer.  The input files  were divided into  16K blockswhich  were  compressed  independently.  The  %Rem  column  gives  thecompression as  a percentage  remaining.. The  K/Sec column  gives thecompression and  decompression speeds in kilobytes  (1024) per second.Decompression  speeds  are  given relative  to  OUTPUT  (uncompressed)bytes. The  speeds are for execution  only and do not  include IO. Allresults were rounded to the given accuracy.--<End of Table>--The  following table  gives the  performance of  the 68000  version ofLZRW1-A  when applied  to the  corpus using  context block  lengths of256K. The longer blocks improves speed and compression a little.Details: The  run is  on a Mac-SE  (8MHz 68000  with 24-bit  bus). Theaverage values  are the average of  exactly the values printed  in thecolumn above.  The times are for  compression only and do  not includeany IO.BENCHMARKERS: THIS TABLE HOLDS THE "OFFICIAL" RESULTS FOR LZRW1-A.+----------------------------------------------------------------+| DATA COMPRESSION TEST                                          || =====================                                          || Time of run     : Tue 25-Jun-1991 08:21PM                      || Timing accuracy : One part in 100                              || Context length  : 262144 bytes (= 256.0000K)                   || Test suite      : Calgary Corpus Suite                         || Files in suite  : 14                                           || Algorithm       : LZRW1-A (68000 assembly language)            |+----------------------------------------------------------------+| File Name   Length  CxB  ComLen  %Remn  Bits  Com K/s  Dec K/s || ----------  ------  ---  ------  -----  ----  -------  ------- || rpus:Bib.D  111261    1   65801   59.1  4.73    44.35   162.17 || us:Book1.D  768771    3  522797   68.0  5.44    40.58   147.21 || us:Book2.D  610856    3  359715   58.9  4.71    45.74   156.85 || rpus:Geo.D  102400    1   86432   84.4  6.75    31.91   144.93 || pus:News.D  377109    2  230298   61.1  4.89    42.57   163.84 || pus:Obj1.D   21504    1   13189   61.3  4.91    41.14   173.39 || pus:Obj2.D  246814    1  124666   50.5  4.04    50.61   171.89 || s:Paper1.D   53161    1   30626   57.6  4.61    46.35   158.72 || s:Paper2.D   82199    1   50048   60.9  4.87    44.84   152.90 || rpus:Pic.D  513216    2  125718   24.5  1.96    89.43   202.04 || us:Progc.D   39611    1   21494   54.3  4.34    48.52   164.22 || us:Progl.D   71646    1   30634   42.8  3.42    59.97   173.47 || us:Progp.D   49379    1   20935   42.4  3.39    60.49   173.60 || us:Trans.D   93695    1   42484   45.3  3.63    55.36   177.09 |+----------------------------------------------------------------+| Average     224401    1  123202   55.1  4.41    50.13   165.88 |+----------------------------------------------------------------+C Version of LZRW1-A--------------------The C version of LZRW1-A was  obtained by making the following changesto the C version of LZRW1:Compressor:1. The overrun specification in the header comments was changed from 256   (=16x16) to 288 (=16x18).2. The value of ITEMMAX was changed from 16 to 18.3. The unrolled matching loop was unrolled a bit more. Two more "PS ||"   were appended to the list of them.4. The term (len-1) was replaced by (len-3) in the line "*p_dst...".Decompressor:5. "len=1+.." was changed to "len=3+.."At this  stage I had  a working version  of LZRW1-A whose  code lookedvery like the code for LZRW1. However, I continued and:6. Performed extensive optimizations on  both the compressor and   decompressor. The same code structure is present, but a lot has   changed.The  C version  runs about  two to  three times  more slowly  than theassembler version.  The results  are given  below. The  output lengthsvary by a  few bytes from the  256K run of the 68000  version; this isalmost certainly a result of the non-determinism of the algorithm.Details: The  run is  on a Mac-SE  (8MHz 68000  with 24-bit  bus). Theaverage values  are the average of  exactly the values printed  in thecolumn above.  The times are for  compression only and do  not includeany IO. The compiler used was THINK C V4.0.+----------------------------------------------------------------+| DATA COMPRESSION TEST                                          || =====================                                          || Time of run     : Tue 25-Jun-1991 08:59PM                      || Timing accuracy : One part in 100                              || Context length  : 262144 bytes (= 256.0000K)                   || Test suite      : Calgary Corpus Suite                         || Files in suite  : 14                                           || Algorithm       : LZRW1-A (C version)                          |+----------------------------------------------------------------+| File Name   Length  CxB  ComLen  %Remn  Bits  Com K/s  Dec K/s || ----------  ------  ---  ------  -----  ----  -------  ------- || rpus:Bib.D  111261    1   65813   59.2  4.73    14.95    67.67 || us:Book1.D  768771    3  522800   68.0  5.44    13.68    62.96 || us:Book2.D  610856    3  359715   58.9  4.71    15.40    66.59 || rpus:Geo.D  102400    1   86432   84.4  6.75    10.81    59.55 || pus:News.D  377109    2  230298   61.1  4.89    14.33    67.52 || pus:Obj1.D   21504    1   13189   61.3  4.91    13.70    69.36 || pus:Obj2.D  246814    1  124666   50.5  4.04    17.07    71.36 || s:Paper1.D   53161    1   30626   57.6  4.61    15.65    67.09 || s:Paper2.D   82199    1   50048   60.9  4.87    15.12    65.34 || rpus:Pic.D  513216    2  125718   24.5  1.96    30.01    84.19 || us:Progc.D   39611    1   21494   54.3  4.34    16.29    69.28 || us:Progl.D   71646    1   30634   42.8  3.42    20.18    73.43 || us:Progp.D   49379    1   20944   42.4  3.39    20.23    73.56 || us:Trans.D   93695    1   42501   45.4  3.63    18.61    73.53 |+----------------------------------------------------------------+| Average     224401    1  123205   55.1  4.41    16.86    69.39 |+----------------------------------------------------------------+Nomenclature------------I am  keen to  retain a  sharp definitions  for the  algorithms that Iproduce.The  term  "LZRW1" should  be  used  only  to  refer to  the  originalalgorithm published in my data compression conference paper. This termdoes  not refer  to  a generic  family of  algorithms  but rather  thespecific algorithm described in the paper.The term "LZRNW1" I have used accidentally in the past in referring to"LZRW1". It should never be used but  if you see it used, it means thesame as "LZRW1".The term "LZRW1-A" refers to  the particular algorithm described abovewhich is based on the "LZRW1" algorithm.--<End of Report on LZRW1-A>--
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -