📄 lzrw1-a.txt
字号:
RELEASE OF THE LZRW1-A ALGORITHM================================Author : Ross Williams.Date : 25-Jun-1991.Abstract--------This file announces the release of my LZRW1-A algorithm. The LZRW1-Aalgorithm is a direct descendant of the LZRW1 algorithm, bettering ita little in both speed and compression. The only reason for usingLZRW1 now that LZRW1-A exists is backwards compatibility (LZRW1 andLZRW1-A cannot decompress each other's compressed files). Thisdocument specifies the difference between LZRW1 and LZRW1-A and givesdetails of the LZRW1-A algorithm's implementation in C and 68000assembly language.The LZRW1 Algorithm-------------------In April 1991, I published my LZRW1 algorithm by presenting it at thedata compression conference, "An Extremely Fast Ziv-Lempel Data Compression Algorithm", IEEE Data Compression Conference, Snowbird, Utah, 8-11 April 1991.and by advertising it on the internet: FTP Archive Access: Machine : sirius.itd.adelaide.edu.au [IP=129.127.40.3] Directory : ~pub/misc Files : lzrw1.tex - My conference paper describing the algorithm. lzrw1.c - An implementation in C. lzrw1.68000 - An implementation in 68000 assembly language.The LZRW1 algorithm gives not-good compression but is fast, requiresonly 16K of memory and is patent free.Suggested Improvements to LZRW1-------------------------------After publication of LZRW1, at least the following people noted thatthe 4-bit copy length range field was not being used to the full. Dean Long dlong@oak.ucsc.edu Kurt Haenen ghgaea8@cc1.kuleuven.ac.be(I received some other suggested improvements to the algorithm butmost of them were ideas that I had tried and discarded during thealgorithm's development (e.g. deep hash tables, different codingschemes).The following table shows that LZRW1 was wasting two elements of itsfour bit length space. A simple revision changed the range from [3,16]to [3,18].Field Value : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15--------------------------------------------------------------LZRW1 : X X 3 4 5 6 7 8 9 10 11 12 13 14 15 16Revision : 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18The missing values were intentionally designed into the algorithmbecause of misguided concerns for time efficiency. The originalversion of the algorithm was in 68000 machine code and I wasoptimizing like crazy. At some stage I decided that I could save asubtraction instruction in the compressor and an addition instructionin the decompressor by shifting the range by only one (the decrementand jump instruction on the 68000 stops at -1, not 0). This designdecision was reflected in the C version with which I wanted to becompatible (so as not to confuse the meaning of the name "LZRW1").Things changed when Dean Long not only pointed out the deficiency (ofwhich I was aware), but how removing it could actually speed up thecode. When I had a look, I saw that he was right and that I had missedan optimization. I then decided to create a new algorithm containingthe extended length range and the optimizations. The result is theLZRW1-A algorithm which I have implemented in C and 68000 machinecode.LZRW1-A Implementations-----------------------Implementations of LZRW1-A in C and in 68000 assembly language can befound in the following archive (files will appear within a day or twoof 25-Jun-1991): FTP Archive Access: Machine : sirius.itd.adelaide.edu.au [IP=129.127.40.3] Directory : ~pub/misc Files : lzrw1-a.txt - Release notes on the file. lzrw1-a.c - An implementation in C. lzrw1-a.68000 - An implementation in 68000 assembly language. lzrw_headers.h - Useful header files.68000 Assembly Language Version of LZRW1-A------------------------------------------The 68000 code for LZRW1-A was obtained by making the followingchanges to the 68000 code for LZRW1:Compressor:1. The value of GROUPRAW was changed from 256 to 288.2. Two extra COMPARE_BYTE macro calls were inserted in the list of calls.3. Two extra DO_COPY macro calls were inserted in the list of calls.4. A constant of two was subtracted from the second parameter of all calls to DO_COPY.Decompressor:5. The code: COPYLOOP: move.b (A_T1)+, (A_DST)+ ;Perform the copy operation. dbra D_COPLEN, COPYLOOPwas replaced by the more unrolled version: move.b (A_T1)+, (A_DST)+ ;Copy first byte of copy item. move.b (A_T1)+, (A_DST)+ ;Copy second byte of copy item. COPYLOOP: move.b (A_T1)+, (A_DST)+ ;Perform the copy operation. dbra D_COPLEN, COPYLOOPThis last modification was Dean Long's idea, and made it efficient touse the shifted range. Thanks Dean --- your modification improved thedecompression speed by about 10%.6. Some minor cosmetic changes such as updating comments were made.When compared to the 68000 version of LZRW1, the 68000 version ofLZRW1-A gives about 0.5% absolute better compression, compresses atthe same speed, and decompresses 10% faster. The following table givesthe results of applying the 68000 code to the corpus WITH A CONTEXTBLOCK SIZE OF 16K. These results correspond to Figure-8 of myconference paper on LZRW1. I did not bother with instruction countingthis time.--<Start of Table>--+--------------------+---------------+---------+| File K /16K | CmpLen %Rem | K/s |+--------------------+---------------+---------+| bib 109 7 | 67600 60.8 | 43 162 || book1 751 47 | 534086 69.5 | 40 148 || book2 597 38 | 368621 60.3 | 45 157 || geo 100 7 | 87618 85.6 | 31 146 || news 368 24 | 236432 62.7 | 41 164 || obj1 21 2 | 13240 61.6 | 41 173 || obj2 241 16 | 127885 51.8 | 50 171 || paper1 52 4 | 31520 59.3 | 45 159 || paper2 80 6 | 51130 62.2 | 44 153 || pic 501 32 | 126828 24.7 | 89 202 || progc 39 3 | 21932 55.4 | 48 164 || progl 70 5 | 31474 43.9 | 58 173 || progp 48 4 | 21536 43.6 | 59 173 || trans 91 6 | 44158 47.1 | 54 176 |+--------------------+---------------+---------+| Average 219 14 | 56.3 | 49 166 |+--------------------+---------------+---------++--------------------+---------------+---------+| zeros 78 5 | 9508 11.9 |144 219 || noise 78 5 | 80030 100.0 | 23 1178 |+--------------------+---------------+---------+This table gives the performance of a hand-optimized 68000 assemblylanguage implementation of LZRW1-A running on an 8MHz, 24-bit bus,Macintosh-SE computer. The input files were divided into 16K blockswhich were compressed independently. The %Rem column gives thecompression as a percentage remaining.. The K/Sec column gives thecompression and decompression speeds in kilobytes (1024) per second.Decompression speeds are given relative to OUTPUT (uncompressed)bytes. The speeds are for execution only and do not include IO. Allresults were rounded to the given accuracy.--<End of Table>--The following table gives the performance of the 68000 version ofLZRW1-A when applied to the corpus using context block lengths of256K. The longer blocks improves speed and compression a little.Details: The run is on a Mac-SE (8MHz 68000 with 24-bit bus). Theaverage values are the average of exactly the values printed in thecolumn above. The times are for compression only and do not includeany IO.BENCHMARKERS: THIS TABLE HOLDS THE "OFFICIAL" RESULTS FOR LZRW1-A.+----------------------------------------------------------------+| DATA COMPRESSION TEST || ===================== || Time of run : Tue 25-Jun-1991 08:21PM || Timing accuracy : One part in 100 || Context length : 262144 bytes (= 256.0000K) || Test suite : Calgary Corpus Suite || Files in suite : 14 || Algorithm : LZRW1-A (68000 assembly language) |+----------------------------------------------------------------+| File Name Length CxB ComLen %Remn Bits Com K/s Dec K/s || ---------- ------ --- ------ ----- ---- ------- ------- || rpus:Bib.D 111261 1 65801 59.1 4.73 44.35 162.17 || us:Book1.D 768771 3 522797 68.0 5.44 40.58 147.21 || us:Book2.D 610856 3 359715 58.9 4.71 45.74 156.85 || rpus:Geo.D 102400 1 86432 84.4 6.75 31.91 144.93 || pus:News.D 377109 2 230298 61.1 4.89 42.57 163.84 || pus:Obj1.D 21504 1 13189 61.3 4.91 41.14 173.39 || pus:Obj2.D 246814 1 124666 50.5 4.04 50.61 171.89 || s:Paper1.D 53161 1 30626 57.6 4.61 46.35 158.72 || s:Paper2.D 82199 1 50048 60.9 4.87 44.84 152.90 || rpus:Pic.D 513216 2 125718 24.5 1.96 89.43 202.04 || us:Progc.D 39611 1 21494 54.3 4.34 48.52 164.22 || us:Progl.D 71646 1 30634 42.8 3.42 59.97 173.47 || us:Progp.D 49379 1 20935 42.4 3.39 60.49 173.60 || us:Trans.D 93695 1 42484 45.3 3.63 55.36 177.09 |+----------------------------------------------------------------+| Average 224401 1 123202 55.1 4.41 50.13 165.88 |+----------------------------------------------------------------+C Version of LZRW1-A--------------------The C version of LZRW1-A was obtained by making the following changesto the C version of LZRW1:Compressor:1. The overrun specification in the header comments was changed from 256 (=16x16) to 288 (=16x18).2. The value of ITEMMAX was changed from 16 to 18.3. The unrolled matching loop was unrolled a bit more. Two more "PS ||" were appended to the list of them.4. The term (len-1) was replaced by (len-3) in the line "*p_dst...".Decompressor:5. "len=1+.." was changed to "len=3+.."At this stage I had a working version of LZRW1-A whose code lookedvery like the code for LZRW1. However, I continued and:6. Performed extensive optimizations on both the compressor and decompressor. The same code structure is present, but a lot has changed.The C version runs about two to three times more slowly than theassembler version. The results are given below. The output lengthsvary by a few bytes from the 256K run of the 68000 version; this isalmost certainly a result of the non-determinism of the algorithm.Details: The run is on a Mac-SE (8MHz 68000 with 24-bit bus). Theaverage values are the average of exactly the values printed in thecolumn above. The times are for compression only and do not includeany IO. The compiler used was THINK C V4.0.+----------------------------------------------------------------+| DATA COMPRESSION TEST || ===================== || Time of run : Tue 25-Jun-1991 08:59PM || Timing accuracy : One part in 100 || Context length : 262144 bytes (= 256.0000K) || Test suite : Calgary Corpus Suite || Files in suite : 14 || Algorithm : LZRW1-A (C version) |+----------------------------------------------------------------+| File Name Length CxB ComLen %Remn Bits Com K/s Dec K/s || ---------- ------ --- ------ ----- ---- ------- ------- || rpus:Bib.D 111261 1 65813 59.2 4.73 14.95 67.67 || us:Book1.D 768771 3 522800 68.0 5.44 13.68 62.96 || us:Book2.D 610856 3 359715 58.9 4.71 15.40 66.59 || rpus:Geo.D 102400 1 86432 84.4 6.75 10.81 59.55 || pus:News.D 377109 2 230298 61.1 4.89 14.33 67.52 || pus:Obj1.D 21504 1 13189 61.3 4.91 13.70 69.36 || pus:Obj2.D 246814 1 124666 50.5 4.04 17.07 71.36 || s:Paper1.D 53161 1 30626 57.6 4.61 15.65 67.09 || s:Paper2.D 82199 1 50048 60.9 4.87 15.12 65.34 || rpus:Pic.D 513216 2 125718 24.5 1.96 30.01 84.19 || us:Progc.D 39611 1 21494 54.3 4.34 16.29 69.28 || us:Progl.D 71646 1 30634 42.8 3.42 20.18 73.43 || us:Progp.D 49379 1 20944 42.4 3.39 20.23 73.56 || us:Trans.D 93695 1 42501 45.4 3.63 18.61 73.53 |+----------------------------------------------------------------+| Average 224401 1 123205 55.1 4.41 16.86 69.39 |+----------------------------------------------------------------+Nomenclature------------I am keen to retain a sharp definitions for the algorithms that Iproduce.The term "LZRW1" should be used only to refer to the originalalgorithm published in my data compression conference paper. This termdoes not refer to a generic family of algorithms but rather thespecific algorithm described in the paper.The term "LZRNW1" I have used accidentally in the past in referring to"LZRW1". It should never be used but if you see it used, it means thesame as "LZRW1".The term "LZRW1-A" refers to the particular algorithm described abovewhich is based on the "LZRW1" algorithm.--<End of Report on LZRW1-A>--
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -