📄 crc_v3.txt

📁 iscsi源代码 UNH的progect 有initiator端和target端的源码
💻 TXT
📖 第 1 页 / 共 5 页
字号:
TABLE algorithm.10. A Slightly Mangled Table-Driven Implementation--------------------------------------------------Despite the terse beauty of the line   r=0; while (len--) r = ((r << 8) | *p++) ^ t[(r >> 24) & 0xFF];those optimizing hackers couldn't leave it alone. The trouble, yousee, is that this loop operates upon the AUGMENTED message and inorder to use this code, you have to append W/8 zero bytes to the endof the message before pointing p at it. Depending on the run-timeenvironment, this may or may not be a problem; if the block of datawas handed to us by some other code, it could be a BIG problem. Onealternative is simply to append the following line after the aboveloop, once for each zero byte:      for (i=0; i<W/4; i++) r = (r << 8) ^ t[(r >> 24) & 0xFF];This looks like a sane enough solution to me. However, at the furtherexpense of clarity (which, you must admit, is already a pretty scarecommodity in this code) we can reorganize this small loop further soas to avoid the need to either augment the message with zero bytes, orto explicitly process zero bytes at the end as above. To explain theoptimization, we return to the processing diagram given earlier.                   3    2    1    0   Bytes                +----+----+----+----+         +-----<|    |    |    |    | <----- Augmented message         |      +----+----+----+----+         |                ^         |                |         |               XOR         |                |         |     0+----+----+----+----+       Algorithm         v      +----+----+----+----+       ---------         |      +----+----+----+----+       1. Shift the register left by         |      +----+----+----+----+          one byte, reading in a new         |      +----+----+----+----+          message byte.         |      +----+----+----+----+       2. Use the top byte just rotated         |      +----+----+----+----+          out of the register to index         +----->+----+----+----+----+          the table of 256 32-bit values.                +----+----+----+----+       3. XOR the table value into the                +----+----+----+----+          register.                +----+----+----+----+       4. Goto 1 iff more augmented                +----+----+----+----+          message bytes.             255+----+----+----+----+Now, note the following facts:TAIL: The W/4 augmented zero bytes that appear at the end of the      message will be pushed into the register from the right as all      the other bytes are, but their values (0) will have no effect      whatsoever on the register because 1) XORing with zero does not      change the target byte, and 2) the four bytes are never      propagated out the left side of the register where their      zeroness might have some sort of influence. Thus, the sole      function of the W/4 augmented zero bytes is to drive the      calculation for another W/4 byte cycles so that the end of the      REAL data passes all the way through the register.HEAD: If the initial value of the register is zero, the first four      iterations of the loop will have the sole effect of shifting in      the first four bytes of the message from the right. This is      because the first 32 control bits are all zero and so nothing is      XORed into the register. Even if the initial value is not zero,      the first 4 byte iterations of the algorithm will have the sole      effect of shifting the first 4 bytes of the message into the      register and then XORing them with some constant value (that is      a function of the initial value of the register).These facts, combined with the XOR property   (A xor B) xor C = A xor (B xor C)mean that message bytes need not actually travel through the W/4 bytesof the register. Instead, they can be XORed into the top byte justbefore it is used to index the lookup table. This leads to thefollowing modified version of the algorithm.         +-----<Message (non augmented)         |         v         3    2    1    0   Bytes         |      +----+----+----+----+        XOR----<|    |    |    |    |         |      +----+----+----+----+         |                ^         |                |         |               XOR         |                |         |     0+----+----+----+----+       Algorithm         v      +----+----+----+----+       ---------         |      +----+----+----+----+       1. Shift the register left by         |      +----+----+----+----+          one byte, reading in a new         |      +----+----+----+----+          message byte.         |      +----+----+----+----+       2. XOR the top byte just rotated         |      +----+----+----+----+          out of the register with the         +----->+----+----+----+----+          next message byte to yield an                +----+----+----+----+          index into the table ([0,255]).                +----+----+----+----+       3. XOR the table value into the                +----+----+----+----+          register.                +----+----+----+----+       4. Goto 1 iff more augmented             255+----+----+----+----+          message bytes.Note: The initial register value for this algorithm must be theinitial value of the register for the previous algorithm fed throughthe table four times. Note: The table is such that if the previousalgorithm used 0, the new algorithm will too.This is an IDENTICAL algorithm and will yield IDENTICAL results. The Ccode looks something like this:   r=0; while (len--) r = (r<<8) ^ t[(r >> 24) ^ *p++];and THIS is the code that you are likely to find inside currenttable-driven CRC implementations. Some FF masks might have to be ANDedin here and there for portability's sake, but basically, the aboveloop is IT. We will call this the DIRECT TABLE ALGORITHM.During the process of trying to understand all this stuff, I managedto derive the SIMPLE algorithm and the table-driven version derivedfrom that. However, when I compared my code with the code found inreal-implementations, I was totally bamboozled as to why the byteswere being XORed in at the wrong end of the register! It took quite awhile before I figured out that theirs and my algorithms were actuallythe same. Part of why I am writing this document is that, while thelink between division and my earlier table-driven code is vaguelyapparent, any such link is fairly well erased when you start pumpingbytes in at the "wrong end" of the register. It looks all wrong!If you've got this far, you not only understand the theory, thepractice, the optimized practice, but you also understand the realcode you are likely to run into. Could get any more complicated? Yesit can.11. "Reflected" Table-Driven Implementations--------------------------------------------Despite the fact that the above code is probably optimized about asmuch as it could be, this did not stop some enterprising individualsfrom making things even more complicated. To understand how thishappened, we have to enter the world of hardware.DEFINITION: A value/register is reflected if it's bits are swappedaround its centre. For example: 0101 is the 4-bit reflection of 1010.0011 is the reflection of 1100.0111-0101-1010-1111-0010-0101-1011-1100 is the reflection of0011-1101-1010-0100-1111-0101-1010-1110.Turns out that UARTs (those handy little chips that perform serial IO)are in the habit of transmitting each byte with the least significantbit (bit 0) first and the most significant bit (bit 7) last (i.e.reflected). An effect of this convention is that hardware engineersconstructing hardware CRC calculators that operate at the bit leveltook to calculating CRCs of bytes streams with each of the bytesreflected within itself. The bytes are processed in the same order,but the bits in each byte are swapped; bit 0 is now bit 7, bit 1 isnow bit 6, and so on. Now this wouldn't matter much if this conventionwas restricted to hardware land. However it seems that at some stagesome of these CRC values were presented at the software level andsomeone had to write some code that would interoperate with thehardware CRC calculation.In this situation, a normal sane software engineer would simplyreflect each byte before processing it. However, it would seem thatnormal sane software engineers were thin on the ground when this earlyground was being broken, because instead of reflecting the bytes,whoever was responsible held down the byte and reflected the world,leading to the following "reflected" algorithm which is identical tothe previous one except that everything is reflected except the inputbytes.             Message (non augmented) >-----+                                           |           Bytes   0    1    2    3        v                +----+----+----+----+      |                |    |    |    |    |>----XOR                +----+----+----+----+      |                          ^                |                          |                |                         XOR               |                          |                |                +----+----+----+----+0     |                +----+----+----+----+      v                +----+----+----+----+      |                +----+----+----+----+      |                +----+----+----+----+      |                +----+----+----+----+      |                +----+----+----+----+      |                +----+----+----+----+<-----+                +----+----+----+----+                +----+----+----+----+                +----+----+----+----+                +----+----+----+----+                +----+----+----+----+255Notes:   * The table is identical to the one in the previous algorithm   except that each entry has been reflected.   * The initial value of the register is the same as in the previous   algorithm except that it has been reflected.   * The bytes of the message are processed in the same order as   before (i.e. the message itself is not reflected).   * The message bytes themselves don't need to be explicitly   reflected, because everything else has been!At the end of execution, the register contains the reflection of thefinal CRC value (remainder). Actually, I'm being rather hard onwhoever cooked this up because it seems that hardware implementationsof the CRC algorithm used the reflected checksum value and soproducing a reflected CRC was just right. In fact reflecting the worldwas probably a good engineering solution - if a confusing one.We will call this the REFLECTED algorithm.Whether or not it made sense at the time, the effect of havingreflected algorithms kicking around the world's FTP sites is thatabout half the CRC implementations one runs into are reflected and theother half not. It's really terribly confusing. In particular, itwould seem to me that the casual reader who runs into a reflected,table-driven implementation with the bytes "fed in the wrong end"would have Buckley's chance of ever connecting the code to the conceptof binary mod 2 division.It couldn't get any more confusing could it? Yes it could.12. "Reversed" Polys--------------------As if reflected implementations weren't enough, there is anotherconcept kicking around which makes the situation bizaarly confusing.The concept is reversed Polys.It turns out that the reflection of good polys tend to be good polystoo! That is, if G=11101 is a good poly value, then 10111 will be aswell. As a consequence, it seems that every time an organization (suchas CCITT) standardizes on a particularly good poly ("polynomial"),those in the real world can't leave the poly's reflection aloneeither. They just HAVE to use it. As a result, the set of "standard"poly's has a corresponding set of reflections, which are also in use.To avoid confusion, we will call these the "reversed" polys.   X25   standard: 1-0001-0000-0010-0001   X25   reversed: 1-0000-1000-0001-0001   CRC16 standard: 1-1000-0000-0000-0101   CRC16 reversed: 1-0100-0000-0000-0011Note that here it is the entire poly that is being reflected/reversed,not just the bottom W bits. This is an important distinction. In thereflected algorithm described in the previous section, the poly usedin the reflected algorithm was actually identical to that used in thenon-reflected algorithm; all that had happened is that the bytes hadeffectively been reflected. As such, all the 16-bit/32-bit numbers inthe algorithm had to be reflected. In contrast, the ENTIRE polyincludes the implicit one bit at the top, and so reversing a poly isnot the same as reflecting its bottom 16 or 32 bits.The upshot of all this is that a reflected algorithm is not equivalentto the original algorithm with the poly reflected. Actually, this isprobably less confusing than if they were duals.If all this seems a bit unclear, don't worry, because we're going tosort it all out "real soon now". Just one more section to go beforethat.13. Initial and Final Values----------------------------In addition to the complexity already seen, CRC algorithms differ fromeach other in two other regards:   * The initial value of the register.   * The value to be XORed with the final register value.For example, the "CRC32" algorithm initializes its register toFFFFFFFF and XORs the final register value with FFFFFFFF.Most CRC algorithms initialize their register to zero. However, someinitialize it to a non-zero value. In theory (i.e. with no assumptionsabout the message), the initial value has no affect on the strength ofthe CRC algorithm, the initial value merely providing a fixed startingpoint from which the register value can progress. However, inpractice, some messages are more likely than others, and it is wise toinitialize the CRC algorithm register to a value that does not have"blind spots" that are likely to occur in practice. By "blind spot" ismeant a sequence of message bytes that do not result in the registerchanging its value. In particular, any CRC algorithm that initializesits register to zero will have a blind spot of zero when it starts upand will be unable to "count" a leading run of zero bytes. As aleading run of zero bytes is quite common in real messages, it is wise
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -