📄 rfc1071.txt

📁 <VC++网络游戏建摸与实现>源代码
💻 TXT
📖 第 1 页 / 共 4 页
字号:
     Internet Experiment Note  45                          5 June 1978     TCP Checksum Function Design                   William W. Plummer     1.      Introduction     Checksums  are  included  in  packets  in   order   that   errors     encountered  during  transmission  may be detected.  For Internet     protocols such as TCP [1,9] this is especially important  because     packets  may  have  to cross wireless networks such as the Packet     Radio Network  [2]  and  Atlantic  Satellite  Network  [3]  where     packets  may  be  corrupted.  Internet protocols (e.g., those for     real time speech transmission) can tolerate a  certain  level  of     transmission  errors  and  forward error correction techniques or     possibly no checksum at all might be better.  The focus  in  this     paper  is  on  checksum functions for protocols such as TCP where     the required reliable delivery is achieved by retransmission.     Even if the checksum appears good on a  message  which  has  been     received, the message may still contain an undetected error.  The     probability of this is bounded by 2**(-C) where  C  is the number     of  checksum bits.  Errors can arise from hardware (and software)     malfunctions as well as transmission  errors.   Hardware  induced     errors  are  usually manifested in certain well known ways and it     is desirable to account for this in the design  of  the  checksum     function.  Ideally no error of the "common hardware failure" type     would go undetected.     An  example  of  a  failure  that  the  current checksum function     handles successfully is picking up a bit in the network interface     (or I/O buss, memory channel, etc.).  This will always render the     checksum bad.  For an example of  how  the  current  function  is     inadequate, assume that a control signal stops functioning in the     network  interface and the interface stores zeros in place of the     real data.  These  "all  zero"  messages  appear  to  have  valid     checksums.   Noise  on the "There's Your Bit" line of the ARPANET     Interface [4] may go undetected because the extra bits input  may     cause  the  checksum  to be perturbed (i.e., shifted) in the same     way as the data was.     Although messages containing undetected errors will  occasionally     be  passed  to  higher levels of protocol, it is likely that they     will not make sense at that level.  In the case of TCP most  such     messages will be ignored, but some could cause a connection to be     aborted.   Garbled  data could be viewed as a problem for a layer     of protocol above TCP which itself may have a checksuming scheme.     This paper is the first step in design of a new checksum function     for TCP  and  some  other  Internet  protocols.   Several  useful     properties  of  the current function are identified.  If possible                                   - 1 -Braden, Borman, & Partridge                                    [Page 13]RFC 1071            Computing the Internet Checksum       September 1988     Internet Experiment Note  45                          5 June 1978     TCP Checksum Function Design                   William W. Plummer     these should be retained  in  any  new  function.   A  number  of     plausible  checksum  schemes are investigated.  Of these only the     "product code" seems to be simple enough for consideration.     2.      The Current TCP Checksum Function     The current function is  oriented  towards  sixteen-bit  machines     such  as  the PDP-11 but can be computed easily on other machines     (e.g., PDP-10).  A packet is thought of as  a  string  of  16-bit     bytes  and the checksum function is the one's complement sum (add     with  end-around  carry)  of  those  bytes.   It  is  the   one's     complement  of  this sum which is stored in the checksum field of     the TCP header.  Before computing the checksum value, the  sender     places  a  zero  in  the  checksum  field  of the packet.  If the     checksum value computed by a receiver of the packet is zero,  the     packet  is  assumed  to  be  valid.  This is a consequence of the     "negative" number in the checksum field  exactly  cancelling  the     contribution of the rest of the packet.     Ignoring  the  difficulty  of  actually  evaluating  the checksum     function for a given  packet,  the  way  of  using  the  checksum     described  above  is quite simple, but it assumes some properties     of the checksum operator (one's complement addition, "+" in  what     follows):       (P1)    +  is commutative.  Thus, the  order  in  which             the   16-bit   bytes   are  "added"  together  is             unimportant.       (P2)    +  has  at  least  one  identity  element  (The             current  function  has  two:  +0  and  -0).  This             allows  the  sender  to  compute   the   checksum             function by placing a zero in the packet checksum             field before computing the value.       (P3)    +  has an  inverse.   Thus,  the  receiver  may             evaluate the checksum function and expect a zero.       (P4)    +  is associative, allowing the checksum  field             to be anywhere in the packet and the 16-bit bytes             to be scanned sequentially.     Mathematically, these properties of the binary operation "+" over     the set of 16-bit numbers forms an Abelian group [5].  Of course,     there  are  many Abelian groups but not all would be satisfactory     for  use  as  checksum  operators.   (Another  operator   readily                                   - 2 -Braden, Borman, & Partridge                                    [Page 14]RFC 1071            Computing the Internet Checksum       September 1988     Internet Experiment Note  45                          5 June 1978     TCP Checksum Function Design                   William W. Plummer     available  in  the  PDP-11  instruction set that has all of these     properties is exclusive-OR, but XOR is unsatisfactory  for  other     reasons.)     Albeit imprecise, another property which must be preserved in any     future checksum scheme is:       (P5)    +  is fast to compute on a variety of  machines             with limited storage requirements.     The  current  function  is  quite  good  in this respect.  On the     PDP-11 the inner loop looks like:             LOOP:   ADD (R1)+,R0    ; Add the next 16-bit byte                     ADC R0          ; Make carry be end-around                     SOB R2,LOOP     ; Loop over entire packet.              ( 4 memory cycles per 16-bit byte )     On the PDP-10 properties  P1-4  are  exploited  further  and  two     16-bit bytes per loop are processed:     LOOP: ILDB THIS,PTR   ; Get 2 16-bit bytes           ADD SUM,THIS    ; Add into current sum           JUMPGE SUM,CHKSU2  ; Jump if fewer than 8 carries           LDB THIS,[POINT 20,SUM,19] ; Get left 16 and carries           ANDI SUM,177777 ; Save just low 16 here           ADD SUM,THIS    ; Fold in carries     CHKSU2: SOJG COUNT,LOOP ; Loop over entire packet     ( 3.1 memory cycles per 16-bit byte )     The  "extra"  instruction  in  the  loops  above  are required to     convert the two's complement  ADD  instruction(s)  into  a  one's     complement  add  by  making  the  carries  be  end-around.  One's     complement arithmetic is better than two's complement because  it     is  equally  sensitive  to errors in all bit positions.  If two's     complement addition were used, an even number  of  1's  could  be     dropped  (or  picked  up)  in  the  most  significant bit channel     without affecting the value of the checksum.   It  is  just  this     property  that makes some sort of addition preferable to a simple     exclusive-OR which is frequently used but permits an even  number     of drops (pick ups) in any bit channel.  RIM10B paper tape format     used  on PDP-10s [10] uses two's complement add because space for     the loader program is extremely limited.                                   - 3 -Braden, Borman, & Partridge                                    [Page 15]RFC 1071            Computing the Internet Checksum       September 1988     Internet Experiment Note  45                          5 June 1978     TCP Checksum Function Design                   William W. Plummer     Another property of the current checksum scheme is:       (P6)    Adding the checksum to a packet does not change             the information bytes.  Peterson [6] calls this a             "systematic" code.     This property  allows  intermediate  computers  such  as  gateway     machines  to  act  on  fields  (i.e.,  the  Internet  Destination     Address) without having to first  decode  the  packet.   Cyclical     Redundancy  Checks  used  for error correction are not systematic     either.  However, most applications of  CRCs  tend  to  emphasize     error  detection rather than correction and consequently can send     the message unchanged, with the CRC check bits being appended  to     the  end.   The  24-bit CRC used by ARPANET IMPs and Very Distant     Host Interfaces [4] and the ANSI standards for 800 and 6250  bits     per inch magnetic tapes (described in [11]) use this mode.     Note  that  the  operation  of higher level protocols are not (by     design) affected by anything that may be done by a gateway acting     on possibly invalid packets.  It is permissible for  gateways  to     validate  the  checksum  on  incoming  packets,  but  in  general     gateways will not know how to  do  this  if  the  checksum  is  a     protocol-specific feature.     A final property of the current checksum scheme which is actually     a consequence of P1 and P4 is:       (P7)    The checksum may be incrementally modified.     This  property permits an intermediate gateway to add information     to a packet, for instance a timestamp, and "add"  an  appropriate     change  to  the  checksum  field  of  the  packet.  Note that the     checksum  will  still  be  end-to-end  since  it  was  not  fully     recomputed.     3.      Product Codes     Certain  "product  codes"  are potentially useful for checksuming     purposes.  The following is a brief description of product  codes     in  the  context  of TCP.  More general treatment can be found in     Avizienis [7] and probably other more recent works.     The basic concept of this coding is that the message (packet)  to     be sent is formed by transforming the original source message and     adding  some  "check"  bits.   By  reading  this  and  applying a     (possibly different) transformation, a receiver  can  reconstruct                                   - 4 -Braden, Borman, & Partridge                                    [Page 16]RFC 1071            Computing the Internet Checksum       September 1988     Internet Experiment Note  45                          5 June 1978     TCP Checksum Function Design                   William W. Plummer     the  original  message  and  determine  if  it has been corrupted     during transmission.              Mo              Ms              Mr             -----           -----           -----             | A |  code     | 7 |   decode  | A |             | B |    ==>    | 1 |     ==>   | B |             | C |           | 4 |           | C |             -----           |...|           -----                             | 2 | check     plus "valid" flag                             ----- info             Original        Sent            Reconstructed     With product codes the transformation is  Ms = K * Mo .  That is,     the message sent is simply the product of  the  original  message     Mo   and  some  well known constant  K .  To decode, the received     Ms  is divided by  K  which will yield  Mr  as the  quotient  and     0   as the remainder if  Mr is to be considered the same as  Mo .     The first problem is selecting a "good" value for  K, the  "check     factor".   K  must  be  relatively  prime  to  the base chosen to     express  the  message.   (Example:  Binary   messages   with    K     incorrectly  chosen  to be 8.  This means that  Ms  looks exactly     like  Mo  except that three zeros have been appended.   The  only     way  the message could look bad to a receiver dividing by 8 is if     the error occurred in one of those three bits.)     For TCP the base  R  will be chosen to be 2**16.  That is,  every     16-bit byte (word on the PDP-11) will be considered as a digit of     a big number and that number is the message.  Thus,                     Mo =  SIGMA [ Bi * (R**i)]   ,   Bi is i-th byte                          i=0 to N                     Ms = K * Mo     Corrupting a single digit  of   Ms   will  yield   Ms' =  Ms +or-     C*(R**j)  for some radix position  j .  The receiver will compute     Ms'/K = Mo +or- C(R**j)/K. Since R  and  K  are relatively prime,     C*(R**j) cannot be any exact  multiple  of   K.   Therefore,  the     division will result in a non-zero remainder which indicates that     Ms'   is  a  corrupted  version  of  Ms.  As will be seen, a good     choice for  K  is (R**b - 1), for some  b  which  is  the  "check     length"  which  controls  the  degree  of detection to be had for                                   - 5 -Braden, Borman, & Partridge                                    [Page 17]RFC 1071            Computing the Internet Checksum       September 1988     Internet Experiment Note  45                          5 June 1978     TCP Checksum Function Design                   William W. Plummer     burst errors which affect a string of digits (i.e., 16-bit bytes)     in the message.  In fact  b  will be chosen to be  1, so  K  will     be  2**16 - 1 so that arithmetic operations will be simple.  This     means  that  all  bursts  of  15  or fewer bits will be detected.     According to [7] this choice for  b   results  in  the  following     expression for the fraction of undetected weight 2 errors:      f =  16(k-1)/[32(16k-3) + (6/k)]  where k is the message length.     For  large messages  f  approaches  3.125 per cent as  k  goes to     infinity.     Multiple precision multiplication and division are normally quite     complex operations, especially on small machines which  typically     lack  even  single precision multiply and divide operations.  The     exception to this is exactly the case being dealt  with  here  --     the  factor  is  2**16  - 1  on machines with a word length of 16     bits.  The reason for this is due to the following identity:             Q*(R**j)  =  Q, mod (R-1)     0 <= Q < R     That is, any digit  Q  in the selected  radix  (0,  1,  ...  R-1)     multiplied  by any power of the radix will have a remainder of  Q     when divided by the radix minus 1.     Example:  In decimal R = 10.  Pick  Q = 6.                     6  =   0 * 9  +  6  =  6, mod 9                    60  =   6 * 9  +  6  =  6, mod 9                   600  =  66 * 9  +  6  =  6, mod 9   etc.        More to the point, rem(31415/9) = rem((30000+1000+400+10+5)/9)           = (3 mod 9) + (1 mod 9) + (4 mod 9) + (1 mod 9) + (5 mod 9)           = (3+1+4+1+5) mod 9           = 14 mod 9           = 5     So, the remainder of a number divided by the radix minus one  can     be  found  by simply summing the digits of the number.  Since the     radix in the TCP case has been chosen to be  2**16 and the  check     factor is  2**16 - 1, a message can quickly be checked by summing     all  of  the  16-bit  words  (on  a  PDP-11),  with carries being     end-around.  If zero is the result, the message can be considered     valid.  Thus, checking a product coded  message  is  exactly  the     same complexity as with the current TCP checksum!                                   - 6 -Braden, Borman, & Partridge                                    [Page 18]RFC 1071            Computing the Internet Checksum       September 1988
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -