📄 rfc1071.txt
字号:
Internet Experiment Note 45 5 June 1978 TCP Checksum Function Design William W. Plummer In order to form Ms, the sender must multiply the multiple precision "number" Mo by 2**16 - 1. Or, Ms = (2**16)Mo - Mo. This is performed by shifting Mo one whole word's worth of precision and subtracting Mo. Since carries must propagate between digits, but it is only the current digit which is of interest, one's complement arithmetic is used. (2**16)Mo = Mo0 + Mo1 + Mo2 + ... + MoX + 0 - Mo = - ( Mo0 + Mo1 + ......... + MoX) --------- ---------------------------------- Ms = Ms0 + Ms1 + ... - MoX A loop which implements this function on a PDP-11 might look like: LOOP: MOV -2(R2),R0 ; Next byte of (2**16)Mo SBC R0 ; Propagate carries from last SUB SUB (R2)+,R0 ; Subtract byte of Mo MOV R0,(R3)+ ; Store in Ms SOB R1,LOOP ; Loop over entire message ; 8 memory cycles per 16-bit byte Note that the coding procedure is not done in-place since it is not systematic. In general the original copy, Mo, will have to be retained by the sender for retransmission purposes and therefore must remain readable. Thus the MOV R0,(R3)+ is required which accounts for 2 of the 8 memory cycles per loop. The coding procedure will add exactly one 16-bit word to the message since Ms < (2**16)Mo . This additional 16 bits will be at the tail of the message, but may be moved into the defined location in the TCP header immediately before transmission. The receiver will have to undo this to put Ms back into standard format before decoding the message. The code in the receiver for fully decoding the message may be inferred by observing that any word in Ms contains the difference between two successive words of Mo minus the carries from the previous word, and the low order word contains minus the low word of Mo. So the low order (i.e., rightmost) word of Mr is just the negative of the low order byte of Ms. The next word of Mr is the next word of Ms plus the just computed word of Mr plus the carry from that previous computation. A slight refinement of the procedure is required in order to protect against an all-zero message passing to the destination. This will appear to have a valid checksum because Ms'/K = 0/K - 7 -Braden, Borman, & Partridge [Page 19]RFC 1071 Computing the Internet Checksum September 1988 Internet Experiment Note 45 5 June 1978 TCP Checksum Function Design William W. Plummer = 0 with 0 remainder. The refinement is to make the coding be Ms = K*Mo + C where C is some arbitrary, well-known constant. Adding this constant requires a second pass over the message, but this will typically be very short since it can stop as soon as carries stop propagating. Chosing C = 1 is sufficient in most cases. The product code checksum must be evaluated in terms of the desired properties P1 - P7. It has been shown that a factor of two more machine cycles are consumed in computing or verifying a product code checksum (P5 satisfied?). Although the code is not systematic, the checksum can be verified quickly without decoding the message. If the Internet Destination Address is located at the least significant end of the packet (where the product code computation begins) then it is possible for a gateway to decode only enough of the message to see this field without having to decode the entire message. Thus, P6 is at least partially satisfied. The algebraic properties P1 through P4 are not satisfied, but only a small amount of computation is needed to account for this -- the message needs to be reformatted as previously mentioned. P7 is satisfied since the product code checksum can be incrementally updated to account for an added word, although the procedure is somewhat involved. Imagine that the original message has two halves, H1 and H2. Thus, Mo = H1*(R**j) + H2. The timestamp word is to be inserted between these halves to form a modified Mo' = H1*(R**(j+1)) + T*(R**j) + H2. Since K has been chosen to be R-1, the transmitted message Ms' = Mo'(R-1). Then, Ms' = Ms*R + T(R-1)(R**j) + P2((R-1)**2) = Ms*R + T*(R**(j+1)) + T*(R**j) + P2*(R**2) - 2*P2*R - P2 Recalling that R is 2**16, the word size on the PDP-11, multiplying by R means copying down one word in memory. So, the first term of Ms' is simply the unmodified message copied down one word. The next term is the new data T added into the Ms' being formed beginning at the (j+1)th word. The addition is fairly easy here since after adding in T all that is left is propagating the carry, and that can stop as soon as no carry is produced. The other terms can be handle similarly. - 8 -Braden, Borman, & Partridge [Page 20]RFC 1071 Computing the Internet Checksum September 1988 Internet Experiment Note 45 5 June 1978 TCP Checksum Function Design William W. Plummer 4. More Complicated Codes There exists a wealth of theory on error detecting and correcting codes. Peterson [6] is an excellent reference. Most of these "CRC" schemes are designed to be implemented using a shift register with a feedback network composed of exclusive-ORs. Simulating such a logic circuit with a program would be too slow to be useful unless some programming trick is discovered. One such trick has been proposed by Kirstein [8]. Basically, a few bits (four or eight) of the current shift register state are combined with bits from the input stream (from Mo) and the result is used as an index to a table which yields the new shift register state and, if the code is not systematic, bits for the output stream (Ms). A trial coding of an especially "good" CRC function using four-bit bytes showed showed this technique to be about four times as slow as the current checksum function. This was true for both the PDP-10 and PDP-11 machines. Of the desirable properties listed above, CRC schemes satisfy only P3 (It has an inverse.), and P6 (It is systematic.). Placement of the checksum field in the packet is critical and the CRC cannot be incrementally modified. Although the bulk of coding theory deals with binary codes, most of the theory works if the alphabet contains q symbols, where q is a power of a prime number. For instance q taken as 2**16 should make a great deal of the theory useful on a word-by-word basis. 5. Outboard Processing When a function such as computing an involved checksum requires extensive processing, one solution is to put that processing into an outboard processor. In this way "encode message" and "decode message" become single instructions which do not tax the main host processor. The Digital Equipment Corporation VAX/780 computer is equipped with special hardware for generating and checking CRCs [13]. In general this is not a very good solution since such a processor must be constructed for every different host machine which uses TCP messages. It is conceivable that the gateway functions for a large host may be performed entirely in an "Internet Frontend Machine". This machine would be responsible for forwarding packets received - 9 -Braden, Borman, & Partridge [Page 21]RFC 1071 Computing the Internet Checksum September 1988 Internet Experiment Note 45 5 June 1978 TCP Checksum Function Design William W. Plummer either from the network(s) or from the Internet protocol modules in the connected host, and for reassembling Internet fragments into segments and passing these to the host. Another capability of this machine would be to check the checksum so that the segments given to the host are known to be valid at the time they leave the frontend. Since computer cycles are assumed to be both inexpensive and available in the frontend, this seems reasonable. The problem with attempting to validate checksums in the frontend is that it destroys the end-to-end character of the checksum. If anything, this is the most powerful feature of the TCP checksum! There is a way to make the host-to-frontend link be covered by the end-to-end checksum. A separate, small protocol must be developed to cover this link. After having validated an incoming packet from the network, the frontend would pass it to the host saying "here is an Internet segment for you. Call it #123". The host would save this segment, and send a copy back to the frontend saying, "Here is what you gave me as #123. Is it OK?". The frontend would then do a word-by-word comparison with the first transmission, and tell the host either "Here is #123 again", or "You did indeed receive #123 properly. Release it to the appropriate module for further processing." The headers on the messages crossing the host-frontend link would most likely be covered by a fairly strong checksum so that information like which function is being performed and the message reference numbers are reliable. These headers would be quite short, maybe only sixteen bits, so the checksum could be quite strong. The bulk of the message would not be checksumed of course. The reason this scheme reduces the computing burden on the host is that all that is required in order to validate the message using the end-to-end checksum is to send it back to the frontend machine. In the case of the PDP-10, this requires only 0.5 memory cycles per 16-bit byte of Internet message, and only a few processor cycles to setup the required transfers. 6. Conclusions There is an ordering of checksum functions: first and simplest is none at all which provides no error detection or correction. Second, is sending a constant which is checked by the receiver. This also is extremely weak. Third, the exclusive-OR of the data may be sent. XOR takes the minimal amount of computer time to generate and check, but is not a good checksum. A two's complement sum of the data is somewhat better and takes no more - 10 -Braden, Borman, & Partridge [Page 22]RFC 1071 Computing the Internet Checksum September 1988 Internet Experiment Note 45 5 June 1978 TCP Checksum Function Design William W. Plummer computer time to compute. Fifth, is the one's complement sum which is what is currently used by TCP. It is slightly more expensive in terms of computer time. The next step is a product code. The product code is strongly related to one's complement sum, takes still more computer time to use, provides a bit more protection against common hardware failures, but has some objectionable properties. Next is a genuine CRC polynomial code, used for checking purposes only. This is very expensive for a program to implement. Finally, a full CRC error correcting and detecting scheme may be used. For TCP and Internet applications the product code scheme is viable. It suffers mainly in that messages must be (at least partially) decoded by intermediate gateways in order that they can be forwarded. Should product codes not be chosen as an improved checksum, some slight modification to the existing scheme might be possible. For instance the "add and rotate" function used for paper tape by the PDP-6/10 group at the Artificial Intelligence Laboratory at M.I.T. Project MAC [12] could be useful if it can be proved that it is better than the current scheme and that it can be computed efficiently on a variety of machines. - 11 -Braden, Borman, & Partridge [Page 23]RFC 1071 Computing the Internet Checksum September 1988 Internet Experiment Note 45 5 June 1978 TCP Checksum Function Design William W. Plummer References [1] Cerf, V.G. and Kahn, Robert E., "A Protocol for Packet Network Communications," IEEE Transactions on Communications, vol. COM-22, No. 5, May 1974. [2] Kahn, Robert E., "The Organization of Computer Resources into a Packet Radio Network", IEEE Transactions on Communications, vol. COM-25, no. 1, pp. 169-178, January 1977. [3] Jacobs, Irwin, et al., "CPODA - A Demand Assignment Protocol for SatNet", Fifth Data Communications Symposium, September 27-9, 1977, Snowbird, Utah [4] Bolt Beranek and Newman, Inc. "Specifications for the Interconnection of a Host and an IMP", Report 1822, January 1976 edition. [5] Dean, Richard A., "Elements of Abstract Algebra", John Wyley and Sons, Inc., 1966 [6] Peterson, W. Wesley, "Error Correcting Codes", M.I.T. Press Cambridge MA, 4th edition, 1968. [7] Avizienis, Algirdas, "A Study of the Effectiveness of Fault- Detecting Codes for Binary Arithmetic", Jet Propulsion Laboratory Technical Report No. 32-711, September 1, 1965. [8] Kirstein, Peter, private communication [9] Cerf, V. G. and Postel, Jonathan B., "Specification of Internetwork Transmission Control Program Version 3", University of Southern California Information Sciences Institute, January 1978. [10] Digital Equipment Corporation, "PDP-10 Reference Handbook", 1970, pp. 114-5. [11] Swanson, Robert, "Understanding Cyclic Redundancy Codes", Computer Design, November, 1975, pp. 93-99. [12] Clements, Robert C., private communication. [13] Conklin, Peter F., and Rodgers, David P., "Advanced Minicomputer Designed by Team Evaluation of Hardware/Software Tradeoffs", Computer Design, April 1978, pp. 136-7. - 12 -Braden, Borman, & Partridge [Page 24]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -