md5.txt

来自「MD5英文原版官方文档及源码,详细介绍了md5算法的实现过程。」· 文本代码 · 共 1,194 行 · 第 1/3 页
TXT
1,194 行
介绍MD5加密算法基本情况MD5的全称是Message-Digest Algorithm 5，在90年代初由MIT的计算机科学实验室和RSA Data Security Inc发明，经MD2、MD3和MD4发展而来。 
Message-Digest泛指字节串(Message)的Hash变换，就是把一个任意长度的字节串变换成一定长的大整数。请注意我使用了"字节串"而不是"字符串"这个词，是因为这种变换只与字节的值有关，与字符集或编码方式无关。 

MD5将任意长度的"字节串"变换成一个128bit的大整数，并且它是一个不可逆的字符串变换算法，换句话说就是，即使你看到源程序和算法描述，也无法将一个MD5的值变换回原始的字符串，从数学原理上说，是因为原始的字符串有无穷多个，这有点象不存在反函数的数学函数。 

MD5的典型应用是对一段Message(字节串)产生fingerprint(指纹)，以防止被"篡改"。举个例子，你将一段话写在一个叫readme.txt文件中，并对这个readme.txt产生一个MD5的值并记录在案，然后你可以传播这个文件给别人，别人如果修改了文件中的任何内容，你对这个文件重新计算MD5时就会发现。如果再有一个第三方的认证机构，用MD5还可以防止文件作者的"抵赖"，这就是所谓的数字签名应用。 

MD5还广泛用于加密和解密技术上，在很多操作系统中，用户的密码是以MD5值（或类似的其它算法）的方式保存的，用户Login的时候，系统是把用户输入的密码计算成MD5值，然后再去和系统中保存的MD5值进行比较，而系统并不"知道"用户的密码是什么。 

一些黑客破获这种密码的方法是一种被称为"跑字典"的方法。有两种方法得到字典，一种是日常搜集的用做密码的字符串表，另一种是用排列组合方法生成的，先用MD5程序计算出这些字典项的MD5值，然后再用目标的MD5值在这个字典中检索。 

即使假设密码的最大长度为8，同时密码只能是字母和数字，共26+26+10=62个字符，排列组合出的字典的项数则是P(62,1)+P(62,2)....+P(62,8)，那也已经是一个很天文的数字了，存储这个字典就需要TB级的磁盘组，而且这种方法还有一个前提，就是能获得目标账户的密码MD5值的情况下才可以。 

在很多电子商务和社区应用中，管理用户的Account是一种最常用的基本功能，尽管很多Application Server提供了这些基本组件，但很多应用开发者为了管理的更大的灵活性还是喜欢采用关系数据库来管理用户，懒惰的做法是用户的密码往往使用明文或简单的变换后直接保存在数据库中，因此这些用户的密码对软件开发者或系统管理员来说可以说毫无保密可言，本文的目的是介绍MD5的Java Bean的实现，同时给出用MD5来处理用户的Account密码的例子，这种方法使得管理员和程序设计者都无法看到用户的密码，尽管他们可以初始化它们。但重要的一点是对于用户密码设置习惯的保护





Network Working Group                                          R. Rivest
Request for Comments: 1321           MIT Laboratory for Computer Science
                                             and RSA Data Security, Inc.
                                                              April 1992


                     The MD5 Message-Digest Algorithm

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard.  Distribution of this memo is
   unlimited.

Acknowlegements

   We would like to thank Don Coppersmith, Burt Kaliski, Ralph Merkle,
   David Chaum, and Noam Nisan for numerous helpful comments and
   suggestions.

Table of Contents

   1. Executive Summary                                                1
   2. Terminology and Notation                                         2
   3. MD5 Algorithm Description                                        3
   4. Summary                                                          6
   5. Differences Between MD4 and MD5                                  6
   References                                                          7
   APPENDIX A - Reference Implementation                               7
   Security Considerations                                            21
   Author's Address                                                   21

1. Executive Summary

   This document describes the MD5 message-digest algorithm. The
   algorithm takes as input a message of arbitrary length and produces
   as output a 128-bit "fingerprint" or "message digest" of the input.
   It is conjectured that it is computationally infeasible to produce
   two messages having the same message digest, or to produce any
   message having a given prespecified target message digest. The MD5
   algorithm is intended for digital signature applications, where a
   large file must be "compressed" in a secure manner before being
   encrypted with a private (secret) key under a public-key cryptosystem
   such as RSA.







Rivest                                                          [Page 1]

RFC 1321              MD5 Message-Digest Algorithm            April 1992


   The MD5 algorithm is designed to be quite fast on 32-bit machines. In
   addition, the MD5 algorithm does not require any large substitution
   tables; the algorithm can be coded quite compactly.

   The MD5 algorithm is an extension of the MD4 message-digest algorithm
   1,2]. MD5 is slightly slower than MD4, but is more "conservative" in
   design. MD5 was designed because it was felt that MD4 was perhaps
   being adopted for use more quickly than justified by the existing
   critical review; because MD4 was designed to be exceptionally fast,
   it is "at the edge" in terms of risking successful cryptanalytic
   attack. MD5 backs off a bit, giving up a little in speed for a much
   greater likelihood of ultimate security. It incorporates some
   suggestions made by various reviewers, and contains additional
   optimizations. The MD5 algorithm is being placed in the public domain
   for review and possible adoption as a standard.

   For OSI-based applications, MD5's object identifier is

   md5 OBJECT IDENTIFIER ::=
     iso(1) member-body(2) US(840) rsadsi(113549) digestAlgorithm(2) 5}

   In the X.509 type AlgorithmIdentifier [3], the parameters for MD5
   should have type NULL.

2. Terminology and Notation

   In this document a "word" is a 32-bit quantity and a "byte" is an
   eight-bit quantity. A sequence of bits can be interpreted in a
   natural manner as a sequence of bytes, where each consecutive group
   of eight bits is interpreted as a byte with the high-order (most
   significant) bit of each byte listed first. Similarly, a sequence of
   bytes can be interpreted as a sequence of 32-bit words, where each
   consecutive group of four bytes is interpreted as a word with the
   low-order (least significant) byte given first.

   Let x_i denote "x sub i". If the subscript is an expression, we
   surround it in braces, as in x_{i+1}. Similarly, we use ^ for
   superscripts (exponentiation), so that x^i denotes x to the i-th
   power.

   Let the symbol "+" denote addition of words (i.e., modulo-2^32
   addition). Let X <<< s denote the 32-bit value obtained by circularly
   shifting (rotating) X left by s bit positions. Let not(X) denote the
   bit-wise complement of X, and let X v Y denote the bit-wise OR of X
   and Y. Let X xor Y denote the bit-wise XOR of X and Y, and let XY
   denote the bit-wise AND of X and Y.





Rivest                                                          [Page 2]

RFC 1321              MD5 Message-Digest Algorithm            April 1992


3. MD5 Algorithm Description

   We begin by supposing that we have a b-bit message as input, and that
   we wish to find its message digest. Here b is an arbitrary
   nonnegative integer; b may be zero, it need not be a multiple of
   eight, and it may be arbitrarily large. We imagine the bits of the
   message written down as follows:

          m_0 m_1 ... m_{b-1}

   The following five steps are performed to compute the message digest
   of the message.

3.1 Step 1. Append Padding Bits

   The message is "padded" (extended) so that its length (in bits) is
   congruent to 448, modulo 512. That is, the message is extended so
   that it is just 64 bits shy of being a multiple of 512 bits long.
   Padding is always performed, even if the length of the message is
   already congruent to 448, modulo 512.

   Padding is performed as follows: a single "1" bit is appended to the
   message, and then "0" bits are appended so that the length in bits of
   the padded message becomes congruent to 448, modulo 512. In all, at
   least one bit and at most 512 bits are appended.

3.2 Step 2. Append Length

   A 64-bit representation of b (the length of the message before the
   padding bits were added) is appended to the result of the previous
   step. In the unlikely event that b is greater than 2^64, then only
   the low-order 64 bits of b are used. (These bits are appended as two
   32-bit words and appended low-order word first in accordance with the
   previous conventions.)

   At this point the resulting message (after padding with bits and with
   b) has a length that is an exact multiple of 512 bits. Equivalently,
   this message has a length that is an exact multiple of 16 (32-bit)
   words. Let M[0 ... N-1] denote the words of the resulting message,
   where N is a multiple of 16.

3.3 Step 3. Initialize MD Buffer

   A four-word buffer (A,B,C,D) is used to compute the message digest.
   Here each of A, B, C, D is a 32-bit register. These registers are
   initialized to the following values in hexadecimal, low-order bytes
   first):




Rivest                                                          [Page 3]

RFC 1321              MD5 Message-Digest Algorithm            April 1992


          word A: 01 23 45 67
          word B: 89 ab cd ef
          word C: fe dc ba 98
          word D: 76 54 32 10

3.4 Step 4. Process Message in 16-Word Blocks

   We first define four auxiliary functions that each take as input
   three 32-bit words and produce as output one 32-bit word.

          F(X,Y,Z) = XY v not(X) Z
          G(X,Y,Z) = XZ v Y not(Z)
          H(X,Y,Z) = X xor Y xor Z
          I(X,Y,Z) = Y xor (X v not(Z))

   In each bit position F acts as a conditional: if X then Y else Z.
   The function F could have been defined using + instead of v since XY
   and not(X)Z will never have 1's in the same bit position.) It is
   interesting to note that if the bits of X, Y, and Z are independent
   and unbiased, the each bit of F(X,Y,Z) will be independent and
   unbiased.

   The functions G, H, and I are similar to the function F, in that they
   act in "bitwise parallel" to produce their output from the bits of X,
   Y, and Z, in such a manner that if the corresponding bits of X, Y,
   and Z are independent and unbiased, then each bit of G(X,Y,Z),
   H(X,Y,Z), and I(X,Y,Z) will be independent and unbiased. Note that
   the function H is the bit-wise "xor" or "parity" function of its
   inputs.

   This step uses a 64-element table T[1 ... 64] constructed from the
   sine function. Let T[i] denote the i-th element of the table, which
   is equal to the integer part of 4294967296 times abs(sin(i)), where i
   is in radians. The elements of the table are given in the appendix.

   Do the following:

   /* Process each 16-word block. */
   For i = 0 to N/16-1 do

     /* Copy block i into X. */
     For j = 0 to 15 do
       Set X[j] to M[i*16+j].
     end /* of loop on j */

     /* Save A as AA, B as BB, C as CC, and D as DD. */
     AA = A
     BB = B



Rivest                                                          [Page 4]

RFC 1321              MD5 Message-Digest Algorithm            April 1992


     CC = C
     DD = D

     /* Round 1. */
     /* Let [abcd k s i] denote the operation
          a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */
     /* Do the following 16 operations. */
     [ABCD  0  7  1]  [DABC  1 12  2]  [CDAB  2 17  3]  [BCDA  3 22  4]
     [ABCD  4  7  5]  [DABC  5 12  6]  [CDAB  6 17  7]  [BCDA  7 22  8]
     [ABCD  8  7  9]  [DABC  9 12 10]  [CDAB 10 17 11]  [BCDA 11 22 12]
     [ABCD 12  7 13]  [DABC 13 12 14]  [CDAB 14 17 15]  [BCDA 15 22 16]

     /* Round 2. */
     /* Let [abcd k s i] denote the operation
          a = b + ((a + G(b,c,d) + X[k] + T[i]) <<< s). */
     /* Do the following 16 operations. */
     [ABCD  1  5 17]  [DABC  6  9 18]  [CDAB 11 14 19]  [BCDA  0 20 20]
     [ABCD  5  5 21]  [DABC 10  9 22]  [CDAB 15 14 23]  [BCDA  4 20 24]
     [ABCD  9  5 25]  [DABC 14  9 26]  [CDAB  3 14 27]  [BCDA  8 20 28]
     [ABCD 13  5 29]  [DABC  2  9 30]  [CDAB  7 14 31]  [BCDA 12 20 32]

     /* Round 3. */
     /* Let [abcd k s t] denote the operation
          a = b + ((a + H(b,c,d) + X[k] + T[i]) <<< s). */
     /* Do the following 16 operations. */
     [ABCD  5  4 33]  [DABC  8 11 34]  [CDAB 11 16 35]  [BCDA 14 23 36]
     [ABCD  1  4 37]  [DABC  4 11 38]  [CDAB  7 16 39]  [BCDA 10 23 40]
     [ABCD 13  4 41]  [DABC  0 11 42]  [CDAB  3 16 43]  [BCDA  6 23 44]
     [ABCD  9  4 45]  [DABC 12 11 46]  [CDAB 15 16 47]  [BCDA  2 23 48]

     /* Round 4. */
     /* Let [abcd k s t] denote the operation
          a = b + ((a + I(b,c,d) + X[k] + T[i]) <<< s). */
     /* Do the following 16 operations. */
     [ABCD  0  6 49]  [DABC  7 10 50]  [CDAB 14 15 51]  [BCDA  5 21 52]
     [ABCD 12  6 53]  [DABC  3 10 54]  [CDAB 10 15 55]  [BCDA  1 21 56]
     [ABCD  8  6 57]  [DABC 15 10 58]  [CDAB  6 15 59]  [BCDA 13 21 60]
     [ABCD  4  6 61]  [DABC 11 10 62]  [CDAB  2 15 63]  [BCDA  9 21 64]

     /* Then perform the following additions. (That is increment each
        of the four registers by the value it had before this block
        was started.) */
     A = A + AA
     B = B + BB
     C = C + CC
     D = D + DD

   end /* of loop on i */



Rivest                                                          [Page 5]

RFC 1321              MD5 Message-Digest Algorithm            April 1992


3.5 Step 5. Output

   The message digest produced as output is A, B, C, D. That is, we
   begin with the low-order byte of A, and end with the high-order byte
   of D.

   This completes the description of MD5. A reference implementation in
   C is given in the appendix.

4. Summary

   The MD5 message-digest algorithm is simple to implement, and provides
   a "fingerprint" or message digest of a message of arbitrary length.
   It is conjectured that the difficulty of coming up with two messages
   having the same message digest is on the order of 2^64 operations,
   and that the difficulty of coming up with any message having a given
   message digest is on the order of 2^128 operations. The MD5 algorithm
   has been carefully scrutinized for weaknesses. It is, however, a
   relatively new algorithm and further security analysis is of course
   justified, as is the case with any new proposal of this sort.

5. Differences Between MD4 and MD5

     The following are the differences between MD4 and MD5:

       1.   A fourth round has been added.

       2.   Each step now has a unique additive constant.

       3.   The function g in round 2 was changed from (XY v XZ v YZ) to
       (XZ v Y not(Z)) to make g less symmetric.

       4.   Each step now adds in the result of the previous step.  This
       promotes a faster "avalanche effect".

       5.   The order in which input words are accessed in rounds 2 and
       3 is changed, to make these patterns less like each other.

       6.   The shift amounts in each round have been approximately
       optimized, to yield a faster "avalanche effect." The shifts in
       different rounds are distinct.










Rivest                                                          [Page 6]

RFC 1321              MD5 Message-Digest Algorithm            April 1992


References

   [1] Rivest, R., "The MD4 Message Digest Algorithm", RFC 1320, MIT and
       RSA Data Security, Inc., April 1992.

   [2] Rivest, R., "The MD4 message digest algorithm", in A.J.  Menezes
       and S.A. Vanstone, editors, Advances in Cryptology - CRYPTO '90
       Proceedings, pages 303-311, Springer-Verlag, 1991.

   [3] CCITT Recommendation X.509 (1988), "The Directory -
       Authentication Framework."

APPENDIX A - Reference Implementation

   This appendix contains the following files taken from RSAREF: A
   Cryptographic Toolkit for Privacy-Enhanced Mail:

     global.h -- global header file

     md5.h -- header file for MD5

     md5c.c -- source code for MD5

   For more information on RSAREF, send email to <rsaref@rsa.com>.

   The appendix also includes the following file:

     mddriver.c -- test driver for MD2, MD4 and MD5

   The driver compiles for MD5 by default but can compile for MD2 or MD4
   if the symbol MD is defined on the C compiler command line as 2 or 4.

   The implementation is portable and should work on many different
   plaforms. However, it is not difficult to optimize the implementation
   on particular platforms, an exercise left to the reader. For example,
   on "little-endian" platforms where the lowest-addressed byte in a 32-
   bit word is the least significant and there are no alignment
   restrictions, the call to Decode in MD5Transform can be replaced with
   a typecast.

A.1 global.h

/* GLOBAL.H - RSAREF types and constants
md5.txt - 源码说明

本页面展示了「MD5英文原版官方文档及源码,详细介绍了md5算法的实现过程。」中的 md5.txt 源码文件，采用文本编程语言编写，共 1,194 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫开发者社区收录了大量与MD5算法相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?