📄 blowfishvhdl.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"><HTML><HEAD> <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=iso-8859-1"> <TITLE></TITLE> <META NAME="GENERATOR" CONTENT="StarOffice/5.2 (Linux)"> <META NAME="AUTHOR" CONTENT="Wesley Landaker"> <META NAME="CREATED" CONTENT="20001018;12124800"> <META NAME="CHANGEDBY" CONTENT="Wesley Landaker"> <META NAME="CHANGED" CONTENT="20001020;15263000"> <STYLE> <!-- @page { margin-left: 1.25in; margin-right: 1.25in; margin-top: 1in; margin-bottom: 1in } P { margin-bottom: 0.08in } H1 { margin-bottom: 0.08in; font-family: "helvetica", sans-serif; font-size: 16pt } H2 { margin-bottom: 0.08in; font-family: "helvetica", sans-serif; font-size: 14pt; font-style: italic } H6 { margin-bottom: 0.08in; font-family: "helvetica", sans-serif; font-size: 10pt } P.text-body-indent { margin-left: 0.2in } --> </STYLE></HEAD><BODY><p>Click here to go to the blowfishvhdl sourceforge page: <A href="http://sourceforge.net/projects/blowfishvhdl"> <IMG src="http://sourceforge.net/sflogo.php?group_id=13143" width="88" height="31" border="0" alt="SourceForge Logo"></A><p>Below is a paper I wrote when I coded up the original blowfish algorithm. I apologize in advance for any problems with this document; I wrote it up in a couple hours and never really edited it. =)<P ALIGN=RIGHT STYLE="margin-bottom: 0in"><BR></P><H1 ALIGN=CENTER>Blowfish Implementation in VHDL</H1><P ALIGN=CENTER>By Wesley J. Landaker <wjl@icecavern.net></P><P ALIGN=CENTER STYLE="margin-bottom: 0in"><BR></P><H2>Introduction</H2><P> Because something comparable was not readily and freelyavailible, I implemented the Blowfish encryption algorithm insynthesizable VHDL. My implementation is certainly not the fastestpossible implementation, and it is also not the smallest.</P><P> My implementation is somewhere middle-of-the-line, where I madesignificant tradeoffs between speed, size, and ease ofimplementation. The main focus was to make an implementation that wasusable, moderately compact, and would still run at an acceptableclock speed.</P><H2>Background</H2><P> I chose to implement Blowfish for several reasons. The Blowfishcipher is extremely strong, is completely free, and currently doesnot have any public VHDL or hardware implementation of any sortreadily available anywhere.</P><P> As described by Counterpane, the home of the algorithm, "Blowfishis a symmetric block cipher that can be used as a drop-in replacementfor DES or IDEA. It takes a variable-length key, from 32 bits to 448bits, making it ideal for both domestic and exportable use. Blowfishwas designed in 1993 by Bruce Schneier as a fast, free alternative toexisting encryption algorithms. Since then it has been analyzedconsiderably, and it is slowly gaining acceptance as a strongencryption algorithm. Blowfish is unpatented and license-free, andis available free for all uses."(<A HREF="http://www.counterpane.com/blowfish.html">http://www.counterpane.com/blowfish.html</A>)</P><H2>Implementation Details</H2><P> The Blowfish algorithm is block cipher consisting of a standardFiestel network with 16 rounds, plus some initial and finalencryption functions. The main feature that sets it off from othersimilar algorithms is that the non-linear substitution boxes (calledsboxes) are key-dependent. Because of this, there is an expensiveinitialization step required on every key change, requiring theequivalent of 520 encryptions to initialize the sboxes. Because ofthis step, any Blowfish implementation requires a substantial numberof finite state machines to control initialization and encryption.</P><P> In any implementation, there are several separate circuit piecesthat can be identified. First is the encryption core (furtherreferred to as 'crypt') that implements the actual Fiestel network.Second is the function F(xL) that crypt relies on for each round ofthe Fiestel network. Third is a generated array of sub-keys, calledthe parray, which is also used by crypt each round. Fourth are thefour key-dependent sboxes that are read by the F(xL) function alsoeach round. Fifth would be any control logic necessary to initializethe parray and sboxes.</P><P> As with all circuits, there are many ways to implement thisalgorithm. There are, however, two main paradigms for Blowfishimplementation. The first is a purely combination or fully-pipelinedcore (which both have the same problems as discussed below) with asynchronous control. With this approach, the main problem is thatsince all 16 rounds will be running simultaneously, the parray andthe sboxes must be able to have all of their data read at everylocation, every clock cycle. This translates to 1024 32-bit memoryports with 8-bit addresses, and 18 more with 5-bit addresses, whichcould be an extremely expensive size cost. This also means that thecrypt and F(xL) hardware must be duplicated 16 times. The advantagesto this approach, however, are that one data block could be processedevery clock cycle, and a new key could be initialized in 520 clockcycles.</P><P> The other obvious way to implement this algorithm is the way thatI have done it. That is, to have a single register that feeds back toitself through each round. First, this means that the parray andsboxes only need to have 1 memory port each--a total of 5 memoryports, compared to the 1042 required in the previously discussedimplementation. Second, this also means that the only hardware thathas to exist in crypt is that for a single round, and the hardwarefor F(xL) only has to exist once. The downside of this sort ofimplementation is that running optimally, one round per cycle, it cannever take less than 16 clock cycles per data block, and 8320 cyclesto initialize a key. In my implementation, I actually takeapproximately twice as long, because my sboxes and parray areimplemented synchronously. For more information about the exacttiming of my design, see the section on timing below.</P><P> The final large issue is the issue of supporting variable keylengths. It is viable for an implementation to hard-code a key lengthinto the circuit if it is known that only keys of a certain size willbe used. However, to implement the algorithm the most generally, itrequires a very large structure of muxes or something similar toroute various bits of the key to the proper places depending on thekey length. In my implementation, I have added this large muxingstructure, but if the key length is hard coded to a constant valuebefore synthesis, this hardware will not be synthesized, and aconsiderable space savings will be achieved.</P><H2>Testing</H2><P> In order to test this implementation of Blowfish, I used asoftware version of Blowfish I have written that has been checkedagainst some standard test vectors and the official referenceimplementation of Blowfish. This allowed me to run the same datathrough the software and VHDL implementations of Blowfish and comparethe results.</P><P> I tested this implementation with several key lengths between 128and 448, each with 2 or 3 random data blocks. Each block encryptedwas then decrypted to make sure that the original data was restoredcorrectly.</P><P> Although it would be nearly impossible to exhaustively test allcombinations of inputs (448 bits of key, 4 bits of key_size, and 64bits of data = 114688 bits!!), because of the nature of encryptionalgorithms (they are designed to have every output bit depend onevery input bit), several tests is usually all that is necessary todetermine that the system works correctly.</P><H2>Interfacing with Blowfish</H2><P> Interfacing with this implementation Blowfish is verystraightforward. Below is a description of the inputs and outputs andhow they are used.</P><H6>Inputs</H6><P CLASS="text-body-indent"><B>clk </B> : 1 - The input clocksignal.</P><P CLASS="text-body-indent"><B>key</B> : 448 - Theencryption/decryption key. If less than a 448 bit key is desired,this signal must be padded up to 448 bits. Typically, this paddingconsists of all 0's.</P><P CLASS="text-body-indent"><B>key_size</B> : 4 - The size of the<B>key</B> being used, in 32-bit words. Valid inputs are between 1and 14. If 0 or 15 are used, <B>key_size</B> of 14 is assumed. Forexample, 4 corresponds to a 128-bit <B>key</B>, 14 corresponds to a448-bit <B>key</B>. NOTE: If this is hard_coded before synthesis, iteliminate a huge matrix of multiplexers needed to implement keypermutations of variable size.</P><P CLASS="text-body-indent"><B>data_in</B> : 64 - The input data.In encryption mode, this is the plaintext. In decryption mode, thisis the ciphertext. This is only read when the <B>ready</B> signal isasserted.</P><P CLASS="text-body-indent"><B>new_data</B> : 1 - This signalshould be asserted when new data is presented on the <B>data_in</B>port. This signal is ignored when the circuit is not asserting <B>ready</B>.This should not be asserted at the same time as <B>new_key</B>.</P><P CLASS="text-body-indent"><B>new_key</B> : 1 - This signalshould be asserted when a new <B>key</B> or <B>key_size</B> ispresented. This signal is ignored when the circuit is not asserting<B>ready</B>. This should not be asserted at the same time as<B>new_data</B>.</P><P CLASS="text-body-indent"><B>encrypt</B> : 1 - This signaltoggles between encryption and decryption operation. <B>1</B> meansencrypt, <B>0</B> means decrypt. This signal is only read the samecycle that <B>new_data</B> is asserted and read.</P><H6>Outputs</H6><P CLASS="text-body-indent"><B>data_out </B>: 64 - The output data.In encryption mode, this is the ciphertext. In decryption mode, thisis the plaintext. This is only modified during key initialization andthe same cycle that <B>ready</B> is raised after an encryption ordecryption sequence.</P><P CLASS="text-body-indent"><B>ready</B> : 1 - This signal isasserted when the circuit is ready to receive a new data block or a<SPAN STYLE="font-weight: medium">new </SPAN><B>key</B> (or<B>key_size</B>). The circuit will ONLY read <B>new_key</B> or<B>new_data</B> when this signal is asserted. This signal alsosignifies that the <B>data_out</B> signal is valid if an encryptionor decryption sequence was previously running.</P><H6>Initialization</H6><P> To properly initialize the circuit, <B>new_key</B> and <B>new_data</B>must be low before the first rising edge of the clock. After thefirst clock cycle, the circuit will assert <B>ready</B> and it canthen be used. If this constraint is not met, the circuit will stillinitialize, but can take up to 21106 clock cycles before <B>ready</B>is asserted. However, any time that<B> ready</B> is asserted afterthe first clock cycle, the circuit is usable.</P><H6>Timing</H6><P> The following timing is given as a reference. Generally, ininterfacing with this implementation, it is not necessary to keeptrack of any timing or have any external counters--it is sufficientto assert the proper signals, then wait for the ready signal to comeback on before performing the next operation.</P><P CLASS="text-body-indent">Time from when <B>new_key</B> is assertedand read to when <B>ready</B> is asserted again (i.e. time toinitialize the algorithm with a new key): <B>21106</B> clock cycles.</P><P CLASS="text-body-indent">Time from when <B>new_data</B> isasserted and read to when <B>ready</B> is asserted again (i.e. timeto process one block of data): <B>37</B> clock cycles.</P><P CLASS="text-body-indent">(NOTE: <B>new_key</B> and <B>new_data</B>are only read when <B>ready</B> is asserted.)</P><H2>Conclusions</H2><P> My implementation of Blowfish has been designed to completely andcorrectly implement the algorithm with a focus on ease-of-design andease-of-use without sacrificing too much speed or size. Certainlybetter implementations could be used and my existing circuit coulddefinitely be optimized. However, the circuit does work as designedand follows the Blowfish algorithm specification exactly, which wasmy main goal.</P><P> This VHDL implementation will be cleaned up a bit with codecomments and some minor documentation and licensing information andthen will be released publicly under the GNU Public License, whichwill provide almost unlimited use of this VHDL model to the public.</P></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -