📄 pycrypt.tex
字号:
\documentclass{howto}\title{Python Cryptography Toolkit}\release{2.0.1}\author{A.M. Kuchling}\authoraddress{\url{www.amk.ca}}\begin{document}\maketitle\begin{abstract}\noindentThe Python Cryptography Toolkit describes a package containing variouscryptographic modules for the Python programming language. Thisdocumentation assumes you have some basic knowledge about the Pythonlanguage, but not necessarily about cryptography.\end{abstract}\tableofcontents%======================================================================\section{Introduction}\subsection{Design Goals}The Python cryptography toolkit is intended to provide a reliable andstable base for writing Python programs that require cryptographicfunctions.A central goal of the author's has been to provide a simple,consistent interface for similar classes of algorithms. For example,all block cipher objects have the same methods and return values, andsupport the same feedback modes. Hash functions have a differentinterface, but it too is consistent over all the hash functionsavailable. Some of these interfaces have been codified as PythonEnhancement Proposal documents, as \pep{247}, ``API for CryptographicHash Functions'', and \pep{272}, ``API for Block EncryptionAlgorithms''. This is intended to make it easy to replace old algorithms with newer,more secure ones. If you're given a bit of portably-written Pythoncode that uses the DES encryption algorithm, you should be able to useAES instead by simply changing \code{from Crypto.Cipher import DES} to\code{from Crypto.Cipher import AES}, and changing all references to\code{DES.new()} to \code{AES.new()}. It's also fairly simple towrite your own modules that mimic this interface, thus letting you usecombinations or permutations of algorithms.Some modules are implemented in C for performance; others are writtenin Python for ease of modification. Generally, low-level functionslike ciphers and hash functions are written in C, while lessspeed-critical functions have been written in Python. This divisionmay change in future releases. When speeds are quoted in thisdocument, they were measured on a 500 MHz Pentium II running Linux.The exact speeds will obviously vary with different machines,different compilers, and the phase of the moon, but they provide acrude basis for comparison. Currently the cryptographicimplementations are acceptably fast, but not spectacularly good. Iwelcome any suggestions or patches for faster code.I have placed the code under no restrictions; you can redistribute thecode freely or commercially, in its original form or with anymodifications you make, subject to whatever local laws may apply in yourjurisdiction. Note that you still have to come to some agreement withthe holders of any patented algorithms you're using. If you'reintensively using these modules, please tell me about it; there's littleincentive for me to work on this package if I don't know of anyone usingit.I also make no guarantees as to the usefulness, correctness, or legalityof these modules, nor does their inclusion constitute an endorsement oftheir effectiveness. Many cryptographic algorithms are patented;inclusion in this package does not necessarily mean you are allowed toincorporate them in a product and sell it. Some of these algorithms mayhave been cryptanalyzed, and may no longer be secure. While I willinclude commentary on the relative security of the algorithms in thesections entitled "Security Notes", there may be more recent analysesI'm not aware of. (Or maybe I'm just clueless.) If you're implementingan important system, don't just grab things out of a toolbox and putthem together; do some research first. On the other hand, if you'rejust interested in keeping your co-workers or your relatives out of yourfiles, any of the components here could be used.This document is very much a work in progress. If you have anyquestions, comments, complaints, or suggestions, please send them to me.\subsection{Acknowledgements}Much of the code that actually implements the various cryptographicalgorithms was not written by me. I'd like to thank all the people whoimplemented them, and released their work under terms which allowed meto use their code. These individuals are credited in the relevantchapters of this documentation. Bruce Schneier's book \emph{AppliedCryptography} was also very useful in writing this toolkit; I highlyrecommend it if you're interested in learning more about cryptography.Good luck with your cryptography hacking!A.M.K.\email{comments@amk.ca}Washington DC, USAJune 2005%======================================================================\section{Crypto.Hash: Hash Functions}Hash functions take arbitrary strings as input, and produce an outputof fixed size that is dependent on the input; it should never bepossible to derive the input data given only the hash function'soutput. One simple hash function consists of simply adding togetherall the bytes of the input, and taking the result modulo 256. For ahash function to be cryptographically secure, it must be verydifficult to find two messages with the same hash value, or to find amessage with a given hash value. The simple additive hash functionfails this criterion miserably and the hash functions described belowmeet this criterion (as far as we know). Examples ofcryptographically secure hash functions include MD2, MD5, and SHA1.Hash functions can be used simply as a checksum, or, in association with apublic-key algorithm, can be used to implement digital signatures. The hashing algorithms currently implemented are:\begin{tableii}{c|l}{}{Hash function}{Digest length}\lineii{MD2}{128 bits}\lineii{MD4}{128 bits}\lineii{MD5}{128 bits}\lineii{RIPEMD}{160 bits}\lineii{SHA1}{160 bits}\lineii{SHA256}{256 bits}\end{tableii}All hashing modules share the same interface. After importing a givenhashing module, call the \function{new()} function to create a newhashing object. You can now feed arbitrary strings into the objectwith the \method{update()} method, and can ask for the hash value atany time by calling the \method{digest()} or \method{hexdigest()}methods. The \function{new()} function can also be passed an optionalstring parameter that will be immediately hashed into the object'sstate.Hash function modules define one variable:\begin{datadesc}{digest_size}An integer value; the size of the digestproduced by the hashing objects. You could also obtain this value bycreating a sample object, and taking the length of the digest stringit returns, but using \member{digest_size} is faster.\end{datadesc}The methods for hashing objects are always the following:\begin{methoddesc}{copy}{}Return a separate copy of this hashing object. An \code{update} tothis copy won't affect the original object.\end{methoddesc}\begin{methoddesc}{digest}{}Return the hash value of this hashing object, as a string containing8-bit data. The object is not altered in any way by this function;you can continue updating the object after calling this function.\end{methoddesc}\begin{methoddesc}{hexdigest}{}Return the hash value of this hashing object, as a string containingthe digest data as hexadecimal digits. The resulting string will betwice as long as that returned by \method{digest()}. The object is notaltered in any way by this function; you can continue updating theobject after calling this function.\end{methoddesc}\begin{methoddesc}{update}{arg}Update this hashing object with the string \var{arg}.\end{methoddesc}Here's an example, using the MD5 algorithm:\begin{verbatim}>>> from Crypto.Hash import MD5>>> m = MD5.new()>>> m.update('abc')>>> m.digest()'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'>>> m.hexdigest()'900150983cd24fb0d6963f7d28e17f72'\end{verbatim}\subsection{Security Notes}Hashing algorithms are broken by developing an algorithm to compute astring that produces a given hash value, or to find two messages thatproduce the same hash value. Consider an example where Alice and Bobare using digital signatures to sign a contract. Alice computes thehash value of the text of the contract and signs the hash value withher private key. Bob could then compute a different contract that hasthe same hash value, and it would appear that Alice signed that boguscontract; she'd have no way to prove otherwise. Finding such amessage by brute force takes \code{pow(2, b-1)} operations, where thehash function produces \emph{b}-bit hashes.If Bob can only find two messages with the same hash value but can'tchoose the resulting hash value, he can look for two messages withdifferent meanings, such as "I will mow Bob's lawn for $10" and "I oweBob $1,000,000", and ask Alice to sign the first, innocuous contract.This attack is easier for Bob, since finding two such messages by bruteforce will take \code{pow(2, b/2)} operations on average. However,Alice can protect herself by changing the protocol; she can simplyappend a random string to the contract before hashing and signing it;the random string can then be kept with the signature.None of the algorithms implemented here have been completely broken.There are no attacks on MD2, but it's rather slow at 1250 K/sec. MD4is faster at 44,500 K/sec but there have been some partial attacks onit. MD4 makes three iterations of a basic mixing operation; two ofthe three rounds have been cryptanalyzed, but the attack can't beextended to the full algorithm. MD5 is a strengthened version of MD4with four rounds; an attack against one round has been found XXXupdate this. MD5 is still believed secure at the moment, but peopleare gravitating toward using SHA1 in new software because there are noknown attacks against SHA1. The MD5 implementation is moderatelywell-optimized and thus faster on x86 processors, running at 35,500K/sec. MD5 may even be faster than MD4, depending on the processorand compiler you use.All the MD\var{n} algorithms produce 128-bit hashes; SHA1 produces alarger 160-bit hash, and there are no known attacks against it. Thefirst version of SHA had a weakness which was later corrected; thecode used here implements the second, corrected, version. It operatesat 21,000 K/sec. SHA256 is about as half as fast as SHA1. RIPEMD hasa 160-bit output, the same output size as SHA1, and operates at 17,600K/sec.\subsection{Credits}The MD2 and MD4 implementations were written by A.M. Kuchling, and theMD5 code was implemented by Colin Plumb. The SHA1 code was originallywritten by Peter Gutmann. The RIPEMD code was written by AntoonBosselaers, and adapted for the toolkit by Hirendra Hindocha. TheSHA256 code was written by Tom St.~Denis and is part of theLibTomCrypt library (\url{http://www.libtomcrypt.org/}); it wasadapted for the toolkit by Jeethu Rao and Taylor Boon.%======================================================================\section{Crypto.Cipher: Encryption Algorithms}Encryption algorithms transform their input data, or \dfn{plaintext},in some way that is dependent on a variable \dfn{key}, producing\dfn{ciphertext}. This transformation can easily be reversed, if (and,hopefully, only if) one knows the key. The key can be varied by theuser or application and chosen from some very large space of possiblekeys.For a secure encryption algorithm, it should be very difficult todetermine the original plaintext without knowing the key; usually, noclever attacks on the algorithm are known, so the only way of breakingthe algorithm is to try all possible keys. Since the number of possiblekeys is usually of the order of 2 to the power of 56 or 128, this is nota serious threat, although 2 to the power of 56 is now consideredinsecure in the face of custom-built parallel computers and distributedkey guessing efforts.\dfn{Block ciphers} take multibyte inputs of a fixed size(frequently 8 or 16 bytes long) and encrypt them. Block ciphers canbe operated in various modes. The simplest is Electronic Code Book(or ECB) mode. In this mode, each block of plaintext is simplyencrypted to produce the ciphertext. This mode can be dangerous,because many files will contain patterns greater than the block size;for example, the comments in a C program may contain long strings ofasterisks intended to form a box. All these identical blocks willencrypt to identical ciphertext; an adversary may be able to use thisstructure to obtain some information about the text.To eliminate this weakness, there are various feedback modes in whichthe plaintext is combined with the previous ciphertext beforeencrypting; this eliminates any repetitive structure in theciphertext. One mode is Cipher Block Chaining (CBC mode); another is CipherFeedBack (CFB mode). CBC mode still encrypts in blocks, and thus isonly slightly slower than ECB mode. CFB mode encrypts on abyte-by-byte basis, and is much slower than either of the other twomodes. The chaining feedback modes require an initialization value tostart off the encryption; this is a string of the same length as theciphering algorithm's block size, and is passed to the \code{new()}function. There is also a special PGP mode, which is an oddballvariant of CFB used by the PGP program. While you can use it innon-PGP programs, it's quite non-standard.The currently available block ciphers are listed in the following table,and are in the \code{Crypto.Cipher} package:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -