📄 tommath.tex

📁 tommath库
💻 TEX
📖 第 1 页 / 共 5 页
字号:
schools to manually add, subtract, multiply and divide.  \subsection{The Need for Multiple Precision Arithmetic}The most prevalent need for multiple precision arithmetic, often referred to as ``bignum'' math, is within the implementationof public-key cryptography algorithms.   Algorithms such as RSA \cite{RSAREF} and Diffie-Hellman \cite{DHREF} require integers of significant magnitude to resist known cryptanalytic attacks.  For example, at the time of this writing a typical RSA modulus would be at least greater than $10^{309}$.  However, modern programming languages such as ISO C \cite{ISOC} and Java \cite{JAVA} only provide instrinsic support for integers which are relatively small and single precision.\begin{figure}[!here]\begin{center}\begin{tabular}{|r|c|}\hline \textbf{Data Type} & \textbf{Range} \\\hline char  & $-128 \ldots 127$ \\\hline short & $-32768 \ldots 32767$ \\\hline long  & $-2147483648 \ldots 2147483647$ \\\hline long long & $-9223372036854775808 \ldots 9223372036854775807$ \\\hline\end{tabular}\end{center}\caption{Typical Data Types for the C Programming Language}\label{fig:ISOC}\end{figure}The largest data type guaranteed to be provided by the ISO C programming language\footnote{As per the ISO C standard.  However, each compiler vendor is allowed to augment the precision as they see fit.}  can only represent values up to $10^{19}$ as shown in figure \ref{fig:ISOC}. On its own the C language is insufficient to accomodate the magnitude required for the problem at hand.  An RSA modulus of magnitude $10^{19}$ could be trivially factored\footnote{A Pollard-Rho factoring would take only $2^{16}$ time.} on the average desktop computer, rendering any protocol based on the algorithm insecure.  Multiple precision algorithms solve this very problem by extending the range of representable integers while using single precision data types.Most advancements in fast multiple precision arithmetic stem from the need for faster and more efficient cryptographic primitives.  Faster modular reduction and exponentiation algorithms such as Barrett's algorithm, which have appeared in various cryptographic journals, can render algorithms such as RSA and Diffie-Hellman more efficient.  In fact, several major companies such as RSA Security, Certicom and Entrust have built entire product lines on the implementation and deployment of efficient algorithms.However, cryptography is not the only field of study that can benefit from fast multiple precision integer routines.  Another auxiliary use of multiple precision integers is high precision floating point data types.  The basic IEEE \cite{IEEE} standard floating point type is made up of an integer mantissa $q$, an exponent $e$ and a sign bit $s$.  Numbers are given in the form $n = q \cdot b^e \cdot -1^s$ where $b = 2$ is the most common base for IEEE.  Since IEEE floating point is meant to be implemented in hardware the precision of the mantissa is often fairly small (\textit{23, 48 and 64 bits}).  The mantissa is merely an integer and a multiple precision integer could be used to createa mantissa of much larger precision than hardware alone can efficiently support.  This approach could be useful where scientific applications must minimize the total output error over long calculations.Yet another use for large integers is within arithmetic on polynomials of large characteristic (i.e. $GF(p)[x]$ for large $p$).In fact the library discussed within this text has already been used to form a polynomial basis library\footnote{See \url{http://poly.libtomcrypt.org} for more details.}.\subsection{Benefits of Multiple Precision Arithmetic}\index{precision}The benefit of multiple precision representations over single or fixed precision representations is that no precision is lost while representing the result of an operation which requires excess precision.  For example, the product of two $n$-bit integers requires at least $2n$ bits of precision to be represented faithfully.  A multiple precision algorithm would augment the precision of the destination to accomodate the result while a single precision system would truncate excess bits to maintain a fixed level of precision.It is possible to implement algorithms which require large integers with fixed precision algorithms.  For example, ellipticcurve cryptography (\textit{ECC}) is often implemented on smartcards by fixing the precision of the integers to the maximum size the system will ever need.  Such an approach can lead to vastly simpler algorithms which can accomodate the integers required even if the host platform cannot natively accomodate them\footnote{For example, the average smartcard processor has an 8 bit accumulator.}.  However, as efficient as such an approach may be, the resulting source code is notnormally very flexible.  It cannot, at runtime, accomodate inputs of higher magnitude than the designer anticipated.Multiple precision algorithms have the most overhead of any style of arithmetic.  For the the most part the overhead can be kept to a minimum with careful planning, but overall, it is not well suited for most memory starvedplatforms.  However, multiple precision algorithms do offer the most flexibility in terms of the magnitude of the inputs.  That is, the same algorithms based on multiple precision integers can accomodate any reasonable size input without the designer's explicit forethought.  This leads to lower cost of ownership for the code as it only has to be written and tested once.\section{Purpose of This Text}The purpose of this text is to instruct the reader regarding how to implement efficient multiple precision algorithms.  That is to not only explain a limited subset of the core theory behind the algorithms but also the various ``house keeping'' elements that are neglected by authors of other texts on the subject.  Several well reknowned texts \cite{TAOCPV2,HAC} give considerably detailed explanations of the theoretical aspects of algorithms and often very little information regarding the practical implementation aspects.  In most cases how an algorithm is explained and how it is actually implemented are two very different concepts.  For example, the Handbook of Applied Cryptography (\textit{HAC}), algorithm 14.7 on page 594, gives a relatively simple algorithm for performing multiple precision integer addition.  However, the description lacks any discussion concerning the fact that the two integer inputs may be of differing magnitudes.  As a result the implementation is not as simpleas the text would lead people to believe.  Similarly the division routine (\textit{algorithm 14.20, pp. 598}) does not discuss how to handle sign or handle the dividend's decreasing magnitude in the main loop (\textit{step \#3}).Both texts also do not discuss several key optimal algorithms required such as ``Comba'' and Karatsuba multipliers and fast modular inversion, which we consider practical oversights.  These optimal algorithms are vital to achieve any form of useful performance in non-trivial applications.  To solve this problem the focus of this text is on the practical aspects of implementing a multiple precision integerpackage.  As a case study the ``LibTomMath''\footnote{Available at \url{http://math.libtomcrypt.org}} package is used to demonstrate algorithms with real implementations\footnote{In the ISO C programming language.} that have been field tested and work very well.  The LibTomMath library is freely available on the Internet for all uses and this text discusses a very large portion of the inner workings of the library.The algorithms that are presented will always include at least one ``pseudo-code'' description followed by the actual C source code that implements the algorithm.  The pseudo-code can be used to implement the same algorithm in other programming languages as the reader sees fit.  This text shall also serve as a walkthrough of the creation of multiple precision algorithms from scratch.  Showingthe reader how the algorithms fit together as well as where to start on various taskings.  \section{Discussion and Notation}\subsection{Notation}A multiple precision integer of $n$-digits shall be denoted as $x = (x_{n-1}, \ldots, x_1, x_0)_{ \beta }$ and representthe integer $x \equiv \sum_{i=0}^{n-1} x_i\beta^i$.  The elements of the array $x$ are said to be the radix $\beta$ digits of the integer.  For example, $x = (1,2,3)_{10}$ would represent the integer $1\cdot 10^2 + 2\cdot10^1 + 3\cdot10^0 = 123$.  \index{mp\_int}The term ``mp\_int'' shall refer to a composite structure which contains the digits of the integer it represents, as well as auxilary data required to manipulate the data.  These additional members are discussed further in section \ref{sec:MPINT}.  For the purposes of this text a ``multiple precision integer'' and an ``mp\_int'' are assumed to be synonymous.  When an algorithm is specified to accept an mp\_int variable it is assumed the various auxliary data members are present as well.  An expression of the type \textit{variablename.item} implies that it should evaluate to the member named ``item'' of the variable.  For example, a string of characters may have a member ``length'' which would evaluate to the number of characters in the string.  If the string $a$ equals ``hello'' then it follows that $a.length = 5$.  For certain discussions more generic algorithms are presented to help the reader understand the final algorithm usedto solve a given problem.  When an algorithm is described as accepting an integer input it is assumed the input is a plain integer with no additional multiple-precision members.  That is, algorithms that use integers as opposed to mp\_ints as inputs do not concern themselves with the housekeeping operations required such as memory management.  These algorithms will be used to establish the relevant theory which will subsequently be used to describe a multipleprecision algorithm to solve the same problem.  \subsection{Precision Notation}The variable $\beta$ represents the radix of a single digit of a multiple precision integer and must be of the form $q^p$ for $q, p \in \Z^+$.  A single precision variable must be able to represent integers in the range $0 \le x < q \beta$ while a double precision variable must be able to represent integers in the range $0 \le x < q \beta^2$.  The extra radix-$q$ factor allows additions and subtractions to proceed without truncation of the carry.  Since all modern computers are binary, it is assumed that $q$ is two.\index{mp\_digit} \index{mp\_word}Within the source code that will be presented for each algorithm, the data type \textbf{mp\_digit} will represent a single precision integer type, while, the data type \textbf{mp\_word} will represent a double precision integer type.  In several algorithms (notably the Comba routines) temporary results will be stored in arrays of double precision mp\_words.  For the purposes of this text $x_j$ will refer to the $j$'th digit of a single precision array and $\hat x_j$ will refer to the $j$'th digit of a double precision array.  Whenever an expression is to be assigned to a double precisionvariable it is assumed that all single precision variables are promoted to double precision during the evaluation.  Expressions that are assigned to a single precision variable are truncated to fit within the precision of a singleprecision data type.For example, if $\beta = 10^2$ a single precision data type may represent a value in the range $0 \le x < 10^3$, while a double precision data type may represent a value in the range $0 \le x < 10^5$.  Let$a = 23$ and $b = 49$ represent two single precision variables.  The single precision product shall be writtenas $c \leftarrow a \cdot b$ while the double precision product shall be written as $\hat c \leftarrow a \cdot b$.In this particular case, $\hat c = 1127$ and $c = 127$.  The most significant digit of the product would not fit in a single precision data type and as a result $c \ne \hat c$.  \subsection{Algorithm Inputs and Outputs}Within the algorithm descriptions all variables are assumed to be scalars of either single or double precisionas indicated.  The only exception to this rule is when variables have been indicated to be of type mp\_int.  This distinction is important as scalars are often used as array indicies and various other counters.  \subsection{Mathematical Expressions}The $\lfloor \mbox{ } \rfloor$ brackets imply an expression truncated to an integer not greater than the expression itself.  For example, $\lfloor 5.7 \rfloor = 5$.  Similarly the $\lceil \mbox{ } \rceil$ brackets imply an expressionrounded to an integer not less than the expression itself.  For example, $\lceil 5.1 \rceil = 6$.  Typically when the $/$ division symbol is used the intention is to perform an integer division with truncation.  For example, $5/2 = 2$ which will often be written as $\lfloor 5/2 \rfloor = 2$ for clarity.  When an expression is written as a fraction a real value division is implied, for example ${5 \over 2} = 2.5$.  The norm of a multiple precision integer, for example $\vert \vert x \vert \vert$, will be used to represent the number of digits in the representationof the integer.  For example, $\vert \vert 123 \vert \vert = 3$ and $\vert \vert 79452 \vert \vert = 5$.  \subsection{Work Effort}\index{big-Oh}To measure the efficiency of the specified algorithms, a modified big-Oh notation is used.  In this system all single precision operations are considered to have the same cost\footnote{Except where explicitly noted.}.  That is a single precision addition, multiplication and division are assumed to take the same time to complete.  While this is generally not true in practice, it will simplify the discussions considerably.Some algorithms have slight advantages over others which is why some constants will not be removed in the notation.  For example, a normal baseline multiplication (section \ref{sec:basemult}) requires $O(n^2)$ work while a baseline squaring (section \ref{sec:basesquare}) requires $O({{n^2 + n}\over 2})$ work.  In standard big-Oh notation these would both be said to be equivalent to $O(n^2)$.  However, in the context of the this text this is not the case as the magnitude of the inputs will typically be rather small.  As a result small constant factors in the work effort will make an observable difference in algorithm efficiency.All of the algorithms presented in this text have a polynomial time work level.  That is, of the form $O(n^k)$ for $n, k \in \Z^{+}$.  This will help make useful comparisons in terms of the speed of the algorithms and how various optimizations will help pay off in the long run.\section{Exercises}Within the more advanced chapters a section will be set aside to give the reader some challenging exercises related tothe discussion at hand.  These exercises are not designed to be prize winning problems, but instead to be thought provoking.  Wherever possible the problems are forward minded, stating problems that will be answered in subsequent chapters.  The reader is encouraged to finish the exercises as they appear to get a better understanding of the subject material.  That being said, the problems are designed to affirm knowledge of a particular subject matter.  Students in particularare encouraged to verify they can answer the problems correctly before moving on.Similar to the exercises of \cite[pp. ix]{TAOCPV2} these exercises are given a scoring system based on the difficulty ofthe problem.  However, unlike \cite{TAOCPV2} the problems do not get nearly as hard.  The scoring of these exercises ranges from one (the easiest) to five (the hardest).  The following table sumarizes the scoring system used.\begin{figure}[here]\begin{center}\begin{small}\begin{tabular}{|c|l|}\hline $\left [ 1 \right ]$ & An easy problem that should only take the reader a manner of \\                            & minutes to solve.  Usually does not involve much computer time \\                            & to solve. \\\hline $\left [ 2 \right ]$ & An easy problem that involves a marginal amount of computer \\                     & time usage.  Usually requires a program to be written to \\                     & solve the problem. \\\hline $\left [ 3 \right ]$ & A moderately hard problem that requires a non-trivial amount \\                     & of work.  Usually involves trivial research and development of \\                     & new theory from the perspective of a student. \\
💿 文件大小 2013 K
👤 上传用户 xxjjyy1237
📂 所属分类书籍源码
🏷️ 相关标签

#tommath
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -